Bad Index - use Clustering Factor to determine when a range scan is expensive

Oracle's Optimizer attempts to pick the most efficient plan for resolving a SQL statement.  Most of the time, it succeeds and execution time is fast.  Sometimes, it doesn't and execution time is slow.  If the Optimizer chooses a bad index, it will not only be slow, it will consume storage resources and potentially negatively impact other processes in a shared SAN environment.

The crucial metric is "clustering factor".  Is shows the rough number of I/O operations that will be performed during an index range scan.  This metric can span on the low side from the number of blocks in the table to the number of rows in the table on the high side.  Even when the storage subsystem runs at a high IOPS (I/Os Per Second), if an index range scan drives a high number, the elapsed time will add up.  Reduce the number of I/Os and reduce the execution time.

Index Clustering Factor Very High

Intensive I/O wait activity on table due to a range scan carried out by an index with a bad clustering factor (mismatch between physical order of rows in table and order of ROWIDs from the index range scan leads to re-reading of table blocks).  In Precise, "I/O Wait" is the total time that the SQL statements wait for the storage subsystem whether direct attached flash or a shared SAN.  Oracle performance is dependent on storage performance.

The following example shows the effect that a bad clustered index can have on performance when an index is scanned:

The figure below shows an example of an index with a good clustering factor. In this example, the root is read first, followed by the first leaf page. Then the first data block that serves the first three keys matching the three rows in the data block is fetched. In this way the keys and data blocks that follow are read. The I/O operations required by this scan include five index blocks and six data blocks, which is the equivalence of 11 I/O operations.

Index with good clustering factor (low = number of table blocks)

The following figure shows an example of an index with a bad clustering factor.

The index with the bad clustering factor starts in the same way as the index with the good clustering factor. However, when the second key in the index is read, the row for the second key in the first data block has not yet been fetched, so another block must be fetched. By the time Oracle accesses the index key matching the second row in the first table block, it has already been swapped out of memory and needs to be re-read. In the worse case scenario, I/O for the table blocks will be required for every index key. The I/O operations required by this scan include five index blocks and 16 table blocks, which is equivalence of 21 I/O operations. When the difference between the number of blocks and number of rows is great, performance can be greatly impacted.

Index with bad clustering factor (high = number of rows)

The Objects button in Precise makes it easy to find the relevant metrics.

Remedies to consider when an index has a high clustering factor and is used in a scan:

  1. Use the HINT, full(table_name), to force a full table scan and save I/Os.
  2.  Covering Index - use an index only access path to eliminate the reading of the table itself.  This is accomplished by adding the additional columns used by the SQL statement to the index. 
  3.  Reorder the data in the table to be in the same order as the most frequently used index.
  4. The order of columns in the index will change the clustering factor.
  5.  Partitioning to limit the span of the scan.
  6.  Archiving to minimize the data in the active tables.