Quantcast
Channel: Mohamed Houri’s Oracle Notes
Viewing all articles
Browse latest Browse all 224

CBO decision: unique or non-unique index?

$
0
0

I have been asked to look at one of those few particular frustrating situations that only running systems can procure. It is an update of a single table using a complete set of its primary key columns in order to locate and update a unique row. This update looks like:

UPDATE T1
SET
  {list of columns}
WHERE
    T1_DATE    = :B9
AND T1_I_E_ID  = :B8
AND T1_TYPE    = :B7
AND DATE_TYPE  = :B6
AND T1_AG_ID   = :B5
AND T1_ACC_ID  = :B4
AND T1_SEC_ID  = :B3
AND T1_B_ID    = :B2
AND T1_FG_ID   = :B1;

The 9 columns in the above where clause represent the primary key of the T1 table. You might be surprised to know that this update didn’t used the primary key index and preferred instead a range scan of an existing 3 columns index plus a table access by index rowid to locate and update a unique row:

----------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT             |        |       |       |     1 (100)|
|   1 |  UPDATE                      | T1     |       |       |            |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |     1 |   122 |     1   (0)|
|   3 |    INDEX RANGE SCAN          | IDX_T1 |     1 |       |     1   (0)|
----------------------------------------------------------------------------
   2 - filter("T1_TYPE"=:B7 AND "DATE_TYPE"=:B6 AND "T1_I_E_ID"=:B8 AND
              "T1_AG_ID"=TO_NUMBER(:B5) AND "T1_DATE"=TO_TIMESTAMP(:B9) AND
              "T1_FG_ID"=TO_NUMBER(:B1))
   3 - access("T1_SEC_ID"=TO_NUMBER(:B3) AND "T1_B_ID"=TO_NUMBER(:B2) AND
              "T1_ACC_ID"=TO_NUMBER(:B4))

The same update, when hinted with the primary key index, uses the following much desirable execution plan:

------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |        |     1 |   126 |     1   (0)|
|   1 |  UPDATE            | T1     |       |       |            |
|*  2 |   INDEX UNIQUE SCAN| PK_T19 |     1 |   126 |     1   (0)|
------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1_I_E_ID"=:B8 AND "T1_TYPE"=:B7 AND "DATE_TYPE"=:B6
              AND "T1_AG_ID"=TO_NUMBER(:B5) AND "T1_ACC_ID"=TO_NUMBER(:B4) AND
              "T1_SEC_ID"=TO_NUMBER(:B3) AND "T1_B_ID"=TO_NUMBER(:B2) AND
              "T1_FG_ID"=TO_NUMBER(:B1) AND "T1_DATE"=TO_TIMESTAMP(:B9))

I don’t know how Oracle manage to get the same cost (Cost = 1) for two completely different indexes one with 9 columns and one(which is a subset of the first index) with only 3 columns (not necessarily of the same starting order though)?

So why Oracle has not selected the primary key unique index?

First here are below the available statistics on the primary key columns:

SQL> select
      column_name
     ,num_distinct
     ,num_nulls
     ,histogram
    from
      all_tab_col_statistics
    where
     table_name = 'T1'
   and
    column_name in ('T1_DATE'
                   ,'T1_I_E_ID'
                   ,'T1_TYPE'
                   ,'DATE_TYPE'
                   ,'T1_AG_ID'
                   ,'T1_ACC_ID'
                   ,'T1_SEC_ID'
                   ,'T1_B_ID'
                   ,'T1_FG_ID' );

COLUMN_NAME         NUM_DISTINCT  NUM_NULLS HISTOGRAM
------------------- ------------ ---------- ---------------
T1_I_E_ID              2          0 		FREQUENCY
T1_TYPE                5          0 		FREQUENCY
DATE_TYPE              5          0 		FREQUENCY
T1_AG_ID               106        0 		FREQUENCY
T1_ACC_ID             182         0 		FREQUENCY
T1_DATE               2861        0 		HEIGHT BALANCED
T1_SEC_ID             3092480     0 		NONE
T1_B_ID               1452        0 		HEIGHT BALANCED
T1_FG_ID              1           0 		FREQUENCY

And here’s the corresponding “update” 10053 trace file restricted only to the important part related to my investigations

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 387027527  #Blks:  6778908  AvgRowLen:  126.00  ChainCnt:  0.00

Index Stats::
Index: IDX_T1  Col#: 7 8 5
 LVLS: 3  #LB: 1568007  #DK: 3314835  LB/K: 1.00   DB/K: 25.00  CLUF: 84758374.00

Index: PK_T19  Col#: 1 2 3 4 5 7 8 37 6
 LVLS: 4  #LB: 4137117  #DK: 377310281  LB/K: 1.00  DB/K: 1.00  CLUF: 375821219.00

And the part of same trace file where the index choice is done

Access Path: index (UniqueScan)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647 ----------------------> spot this
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 1.00  Resp: 1.00  Degree: 1
    ColGroup Usage:: PredCnt: 3  Matches Full:  Partial:
    ColGroup Usage:: PredCnt: 3  Matches Full:  Partial:

Access Path: index (AllEqRange)
    Index: IDX_T1
    resc_io: 5.00  resc_cpu: 36797 ----------------------> spot this
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 1.00  Resp: 1.00  Degree: 1

Best:: AccessPath: IndexRange
  Index: IDX_T1
    Cost: 1.00  Degree: 1  Resp: 1.00  Card: 0.00  Bytes: 0

Looking closely to the above trace file I didn’t find any difference in the index costing information (ix_sel_with_filters, resc_io, cost) which favours the non-unique IDX_T1 index over the PK_T19 primary key unique index except the resc_cpu value which equals 36797 for the former and 37647 for the latter index. I haven’t considered the clustering factor information because the index I wanted the CBO to use is a unique index. The two indexes have the same cost in this case. What extra information the CBO is using in this case to prefer the non-unique index over the unique one?

This issue remembered me an old otn thread in which the Original Poster says that, under the default CPU costing model, when two indexes have the same cost, Oracle will consider using the less CPU expensive index.

As far as I have a practical case of two different indexes with the same cost, I have decided to check this assumption by changing the costing model from CPU to I/O

SQL> alter session set "_optimizer_cost_model"=io;

SQL> explain plan for
UPDATE T1
SET
  {list of columns}
WHERE
    T1_DATE   = :B9
AND T1_I_E_ID = :B8
AND T1_TYPE   = :B7
AND DATE_TYPE = :B6
AND T1_AG_ID  = :B5
AND T1_ACC_ID = :B4
AND T1_SEC_ID = :B3
AND T1_B_ID   = :B2
AND T1_FG_ID  = :B1;

SQL> select * from table(dbms_xplan.display);

Plan hash value: 704748203
-------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | UPDATE STATEMENT   |        |     1 |   126 |     1 |
|   1 |  UPDATE            | T1     |       |       |       |
|*  2 |   INDEX UNIQUE SCAN| PK_T19 |     1 |   126 |     1 |
-------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1_I_E_ID"=:B8 AND "T1_TYPE"=:B7 AND
              "DATE_TYPE"=:B6 AND "T1_AG_ID"=TO_NUMBER(:B5) AND
              "T1_ACC_ID"=TO_NUMBER(:B4) AND "T1_SEC_ID"=TO_NUMBER(:B3) AND
              "T1_B_ID"=TO_NUMBER(:B2) AND "T1_FG_ID"=TO_NUMBER(:B1) AND
              "T1_DATE"=TO_TIMESTAMP(:B9))

Note
-----
   - cpu costing is off (consider enabling it)

Spot on. We get the desired primary key index without any help.

However changing the default costing model is not acceptable in the client PRODUCTION database. Continuing my root causes investigations I was getting the feeling that histograms are messing up this CBO index choice. This is why I decided to give it a try and get rid of histograms and analyze the corresponding 10053 trace file:

SQL> exec dbms_stats.gather_table_stats (user, 'T1', method_opt => 'for all columns size 1');


***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 387029216  #Blks:  6778908  AvgRowLen:  126.00  ChainCnt:  0.00

Index Stats::
  Index: IDX_T1  Col#: 7 8 5
  LVLS: 3  #LB: 1625338  #DK: 3270443  LB/K: 1.00   DB/K: 26.00  CLUF: 87831324.00

  Index: PK_T19  Col#: 1 2 3 4 5 7 8 37 6
  LVLS: 4  #LB: 4335908  #DK: 395902926  LB/K: 1.00  DB/K: 1.00  CLUF: 394578898.00

Access Path: index (UniqueScan)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 1.00  Resp: 1.00  Degree: 1
  ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.0000
  ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.0000

  Access Path: index (AllEqRange)
    Index: IDX_T1
    resc_io: 31.00  resc_cpu: 370081
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 6.20  Resp: 6.20  Degree: 1 ------> spot the cost

Access Path: index (AllEqUnique)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 1.00  Resp: 1.00  Degree: 1
 One row Card: 1.000000

  Best:: AccessPath: IndexUnique
  Index: PK_T19
    Cost: 1.00  Degree: 1  Resp: 1.00  Card: 1.00  Bytes: 0

And now that the cost of accessing the non-unique index became 6 times (Cost = 6,2) greater than that of the unique index (Cost =1) Oracle preferred the primary key index without any help:

============
Plan Table
============
-------------------------------------------------+----------------------+
| Id  | Operation           | Name  | Rows  | Bytes | Cost  | Time      |
-------------------------------------------------+----------------------+
| 0   | UPDATE STATEMENT    |       |       |       |     1 |           |
| 1   |  UPDATE             | T1    |       |       |       |           |
| 2   |   INDEX UNIQUE SCAN | PK_T19|     1 |   126 |     1 |  00:00:01 |
-------------------------------------------------+-----------------------+

The final acceptable decision to solve this issue was to hint an instance of the same query to use the primary key index and attach an SQL profile to the original packaged query using the plan_hash_value of the primary key index execution plan.

Bottom Line: under the default CPU costing model, when two (or more) indexes have the same cost Oracle will prefer using the index that it is going to consume the less amount of CPU (resc_cpu). And, before jumping on collecting histograms (particularly the Height Balanced ones) by default, be informed that they do participate in the perception Oracle has on the amount of CPU the different indexes will need and ultimately on the CBO index desirability.



Viewing all articles
Browse latest Browse all 224

Trending Articles