Here it is a very brief discussion I have had with one of my colleagues about index design
Colleague: what kind of index would you suggest to cover the following query?
SELECT
rowid
,a.*
FROM message_out a
WHERE sms_status in (700, 707)
AND (scheduled_time is null
OR scheduled_time <= :1)
AND provider_id in (0,0)
ORDER BY
priority_level desc,
creation_time asc;
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5529 | 1376K | | 4769 (1) |
| 1 | SORT ORDER BY | | 5529 | 1376K | 1856K | 4769 (1) |
|* 2 | TABLE ACCESS FULL| MESSAGE_OUT | 5529 | 1376K | | 4462 (1) |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("SMS_STATUS"=700 OR "SMS_STATUS"=707)
AND ("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
AND "PROVIDER_ID"=0)
Me: and what have you ended up with until now?
Colleague: here my suggested index and the related execution plan
CREATE INDEX virtual_index ON MESSAGE_OUT(sms_status,scheduled_time,provider_id) ;
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5529 | 1376K | | 446 (1) |
| 1 | SORT ORDER BY | | 5529 | 1376K | 1856K | 446 (1) |
| 2 | INLIST ITERATOR | | | | | |
|* 3 | TABLE ACCESS BY INDEX ROWID| MESSAGE_OUT | 5529 | 1376K | | 140 (0) |
|* 4 | INDEX RANGE SCAN | VIRTUAL_INDEX | 5529 | | | 6 (0) |
---------------------------------------------------------------------------------------
Me: I would not have created the same index
Me: here it is the index I would have created (after several questions regarding the data distribution, the table data volume, the use of bind variables, etc…)
create index mho_ind on MESSAGE_OUT (status, provider_id, scheduled_time);
Me: and if sms_status contains repetitive values then I would have added a compress command to that index creation
Colleague: there is no difference in the execution plan either by using my index or your index
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5529 | 1376K | | 446 (1) |
| 1 | SORT ORDER BY | | 5529 | 1376K | 1856K | 446 (1) |
| 2 | INLIST ITERATOR | | | | | |
|* 3 | TABLE ACCESS BY INDEX ROWID | MESSAGE_OUT | 5529 | 1376K | | 140 (0) |
|* 4 | INDEX RANGE SCAN | VIRTUAL_INDEX | 5529 | | | 6 (0) |
------------------------------------------------------------------------------------------
Me: no, it is not the same plan. Please always consider the predicate part
Me: what is the predicate part of the plan using your index
Colleague: this is my index predicate part
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
4 - access(("SMS_STATUS"=700 OR "SMS_STATUS"=707) AND "PROVIDER_ID"=0)
filter("PROVIDER_ID"=0) --> additional filter operation
Colleague: and this is your index predicate part
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
4 - access(("SMS_STATUS"=700 OR "SMS_STATUS"=707) AND "PROVIDER_ID"=0)
--> no additional filter operation
Me: and did you pointed out the difference or not yet?
Colleague: no, same plan, same cost and same execution time
Me: there is a fundamental difference between your plan and mine. In your plan there is a double operation on your engineered index: “ACCESS + FILTER” operation while my engineered index needs only one precise operation : “ACCESS”
Me: and when it comes to performance you always prefers a precise index ACCESS operation to that double ACCESS and FILTER operations.
Me: your engineered index has a second columns on which an inequality predicate is applied
SCHEDULED_TIME <= :1
You should always start your index by the columns on which an equality predicate is applied. In my case, I put the SCHEDULED_TIME column at the trailing edge of my index and doing as such I have avoided a costly filter operation on my index while your engineered index has been subject to that costly filter operation
If you want to test this behaviour then below is an example to play with. I hope you will enjoy it
SQL> create table t1
(id number,
n_1000 number,
n_5000 number,
n_10000 number,
small_vc varchar2(20),
padding varchar2(100)
);
Table created.
SQL> insert into t1
with generator as (
select --+ materialize
rownum id
from dual
connect by
rownum <= 10000
)
select
rownum id,
mod(rownum,1000) n_1000,
mod(rownum,5000) n_5000,
mod(rownum,10000) n_10000,
lpad(rownum,10,'0') small_vc,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 100000
;
SQL> create index my_ind on t1(id, n_5000, n_1000);
Index created.
SQL> create index colleague_ind on t1(id, n_1000, n_5000);
Index created.
SQL> alter index my_ind invisible;
Index altered.
SQL> exec dbms_stats.gather_table_stats(user, 't1');
PL/SQL procedure successfully completed.
SQL> select
a.*
from t1 a
where id in (112,120)
and (n_1000 is null
or n_1000 <= 3000)
and n_5000 in (120);
Statistics
------------------------------------------------------
65 recursive calls
0 db block gets
95 consistent gets ---> spot this
0 physical reads
0 redo size
1005 bytes sent via SQL*Net to client
543 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
1 rows processed
SQL_ID 7d6ag1m1ztpgr, child number 1
-------------------------------------
select a.* from t1 a where id in (112,120) and (n_1000 is null
or n_1000 <= 3000) and n_5000 in (120)
Plan hash value: 3644584748
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 4 (100)|
| 1 | INLIST ITERATOR | | | | |
|* 2 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 2 | 258 | 4 (0)|
|* 3 | INDEX RANGE SCAN | COLLEAGUE_IND | 2 | | 3 (0)|
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("N_1000"<=3000 OR "N_1000" IS NULL))
3 - access((("ID"=112 OR "ID"=120)) AND "N_5000"=120)
filter("N_5000"=120) ---> spot this
SQL> alter index colleague_ind invisible;
Index altered.
SQL> alter index my_ind visible;
Index altered.
SQL> select
a.*
from t1 a
where id in (112,120)
and (n_1000 is null
or n_1000 <= 3000)
and n_5000 in (120);
Statistics
------------------------------------------------------
33 recursive calls
0 db block gets
49 consistent gets --> spot the reduction
0 physical reads
0 redo size
1005 bytes sent via SQL*Net to client
543 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
1 rows processed
SQL_ID 7d6ag1m1ztpgr, child number 1
-------------------------------------
select a.* from t1 a where id in (112,120) and (n_1000 is null
or n_1000 <= 3000) and n_5000 in (120)
Plan hash value: 4286547933</pre>
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 4 (100)|
| 1 | INLIST ITERATOR | | | | |
|* 2 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 2 | 258 | 4 (0)|
|* 3 | INDEX RANGE SCAN | MY_IND | 2 | | 3 (0)|
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("N_1000"<=3000 OR "N_1000" IS NULL))
3 - access((("ID"=112 OR "ID"=120)) AND "N_5000"=120)
--> spot the absence of filter on index