Quantcast
Channel: Mohamed Houri’s Oracle Notes
Viewing all 224 articles
Browse latest View live

Indexing Foreign Key: the proof of the FK index use

$
0
0

If you have been looking for a material proof showing Oracle using the foreign key index created on the child table while deleting from a parent table then here it is

drop table t2;
drop table t1;

create table t1
(col1 number primary key);

create table t2
(col1    number primary key
,status  varchar2(12) not null
,col2    number
,col2v   number generated always as (case when status = 'ACTIVE' then col2 end) VIRTUAL
,constraint t2_fk foreign key (col2v) references t1(col1)
,constraint t2_ck check (status in ('ACTIVE','INACTIVE') and (status = 'INACTIVE' or col2 is not null))
);

create index t2_ind_fk on t2(col2v);

insert into t1
 select rownum
from dual
connect by level <=100;

commit;

insert into t2 (col1, status, col2) values (1, 'ACTIVE',50);

alter session set skip_unusable_indexes = false;

alter index t2_ind_fk unusable; -- implicit commit

I have created a pair of parent-child table (t1 and t2), an index on the foreign key on the t2 child table, set this index into an unusable state and changed the default skip_unusable_indexes parameter to false so that unusable indexes will not be skipped.

Now, I am going in the next PL/SQL anonymous block, to simulate a delete from a parent table using an autonomous transaction in order to mimic a different session (in fact a different transaction within the same session)

declare
 pragma autonomous_transaction;
begin
 delete from t1 –- deleting from the parent table
 where col1 = 99;
 commit;
end;
/

declare
*
ERROR at line 1:
ORA-01502: index 'XXX.T2_IND_FK' or partition of such index is in unusable state
ORA-06512: at line 4

See how deleting from the parent table (t1) triggered an error on the index of the foreign key constraint created on the child table (t2). This is a simple way to show the mechanism used by Oracle in order to avoid a child table lock (before eventually a deadlock situation) simply by using the index on the foreign key.



Window SORT and Window BUFFER

$
0
0

I was thinking that I will see a WINDOW SORT operation in an execution plan for every over ( ) clause statement that differs in the partition and the order by options. For example

SQL> select
        id
       ,n_5000
       ,lead(id) over (partition by n_5000 order by id)
     from t1
     where n_5000 = 1778;

---------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |       |       |    94 (100)|
|   1 |  WINDOW SORT          |        |    22 |   572 |    94   (6)|
|*  2 |   INDEX FAST FULL SCAN| MY_IND |    22 |   572 |    93   (5)|
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N_5000"=1778)

But it suffices to have (1) an index that starts by the order by column (id) or by the partition by column(n_5000) and (2) add a predicate on the order by column or on the partition by column to that original query and the WINDOW SORT will be transformed into a less costly WINDOW BUFFER.

SQL> select
        id
       ,n_5000
       ,lead(id) over (partition by n_5000 order by id)
    from t1
    where id = 1778;

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |       |       |     3 (100)|          |
|   1 |  WINDOW BUFFER    |        |     1 |    26 |     3  (34)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN| MY_IND |     1 |    26 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1778)

That’s pretty straightforward. The index MY_IND (id, n_5000, n_10000) has been used to avoid the WINDOW SORT analytical operation. This transforms my initial thinking to: “I will see a WINDOW SORT operation in an execution plan for every over () clause statement that differs in the partition by and the order by options unless the CBO finds a suitable index that permits bypassing a SORT operation”

But does this mean that I will not see a parent WINDOW BUFFER operation without a child index scan operation?

SQL> select
       id
      ,n_5000
      ,padding
      ,sum(id) over (partition by n_5000)
     from t1
     where n_5000 = 444;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   478 (100)|          |
|   1 |  WINDOW BUFFER     |      |    20 |  2200 |   478   (2)| 00:00:02 |
|*  2 |   TABLE ACCESS FULL| T1   |    20 |  2200 |   477   (2)| 00:00:02 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N_5000"=444)

I get rid of the order by option in the over() operation and added a predicate on the partition by column.

And maybe I can add this:

I will NOT see a WINDOW SORT operation in an execution plan for every over () clause statement that contains ONLY a partition by option and where the container query includes the partition by column in the predicate part”.

Bottom Line: from now and on when I see an over () clause statement I will be paying more attention to the partition by clause to see if its related column is in the predicate part or not. It might explain why I have a WINDOW SORT instead of a WINDOW BUFFER

FootNote:  I have to warn that the above conclusions, despite they might be correct, they nevertheless remains to be sourced from a one day experiment. As such you should consider them with a careful attention before taken them as definitely demonstrated. It is not because they have been published that they are correct.


Redundant Indexes

$
0
0

I am very often warning:  dropping redundant indexes in production is not 100% safe. I have instead always been advocating paying a careful attention during design time to avoid creating redundant indexes. In my professional experience I have realized that, it is very often when creating indexes to cover the lock threat of unindexed foreign key constraints, that developers are creating unintentionally redundant indexes. It has been irritating me so that I have created a script which checks if a given foreign key is already indexed or not before creating supplementary non wanted indexes damaging the DML part of the application.

Having said that I still have not defined what is redundant indexes

Indexes ind_1 and ind_2 are said redundant when leading columns of one index are a superset of the leading columns of the other one

For example

Ind_1 (a,b) and ind_2(a,b,c) are redundant because ind_2 contains index ind_1.

If you are at design time it is obvious that you should not create index ind_1. However, once in production, it is not 100% safe to drop index ind_1 without any impact. There are, for sure, occasions were the clustering factor of ind_2 is so dramatic when compared to index ind_1 so that if the later index is dropped the optimizer will opt for a full table scan traumatizing the queries that were perfectly happy with the dropped redundant index.

I can also show you another type of what people might consider redundant indexes while they aren’t. Consider the following model where I have created a range partitioned table (mho_date being the partition key and I have created 1493 partitions) and two indexes as shown below

desc partitioned_tab

Name                            Null?    Type
------------------------------- -------- ------------
1      MHO_ID                          NOT NULL NUMBER(10)
2      MHO_DATE                        NOT NULL DATE
3      MHO_CODE                        NOT NULL VARCHAR2(1)
4      MHO_TYP_ID                      NOT NULL NUMBER(10)

create index local_ind_1 on partitioned_tab (mho_typ_id,mho_code) local;

create index global_ind_1 on partitioned_tab (mho_typ_id);

I am going to execute a simple query against the above engineered partitioned table.

select * from partitioned_tab where mho_typ_id = 0;

Which, in the presence of the above two indexes is honored via the following execution plan

----------------------------------------------------------------------
Plan hash value: 3042058313
----------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name            | Rows  |Cost (%CPU)| Pstart| Pstop |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                 |  1493 | 1496   (0)|       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PARTITIONED_TAB |  1493 | 1496   (0)| ROWID | ROWID |
|*  2 |   INDEX RANGE SCAN                         | GLOBAL_IND_1    |  1493 |    4   (0)|       |       |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("MHO_TYP_ID"=0)

Statistics
-------------------------------------------------------
48    recursive calls
0     db block gets
3201  consistent gets
0     physical reads
0     redo size
51244 bytes sent via SQL*Net to client
1632  bytes received via SQL*Net from client
101   SQL*Net roundtrips to/from client
7     sorts (memory)
0     sorts (disk)
1493  rows processed

As you might already know I am one of the fans of SQLT tool developed by Carlos Sierra. And here below what this tool said about redundant indexes in this particular case

Redundant_indexes

It is clearly suggesting considering dropping the redundant index global_ind_1 index

So let’s follow this advice and see what happens. Hopefully with the recent Oracle release I will first make the index invisible ( by the way that’s a good point to signal for Carlos Sierra and Mauro Pagano for them to suggest setting first the index invisible before considering dropping it)

alter index global_ind_1 invisible;

select * from partitioned_tab where mho_typ_id = 0;

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name            | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                 |  1493 | 23888 |  2985   (1)|       |       |
|   1 |  PARTITION RANGE ALL                       |                 |  1493 | 23888 |  2985   (1)|     1 |  1493 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| PARTITIONED_TAB |  1493 | 23888 |  2985   (1)|     1 |  1493 |
|*  3 |    INDEX RANGE SCAN                        | LOCAL_IND_1     |  1493 |       |  1492   (0)|     1 |  1493 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("MHO_TYP_ID"=0)

Statistics
--------------------------------------------------------
48    recursive calls
0     db block gets
4588  consistent gets
0     physical reads
0     redo size
47957 bytes sent via SQL*Net to client
1632  bytes received via SQL*Net from client
101   SQL*Net roundtrips to/from client
7     sorts (memory)
0     sorts (disk)
1493  rows processed

By making the ”redundant” index invisible we went from a smooth one global index range scan with a cost of 4 to 1493 index range scans with a cost of 1492 with an additional 1387 logical reads.

Bottom line: You should always consider dropping (avoiding) redundant index at design time. Once in production consider making them invisible first before thinking carefully about dropping them.

PS : if you want to create the table the script can be found partitioned table


DBMS_SCHEDULER and CALL statement

$
0
0

This is a brief reminder for those using intensively Oracle dbms_scheduler package to schedule a launch of a stored PL/SQL procedure. Recently I was investigating a wide range performance problem via a 60 minutes AWR snapshot until the following action that appears in the TOP SQL ordered by Gets has captured my attention:

SQL ordered by Gets

Buffer Gets Executions Gets per Exec %Total Elapsed Time (s) %CPU %IO SQL Id SQL Module SQL Text
858,389,984 2 429,194,992.00 89.58 3,606.79 97.2 1.9 6cjn7jnzpc160 DBMS_SCHEDULER call xxx_PA_REDPC.P_EXPORT_DA…
542,731,679 32,021 16,949.24 56.64 3,201.40 98.4 .6 4bnh2nc4shkz3 w3wp.exe SELECT WAGD_ID FROM S1

It is not the enormous Logical I/O done by this scheduled stored procedure that has retained my attention but it is the appearance of the call statement in the corresponding SQL Text.

Where does this come from?

Let’s get the DDL of the corresponding program

 SELECT dbms_metadata.get_ddl('PROCOBJ','XXX_PARSE_MESSAGE_PRG', 'SXXX') from dual;

Which gives this

 BEGIN
  DBMS_SCHEDULER.CREATE_PROGRAM
    (program_name         => 'XXX_PARSE_MESSAGE_PRG'
    ,program_type         => 'STORED_PROCEDURE'
    ,program_action       => 'XXX_PA_REDPC.P_EXPORT_DA.P_xxxx'
    ,number_of_arguments  => 0
    ,enabled              => FALSE
    ,comments             => NULL);
 COMMIT;
 END;
 /
 

I have  arranged a little bit the generated DDL script for clarity.

Now things become clear.

When you define your program using

     program_type         => 'STORED_PROCEDURE'
 

Then your job will be executed using the SQL command call

    call XXX_PA_REDPC.P_EXPORT_DA..'
 

This is in contrast to when you define your program using

  program_type         => 'PLSQL_BLOCK'
 

which has the consequence of making your job being executed using an anonymous PL/SQL block

 BEGIN
    XXX_PA_REDPC.P_EXPORT_DA..
 END;
 

And now the question: how would you prefer your scheduled stored procedure to be executed?

  1. via the  SQL call statement
  2. via the anonymous PL/SQL block

Well, after a brief research on My Oracle support I found a bug that seems closely related to it

DBMS_SCHEDULER Is Suppressing NO_DATA_FOUND Exceptions for Jobs that Execute Stored Procedures (Doc ID 1331778.1)

In fact, there is one fundamental threat when opting for the call statement. Consider this

 SQL> create or replace procedure p1 as
    begin
     insert into t values (1,2);
      raise no_data_found;
      commit;
    end;
 /
Procedure created.

SQL> select count(1) from t;   

   COUNT(1)
   ----------        
    0

I am going to call this procedure using the call statement which normally will raise a no_data_found exception and the inserted data into table t will be rolled back

SQL> call p1();

Call completed.

SQL> select * from t;

        N1         N2
---------- ----------
         1          2

Despite the raise of the no_data_found exception inserted data has been committed and the exception ignored. This will not have happened if I have managed to execute the stored procedure using an anonymous PL/SQL block as shown below:

SQL> truncate table t;

Table truncated.

SQL> begin
p1();
end;
/
begin
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "XXX.P1", line 4
ORA-06512: at line 2

SQL> select count(1) from t;

COUNT(1)
----------
0

So, please be informed :-)


12c display_awr

$
0
0

It is well known that display_awr function is unable to show the predicate part of a captured execution plan. Is this still the case with the 12c advent?

 select * from v$version;

BANNER                                                                           CON_ID
 -------------------------------------------------------------------------------- ----------
 Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production     0
 PL/SQL Release 12.1.0.1.0 - Production                                           0
 CORE 12.1.0.1.0 Production                                                       0
 TNS for 64-bit Windows: Version 12.1.0.1.0 - Production                          0
 NLSRTL Version 12.1.0.1.0 - Production                                           0

 create table t2
  (col1 number
  ,col2 varchar2(50)
  ,flag varchar2(2));

 var n varchar2(2);
 exec :n := 'Y1';

 select count(*), max(col2) from t2 where flag = :n;

 COUNT(*) MAX(COL2)
 --------- ------------------------------------------
 0

 select * from table(dbms_xplan.display_cursor);

 SQL_ID 47n9zu0w7ht8d, child number 0
 -------------------------------------
 select count(*), max(col2) from t2 where flag = :n

 Plan hash value: 3321871023
 ---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    30 |            |          |
|*  2 |   TABLE ACCESS FULL| T2   |     1 |    30 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
  2 - filter("FLAG"=:N)

 Note
 -----
  - dynamic statistics used: dynamic sampling (level=2)
 

Let’s “color” the above sql query so that it will be captured by the next AWR snapshot and check the captured execution plan to see if it reports the predicate part or not

exec dbms_workload_repository.add_colored_sql('47n9zu0w7ht8d');

exec dbms_workload_repository.create_snapshot;

select * from table(dbms_xplan.display_awr('47n9zu0w7ht8d'));

SQL_ID 47n9zu0w7ht8d
--------------------
select count(*), max(col2) from t2 where flag = :n

Plan hash value: 3321871023
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    30 |            |          |
|   2 |   TABLE ACCESS FULL| T2   |     1 |    30 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------
 Note
 -----
  - dynamic statistics used: dynamic sampling (level=2)
 

The predicate part is still not shown in the execution plan taken from AWR. Damn!!


Why my query is not using my unique index

$
0
0

Here’s a point that I have been asked to explain and that I failed to answer. It is a simple select on a table with an equality predicate on two columns. These two columns are indexed using a two columns unique index. So it seems quite obvious that when selecting from this table using these two columns to (a) select a unique row and (b) use that unique index for that purpose. However, the CBO seems making an illogic decision per regards to the choice of the index access when there is no statistics on the table (num_rows and last_analyzed are null) while there is 0 statistics in the indexes. Things will become clear with a concrete example.

As very often, I will use one of the Jonathan Lewis table scripts creation and population.

 create table t1
   (id number,
    n_1000 number,
    n_5000 number,
    n_10000 number,
    small_vc varchar2(20),
    padding  varchar2(100)
   );

create unique index a_unq_ind on t1(id, n_1000);
create index b_non_unq_ind on t1(n_1000, n_5000);

Spot how I have managed to name my indexes(unique and non-unique) in an alphabetic order so that when this specific order matters the CBO will choose the unique index as far as it starts with the ‘a’ letter.

Now that I have created my table and my two indexes I can populate the table with data without transmitting statistics to the indexes.

 insert into t1
  with generator as (
   select   --+ materialize
    rownum id
   from dual
  connect by
  rownum <= 10000
)
select
    rownum                    id,
    mod(rownum,1000) n_1000,
    mod(rownum,5000) n_5000,
    mod(rownum,10000) n_10000,
    lpad(rownum,10,'0')       small_vc,
    rpad('x',100)             padding
from
    generator        v1,
    generator        v2
where
rownum <= 100000
;

commit;

Before starting my select, let me show you the actual table and index statistics


SQL> select table_name, num_rows, last_analyzed from user_tables where table_name = 'T1';

TABLE_NAME   NUM_ROWS LAST_ANALYZED
---------- ---------- -----------------
T1

SQL> select index_name, num_rows, clustering_factor from user_indexes where table_name = 'T1';

INDEX_NAME        NUM_ROWS CLUSTERING_FACTOR
--------------- ---------- -----------------
B_NON_UNQ_IND            0                 0
A_UNQ_IND                0                 0

And now my desired select and its corresponding execution plan (statistics_level have been set to all before)

SQL> select * from t1
     where id = 1
     and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------- ---------- ---------- ---------- -------------------- -----------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 0
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3790258116

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |               |      1 |        |      1 |00:00:00.01 |     103 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1            |      1 |     15 |      1 |00:00:00.01 |     103 |
|*  2 |   INDEX RANGE SCAN                  | B_NON_UNQ_IND |      1 |    615 |    100 |00:00:00.01 |       3 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"=1)
2 - access("N_1000"=1)

It is clear by comparing the E-Rows and A-Rows that the statistics are not reflecting the reality.

And when I force the use of the unique index the CBO seems doing good estimations and generation less buffer gets

SQL> select /*+ index( t1 A_UNQ_IND ) */ * from t1  where id = 1  and   n_1000= 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------- ---------- ---------- ---------- --------------------
1          1          1          1 0000000001           x

SQL_ID  bt11jwur90xg0, child number 0
-------------------------------------
select /*+ index( t1 A_UNQ_IND ) */ * from t1  where id = 1  and n_1000= 1;

Plan hash value: 3585360496
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

While this unique index execution plan seems optimal with correct estimations and less Buffers gets it is nevertheless not the index that the CBO desires to use.

Anyway, you are going to object and to say we are not going to run a real life application without having correct statistics gathered. Believe me that if I am writing this case here it is because it has happened in a real life production application. Having said that let’s collect statistics and see the new reaction of the CBO

SQL> exec dbms_stats.gather_table_stats(user ,'t1');

PL/SQL procedure successfully completed.

SQL> select * from t1
where id = 1
and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------------------------------------------------------------------------------------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 1
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3585360496

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

Note
-----
- statistics feedback used for this statement

After having collected statistics the CBO is now choosing the right index.

By the way I am doing this test case on a 12c data base (you might have already guessed that through the presence of the word BATCHED in the execution plan) and seeing this statistics feedback used note, pushed me to flush the shared pool and re-run my query in order to be sure that the use of the unique index is not due to this statistics feedback feature

SQL> alter system flush shared_pool;

System altered.

SQL> select * from t1
where id = 1
and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
--------- ---------- ---------- ---------- -------------------- ---------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 0
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3585360496
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

That is the whole story. If you want to learn more about this particular situation then you have to read this related otn thread.

Bottom line: always be sure to have adequate statistics so that you will give the CBO all chances to produce an adequate plan

PS : I started by trying to explain why the CBO didn’t choose the unique index and ended up while writing this article to two questions

1 : why dynamic sampling didn’t occur : Martin Press gave a clue that needs only to be tested

2 : when the CBO uses the index range scan the table is accessed via this new 12c  TABLE ACCESS BY INDEX ROWID BATCHED  that has been explained by Timur Akhmadeed but when it uses the unique index scan the CBO opted for the classical TABLE ACCESS BY INDEX ROWID


Index design

$
0
0

Here it is a very brief discussion I have had with one of my colleagues about index design

Colleague: what kind of index would you suggest to cover the following query?

SELECT
     rowid
    ,a.*
FROM  message_out a
WHERE sms_status    in (700, 707)
AND  (scheduled_time is null
      OR scheduled_time   <= :1)
AND   provider_id     in (0,0)
ORDER BY
      priority_level desc,
      creation_time asc;

----------------------------------------------------------------------------
| Id | Operation         | Name        | Rows | Bytes |TempSpc| Cost (%CPU)|
----------------------------------------------------------------------------
| 0  | SELECT STATEMENT  |             | 5529 | 1376K |       | 4769 (1)   |
| 1  | SORT ORDER BY     |             | 5529 | 1376K | 1856K | 4769 (1)   |
|* 2 |  TABLE ACCESS FULL| MESSAGE_OUT | 5529 | 1376K |       | 4462 (1)   |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("SMS_STATUS"=700 OR "SMS_STATUS"=707)
AND ("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
AND "PROVIDER_ID"=0)

Me: and what have you ended up with until now?

Colleague: here my suggested index and the related execution plan

CREATE INDEX virtual_index ON MESSAGE_OUT(sms_status,scheduled_time,provider_id) ;

---------------------------------------------------------------------------------------
| Id | Operation                   | Name          | Rows | Bytes |TempSpc| Cost (%CPU)|
---------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT            |               | 5529 | 1376K |       | 446 (1)    |
| 1  | SORT ORDER BY               |               | 5529 | 1376K | 1856K | 446 (1)    |
| 2  | INLIST ITERATOR             |               |      |       |       |            |
|* 3 |  TABLE ACCESS BY INDEX ROWID| MESSAGE_OUT   | 5529 | 1376K |       | 140 (0)    |
|* 4 |   INDEX RANGE SCAN          | VIRTUAL_INDEX | 5529 |       |       | 6 (0)      |
---------------------------------------------------------------------------------------

Me: I would not have created the same index

Me: here it is the index I would have created (after several questions regarding the data distribution, the table data volume, the use of bind variables, etc…)

create index mho_ind on MESSAGE_OUT (status, provider_id, scheduled_time);

Me: and if sms_status contains repetitive values then I would have added a compress command to that index creation

Colleague: there is no difference in the execution plan either by using my index or your index

------------------------------------------------------------------------------------------
| Id | Operation                     | Name          | Rows | Bytes |TempSpc| Cost (%CPU)|
------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT              |               | 5529 | 1376K |       | 446 (1)    |
| 1  | SORT ORDER BY                 |               | 5529 | 1376K | 1856K | 446 (1)    |
| 2  |  INLIST ITERATOR              |               |      |       |       |            |
|* 3 |   TABLE ACCESS BY INDEX ROWID | MESSAGE_OUT   | 5529 | 1376K |       | 140 (0)    |
|* 4 |    INDEX RANGE SCAN           | VIRTUAL_INDEX | 5529 |       |       | 6 (0)      |
------------------------------------------------------------------------------------------

Me: no, it is not the same plan. Please always consider the predicate part

Me: what is the predicate part of the plan using your index

Colleague: this is my index predicate part


Predicate Information (identified by operation id):
 ---------------------------------------------------
 3 - filter("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
 4 - access(("SMS_STATUS"=700 OR "SMS_STATUS"=707) AND "PROVIDER_ID"=0)
     filter("PROVIDER_ID"=0) --> additional filter operation

Colleague: and this is your index predicate part

Predicate Information (identified by operation id):
---------------------------------------------------
 3 - filter("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
 4 - access(("SMS_STATUS"=700 OR "SMS_STATUS"=707) AND "PROVIDER_ID"=0)
--> no additional filter operation

Me: and did you pointed out the difference or not yet?

Colleague: no, same plan, same cost and same execution time

Me: there is a fundamental difference between your plan and mine. In your plan there is a double operation on your engineered index:  “ACCESS + FILTER” operation while my engineered index needs only one precise operation :  “ACCESS

Me: and when it comes to performance you always prefers a precise index ACCESS operation to that double ACCESS and FILTER operations.

Me: your engineered index has a second columns on which an inequality predicate is applied

SCHEDULED_TIME <= :1

You should always start your index by the columns on which an equality predicate is applied. In my case, I put the SCHEDULED_TIME column at the trailing edge of my index and doing as such I have avoided a costly filter operation on my index while your engineered index has been subject to that costly filter operation

If you want to test this behaviour then below is an example to play with. I hope you will enjoy it

SQL> create table t1
     (id number,
     n_1000 number,
     n_5000 number,
     n_10000 number,
     small_vc varchar2(20),
     padding varchar2(100)
     );

Table created.

SQL> insert into t1
  with generator as (
  select --+ materialize
  rownum id
  from dual
  connect by
  rownum <= 10000
  )
  select
    rownum id,
    mod(rownum,1000) n_1000,
    mod(rownum,5000) n_5000,
    mod(rownum,10000) n_10000,
    lpad(rownum,10,'0') small_vc,
    rpad('x',100) padding
  from
    generator v1,
    generator v2
  where
  rownum <= 100000
  ;

SQL> create index my_ind on t1(id, n_5000, n_1000);

Index created.

SQL> create index colleague_ind on t1(id, n_1000, n_5000);

Index created.

SQL> alter index my_ind invisible;

Index altered.

SQL> exec dbms_stats.gather_table_stats(user, 't1');

PL/SQL procedure successfully completed.

SQL> select
        a.*
  from t1 a
  where id in (112,120)
  and (n_1000 is null
  or n_1000 <= 3000)
  and n_5000 in (120);

Statistics
------------------------------------------------------
 65 recursive calls
 0 db block gets
 95 consistent gets ---> spot this
 0 physical reads
 0 redo size
 1005 bytes sent via SQL*Net to client
 543 bytes received via SQL*Net from client
 2 SQL*Net roundtrips to/from client
 6 sorts (memory)
 0 sorts (disk)
 1 rows processed

SQL_ID 7d6ag1m1ztpgr, child number 1
-------------------------------------
select a.* from t1 a where id in (112,120) and (n_1000 is null
or n_1000 <= 3000) and n_5000 in (120)

Plan hash value: 3644584748
-------------------------------------------------------------------------------------------
| Id  | Operation                            | Name          | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |               |       |       |     4 (100)|
|   1 |  INLIST ITERATOR                     |               |       |       |            |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1            |     2 |   258 |     4   (0)|
|*  3 |    INDEX RANGE SCAN                  | COLLEAGUE_IND |     2 |       |     3   (0)|
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 2 - filter(("N_1000"<=3000 OR "N_1000" IS NULL))
 3 - access((("ID"=112 OR "ID"=120)) AND "N_5000"=120)
 filter("N_5000"=120) ---> spot this

SQL> alter index colleague_ind invisible;

Index altered.

SQL> alter index my_ind visible;

Index altered.

SQL> select
       a.*
  from t1 a
  where id in (112,120)
  and (n_1000 is null
  or n_1000 <= 3000)
  and n_5000 in (120);

Statistics
------------------------------------------------------
 33 recursive calls
 0 db block gets
 49 consistent gets --> spot the reduction
 0 physical reads
 0 redo size
 1005 bytes sent via SQL*Net to client
 543 bytes received via SQL*Net from client
 2 SQL*Net roundtrips to/from client
 6 sorts (memory)
 0 sorts (disk)
 1 rows processed

SQL_ID 7d6ag1m1ztpgr, child number 1
-------------------------------------
select a.* from t1 a where id in (112,120) and (n_1000 is null
or n_1000 <= 3000) and n_5000 in (120)

Plan hash value: 4286547933</pre>
------------------------------------------------------------------------------------
| Id  | Operation                            | Name   | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |        |       |       |     4 (100)|
|   1 |  INLIST ITERATOR                     |        |       |       |            |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     2 |   258 |     4   (0)|
|*  3 |    INDEX RANGE SCAN                  | MY_IND |     2 |       |     3   (0)|
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 2 - filter(("N_1000"<=3000 OR "N_1000" IS NULL))
 3 - access((("ID"=112 OR "ID"=120)) AND "N_5000"=120)
     --> spot the absence of filter on index

NLS_SORT and SPM reproducibility

$
0
0

I wrote a blog article which aims mainly to show what optimizer parameter the CBO will use to reproduce aSPM baseline plan: the current environment parameters or the parameters used during the SPM plan baseline capture? Having tested in this blog article only the influence of the optimizer_mode (all_rows versus first_rows mode) I ended up with a conclusion that the CBO will use the optimizer_mode parameter stored during the SPM plan capture. I have also put a careful warning that this is not a conclusion one can spread to other CBO parameters without testing; particularly that I have already seen an otn thread dealing with the inability to reproduce a SPM plan because of a change in the _optim_peek_user_binds  hidden parameter. Then, a post on oracle list about the non reproducibility of a SPM baseline following an upgrade from 10gR2 to 11gR2 prompted me to investigate the influence the NLS_SORT parameter can have on the reproducibility of a SPM baseline plan. Below are my investigations and findings:

First, the model

CREATE TABLE t
(c1 VARCHAR2(64), c2 CHAR(15), d1 DATE);

INSERT INTO t
SELECT
    mod(ABS(dbms_random.random),3)+ 1||chr(ascii('Y')) ,
    dbms_random.string('L',dbms_random.value(1,5))||rownum ,
    to_date(TO_CHAR(to_date('01/01/1980','dd/mm/yyyy'),'J') + TRUNC(dbms_random.value(1,11280)),'J')
FROM dual
CONNECT BY level <= 2e6;

ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (c1,c2) USING INDEX;

EXEC dbms_stats.gather_table_stats (USER, 't', CASCADE => true, method_opt => 'FOR ALL COLUMNS SIZE 1');

Second, the preliminaries

SQL> select * from dba_sql_plan_baselines;

no rows selected  --  no baseline yet

SQL> show parameter nls_sort

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------------
nls_sort                             string      BINARY

SQL> alter session set optimizer_capture_sql_plan_baselines = TRUE;

SQL> SELECT  c1
FROM t
GROUP BY c1
ORDER BY c1 ASC NULLS LAST;

C1
------
1Y
2Y
3Y

SQL> /

C1
------
1Y
2Y
3Y

SQL> alter session set optimizer_capture_sql_plan_baselines = FALSE;

SQL> select plan_name from dba_sql_plan_baselines;

PLAN_NAME
------------------------------
SQL_PLAN_90sg67694zwyj641607ca  -- one SPM plan baseline

So far I have engineered a model against which I executed a query returning data in a certain order under the classical Binary NLS_SORT parameter. I added to that situation a SPM baseline plan so that any “resembling” query will use that SPM plan. That is plan stability. This SPM plan looks like:

SQL> select * from table(dbms_xplan.display_sql_plan_baseline(plan_name => 'SQL_PLAN_90sg67694zwyj641607ca', format => 'ADVANCED'));

--------------------------------------------------------------------------------
SQL handle: SQL_9061e639924ff3d1
SQL text: SELECT  c1     FROM t     GROUP BY c1    ORDER BY c1 ASC NULLS LAST
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_90sg67694zwyj641607ca         Plan id: 1679165386
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Outline Data from SMB:
/*+
BEGIN_OUTLINE_DATA
INDEX(@"SEL$1" "T"@"SEL$1" ("T"."C1" "T"."C2"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('11.2.0.3')
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
--------------------------------------------------------------------------------
Plan hash value: 2111031280
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     3 |     9 |  2069   (5)| 00:00:06 |
|   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |  2069   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
-----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T@SEL$1

Outline Data
------------
/*+
BEGIN_OUTLINE_DATA
INDEX(@"SEL$1" "T"@"SEL$1" ("T"."C1" "T"."C2"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('11.2.0.3')
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) "C1"[VARCHAR2,256]
2 - "C1"[VARCHAR2,256]

An index full scan with which the CBO has avoided the order by sort operation.

Now, for my “tranquility”,  I will execute my query and check if the SPM plan is used or not.

SQL> SELECT  c1
     FROM t
     GROUP BY c1
     ORDER BY c1 ASC NULLS LAST;

C1
----
1Y
2Y
3Y

SQL_ID  28dazsm20sbw6, child number 2
-------------------------------------
SELECT  c1     FROM t     GROUP BY c1    ORDER BY c1 ASC NULLS LAST
Plan hash value: 2111031280
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |       |       |  2069 (100)|          |
|   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |  2069   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
-----------------------------------------------------------------------------

Note
-----
- SQL plan baseline SQL_PLAN_90sg67694zwyj641607ca used for this statement

Great it is used.

Third, the issue presentation and discussion

What happens if, in my current environment, I change the NLS_SORT parameter?

SQL> alter session set nls_sort=french; -- I altered my current environment when compared to the SPM capture time environment

SQL> SELECT  c1
FROM t
GROUP BY c1
ORDER BY c1 ASC NULLS LAST;

C1
----
1Y
2Y
3Y

SQL_ID  28dazsm20sbw6, child number 2
-------------------------------------
SELECT  c1     FROM t     GROUP BY c1    ORDER BY c1 ASC NULLS LAST

Plan hash value: 1760210272
------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |  2451 (100)|          |
|   1 |  SORT ORDER BY        |      |     3 |     9 |  2451  (20)| 00:00:07 |
|   2 |   SORT GROUP BY NOSORT|      |     3 |     9 |  2451  (20)| 00:00:07 |
|   3 |    INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
------------------------------------------------------------------------------

See how my query is not using the SPM baseline plan anymore. It is using a new plan where the sort order by operation has not been eliminated by the CBO. If my query is not using the SPM plan this is because the CBO was not able to reproduce the stored Baseline plan because of nls_sort parameter change. If this means something it then means that when trying to reproduce the SPM plan the CBO uses the current nls_sort parameter.

The new CBO plan have been added to the SPM baseline for an eventual evolution

SQL> select plan_name from dba_sql_plan_baselines;

PLAN_NAME
------------------------------
SQL_PLAN_90sg67694zwyj297df088
SQL_PLAN_90sg67694zwyj641607ca

If I alter again my current nls_sort parameter so that it will match the one stored against the baseline plan, then my query will be back to its initial use of the SPM plan

 SQL> alter session set nls_sort=binary;

 SQL> SELECT  c1
      FROM t
      GROUP BY c1
      ORDER BY c1 ASC NULLS LAST;

SQL_ID  5hrfv0352fzdr, child number 0
-------------------------------------
SELECT  c1 FROM t GROUP BY c1     ORDER BY c1 ASC NULLS LAST

Plan hash value: 2111031280
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |       |       |  2069 (100)|          |
|   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |  2069   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
-----------------------------------------------------------------------------

Note
-----
- SQL plan baseline SQL_PLAN_90sg67694zwyj641607ca used for this statement

Bottom line: in contrast to the optimizer_mode parameter when it comes to NLS_SORT (and NLS_LANG) parameter, the CBO seems to use the current environment NLS_SORT value (and not the one that existed at the baseline time capture) to reproduce the stored SPM plan baseline.



parallel insert

$
0
0

Do you know that a direct path insert can be silently ignored when the inserted table has an insert trigger?

SQL> create table t1 as select rownum n1 from dual connect by level <=1e5;

SQL> create table t2 as select * from t1 where 0 = 1;

SQL> create or replace trigger t2_trg
      before insert on t2
      for each row
    begin
       null;
    end;
   /

SQL> insert /*+ append */ into t2 select * from t1;

100000 rows created.

SQL> select count(1) from t2;

COUNT(1)
---------
100000

SQL> alter trigger t2_trg disable;

SQL> insert /*+ append */ into t2 select * from t1;

100000 rows created.

SQL> select count(1) from t2;

select count(1) from t2
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

It is only when the trigger has been disabled that the insert has been done via a direct path load.

But are you aware that a parallel insert can also be silently ignored in presence of an insert trigger?

SQL> alter session enable parallel dml;

SQL> insert /*+ parallel(t2) */ into t2 select * from t1;

100000 rows created.

SQL_ID  5npb49pus3wtr, child number 0
-------------------------------------
insert /*+ parallel(t2) */ into t2 select * from t1

Plan hash value: 2315600204
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT        |          |       |       |    47 (100)|          |        |      |            |
|   1 |  PX COORDINATOR         |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10001 |   100K|  1269K|    47   (7)| 00:00:01 |  Q1,01 | P->S | QC (RAND)  |
|   3 |    LOAD AS SELECT       |          |       |       |            |          |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE          |          |   100K|  1269K|    47   (7)| 00:00:01 |  Q1,01 | PCWP |            |
|   5 |      PX SEND ROUND-ROBIN| :TQ10000 |   100K|  1269K|    47   (7)| 00:00:01 |        | S->P | RND-ROBIN  |
|   6 |       TABLE ACCESS FULL | T1       |   100K|  1269K|    47   (7)| 00:00:01 |        |      |            |
-----------------------------------------------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement (level=2)

SQL> select count(1) from t2;

select count(1) from t2
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

SQL> rollback;

SQL> alter trigger t2_trg enable;

SQL> insert /*+ parallel(t2) */ into t2 select * from t1;

100000 rows created.

SQL_ID  5npb49pus3wtr, child number 0
-------------------------------------
insert /*+ parallel(t2) */ into t2 select * from t1

Plan hash value: 3617692013
---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |       |       |    47 (100)|          |
|   1 |  LOAD TABLE CONVENTIONAL |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL      | T1   |   100K|  1269K|    47   (7)| 00:00:01 |
---------------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement (level=2)

SQL> select count(1) from t2;

COUNT(1)
----------
300000

When the trigger has been disabled the insert went through a parallel load path in contrast to when the same trigger has been enabled where the insert went through to a classical conventional load.

A classical conventional load have I said?

Yes. If you look closely to the case where the parallel insert has been honored you will realize that not only the insert run in parallel but it has also been done via a direct path mode.

Tell me: does this means that a parallel insert (when honored) is always accomplished via a direct path load?

Yes it is.

So is the append hint in the following instruction redundant?


SQL> insert /*+ append parallel(t2) */ into t2 select * from t1;

No it is not always redundant.

Simply because a parallel run and a direct path load share several common “impeachment” reasons like the presence of insert triggers or foreign integrity constraints((yes I have also tested this case also)  but they also have their proper source of impeachment. While the direct path is not a CBO decision, the parallel run is. So, what do you think will happen when the CBO decides that the parallel insert is not the best path to follow and you have already get rid of the append hint from your insert statement? The direct path, when possible, will not be followed.

If I simulate a parallel run impeachment by disabling the parallel dml and get rid of the append hint from my insert statement then the direct path load will not happen

SQL> alter session disable parallel dml;

SQL> insert /*+ parallel(t2) */ into t2 select * from t1;

100000 rows created.

SQL> select count(1) from t2;

COUNT(1)
----------
100000

However if, in the same situation,  I will add the append hint to my insert statement then the insert will follow a direct path load path as demonstrated below:

SQL> insert /*+ parallel(t2) append */ into t2 select * from t1;

100000 rows created.

SQL> select count(1) from t2;
select count(1) from t2
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

This is the otn thread that prompted this blog post

Bottom line: I learnt yesterday that a parallel insert when followed by the CBO will use a direct path load which is not always the best solution. I learnt also in this case that a parallel insert suffers from the same impeachment raisons as the direct path load does: presence of insert triggers and foreign integrity constraint. And I learnt that to secure your direct path load add the append hint  to your insert statement even when this statement contains a parallel hint


SPM reproducibility: changing the _optimizer_skip_scan_enabled value

$
0
0

I have had several articles about Sql Plan Management stability and evolution. I have also been recently focusing my attention particularly on the criteria, scenarios and circumstances that impeach an enabled and accepted SPM plan to be reproducible. Below are summarized those impeachment reasons:

  1. Changing the index name
  2. Changing the index type (with particular situations for function based and reverse indexes)
  3. Changing the index leading column(s)

Then I embarked on the investigation of the reaction a SPM plan could manifest, during a query execution time, to a change in one of the environment parameters that have been used during the SPM plan capture. I have investigated two situations in this context

  1. One for a change in an optimizer parameter (optimizer mode) value
  2. And the other one for a NLS parameter (nls_sort) change

For the optimizer mode I came up to a conclusion that a change of that mode in the current environment will not have an effect on the reproducibility of the SPM plan as far as this one will be reproduced using the optimizer mode stored against the SPM plan.

However, an nls_sort parameter change reveals to be a serious threat for the SPM reproducibility as far the CBO will use in this particular case the current nls_sort parameter value which might end up with an impossibility to reproduce the stored SPM plan.

Despite the above investigations, there is one thing that I wanted to investigate the effect it has on the reproducibility of a SPM plan: a change of an undocumented optimizer parameter value. This post is for that purpose. Follow me please

 create table t1
    (col1  number
    ,col2  varchar2(50)
    ,flag  varchar2(2));

 insert into t1
    select rownum
          ,lpad('X',50,'X')
          ,case when rownum = 1
            then 'Y1'
               when rownum = 2
            then 'Y2'
               when mod(rownum,2) = 0
            then 'N1'
            else 'N2'
           end
    from   dual
connect by rownum <= 100000;

create index i1 on t1(col1,flag);

exec dbms_stats.gather_table_stats(user ,'t1');

A simple model with an index which, as we will see later, has been engineered so that it will be skip scanned. The next lines of code will store a SPM plan for a given query using default optimizer parameters

SQL> var n varchar2(2);
SQL> exec :n := 'Y1'

SQL> alter session set optimizer_capture_sql_plan_baselines=TRUE;

SQL> select count(1) from t1 where flag = :n;

SQL> /

SQL> alter session set optimizer_capture_sql_plan_baselines=FALSE;

SQL> select count(1) from t1 where flag = :n;

SQL_ID  5k94675mwqz5j, child number 0
-------------------------------------
select count(1) from t1 where flag = :n

Plan hash value: 2775586896
-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    67 (100)|          |
|   1 |  SORT AGGREGATE  |      |     1 |     3 |            |          |
|*  2 |   INDEX SKIP SCAN| I1   | 25000 | 75000 |    67   (2)| 00:00:01 |
-------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T1@SEL$1

Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX_SS(@"SEL$1" "T1"@"SEL$1" ("T1"."COL1" "T1"."FLAG"))
END_OUTLINE_DATA
*/

Peeked Binds (identified by position):
--------------------------------------
1 - :N (VARCHAR2(30), CSID=873): 'Y1'

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FLAG"=:N)
   filter("FLAG"=:N)

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) COUNT(*)[22]

Note
-----
- SQL plan baseline SQL_PLAN_4njm93y44xuu38bc11ac1 used for this statement

A simple query constrained by a SPM plan using index skip scan access.

Having presented the model  I can now start my real investigations which can be summarize via this simple question : if I alter my current environment so that I will disable the use of index skip scan, will my query still be using the SPM plan? Or in more subtle and clear words: will the SPM plan be reproducible under such a change of an undocumented parameter value?

And the answer is : see below

SQL> alter session set "_optimizer_skip_scan_enabled"=FALSE;

SQL> alter session set optimizer_use_sql_plan_baselines = FALSE;

I have altered the hidden parameter and put on hold for a moment the use of sql plan baseline in order to show you that under this circumstances the CBO will come up with a new plan as shown below:

SQL> select count(1) from t1 where flag = :n;

SQL_ID  5k94675mwqz5j, child number 3
-------------------------------------
select count(1) from t1 where flag = :n

Plan hash value: 129980005
------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |    71 (100)|          |
|   1 |  SORT AGGREGATE       |      |     1 |     3 |            |          |
|*  2 |   INDEX FAST FULL SCAN| I1   | 25000 | 75000 |    71   (6)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG"=:N)

There is now a new plan using an INDEX FAST FULL SCAN since I have disabled the use of INDEX SKIP SCAN. It is a clear indication that under this undocumented parameter change the CBO is not able to reproduce the index skip scan plan. Let’s see then what happen to our query.

SQL> alter session set optimizer_use_sql_plan_baselines = TRUE;

SQL> select count(1) from t1 where flag = :n;

---------------------------------------------------
SQL_ID  5k94675mwqz5j, child number 5
-------------------------------------
select count(1) from t1 where flag = :n

Plan hash value: 2775586896
-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    67 (100)|          |
|   1 |  SORT AGGREGATE  |      |     1 |     3 |            |          |
|*  2 |   INDEX SKIP SCAN| I1   | 25000 | 75000 |    67   (2)| 00:00:01 |
-------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T1@SEL$1

Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX_SS(@"SEL$1" "T1"@"SEL$1" ("T1"."COL1" "T1"."FLAG"))
END_OUTLINE_DATA
*/

Peeked Binds (identified by position):
--------------------------------------
1 - :N (VARCHAR2(30), CSID=873): 'Y1'

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FLAG"=:N)
   filter("FLAG"=:N)

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) COUNT(*)[22]

Note
-----
- SQL plan baseline SQL_PLAN_4njm93y44xuu38bc11ac1 used for this statement

The CBO has been able to reproduce the SPM INDEX SKIP SCAN plan.

Bottom line: if the CBO has been able to reproduce the SPM INDEX SKIP SCAN plan under a disabled _optimizer_skip_scan_enabled parameter in the query execution environment, this means that the CBO will use the value of that hidden parameter stored during the SPM plan capture and not the value this parameter has in the current execution environment.


Indexing Foreign Key: the proof of the FK index use

$
0
0

If you have been looking for a material proof showing Oracle using the foreign key index created on the child table while deleting from a parent table then here it is

drop table t2;
drop table t1;

create table t1
(col1 number primary key);

create table t2
(col1    number primary key
,status  varchar2(12) not null
,col2    number
,col2v   number generated always as (case when status = 'ACTIVE' then col2 end) VIRTUAL
,constraint t2_fk foreign key (col2v) references t1(col1)
,constraint t2_ck check (status in ('ACTIVE','INACTIVE') and (status = 'INACTIVE' or col2 is not null))
);

create index t2_ind_fk on t2(col2v);

insert into t1
 select rownum
from dual
connect by level <=100;

commit;

insert into t2 (col1, status, col2) values (1, 'ACTIVE',50);

alter session set skip_unusable_indexes = false;

alter index t2_ind_fk unusable; -- implicit commit

I have created a pair of parent-child table (t1 and t2), an index on the foreign key on the t2 child table, set this index into an unusable state and changed the default skip_unusable_indexes parameter to false so that unusable indexes will not be skipped.

Now, I am going in the next PL/SQL anonymous block, to simulate a delete from a parent table using an autonomous transaction in order to mimic a different session (in fact a different transaction within the same session)

declare
 pragma autonomous_transaction;
begin
 delete from t1 –- deleting from the parent table
 where col1 = 99;
 commit;
end;
/

declare
*
ERROR at line 1:
ORA-01502: index 'XXX.T2_IND_FK' or partition of such index is in unusable state
ORA-06512: at line 4

See how deleting from the parent table (t1) triggered an error on the index of the foreign key constraint created on the child table (t2). This is a simple way to show the mechanism used by Oracle in order to avoid a child table lock (before eventually a deadlock situation) simply by using the index on the foreign key.


Window SORT and Window BUFFER

$
0
0

I was thinking that I will see a WINDOW SORT operation in an execution plan for every over ( ) clause statement that differs in the partition and the order by options. For example

SQL> select
        id
       ,n_5000
       ,lead(id) over (partition by n_5000 order by id)
     from t1
     where n_5000 = 1778;

---------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |       |       |    94 (100)|
|   1 |  WINDOW SORT          |        |    22 |   572 |    94   (6)|
|*  2 |   INDEX FAST FULL SCAN| MY_IND |    22 |   572 |    93   (5)|
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N_5000"=1778)

But it suffices to have (1) an index that starts by the order by column (id) or by the partition by column(n_5000) and (2) add a predicate on the order by column or on the partition by column to that original query and the WINDOW SORT will be transformed into a less costly WINDOW BUFFER.

SQL> select
        id
       ,n_5000
       ,lead(id) over (partition by n_5000 order by id)
    from t1
    where id = 1778;

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |       |       |     3 (100)|          |
|   1 |  WINDOW BUFFER    |        |     1 |    26 |     3  (34)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN| MY_IND |     1 |    26 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1778)

That’s pretty straightforward. The index MY_IND (id, n_5000, n_10000) has been used to avoid the WINDOW SORT analytical operation. This transforms my initial thinking to: “I will see a WINDOW SORT operation in an execution plan for every over () clause statement that differs in the partition by and the order by options unless the CBO finds a suitable index that permits bypassing a SORT operation”

But does this mean that I will not see a parent WINDOW BUFFER operation without a child index scan operation?

SQL> select
       id
      ,n_5000
      ,padding
      ,sum(id) over (partition by n_5000)
     from t1
     where n_5000 = 444;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   478 (100)|          |
|   1 |  WINDOW BUFFER     |      |    20 |  2200 |   478   (2)| 00:00:02 |
|*  2 |   TABLE ACCESS FULL| T1   |    20 |  2200 |   477   (2)| 00:00:02 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N_5000"=444)

I get rid of the order by option in the over() operation and added a predicate on the partition by column.

And maybe I can add this:

I will NOT see a WINDOW SORT operation in an execution plan for every over () clause statement that contains ONLY a partition by option and where the container query includes the partition by column in the predicate part”.

Bottom Line: from now and on when I see an over () clause statement I will be paying more attention to the partition by clause to see if its related column is in the predicate part or not. It might explain why I have a WINDOW SORT instead of a WINDOW BUFFER

FootNote:  I have to warn that the above conclusions, despite they might be correct, they nevertheless remains to be sourced from a one day experiment. As such you should consider them with a careful attention before taken them as definitely demonstrated. It is not because they have been published that they are correct.


Redundant Indexes

$
0
0

I am very often warning:  dropping redundant indexes in production is not 100% safe. I have instead always been advocating paying a careful attention during design time to avoid creating redundant indexes. In my professional experience I have realized that, it is very often when creating indexes to cover the lock threat of unindexed foreign key constraints, that developers are creating unintentionally redundant indexes. It has been irritating me so that I have created a script which checks if a given foreign key is already indexed or not before creating supplementary non wanted indexes damaging the DML part of the application.

Having said that I still have not defined what is redundant indexes

Indexes ind_1 and ind_2 are said redundant when leading columns of one index are a superset of the leading columns of the other one

For example

Ind_1 (a,b) and ind_2(a,b,c) are redundant because ind_2 contains index ind_1.

If you are at design time it is obvious that you should not create index ind_1. However, once in production, it is not 100% safe to drop index ind_1 without any impact. There are, for sure, occasions were the clustering factor of ind_2 is so dramatic when compared to index ind_1 so that if the later index is dropped the optimizer will opt for a full table scan traumatizing the queries that were perfectly happy with the dropped redundant index.

I can also show you another type of what people might consider redundant indexes while they aren’t. Consider the following model where I have created a range partitioned table (mho_date being the partition key and I have created 1493 partitions) and two indexes as shown below

desc partitioned_tab

Name                            Null?    Type
------------------------------- -------- ------------
1      MHO_ID                          NOT NULL NUMBER(10)
2      MHO_DATE                        NOT NULL DATE
3      MHO_CODE                        NOT NULL VARCHAR2(1)
4      MHO_TYP_ID                      NOT NULL NUMBER(10)

create index local_ind_1 on partitioned_tab (mho_typ_id,mho_code) local;

create index global_ind_1 on partitioned_tab (mho_typ_id);

I am going to execute a simple query against the above engineered partitioned table.

select * from partitioned_tab where mho_typ_id = 0;

Which, in the presence of the above two indexes is honored via the following execution plan

----------------------------------------------------------------------
Plan hash value: 3042058313
----------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name            | Rows  |Cost (%CPU)| Pstart| Pstop |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                 |  1493 | 1496   (0)|       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PARTITIONED_TAB |  1493 | 1496   (0)| ROWID | ROWID |
|*  2 |   INDEX RANGE SCAN                         | GLOBAL_IND_1    |  1493 |    4   (0)|       |       |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("MHO_TYP_ID"=0)

Statistics
-------------------------------------------------------
48    recursive calls
0     db block gets
3201  consistent gets
0     physical reads
0     redo size
51244 bytes sent via SQL*Net to client
1632  bytes received via SQL*Net from client
101   SQL*Net roundtrips to/from client
7     sorts (memory)
0     sorts (disk)
1493  rows processed

As you might already know I am one of the fans of SQLT tool developed by Carlos Sierra. And here below what this tool said about redundant indexes in this particular case

Redundant_indexes

It is clearly suggesting considering dropping the redundant index global_ind_1 index

So let’s follow this advice and see what happens. Hopefully with the recent Oracle release I will first make the index invisible ( by the way that’s a good point to signal for Carlos Sierra and Mauro Pagano for them to suggest setting first the index invisible before considering dropping it)

alter index global_ind_1 invisible;

select * from partitioned_tab where mho_typ_id = 0;

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name            | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                 |  1493 | 23888 |  2985   (1)|       |       |
|   1 |  PARTITION RANGE ALL                       |                 |  1493 | 23888 |  2985   (1)|     1 |  1493 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| PARTITIONED_TAB |  1493 | 23888 |  2985   (1)|     1 |  1493 |
|*  3 |    INDEX RANGE SCAN                        | LOCAL_IND_1     |  1493 |       |  1492   (0)|     1 |  1493 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("MHO_TYP_ID"=0)

Statistics
--------------------------------------------------------
48    recursive calls
0     db block gets
4588  consistent gets
0     physical reads
0     redo size
47957 bytes sent via SQL*Net to client
1632  bytes received via SQL*Net from client
101   SQL*Net roundtrips to/from client
7     sorts (memory)
0     sorts (disk)
1493  rows processed

By making the ”redundant” index invisible we went from a smooth one global index range scan with a cost of 4 to 1493 index range scans with a cost of 1492 with an additional 1387 logical reads.

Bottom line: You should always consider dropping (avoiding) redundant index at design time. Once in production consider making them invisible first before thinking carefully about dropping them.

PS : if you want to create the table the script can be found partitioned table


DBMS_SCHEDULER and CALL statement

$
0
0

This is a brief reminder for those using intensively Oracle dbms_scheduler package to schedule a launch of a stored PL/SQL procedure. Recently I was investigating a wide range performance problem via a 60 minutes AWR snapshot until the following action that appears in the TOP SQL ordered by Gets has captured my attention:

SQL ordered by Gets

Buffer Gets Executions Gets per Exec %Total Elapsed Time (s) %CPU %IO SQL Id SQL Module SQL Text
858,389,984 2 429,194,992.00 89.58 3,606.79 97.2 1.9 6cjn7jnzpc160 DBMS_SCHEDULER call xxx_PA_REDPC.P_EXPORT_DA…
542,731,679 32,021 16,949.24 56.64 3,201.40 98.4 .6 4bnh2nc4shkz3 w3wp.exe SELECT WAGD_ID FROM S1

It is not the enormous Logical I/O done by this scheduled stored procedure that has retained my attention but it is the appearance of the call statement in the corresponding SQL Text.

Where does this come from?

Let’s get the DDL of the corresponding program

 SELECT dbms_metadata.get_ddl('PROCOBJ','XXX_PARSE_MESSAGE_PRG', 'SXXX') from dual;

Which gives this

 BEGIN
  DBMS_SCHEDULER.CREATE_PROGRAM
    (program_name         => 'XXX_PARSE_MESSAGE_PRG'
    ,program_type         => 'STORED_PROCEDURE'
    ,program_action       => 'XXX_PA_REDPC.P_EXPORT_DA.P_xxxx'
    ,number_of_arguments  => 0
    ,enabled              => FALSE
    ,comments             => NULL);
 COMMIT;
 END;
 /
 

I have  arranged a little bit the generated DDL script for clarity.

Now things become clear.

When you define your program using

     program_type         => 'STORED_PROCEDURE'
 

Then your job will be executed using the SQL command call

    call XXX_PA_REDPC.P_EXPORT_DA..'
 

This is in contrast to when you define your program using

  program_type         => 'PLSQL_BLOCK'
 

which has the consequence of making your job being executed using an anonymous PL/SQL block

 BEGIN
    XXX_PA_REDPC.P_EXPORT_DA..
 END;
 

And now the question: how would you prefer your scheduled stored procedure to be executed?

  1. via the  SQL call statement
  2. via the anonymous PL/SQL block

Well, after a brief research on My Oracle support I found a bug that seems closely related to it

DBMS_SCHEDULER Is Suppressing NO_DATA_FOUND Exceptions for Jobs that Execute Stored Procedures (Doc ID 1331778.1)

In fact, there is one fundamental threat when opting for the call statement. Consider this

 SQL> create or replace procedure p1 as
    begin
     insert into t values (1,2);
      raise no_data_found;
      commit;
    end;
 /
Procedure created.

SQL> select count(1) from t;   

   COUNT(1)
   ----------        
    0

I am going to call this procedure using the call statement which normally will raise a no_data_found exception and the inserted data into table t will be rolled back

SQL> call p1();

Call completed.

SQL> select * from t;

        N1         N2
---------- ----------
         1          2

Despite the raise of the no_data_found exception inserted data has been committed and the exception ignored. This will not have happened if I have managed to execute the stored procedure using an anonymous PL/SQL block as shown below:

SQL> truncate table t;

Table truncated.

SQL> begin
p1();
end;
/
begin
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "XXX.P1", line 4
ORA-06512: at line 2

SQL> select count(1) from t;

COUNT(1)
----------
0

So, please be informed :-)


Why my query is not using my unique index

$
0
0

Here’s a point that I have been asked to explain and that I failed to answer. It is a simple select on a table with an equality predicate on two columns. These two columns are indexed using a two columns unique index. So it seems quite obvious that when selecting from this table using these two columns to (a) select a unique row and (b) use that unique index for that purpose. However, the CBO seems making an illogic decision per regards to the choice of the index access when there is no statistics on the table (num_rows and last_analyzed are null) while there is 0 statistics in the indexes. Things will become clear with a concrete example.

As very often, I will use one of the Jonathan Lewis table scripts creation and population.

 create table t1
   (id number,
    n_1000 number,
    n_5000 number,
    n_10000 number,
    small_vc varchar2(20),
    padding  varchar2(100)
   );

create unique index a_unq_ind on t1(id, n_1000);
create index b_non_unq_ind on t1(n_1000, n_5000);

Spot how I have managed to name my indexes(unique and non-unique) in an alphabetic order so that when this specific order matters the CBO will choose the unique index as far as it starts with the ‘a’ letter.

Now that I have created my table and my two indexes I can populate the table with data without transmitting statistics to the indexes.

 insert into t1
  with generator as (
   select   --+ materialize
    rownum id
   from dual
  connect by
  rownum <= 10000
)
select
    rownum                    id,
    mod(rownum,1000) n_1000,
    mod(rownum,5000) n_5000,
    mod(rownum,10000) n_10000,
    lpad(rownum,10,'0')       small_vc,
    rpad('x',100)             padding
from
    generator        v1,
    generator        v2
where
rownum <= 100000
;

commit;

Before starting my select, let me show you the actual table and index statistics


SQL> select table_name, num_rows, last_analyzed from user_tables where table_name = 'T1';

TABLE_NAME   NUM_ROWS LAST_ANALYZED
---------- ---------- -----------------
T1

SQL> select index_name, num_rows, clustering_factor from user_indexes where table_name = 'T1';

INDEX_NAME        NUM_ROWS CLUSTERING_FACTOR
--------------- ---------- -----------------
B_NON_UNQ_IND            0                 0
A_UNQ_IND                0                 0

And now my desired select and its corresponding execution plan (statistics_level have been set to all before)

SQL> select * from t1
     where id = 1
     and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------- ---------- ---------- ---------- -------------------- -----------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 0
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3790258116

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |               |      1 |        |      1 |00:00:00.01 |     103 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1            |      1 |     15 |      1 |00:00:00.01 |     103 |
|*  2 |   INDEX RANGE SCAN                  | B_NON_UNQ_IND |      1 |    615 |    100 |00:00:00.01 |       3 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"=1)
2 - access("N_1000"=1)

It is clear by comparing the E-Rows and A-Rows that the statistics are not reflecting the reality.

And when I force the use of the unique index the CBO seems doing good estimations and generation less buffer gets

SQL> select /*+ index( t1 A_UNQ_IND ) */ * from t1  where id = 1  and   n_1000= 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------- ---------- ---------- ---------- --------------------
1          1          1          1 0000000001           x

SQL_ID  bt11jwur90xg0, child number 0
-------------------------------------
select /*+ index( t1 A_UNQ_IND ) */ * from t1  where id = 1  and n_1000= 1;

Plan hash value: 3585360496
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

While this unique index execution plan seems optimal with correct estimations and less Buffers gets it is nevertheless not the index that the CBO desires to use.

Anyway, you are going to object and to say we are not going to run a real life application without having correct statistics gathered. Believe me that if I am writing this case here it is because it has happened in a real life production application. Having said that let’s collect statistics and see the new reaction of the CBO

SQL> exec dbms_stats.gather_table_stats(user ,'t1');

PL/SQL procedure successfully completed.

SQL> select * from t1
where id = 1
and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------------------------------------------------------------------------------------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 1
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3585360496

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

Note
-----
- statistics feedback used for this statement

After having collected statistics the CBO is now choosing the right index.

By the way I am doing this test case on a 12c data base (you might have already guessed that through the presence of the word BATCHED in the execution plan) and seeing this statistics feedback used note, pushed me to flush the shared pool and re-run my query in order to be sure that the use of the unique index is not due to this statistics feedback feature

SQL> alter system flush shared_pool;

System altered.

SQL> select * from t1
where id = 1
and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
--------- ---------- ---------- ---------- -------------------- ---------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 0
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3585360496
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

That is the whole story. If you want to learn more about this particular situation then you have to read this related otn thread.

Bottom line: always be sure to have adequate statistics so that you will give the CBO all chances to produce an adequate plan

PS : I started by trying to explain why the CBO didn’t choose the unique index and ended up while writing this article to two questions

1 : why dynamic sampling didn’t occur : Martin Press gave a clue that needs only to be tested

2 : when the CBO uses the index range scan the table is accessed via this new 12c  TABLE ACCESS BY INDEX ROWID BATCHED  that has been explained by Timur Akhmadeed but when it uses the unique index scan the CBO opted for the classical TABLE ACCESS BY INDEX ROWID



Index design

$
0
0

Here it is a very brief discussion I have had with one of my colleagues about index design

Colleague: what kind of index would you suggest to cover the following query?

SELECT
     rowid
    ,a.*
FROM  message_out a
WHERE sms_status    in (700, 707)
AND  (scheduled_time is null
      OR scheduled_time   <= :1)
AND   provider_id     in (0,0)
ORDER BY
      priority_level desc,
      creation_time asc;

----------------------------------------------------------------------------
| Id | Operation         | Name        | Rows | Bytes |TempSpc| Cost (%CPU)|
----------------------------------------------------------------------------
| 0  | SELECT STATEMENT  |             | 5529 | 1376K |       | 4769 (1)   |
| 1  | SORT ORDER BY     |             | 5529 | 1376K | 1856K | 4769 (1)   |
|* 2 |  TABLE ACCESS FULL| MESSAGE_OUT | 5529 | 1376K |       | 4462 (1)   |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("SMS_STATUS"=700 OR "SMS_STATUS"=707)
AND ("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
AND "PROVIDER_ID"=0)

Me: and what have you ended up with until now?

Colleague: here my suggested index and the related execution plan

CREATE INDEX virtual_index ON MESSAGE_OUT(sms_status,scheduled_time,provider_id) ;

---------------------------------------------------------------------------------------
| Id | Operation                   | Name          | Rows | Bytes |TempSpc| Cost (%CPU)|
---------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT            |               | 5529 | 1376K |       | 446 (1)    |
| 1  | SORT ORDER BY               |               | 5529 | 1376K | 1856K | 446 (1)    |
| 2  | INLIST ITERATOR             |               |      |       |       |            |
|* 3 |  TABLE ACCESS BY INDEX ROWID| MESSAGE_OUT   | 5529 | 1376K |       | 140 (0)    |
|* 4 |   INDEX RANGE SCAN          | VIRTUAL_INDEX | 5529 |       |       | 6 (0)      |
---------------------------------------------------------------------------------------

Me: I would not have created the same index

Me: here it is the index I would have created (after several questions regarding the data distribution, the table data volume, the use of bind variables, etc…)

create index mho_ind on MESSAGE_OUT (status, provider_id, scheduled_time);

Me: and if sms_status contains repetitive values then I would have added a compress command to that index creation

Colleague: there is no difference in the execution plan either by using my index or your index

------------------------------------------------------------------------------------------
| Id | Operation                     | Name          | Rows | Bytes |TempSpc| Cost (%CPU)|
------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT              |               | 5529 | 1376K |       | 446 (1)    |
| 1  | SORT ORDER BY                 |               | 5529 | 1376K | 1856K | 446 (1)    |
| 2  |  INLIST ITERATOR              |               |      |       |       |            |
|* 3 |   TABLE ACCESS BY INDEX ROWID | MESSAGE_OUT   | 5529 | 1376K |       | 140 (0)    |
|* 4 |    INDEX RANGE SCAN           | VIRTUAL_INDEX | 5529 |       |       | 6 (0)      |
------------------------------------------------------------------------------------------

Me: no, it is not the same plan. Please always consider the predicate part

Me: what is the predicate part of the plan using your index

Colleague: this is my index predicate part


Predicate Information (identified by operation id):
 ---------------------------------------------------
 3 - filter("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
 4 - access(("SMS_STATUS"=700 OR "SMS_STATUS"=707) AND "PROVIDER_ID"=0)
     filter("PROVIDER_ID"=0) --> additional filter operation

Colleague: and this is your index predicate part

Predicate Information (identified by operation id):
---------------------------------------------------
 3 - filter("SCHEDULED_TIME" IS NULL OR "SCHEDULED_TIME"<=:1)
 4 - access(("SMS_STATUS"=700 OR "SMS_STATUS"=707) AND "PROVIDER_ID"=0)
--> no additional filter operation

Me: and did you pointed out the difference or not yet?

Colleague: no, same plan, same cost and same execution time

Me: there is a fundamental difference between your plan and mine. In your plan there is a double operation on your engineered index:  “ACCESS + FILTER” operation while my engineered index needs only one precise operation :  “ACCESS

Me: and when it comes to performance you always prefers a precise index ACCESS operation to that double ACCESS and FILTER operations.

Me: your engineered index has a second columns on which an inequality predicate is applied

SCHEDULED_TIME <= :1

You should always start your index by the columns on which an equality predicate is applied. In my case, I put the SCHEDULED_TIME column at the trailing edge of my index and doing as such I have avoided a costly filter operation on my index while your engineered index has been subject to that costly filter operation

If you want to test this behaviour then below is an example to play with. I hope you will enjoy it

SQL> create table t1
     (id number,
     n_1000 number,
     n_5000 number,
     n_10000 number,
     small_vc varchar2(20),
     padding varchar2(100)
     );

Table created.

SQL> insert into t1
  with generator as (
  select --+ materialize
  rownum id
  from dual
  connect by
  rownum <= 10000
  )
  select
    rownum id,
    mod(rownum,1000) n_1000,
    mod(rownum,5000) n_5000,
    mod(rownum,10000) n_10000,
    lpad(rownum,10,'0') small_vc,
    rpad('x',100) padding
  from
    generator v1,
    generator v2
  where
  rownum <= 100000
  ;

SQL> create index my_ind on t1(id, n_5000, n_1000);

Index created.

SQL> create index colleague_ind on t1(id, n_1000, n_5000);

Index created.

SQL> alter index my_ind invisible;

Index altered.

SQL> exec dbms_stats.gather_table_stats(user, 't1');

PL/SQL procedure successfully completed.

SQL> select
        a.*
  from t1 a
  where id in (112,120)
  and (n_1000 is null
  or n_1000 <= 3000)
  and n_5000 in (120);

Statistics
------------------------------------------------------
 65 recursive calls
 0 db block gets
 95 consistent gets ---> spot this
 0 physical reads
 0 redo size
 1005 bytes sent via SQL*Net to client
 543 bytes received via SQL*Net from client
 2 SQL*Net roundtrips to/from client
 6 sorts (memory)
 0 sorts (disk)
 1 rows processed

SQL_ID 7d6ag1m1ztpgr, child number 1
-------------------------------------
select a.* from t1 a where id in (112,120) and (n_1000 is null
or n_1000 <= 3000) and n_5000 in (120)

Plan hash value: 3644584748
-------------------------------------------------------------------------------------------
| Id  | Operation                            | Name          | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |               |       |       |     4 (100)|
|   1 |  INLIST ITERATOR                     |               |       |       |            |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1            |     2 |   258 |     4   (0)|
|*  3 |    INDEX RANGE SCAN                  | COLLEAGUE_IND |     2 |       |     3   (0)|
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 2 - filter(("N_1000"<=3000 OR "N_1000" IS NULL))
 3 - access((("ID"=112 OR "ID"=120)) AND "N_5000"=120)
 filter("N_5000"=120) ---> spot this

SQL> alter index colleague_ind invisible;

Index altered.

SQL> alter index my_ind visible;

Index altered.

SQL> select
       a.*
  from t1 a
  where id in (112,120)
  and (n_1000 is null
  or n_1000 <= 3000)
  and n_5000 in (120);

Statistics
------------------------------------------------------
 33 recursive calls
 0 db block gets
 49 consistent gets --> spot the reduction
 0 physical reads
 0 redo size
 1005 bytes sent via SQL*Net to client
 543 bytes received via SQL*Net from client
 2 SQL*Net roundtrips to/from client
 6 sorts (memory)
 0 sorts (disk)
 1 rows processed

SQL_ID 7d6ag1m1ztpgr, child number 1
-------------------------------------
select a.* from t1 a where id in (112,120) and (n_1000 is null
or n_1000 <= 3000) and n_5000 in (120)

Plan hash value: 4286547933</pre>
------------------------------------------------------------------------------------
| Id  | Operation                            | Name   | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |        |       |       |     4 (100)|
|   1 |  INLIST ITERATOR                     |        |       |       |            |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     2 |   258 |     4   (0)|
|*  3 |    INDEX RANGE SCAN                  | MY_IND |     2 |       |     3   (0)|
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 2 - filter(("N_1000"<=3000 OR "N_1000" IS NULL))
 3 - access((("ID"=112 OR "ID"=120)) AND "N_5000"=120)
     --> spot the absence of filter on index

NLS_SORT and SPM reproducibility

$
0
0

I wrote a blog article which aims mainly to show what optimizer parameter the CBO will use to reproduce aSPM baseline plan: the current environment parameters or the parameters used during the SPM plan baseline capture? Having tested in this blog article only the influence of the optimizer_mode (all_rows versus first_rows mode) I ended up with a conclusion that the CBO will use the optimizer_mode parameter stored during the SPM plan capture. I have also put a careful warning that this is not a conclusion one can spread to other CBO parameters without testing; particularly that I have already seen an otn thread dealing with the inability to reproduce a SPM plan because of a change in the _optim_peek_user_binds  hidden parameter. Then, a post on oracle list about the non reproducibility of a SPM baseline following an upgrade from 10gR2 to 11gR2 prompted me to investigate the influence the NLS_SORT parameter can have on the reproducibility of a SPM baseline plan. Below are my investigations and findings:

First, the model

CREATE TABLE t
(c1 VARCHAR2(64), c2 CHAR(15), d1 DATE);

INSERT INTO t
SELECT
    mod(ABS(dbms_random.random),3)+ 1||chr(ascii('Y')) ,
    dbms_random.string('L',dbms_random.value(1,5))||rownum ,
    to_date(TO_CHAR(to_date('01/01/1980','dd/mm/yyyy'),'J') + TRUNC(dbms_random.value(1,11280)),'J')
FROM dual
CONNECT BY level <= 2e6;

ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (c1,c2) USING INDEX;

EXEC dbms_stats.gather_table_stats (USER, 't', CASCADE => true, method_opt => 'FOR ALL COLUMNS SIZE 1');

Second, the preliminaries

SQL> select * from dba_sql_plan_baselines;

no rows selected  --  no baseline yet

SQL> show parameter nls_sort

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------------
nls_sort                             string      BINARY

SQL> alter session set optimizer_capture_sql_plan_baselines = TRUE;

SQL> SELECT  c1
FROM t
GROUP BY c1
ORDER BY c1 ASC NULLS LAST;

C1
------
1Y
2Y
3Y

SQL> /

C1
------
1Y
2Y
3Y

SQL> alter session set optimizer_capture_sql_plan_baselines = FALSE;

SQL> select plan_name from dba_sql_plan_baselines;

PLAN_NAME
------------------------------
SQL_PLAN_90sg67694zwyj641607ca  -- one SPM plan baseline

So far I have engineered a model against which I executed a query returning data in a certain order under the classical Binary NLS_SORT parameter. I added to that situation a SPM baseline plan so that any “resembling” query will use that SPM plan. That is plan stability. This SPM plan looks like:

SQL> select * from table(dbms_xplan.display_sql_plan_baseline(plan_name => 'SQL_PLAN_90sg67694zwyj641607ca', format => 'ADVANCED'));

--------------------------------------------------------------------------------
SQL handle: SQL_9061e639924ff3d1
SQL text: SELECT  c1     FROM t     GROUP BY c1    ORDER BY c1 ASC NULLS LAST
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_90sg67694zwyj641607ca         Plan id: 1679165386
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Outline Data from SMB:
/*+
BEGIN_OUTLINE_DATA
INDEX(@"SEL$1" "T"@"SEL$1" ("T"."C1" "T"."C2"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('11.2.0.3')
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
--------------------------------------------------------------------------------
Plan hash value: 2111031280
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     3 |     9 |  2069   (5)| 00:00:06 |
|   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |  2069   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
-----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T@SEL$1

Outline Data
------------
/*+
BEGIN_OUTLINE_DATA
INDEX(@"SEL$1" "T"@"SEL$1" ("T"."C1" "T"."C2"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('11.2.0.3')
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) "C1"[VARCHAR2,256]
2 - "C1"[VARCHAR2,256]

An index full scan with which the CBO has avoided the order by sort operation.

Now, for my “tranquility”,  I will execute my query and check if the SPM plan is used or not.

SQL> SELECT  c1
     FROM t
     GROUP BY c1
     ORDER BY c1 ASC NULLS LAST;

C1
----
1Y
2Y
3Y

SQL_ID  28dazsm20sbw6, child number 2
-------------------------------------
SELECT  c1     FROM t     GROUP BY c1    ORDER BY c1 ASC NULLS LAST
Plan hash value: 2111031280
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |       |       |  2069 (100)|          |
|   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |  2069   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
-----------------------------------------------------------------------------

Note
-----
- SQL plan baseline SQL_PLAN_90sg67694zwyj641607ca used for this statement

Great it is used.

Third, the issue presentation and discussion

What happens if, in my current environment, I change the NLS_SORT parameter?

SQL> alter session set nls_sort=french; -- I altered my current environment when compared to the SPM capture time environment

SQL> SELECT  c1
FROM t
GROUP BY c1
ORDER BY c1 ASC NULLS LAST;

C1
----
1Y
2Y
3Y

SQL_ID  28dazsm20sbw6, child number 2
-------------------------------------
SELECT  c1     FROM t     GROUP BY c1    ORDER BY c1 ASC NULLS LAST

Plan hash value: 1760210272
------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |  2451 (100)|          |
|   1 |  SORT ORDER BY        |      |     3 |     9 |  2451  (20)| 00:00:07 |
|   2 |   SORT GROUP BY NOSORT|      |     3 |     9 |  2451  (20)| 00:00:07 |
|   3 |    INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
------------------------------------------------------------------------------

See how my query is not using the SPM baseline plan anymore. It is using a new plan where the sort order by operation has not been eliminated by the CBO. If my query is not using the SPM plan this is because the CBO was not able to reproduce the stored Baseline plan because of nls_sort parameter change. If this means something it then means that when trying to reproduce the SPM plan the CBO uses the current nls_sort parameter.

The new CBO plan have been added to the SPM baseline for an eventual evolution

SQL> select plan_name from dba_sql_plan_baselines;

PLAN_NAME
------------------------------
SQL_PLAN_90sg67694zwyj297df088
SQL_PLAN_90sg67694zwyj641607ca

If I alter again my current nls_sort parameter so that it will match the one stored against the baseline plan, then my query will be back to its initial use of the SPM plan

 SQL> alter session set nls_sort=binary;

 SQL> SELECT  c1
      FROM t
      GROUP BY c1
      ORDER BY c1 ASC NULLS LAST;

SQL_ID  5hrfv0352fzdr, child number 0
-------------------------------------
SELECT  c1 FROM t GROUP BY c1     ORDER BY c1 ASC NULLS LAST

Plan hash value: 2111031280
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |       |       |  2069 (100)|          |
|   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |  2069   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
-----------------------------------------------------------------------------

Note
-----
- SQL plan baseline SQL_PLAN_90sg67694zwyj641607ca used for this statement

Bottom line: in contrast to the optimizer_mode parameter when it comes to NLS_SORT (and NLS_LANG) parameter, the CBO seems to use the current environment NLS_SORT value (and not the one that existed at the baseline time capture) to reproduce the stored SPM plan baseline.


parallel insert

$
0
0

Do you know that a direct path insert can be silently ignored when the inserted table has an insert trigger?

SQL> create table t1 as select rownum n1 from dual connect by level <=1e5;

SQL> create table t2 as select * from t1 where 0 = 1;

SQL> create or replace trigger t2_trg
      before insert on t2
      for each row
    begin
       null;
    end;
   /

SQL> insert /*+ append */ into t2 select * from t1;

100000 rows created.

SQL> select count(1) from t2;

COUNT(1)
---------
100000

SQL> alter trigger t2_trg disable;

SQL> insert /*+ append */ into t2 select * from t1;

100000 rows created.

SQL> select count(1) from t2;

select count(1) from t2
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

It is only when the trigger has been disabled that the insert has been done via a direct path load.

But are you aware that a parallel insert can also be silently ignored in presence of an insert trigger?

SQL> alter session enable parallel dml;

SQL> insert /*+ parallel(t2) */ into t2 select * from t1;

100000 rows created.

SQL_ID  5npb49pus3wtr, child number 0
-------------------------------------
insert /*+ parallel(t2) */ into t2 select * from t1

Plan hash value: 2315600204
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT        |          |       |       |    47 (100)|          |        |      |            |
|   1 |  PX COORDINATOR         |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10001 |   100K|  1269K|    47   (7)| 00:00:01 |  Q1,01 | P->S | QC (RAND)  |
|   3 |    LOAD AS SELECT       |          |       |       |            |          |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE          |          |   100K|  1269K|    47   (7)| 00:00:01 |  Q1,01 | PCWP |            |
|   5 |      PX SEND ROUND-ROBIN| :TQ10000 |   100K|  1269K|    47   (7)| 00:00:01 |        | S->P | RND-ROBIN  |
|   6 |       TABLE ACCESS FULL | T1       |   100K|  1269K|    47   (7)| 00:00:01 |        |      |            |
-----------------------------------------------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement (level=2)

SQL> select count(1) from t2;

select count(1) from t2
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

SQL> rollback;

SQL> alter trigger t2_trg enable;

SQL> insert /*+ parallel(t2) */ into t2 select * from t1;

100000 rows created.

SQL_ID  5npb49pus3wtr, child number 0
-------------------------------------
insert /*+ parallel(t2) */ into t2 select * from t1

Plan hash value: 3617692013
---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |       |       |    47 (100)|          |
|   1 |  LOAD TABLE CONVENTIONAL |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL      | T1   |   100K|  1269K|    47   (7)| 00:00:01 |
---------------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement (level=2)

SQL> select count(1) from t2;

COUNT(1)
----------
300000

When the trigger has been disabled the insert went through a parallel load path in contrast to when the same trigger has been enabled where the insert went through to a classical conventional load.

A classical conventional load have I said?

Yes. If you look closely to the case where the parallel insert has been honored you will realize that not only the insert run in parallel but it has also been done via a direct path mode.

Tell me: does this means that a parallel insert (when honored) is always accomplished via a direct path load?

Yes it is.

So is the append hint in the following instruction redundant?


SQL> insert /*+ append parallel(t2) */ into t2 select * from t1;

No it is not always redundant.

Simply because a parallel run and a direct path load share several common “impeachment” reasons like the presence of insert triggers or foreign integrity constraints((yes I have also tested this case also)  but they also have their proper source of impeachment. While the direct path is not a CBO decision, the parallel run is. So, what do you think will happen when the CBO decides that the parallel insert is not the best path to follow and you have already get rid of the append hint from your insert statement? The direct path, when possible, will not be followed.

If I simulate a parallel run impeachment by disabling the parallel dml and get rid of the append hint from my insert statement then the direct path load will not happen

SQL> alter session disable parallel dml;

SQL> insert /*+ parallel(t2) */ into t2 select * from t1;

100000 rows created.

SQL> select count(1) from t2;

COUNT(1)
----------
100000

However if, in the same situation,  I will add the append hint to my insert statement then the insert will follow a direct path load path as demonstrated below:

SQL> insert /*+ parallel(t2) append */ into t2 select * from t1;

100000 rows created.

SQL> select count(1) from t2;
select count(1) from t2
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

This is the otn thread that prompted this blog post

Bottom line: I learnt yesterday that a parallel insert when followed by the CBO will use a direct path load which is not always the best solution. I learnt also in this case that a parallel insert suffers from the same impeachment raisons as the direct path load does: presence of insert triggers and foreign integrity constraint. And I learnt that to secure your direct path load add the append hint  to your insert statement even when this statement contains a parallel hint


SPM reproducibility: changing the _optimizer_skip_scan_enabled value

$
0
0

I have had several articles about Sql Plan Management stability and evolution. I have also been recently focusing my attention particularly on the criteria, scenarios and circumstances that impeach an enabled and accepted SPM plan to be reproducible. Below are summarized those impeachment reasons:

  1. Changing the index name
  2. Changing the index type (with particular situations for function based and reverse indexes)
  3. Changing the index leading column(s)

Then I embarked on the investigation of the reaction a SPM plan could manifest, during a query execution time, to a change in one of the environment parameters that have been used during the SPM plan capture. I have investigated two situations in this context

  1. One for a change in an optimizer parameter (optimizer mode) value
  2. And the other one for a NLS parameter (nls_sort) change

For the optimizer mode I came up to a conclusion that a change of that mode in the current environment will not have an effect on the reproducibility of the SPM plan as far as this one will be reproduced using the optimizer mode stored against the SPM plan.

However, an nls_sort parameter change reveals to be a serious threat for the SPM reproducibility as far the CBO will use in this particular case the current nls_sort parameter value which might end up with an impossibility to reproduce the stored SPM plan.

Despite the above investigations, there is one thing that I wanted to investigate the effect it has on the reproducibility of a SPM plan: a change of an undocumented optimizer parameter value. This post is for that purpose. Follow me please

 create table t1
    (col1  number
    ,col2  varchar2(50)
    ,flag  varchar2(2));

 insert into t1
    select rownum
          ,lpad('X',50,'X')
          ,case when rownum = 1
            then 'Y1'
               when rownum = 2
            then 'Y2'
               when mod(rownum,2) = 0
            then 'N1'
            else 'N2'
           end
    from   dual
connect by rownum <= 100000;

create index i1 on t1(col1,flag);

exec dbms_stats.gather_table_stats(user ,'t1');

A simple model with an index which, as we will see later, has been engineered so that it will be skip scanned. The next lines of code will store a SPM plan for a given query using default optimizer parameters

SQL> var n varchar2(2);
SQL> exec :n := 'Y1'

SQL> alter session set optimizer_capture_sql_plan_baselines=TRUE;

SQL> select count(1) from t1 where flag = :n;

SQL> /

SQL> alter session set optimizer_capture_sql_plan_baselines=FALSE;

SQL> select count(1) from t1 where flag = :n;

SQL_ID  5k94675mwqz5j, child number 0
-------------------------------------
select count(1) from t1 where flag = :n

Plan hash value: 2775586896
-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    67 (100)|          |
|   1 |  SORT AGGREGATE  |      |     1 |     3 |            |          |
|*  2 |   INDEX SKIP SCAN| I1   | 25000 | 75000 |    67   (2)| 00:00:01 |
-------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T1@SEL$1

Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX_SS(@"SEL$1" "T1"@"SEL$1" ("T1"."COL1" "T1"."FLAG"))
END_OUTLINE_DATA
*/

Peeked Binds (identified by position):
--------------------------------------
1 - :N (VARCHAR2(30), CSID=873): 'Y1'

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FLAG"=:N)
   filter("FLAG"=:N)

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) COUNT(*)[22]

Note
-----
- SQL plan baseline SQL_PLAN_4njm93y44xuu38bc11ac1 used for this statement

A simple query constrained by a SPM plan using index skip scan access.

Having presented the model  I can now start my real investigations which can be summarize via this simple question : if I alter my current environment so that I will disable the use of index skip scan, will my query still be using the SPM plan? Or in more subtle and clear words: will the SPM plan be reproducible under such a change of an undocumented parameter value?

And the answer is : see below

SQL> alter session set "_optimizer_skip_scan_enabled"=FALSE;

SQL> alter session set optimizer_use_sql_plan_baselines = FALSE;

I have altered the hidden parameter and put on hold for a moment the use of sql plan baseline in order to show you that under this circumstances the CBO will come up with a new plan as shown below:

SQL> select count(1) from t1 where flag = :n;

SQL_ID  5k94675mwqz5j, child number 3
-------------------------------------
select count(1) from t1 where flag = :n

Plan hash value: 129980005
------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |    71 (100)|          |
|   1 |  SORT AGGREGATE       |      |     1 |     3 |            |          |
|*  2 |   INDEX FAST FULL SCAN| I1   | 25000 | 75000 |    71   (6)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG"=:N)

There is now a new plan using an INDEX FAST FULL SCAN since I have disabled the use of INDEX SKIP SCAN. It is a clear indication that under this undocumented parameter change the CBO is not able to reproduce the index skip scan plan. Let’s see then what happen to our query.

SQL> alter session set optimizer_use_sql_plan_baselines = TRUE;

SQL> select count(1) from t1 where flag = :n;

---------------------------------------------------
SQL_ID  5k94675mwqz5j, child number 5
-------------------------------------
select count(1) from t1 where flag = :n

Plan hash value: 2775586896
-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    67 (100)|          |
|   1 |  SORT AGGREGATE  |      |     1 |     3 |            |          |
|*  2 |   INDEX SKIP SCAN| I1   | 25000 | 75000 |    67   (2)| 00:00:01 |
-------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T1@SEL$1

Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX_SS(@"SEL$1" "T1"@"SEL$1" ("T1"."COL1" "T1"."FLAG"))
END_OUTLINE_DATA
*/

Peeked Binds (identified by position):
--------------------------------------
1 - :N (VARCHAR2(30), CSID=873): 'Y1'

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FLAG"=:N)
   filter("FLAG"=:N)

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) COUNT(*)[22]

Note
-----
- SQL plan baseline SQL_PLAN_4njm93y44xuu38bc11ac1 used for this statement

The CBO has been able to reproduce the SPM INDEX SKIP SCAN plan.

Bottom line: if the CBO has been able to reproduce the SPM INDEX SKIP SCAN plan under a disabled _optimizer_skip_scan_enabled parameter in the query execution environment, this means that the CBO will use the value of that hidden parameter stored during the SPM plan capture and not the value this parameter has in the current execution environment.


Indexing Foreign Key: the proof of the FK index use

$
0
0

If you have been looking for a material proof showing Oracle using the foreign key index created on the child table while deleting from a parent table then here it is

drop table t2;
drop table t1;

create table t1
(col1 number primary key);

create table t2
(col1    number primary key
,status  varchar2(12) not null
,col2    number
,col2v   number generated always as (case when status = 'ACTIVE' then col2 end) VIRTUAL
,constraint t2_fk foreign key (col2v) references t1(col1)
,constraint t2_ck check (status in ('ACTIVE','INACTIVE') and (status = 'INACTIVE' or col2 is not null))
);

create index t2_ind_fk on t2(col2v);

insert into t1
 select rownum
from dual
connect by level <=100;

commit;

insert into t2 (col1, status, col2) values (1, 'ACTIVE',50);

alter session set skip_unusable_indexes = false;

alter index t2_ind_fk unusable; -- implicit commit

I have created a pair of parent-child table (t1 and t2), an index on the foreign key on the t2 child table, set this index into an unusable state and changed the default skip_unusable_indexes parameter to false so that unusable indexes will not be skipped.

Now, I am going in the next PL/SQL anonymous block, to simulate a delete from a parent table using an autonomous transaction in order to mimic a different session (in fact a different transaction within the same session)

declare
 pragma autonomous_transaction;
begin
 delete from t1 –- deleting from the parent table
 where col1 = 99;
 commit;
end;
/

declare
*
ERROR at line 1:
ORA-01502: index 'XXX.T2_IND_FK' or partition of such index is in unusable state
ORA-06512: at line 4

See how deleting from the parent table (t1) triggered an error on the index of the foreign key constraint created on the child table (t2). This is a simple way to show the mechanism used by Oracle in order to avoid a child table lock (before eventually a deadlock situation) simply by using the index on the foreign key.


Viewing all 224 articles
Browse latest View live