12c Adaptive Cursor Sharing

April 1, 2015, 12:21 pm

≫ Next: BIND_EQUIV_FAILURE – Or when you will regret using Adaptive Cursor Sharing

This is neither a 12c new feature you are still not aware of nor an extension of the 11 g Adaptive Cursor Sharing I am going to have the scoop to announce it. It is rather something I have pointed out when writing a complete chapter on Adaptive Cursor Sharing and that I wanted to share with you. Here we go.

11g Release

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

I have a simple query executed against two bind variables values. The first one favors an index range scan path while the second one desires a full table scan. The bind variable against which this query is run is related to a column having a frequency histogram collected on it so that when executed several times its underlying cursor starts being bind sensitive before ending up being bind aware following a warmup period. Once the cursor is bind aware, the Extended Cursor Sharing (ECS) layer code kicks in by peeking at the bind variable (there might be several ones), checking its selectivity and deciding whether to share an existing child cursor or compile a new one and update the v$sql_cs_selectivity dynamic view accordingly.

SQL> select
        sql_id
       ,child_number
       ,is_bind_aware
       ,is_bind_sensitive
       ,to_char(exact_matching_signature) sig
       ,executions
       ,plan_hash_value
    from v$sql
    where sql_id = '6fbvysnhkvugw'
    and is_shareable = 'Y';

SQL_ID        CHILD_NUMBER I I SIG                  EXECUTIONS PLAN_HASH_VALUE
------------- ------------ - - -------------------- ---------- ---------------
6fbvysnhkvugw            3 Y Y 15340826253708983785    1       3625400295
6fbvysnhkvugw            4 Y Y 15340826253708983785    1       3724264953

In order to finish setting up the blog article scene, I have previously loaded the above two execution plans(plan_hash_value) from cursor cache into a SPM baseline so that I will pre-empt the CBO from using a plan I don’t accept.

SQL> declare
           rs pls_integer;
     begin
           rs := dbms_spm.load_plans_from_cursor_cache('6fbvysnhkvugw');
     end;
     /
PL/SQL procedure successfully completed.

The engineered model (ACS + SPM) is so that when I running the following query alternating between a ”full table scan” bind variable value and an ”index range scan” one I got the following picture:

SQL> select count(*), max(col2) from t1 where flag = 'N1';-- full table scan

  COUNT(*) MAX(COL2)
---------- --------------------------------------------------
     49999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 50135 |  2643K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)
Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

And

SQL> select count(*), max(col2) from t1 where flag = 'Y1'; -- index range scan

  COUNT(*) MAX(COL2)
---------- --------------------------------------------------
         1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |    54 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |     9 |   486 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |     9 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("FLAG"=:SYS_B_0)

Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg98576eb1f used for this statement

After a short period of ACS warmup which depends on the number of executions we start observing a perfect harmony between ACS and SPM: an index range scan plan for an ‘’index range scan’’ bind variable and a full table scan plan for a ‘’full table scan’’ bind variable.

However, it suffices to create an extra index and this perfect harmony ceases to work as shown below:

SQL> create index i2 on t1(flag,col2);

SQL> select count(*), max(col2) from t1 where flag = 'N2';

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49846 |  2628K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)

Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

SQL> select plan_name
    from dba_sql_plan_baselines
    where accepted = 'NO';

PLAN_NAME
------------------------------
SQL_PLAN_d9tch6banyzg9495f4ddb

The CBO has come up with a new execution plan(index fast full scan) which has been constrained by the full table scan SPM plan. This new CBO plan looks like:

SQL> select * from table(dbms_xplan.display_sql_plan_baseline(plan_name => 'SQL_PLAN_d9tch6banyzg9495f4ddb'));

--------------------------------------------------------------------------------
SQL handle: SQL_d4e59032d54f7de9
SQL text: select count(*), max(col2) from t1 where flag = :"SYS_B_0"
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_d9tch6banyzg9495f4ddb         Plan id: 1230982619
Enabled: YES     Fixed: NO      Accepted: NO      Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------
Plan hash value: 2348726875
------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    54 |   249   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE       |      |     1 |    54 |            |          |
|*  2 |   INDEX FAST FULL SCAN| I2   | 25000 |  1318K|   249   (1)| 00:00:03 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)

Let’s now execute the query with a bind variable favoring an index range scan

SQL> select count(*), max(col2) from t1 where flag = 'Y2'; -- index range scan
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49846 |  2628K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
--------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)

Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

The index range scan plan has not been used. May be it still needs a warm up period?

SQL> select count(*), max(col2) from t1 where flag = 'Y2';
SQL> /
SQL> /
SQL> /
SQL> /
SQL> /

SQL> select * from table(dbms_xplan.display_cursor);
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49846 |  2628K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)

Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

SQL> select
          sql_id
         ,child_number
         ,is_bind_aware
         ,is_bind_sensitive
         ,to_char(exact_matching_signature) sig
         ,executions
         ,plan_hash_value
     from v$sql
     where sql_id = '6fbvysnhkvugw'
     and is_shareable = 'Y'
     ;

SQL_ID        CHILD_NUMBER I I SIG                  EXECUTIONS PLAN_HASH_VALUE
------------- ------------ - - -------------------- ---------- ---------------
6fbvysnhkvugw            3 N N 15340826253708983785  10      3724264953

10 executions later and the switch to the index range scan didn’t occured.

Bizarrely, it suffices to disable the use of sql_plan_baselines and the plan switch occurs immediately

SQL> alter session set optimizer_use_sql_plan_baselines = FALSE;

SQL> select count(*), max(col2) from t1 where flag = 'Y2';

Plan hash value: 3625400295
-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |    54 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |   972 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("FLAG"=:SYS_B_0)

Set back the use of sql_plan_baselines and the plan switch will cease to happen immediately as well

SQL> alter session set optimizer_use_sql_plan_baselines = TRUE;

SQL> select count(*), max(col2) from t1 where flag = 'Y2';

Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49846 |  2628K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG"=:SYS_B_0)

Note
-----
- SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

12c Release

Repeat exactly the same experiment in 12c and you will realize that things have changed

SQL> select * from v$version where rownum =1;

BANNER                                                                       CON_ID
---------------------------------------------------------------------------- ------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production   0

SQL> create index i2 on t1(flag,col2);

SQL> select count(*), max(col2) from t1 where flag = 'N2';

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49874 |  2630K|   273   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG"=:SYS_B_0)

Note
-----
- SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

SQL> select count(*), max(col2) from t1 where flag = 'Y2';
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49874 |  2630K|   273   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG"=:SYS_B_0)

Note
-----
- SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

Let’s see now if a second execution of the same query with a bind variable favoring index range scan will switch to an index range scan plan


SQL> select count(*), max(col2) from t1 where flag = 'Y2';

----------------------------------------------------------------------------------
| Id  | Operation                            | Name | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |       |       |     2 (100)|
|   1 |  SORT AGGREGATE                      |      |     1 |    54 |            |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |     1 |    54 |     2   (0)|
|*  3 |    INDEX RANGE SCAN                  | I1   |     1 |       |     1   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("FLAG"=:SYS_B_0)

Note
-----
- SQL plan baseline SQL_PLAN_d9tch6banyzg97823646b used for this statement

And the plan switch happens.

Something new has been introduced in Oracle 12c which has dramatically enhanced the collaboration between ACS and SPM.

In the 11g investigation I’ve executed the same query 10 times without provoking a plan switch. In 12c, I have executed the same query with the ”index range scan” bind variable as many times as the number of executions done with the “full table scan” bind variable (2 executions) to see the ACS kicking off as shown below:

<pre>SQL> select
      sql_id
     ,child_number
     ,is_bind_aware
     ,is_bind_sensitive
     ,to_char(exact_matching_signature) sig
     ,executions
     ,plan_hash_value
    from v$sql
    where sql_id = '6fbvysnhkvugw'
    and is_shareable = 'Y';

SQL_ID        CHILD_NUMBER I I SIG                  EXECUTIONS PLAN_HASH_VALUE
------------- ------------ - - -------------------- ---------- ---------------
6fbvysnhkvugw            1 Y Y 15340826253708983785     1      3724264953
6fbvysnhkvugw            2 Y Y 15340826253708983785     2       497086120
6fbvysnhkvugw            4 Y Y 15340826253708983785     1       497086120
6fbvysnhkvugw            6 Y Y 15340826253708983785     1      3724264953

↧

BIND_EQUIV_FAILURE – Or when you will regret using Adaptive Cursor Sharing

April 6, 2015, 9:42 am

≫ Next: The dark side of using bind variables : sharing everything

≪ Previous: 12c Adaptive Cursor Sharing

Previously, when I was asked to define the Adaptive Cursor Sharing(ACS) feature I’ve often used the following definition: “it represents an answer to the always threating and challenging Oracle task of sharing cursors and optimizing SQL”.

Time passes and I have altered a little bit this definition to become: “it represents a short answer to the always threating and challenging Oracle task of sharing cursors and optimizing SQL’’.

Time passes again and I ended up by drastically altering my initial ACS definition to become:“In certain very plausible situations, It might represent a serious threat for your application where you will be happy to disable it provided you have enough experience to identify the link between ACS and your threat”.

If you want to know what has changed my mind about ACS, then follow this situation taken from a real life running system I am going to summarize:

When you see something like this in the library cache (11.2.0.3.0)

SQL> select
        sql_id
       ,count(1)
     from
        v$sql
     where executions < 2
     group by sql_id
     having count(1) > 10
     order by 2 desc;

SQL_ID          COUNT(1)
------------- ----------
7zwq7z1nj7vga      44217

You start wondering what makes this sql_id having such a big count of different versions in memory.

After few minutes of investigation you end up by ruling out the bind variable hypothesis. And then you finish by asking yourself what the heck is this sql_id?

Hopefully Tanel Poder nonshared script shed a small light on that:

SQL> @nonshared 7zwq7z1nj7vga
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)...

-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000042CE36F7E8
CHILD_NUMBER         : 0
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>0</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                       <selectivity>1097868685</selectivity>
                      </ChildNode>

-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000045B5C5E478
CHILD_NUMBER         : 1
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>1</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                      <selectivity>915662630</selectivity>
                      </ChildNode>
-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000038841E2868
CHILD_NUMBER         : 2
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>2</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                      <selectivity>163647208</selectivity>
                      </ChildNode>
-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000038841E2708
CHILD_NUMBER         : 3
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>3</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                      <selectivity>4075662961</selectivity>
                      </ChildNode>

…/…

-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000045B5C5E5D8
CHILD_NUMBER         : 99
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>99</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                      <selectivity>3246589452</selectivity>
                      </ChildNode>

Moreover a direct select on the v$sql_shared_cursor shows this:

SQL> select
       count(1)
     from
         v$sql_shared_cursor
     where
        sql_id = '7zwq7z1nj7vga';

  COUNT(1)
----------
     45125

SQL> select
        count(1)
     from
       v$sql_shared_cursor
     where
        sql_id = '7zwq7z1nj7vga'
     and BIND_EQUIV_FAILURE = 'Y';

  COUNT(1)
----------
     45121

Hmmm…. A huge count of non shared child cursors due to BIND_EQUIV_FAILURE.

The official Oracle documentation about BIND_EQUIV_FAILURE says : the bind value’s selectivity does not match that used to optimize the existing child cursor.This definition together with the selectivity xml tag mentioned above gave me a first clue: Adaptive Cursor Sharing (in this case Extended Cursor Sharing).

SQL> select
       count(1)
    from
        v$sql_cs_selectivity
    where
      sql_id = '7zwq7z1nj7vga';

  COUNT(1)
----------
  16,847,320

That is an impressive number of records in this dynamic view. For a single sql_id we have about 17 million of rows in this ACS monitoring view!!! This is dramatically altering the execution time of the underlying sql_id query.

If you don’t know what v$sql_cs_selectivity view stands for then look:

Once a cursor becomes bind aware, each time this cursor is executed, the Extended Cursor Sharing layer code peeks at the bind variable values (and in this particular case there are 9 bind variables), and execute, behind the scene, a select against v$sql_cs_selectivity view in order to check if any existing child cursor already covers the selectivity of the peeked bind variables. If a child cursor is found it will be shared. If not then a new child cursor is optimized and inserted into v$sql_cs_selectivity with a new range of bind variable value selectivity.

In this particular case each time the Extended Cursor Sharing layer code fails to find a child cursor in v$sql_cs_selectivity with an adequate range of selectivity(BIND_EQUIV_FAILURE) and compile a new execution plan ending up by filling dramatically v$sql view with multiple “optimal” plans.

We have been asked to use ACS to answer the need of sharing cursor and optimizing SQL. We end up by having neither the first nor the second desire in this particular and very plausible case.

Few extra words about this runing system case:

Query is using 9 bind variables in 9 different predicates
Columns on which bind variables are used have histograms(Frequency and Height Balanced) collected on them
Query is a simple select on a single heap table
Table has 4 indexes (which produces 4 distinct execution plans among those 99 ones)

↧

The dark side of using bind variables : sharing everything

April 8, 2015, 10:48 am

≫ Next: Parallel refreshing a materialized view

≪ Previous: BIND_EQUIV_FAILURE – Or when you will regret using Adaptive Cursor Sharing

An interesting academic situation happened last week which I, honestly believe, is worth a blog article as far as experienced DBA have spent time trying to solve it without success. An overnight job was running for hours in the night from 01/04 to 02/04. The on call DBA spent all the night killing and re-launching the job (sql_id) several attempts without any success. When I arrived at work the next day I was asked to help. As far as this job was still running, I generated the Real Time SQL monitoring report (RTSM) for the corresponding sql_id which showed the classical NESTED LOOP having a huge outer data row set driving an inner data set in which at least 50 different operations have been started 519K times while one operation has been executed 2M times. The corresponding execution plan contains 213 operations. The underlying query uses 628 user bind variables and 48 system generated bind variables (thanks to cursor sharing set to FORCE)

SQL Plan Monitoring Details (Plan Hash Value=1511784243)

Global Information
------------------------------
 Status              :  EXECUTING               
 Instance ID         :  2                       
 Session             :  xxxxx (350:9211)   
 SQL ID              :  dmh5vhkcm877v           
 SQL Execution ID    :  33554436                
 Execution Started   :  04/02/2015 07:52:03     
 First Refresh Time  :  04/02/2015 07:52:47     
 Last Refresh Time   :  04/02/2015 10:04:28     
 Duration            :  7947s                   
 Module/Action       :  wwwww
 Service             :  zzzzz               
 Program             :  wwwww  
 DOP Downgrade       :  100%   

Global Stats
===================================================================================
| Elapsed |   Cpu   |    IO    | Application | Concurrency | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   |  Gets  | Reqs | Bytes |
====================================================================================
|    7900 |    7839 |       20 |        0.00 |        0.82 |   243M | 5946 | 659MB |
====================================================================================

The above 7,839 seconds spent consuming CPU with almost no user wait time represents the classical wrong NESTED LOOP operation starting many times several inner operations as mentioned above.

The job was running without any sign of improvement, the client was waiting for its critical report and I have a query with 700 bind variables honored via an execution plan of 213 operations to figure out how to make this report finishing smoothly as soon as possible.

I was dissecting the execution plan when the end user send me an e-mail saying that the same job ran successfully yesterday within 6 min. With that information in mind I have managed to get the RTSM of the yesterday successful job. The first capital information was that the yesterday query and the today not ending one used the same plan_hash_value (same execution plan). Comparing the 628 input bind variable values of both runs, I found that the yesterday job ran for a one month period (monthly job) while the current job is running for a one day interval (daily job).Of course the end user has not supplied any information about the kind of job they are currently running compared to the previous one. All what I have been told is the yesterday job completed in 6 minutes. It is only until I’ve found the difference in the input bind variable values that the end user said “the current run is for the daily job while the previous one was for the monthly job”.

And the sun starts rising. I was able to figure out that the two set of bind variables are not doing the same amount of work and sharing the same execution plan is probably not a good idea. This is why I have suggested the DBA to do the following:

Kill the not ending session
Purge the sql_id from the shared pool
Ask the end user to re-launch the job
Cross fingers :-)

And you know what? The job completed within a couple of hundreds of seconds:

SQL Plan Monitoring Details (Plan Hash Value=2729107228)

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)         
 Instance ID         :  2                       
 Session             :  xxxxx (1063:62091) 
 SQL ID              :  dmh5vhkcm877v           
 SQL Execution ID    :  33554437                
 Execution Started   :  04/02/2015 10:43:17     
 First Refresh Time  :  04/02/2015 10:43:20     
 Last Refresh Time   :  04/02/2015 10:47:38     
 Duration            :  261s                    
 Module/Action       :  wwwww
 Service             :  zzzzz
 Program             :  wwwww
 Fetch Calls         :  57790    

Global Stats
==============================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs | Bytes |
==============================================================================
|     134 |     107 |     7.10 |          17 | 57790 |    18M | 7857 | 402MB |
==============================================================================

This is the dark side of using bind variable: when sharing resource we share also execution plan. The current daily job was running using the plan optimized for the monthly job. The solution was to force the CBO compiling a new execution plan for the new input bind variable. The new plan (2729107228) is still showing 200 operations and several operations started 578K times. I have the intention to study both execution plans to know exactly from where the enhancement is coming. The clue here might be that in the first shared monthly execution plan the query, for a reason I am unable to figure out, run serially

 DOP Downgrade       :  100%

While the new hard parsed execution has been executed in parallel:

 Parallel Execution Details (DOP=4 , Servers Allocated=20)

Bottom Line: when you have the intention to run a critical report once per day (and once per month) then it is worth to let the CBO compiling a new execution plan for each execution. All what you will have is one hard parse for one execution. This will never hurt from a memory and CPU point of view

↧

Parallel refreshing a materialized view

April 18, 2015, 9:58 am

≫ Next: Real time SQL monitoring limitation

≪ Previous: The dark side of using bind variables : sharing everything

I have been asked to troubleshoot a monthly on demand materialized view refresh job which has got the bad idea to crash with the ORA-01555 error after 25,833 seconds (more than 7 hours) of execution. Despite my several years of professional experience this is the first time I have been asked to look at a materialized view refresh. This issue came up Friday afternoon so I was given a week-end to familiarize myself with materialized views. Coincidentally a couple of days before there was an Oracle webcast on Materialized view basics, architecture and internal working which I have replayed on Saturday and practiced its demo. Christian Antognini book contains a chapter on this topic which I have also gone through as far as Christian book is from where I always like to start when trying to learn an Oracle concept.

Materialized view capabilities

The following Monday morning, armed with this week-end accelerated auto-training, I opened again the e-mail I have been sent about the failing refresh job and started re-reading it. The first thing that has retained my attention this time, in contrast to my last Friday quick pass through reading, was a suggestion made by the DBA to try fast refreshing the materialized view instead of completely refreshing it. I learnt from the Oracle webcast that Oracle is able to let us know wether a materialized view can be fast (also know as incremental) refreshed or not. Here below the steps to do if you want to get this information:

You need first to create the mv_capabilities_table table (in the schema you are going to use against it the dbms_mview package) using the following script :

SQL> $ORACLE_HOME/rdbms/admin/utlxmv.sql

SQL> select * from mv_capabilities_table;
no rows selected

Once this table created you can execute the dbms_mview.explain_mview package as shown below:

SQL> exec dbms_mview.explain_mview ('my_materialied_mv');

PL/SQL procedure successfully completed.

SQL> select
  2     mvname
  3    ,capability_name
  4    ,possible
  5  from
  6    mv_capabilities_table
  7  where
  8     mvname = 'MY_MATERIALIED_MV'
  9  and
 10    capability_name  like '%REFRESH%';

MVNAME                         CAPABILITY_NAME                P
------------------------------ ------------------------------ -
MY_MATERIALIED_MV              REFRESH_COMPLETE               Y  
MY_MATERIALIED_MV              REFRESH_FAST                   N --> spot this
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_INSERT      N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_INSERT      N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_INSERT      N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_INSERT      N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_ONETAB_DML  N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_ANY_DML     N
MY_MATERIALIED_MV              REFRESH_FAST_PCT               N

As spotted above, fast refreshing this materialized view is impossible.

The first learned lesson: instead of trying the create a materialized view log and fast refreshing a complex materialized view which might be impossible to be refreshed incrementally, try first getting the capabilities of the view using the explain_mview procedure. You will certainly save time and resource.

SQL> SELECT
         refresh_method
       , refresh_mode
       , staleness
       , last_refresh_type
       , last_refresh_date
    FROM
          user_mviews
    WHERE mview_name = 'MY_MATERIALIED_MV';

REFRESH_ REFRES STALENESS           LAST_REF LAST_REFRES
-------- ------ ------------------- -------- --------------------
COMPLETE DEMAND NEEDS_COMPILE       COMPLETE 02-APR-2015 16:16:35

Parallel clause in the SQL create statement : any effect on the mview creation?

Since I have ruled out an incremental refresh I decided to get the materialized view definition so that I can investigate its content

SQL> SELECT
       replace (dbms_metadata.get_ddl(replace(
                                      OBJECT_TYPE, ' ', '_'),    
                                      OBJECT_NAME,OWNER)
                    ,'q#"#'
                    ,'q#''#'
                    )
     FROM DBA_OBJECTS
     WHERE OBJECT_TYPE = 'MATERIALIZED VIEW'
     AND object_name = 'MY_MATERIALIED_MV';

------------------------------------------------------------------
CREATE MATERIALIZED VIEW MY_MATERIALIED_MV
   ({list of columns}) 
  TABLESPACE xxxx
  PARALLEL 16 –----------------------------------> spot this
  BUILD IMMEDIATE
  USING INDEX
  REFRESH COMPLETE ON DEMAND
  USING DEFAULT LOCAL ROLLBACK SEGMENT
  USING ENFORCED CONSTRAINTS DISABLE QUERY REWRITE
AS
-- select n°1
 SELECT
    {list of columns}
 FROM
  {list of tables}
 WHERE
  {list of predicates}
 GROUP BY
  {list of columns}
.../...
UNION ALL
-- select n°5
SELECT
    {list of columns}
 FROM
  {list of tables}
 WHERE
  {list of predicates}
GROUP BY
  {list of columns} ;

Have you noticed that parallel 16 clause in the materialized view create script? The developer intention was to create the materialized view using parallel process. Having a Production equivalent database I was happy enough to try re-creating this materialized view:

SQL> set timing on

SQL> start ddl_mv1.sql

Materialized view created.

Elapsed: 00:22:33.52

Global Information
------------------------------
 Status              :  DONE               
 Instance ID         :  1                  
 Session             :  XZYY (901:25027)  
 SQL ID              :  f9s6kdyysz84m      
 SQL Execution ID    :  16777216           
 Execution Started   :  04/16/2015 09:49:22
 First Refresh Time  :  04/16/2015 09:49:23
 Last Refresh Time   :  04/16/2015 10:11:48
 Duration            :  1346s              
 Module/Action       :  SQL*Plus/-         
 Service             :  XZYY
 Program             :  sqlplus.exe        

Global Stats
========================================================================
| Elapsed |   Cpu   |    IO    | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) |  Gets  | Reqs | Bytes | Reqs  | Bytes |
========================================================================
|   20338 |    5462 |    14205 |    63M |   3M | 716GB |    2M | 279GB |
========================================================================

Parallel Execution Details (DOP=16 , Servers Allocated=32)

SQL Plan Monitoring Details (Plan Hash Value=853136481)
==================================================================================================
| Id  |                       Operation            | Name    |  Rows   | Execs |   Rows   |Temp  |
|     |                                            |         | (Estim) |       | (Actual) |(Max) |
==================================================================================================
|   0 | CREATE TABLE STATEMENT                     |         |         |    33 |       16 |      |
|   1 |   PX COORDINATOR                           |         |         |    33 |       16 |      |
|   2 |    PX SEND QC (RANDOM)                     | :TQ10036|         |    16 |       16 |      |
|   3 |     LOAD AS SELECT                         |         |         |    16 |       16 |      |
|   4 |      UNION-ALL                             |         |         |    16 |     117M |      |
|   5 |       HASH GROUP BY                        |         |    259M |    16 |      58M |  36G |
|   6 |        PX RECEIVE                          |         |    259M |    16 |     264M |      |
|   7 |         PX SEND HASH                       | :TQ10031|    259M |    16 |     264M |      |
|   8 |          HASH JOIN RIGHT OUTER BUFFERED    |         |    259M |    16 |     264M |  61G |
|   9 |           PX RECEIVE                       |         |      4M |    16 |       4M |      |
|  10 |            PX SEND HASH                    | :TQ10013|      4M |    16 |       4M |      |
|  11 |             PX BLOCK ITERATOR              |         |      4M |    16 |       4M |      |
|     |                                            |         |         |       |          |      |
| 180 |                PX RECEIVE                  |         |     19M |    16 |      20M |      |
| 181 |                 PX SEND HASH               | :TQ10012|     19M |    16 |      20M |      |
| 182 |                  PX BLOCK ITERATOR         |         |     19M |    16 |      20M |      |
| 183 |                   TABLE ACCESS FULL        | TABLE_M |     19M |   268 |      20M |      |
==================================================================================================

Surprisingly the materialized view has been created in less than 23 minutes. And this creation has been parallelised with a DOP of 16 as shown by the corresponding Real Time Sql Monitoring report (RTSM).The master table has been henceforth created with a DOP of 16 as shown below:

SQL> select
  2    table_name
  3   ,degree
  4  from
  5    user_tables
  6  where table_name = 'MY_MATERIALIED_MV';

TABLE_NAME                     DEGREE
------------------------------ ----------
MY_MATERIALIED_MV               16

A simple select against the created materialized view will go parallel as well

SQL> select count(1) from MY_MATERIALIED_MV;              

SQL Plan Monitoring Details (Plan Hash Value=3672954679)
============================================================================================
| Id |          Operation          |           Name           |  Rows   | Execs |   Rows   |
|    |                             |                          | (Estim) |       | (Actual) |
============================================================================================
|  0 | SELECT STATEMENT            |                          |         |     1 |        1 |
|  1 |   SORT AGGREGATE            |                          |       1 |     1 |        1 |
|  2 |    PX COORDINATOR           |                          |         |    17 |       16 |
|  3 |     PX SEND QC (RANDOM)     | :TQ10000                 |       1 |    16 |       16 |
|  4 |      SORT AGGREGATE         |                          |       1 |    16 |       16 |
|  5 |       PX BLOCK ITERATOR     |                          |    104M |    16 |     117M |
|  6 |        MAT_VIEW ACCESS FULL | MY_MATERIALIED_MV        |    104M |   191 |     117M |
============================================================================================

You might have already pointed out in the above RTSM report that the select part of the “create as select” statement has been parallelised as well. It is as if the parallel 16 clause of the “create” part of the SQL materialized view script induced implicitly its “select” part to be done in parallel with a DOP of 16.

Parallel clause in the SQL create statement : any effect on the mview refresh ?

As far as I am concerned, the problem I have been asked to trouble shoot resides in refreshing the materialized view and not in creating it. Since, the materialized view has been created in 23 minutes, I should be optimistic for its refresh time; isn’t it?

SQL> exec dbms_mview.refresh ('MY_MATERIALIED_MV','C',atomic_refresh=>FALSE);

After more than 4,200 seconds of execution time I finally gave up and decided to stop this refresh. Below is an overview of its corresponding Real Time Sql Monitoring (RTSM) report:

Global Information
------------------------------
 Status              :  DONE (ERROR) --> I have cancelled it after more than 1 hour   
 Instance ID         :  1                  
 Session             :  XZYY (901:25027)  
 SQL ID              :  d5n03tuht2cg8      
 SQL Execution ID    :  16777216           
 Execution Started   :  04/16/2015 10:55:46
 First Refresh Time  :  04/16/2015 10:55:52
 Last Refresh Time   :  04/16/2015 12:06:39
 Duration            :  4253s               
 Module/Action       :  SQL*Plus/-         
 Service             :  XZYY
 Program             :  sqlplus.exe        

Global Stats
===================================================================================
| Elapsed |   Cpu   |    IO    |  Other   | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) |  Gets  | Reqs | Bytes | Reqs  | Bytes |
===================================================================================
|    4253 |    1640 |     2563 |       50 |    53M | 824K | 227GB |  570K | 120GB |
===================================================================================

SQL Plan Monitoring Details (Plan Hash Value=998958099)
=============================================================================
| Id |                  Operation                   |Name |  Rows   | Cost  |
|    |                                              |     | (Estim) |       |
=============================================================================
|  0 | INSERT STATEMENT                             |     |         |       |
|  1 |   LOAD AS SELECT                             |     |         |       |
|  2 |    UNION-ALL                                 |     |         |       |
|  3 |     HASH GROUP BY                            |     |    259M |       |
|  4 |      CONCATENATION                           |     |         |       |
|  5 |       NESTED LOOPS OUTER                     |     |       7 |  4523 |
|  6 |        NESTED LOOPS OUTER                    |     |       7 |  4495 |
|  7 |         NESTED LOOPS                         |     |       7 |  4474 |
|  8 |          NESTED LOOPS                        |     |       7 |  4460 |
|  9 |           PARTITION REFERENCE ALL            |     |       7 |  4439 |
…/…

In contrast to the creation process, the materialized view refresh has been done serially. This confirms that the above parallel 16 clause in the create DDL script concerns only the parallel materialized view creation and not its refresh process.

The second learned lesson : I think that a parallel clause specified in the create statement of a materialized view is not used during the refresh of the same materialized view. The parallel run is considered in this kind of situations only at the materialized view creation time.

dbms_mview.refresh and its parallelism parameter : any effect on the mview refresh ?

The tables on which the materialized view is based have all a degree = 1

 SQL> select
  2      table_name
  3    , degree
  4  from user_tables
  5  where trim(degree) <> '1';

TABLE_NAME            DEGREE
--------------------- -------
MY_MATERIALIED_MV     16

Having said that, what if I try refreshing this materialized view using the parallelism parameter of the dbms_mview.refresh procedure as shown below:

SQL> exec dbms_mview.refresh ('MY_MATERIALIED_MV','C', atomic_refresh=>FALSE, parallelism =>16);

SQL Plan Monitoring Details (Plan Hash Value=998958099)
==========================================================================================
| Id |                  Operation                   |           Name           |  Rows   |
|    |                                              |                          | (Estim) |
==========================================================================================
|  0 | INSERT STATEMENT                             |                          |         |
|  1 |   LOAD AS SELECT                             |                          |         |
|  2 |    UNION-ALL                                 |                          |         |
|  3 |     HASH GROUP BY                            |                          |    259M |
|  4 |      CONCATENATION                           |                          |         |
|  5 |       NESTED LOOPS OUTER                     |                          |       7 |
|  6 |        NESTED LOOPS OUTER                    |                          |       7 |
|  7 |         NESTED LOOPS                         |                          |       7 |
|  8 |          NESTED LOOPS                        |                          |       7 |
|  9 |           PARTITION REFERENCE ALL            |                          |       7 |
| 10 |            TABLE ACCESS BY LOCAL INDEX ROWID | TABLE_XX_ZZ              |       7 |
../..
| 94 |           PARTITION RANGE ALL                |                          |    369M |
| 95 |            PARTITION LIST ALL                |                          |    369M |
| 96 |             TABLE ACCESS FULL                | TABLE_AA_BB_123          |    369M |
==========================================================================================

As confirmed by the above corresponding RTSM report, the parallelism parameter has not been obeyed and the refresh has been done serially in this case as well.

The third learned lesson : using the parameter parallelism of the dbms_mview.refresh procedure has no effect on the parallel refresh of the underlying materialized view.

Adding a parallel hint in the select part of the mview : any effect on the mview refresh ?

At this stage of the troubleshooting process I have emphasized the following points:

The parallel clause used in the create statement of a materialized view is considered only during the materialized view creation. This parallel clause is ignored during the refresh process
The parallelism parameter of the dbms_mview.refresh procedure will not refresh the materialized view in parallel

Now that I have ruled out all the above steps I was almost convinced that to expedite the refresh process I need to add a parallel hint directly in the materialized view definition (ddl_mv2.sql):

CREATE MATERIALIZED VIEW MY_MATERIALIED_MV
   ({list of columns}) 
  TABLESPACE xxxx
  PARALLEL 16
  BUILD IMMEDIATE
  USING INDEX
  REFRESH COMPLETE ON DEMAND
  USING DEFAULT LOCAL ROLLBACK SEGMENT
  USING ENFORCED CONSTRAINTS DISABLE QUERY REWRITE
AS
 SELECT /*+ parallel(8) pq_distribute(tab1 hash hash)*/
    {list of columns}
 FROM
  {list of tables}
 WHERE
  {list of predicates}
 GROUP BY
  {list of columns}
UNION ALL
 SELECT /*+ parallel(8) pq_distribute(tab1 hash hash)*/
    {list of columns}
 FROM
  {list of tables}
 WHERE
  {list of predicates}
 GROUP BY
    {list of columns}
;

Having changed the select part of materialized view DDL script I launched again it creation which completes in 25 minutes as shown below:

SQL> start ddl_mv2.sql
Materialized view created.
Elapsed: 00:25:05.37

And immediately after the creation I launched the refresh process :

SQL> exec dbms_mview.refresh ('MY_MATERIALIED_MV','C',atomic_refresh=>FALSE);

PL/SQL procedure successfully completed.
Elapsed: 00:26:11.12

And hopefully this time the refresh completed in 26 minutes thanks to the parallel run exposed below in the corresponding RTSM report:

Global Information
------------------------------
 Status              :  DONE               
 Instance ID         :  1                  
 Session             :  XZYY
 SQL ID              :  1w1v742mr35g3      
 SQL Execution ID    :  16777216           
 Execution Started   :  04/16/2015 13:38:13
 First Refresh Time  :  04/16/2015 13:38:13
 Last Refresh Time   :  04/16/2015 14:04:24
 Duration            :  1571s              
 Module/Action       :  SQL*Plus/-         
 Service             :  XZYY            
 Program             :  sqlplus.exe        

Parallel Execution Details (DOP=8, Servers Allocated=80)

SQL Plan Monitoring Details (Plan Hash Value=758751629)
===============================================================================
| Id  |                       Operation          |           Name   |  Rows   |
|     |                                          |                  | (Estim) |
===============================================================================
|   0 | INSERT STATEMENT                         |                  |         |
|   1 |   LOAD AS SELECT                         |                  |         |
|   2 |    UNION-ALL                             |                  |         |
|   3 |     PX COORDINATOR                       |                  |         |
|   4 |      PX SEND QC (RANDOM)                 | :TQ10005         |    259M |
|   5 |       HASH GROUP BY                      |                  |    259M |
| 177 |                PX RECEIVE                |                  |     19M |
| 178 |                 PX SEND HASH             | :TQ50004         |     19M |
| 179 |                  PX BLOCK ITERATOR       |                  |     19M |
| 180 |                   TABLE ACCESS FULL      | TABLE_KZ_YX      |     19M |
===============================================================================

I’ve added the pq_distribute (tab1 hash hash) hint above because several refreshes crashed because of the broadcast distribution that ended up by overconsuming TEMP space raising the now classical error:

ERROR at line 484:
ORA-12801: error signaled in parallel query server P012
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

The fourth learned lesson : if you want to parallelise your materialized view refresh process you had better to include the parallel hint in the select part of the materialized view. This is better than to change the parallel degree of the tables on which the materialized view is based on.

↧

Real time SQL monitoring limitation

April 29, 2015, 8:56 am

≫ Next: Index Efficiency

≪ Previous: Parallel refreshing a materialized view

I was trying to explain a performance deterioration of a very complex query honoured via an execution plan with 386 operations (386 lines). From where would someone start deciphering this complex and big execution plan without the help of a Real Time SQL monitoring report? As far as this query took 2 hours to complete it is fairly likely that Oracle has monitored it. Unfortunately a select against v$sql_monitor view didn’t returned any rows for this particular sql_id. What came to my mind in front of this situation is that the report has been flushed from memory due to a stress on the library cache. Hopefully, I was able to get the bind variable and re-execute the same query. While the query was running I opened a sqlplus window and run this:

SQL> select sql_id from v$sql_monitor where status = 'EXECUTING';
no rows selected

The query was still runing after a couple of minutes but was still not monitored. I was suspecting the number of operations in the execution plan but has no way to proof the correlation between this number of lines and the absence of the monitoring

Plan hash value: 1504525856
----------------------------------------------------------------------
| Id  | Operation                                                    |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT                                             |
|   1 |  UNION-ALL                                                   |
|   2 |   SORT UNIQUE                                                |
|   3 |    MERGE JOIN CARTESIAN                                      |
|   4 |     MERGE JOIN CARTESIAN                                     |
|   5 |      NESTED LOOPS                                            |
|*  6 |       HASH JOIN OUTER                                        |
|   7 |        MERGE JOIN CARTESIAN                                  |
|   8 |         NESTED LOOPS OUTER                                   |
|*  9 |          HASH JOIN OUTER                                     |
|* 10 |           HASH JOIN OUTER                                    |
|* 11 |            HASH JOIN OUTER                                   |
|* 12 |             HASH JOIN OUTER                                  |
|* 13 |              HASH JOIN OUTER                                 |
.../...
|*202 |                                               HASH JOIN      |
| 203 | OUTER                                          NESTED LOOPS  |
| 204 |  OUTER                                          NESTED LOOPS |
|*205 |                                                  HASH JOIN   |
|*206 | OUTER                                             HASH JOIN  |
.../...
| 383 |          BUFFER SORT                                         |
| 384 |           PX RECEIVE                                         |
| 385 |            PX SEND BROADCAST                                 |
| 386 |             TABLE ACCESS FULL                                |
----------------------------------------------------------------------

Spot in passing where the OUTER at line 51 and 53 has been placed.

Google being a good friend I asked him and he directed me to this article where Doug Burns pointed that there is a hidden parameter(_sqlmon_max_planlines) which fixes the maximum number of lines an execution plan must not exceed to be, all other things being equal, monitored.

I decided then to give it a try and I have altered this parameter to accept monitoring my 386 plan operations:

SQL> alter session set "_sqlmon_max_planlines" = 400;

Session altered.

And to my pleasant surprise I found that my query started being monitored


SQL Monitoring Report

Global Information
------------------------------
 Status              :  EXECUTING
 Instance ID         :  1
 Session             :  xxxx (626:3043)
 SQL ID              :  315sc2w0cy05w
 SQL Execution ID    :  16777216
 Execution Started   :  04/29/2015 11:29:39
 First Refresh Time  :  04/29/2015 11:29:46
 Last Refresh Time   :  04/29/2015 11:35:26
 Duration            :  348s
 Module/Action       :  sqldeveloper64W.exe/-
 Service             :  SYS$USERS
 Program             :  sqldeveloper64W.exe

 SQL Plan Monitoring Details (Plan Hash Value=1504525856)
==============================================================================================
| Id    |                                     Operation                                      |
|       |                                                                                    |
==============================================================================================
|     0 | SELECT STATEMENT                                                                   |
|     1 |   UNION-ALL                                                                        |
|     2 |    SORT UNIQUE                                                                     |
|     3 |     MERGE JOIN CARTESIAN                                                           |
|     4 |      MERGE JOIN CARTESIAN                                                          |
|     5 |       NESTED LOOPS                                                                 |
|     6 |        HASH JOIN OUTER                                                             |
|     7 |         MERGE JOIN CARTESIAN                                                       |

|   383 |           BUFFER SORT                                                              |
|   384 |            PX RECEIVE                                                              |
|   385 |             PX SEND BROADCAST                                                      |
|   386 |              TABLE ACCESS FULL                                                     |
==============================================================================================

And now the serious stuff can start :-)

↧

Index Efficiency

May 12, 2015, 12:03 am

≫ Next: Extended Statistics Part I : histogram effect

≪ Previous: Real time SQL monitoring limitation

I used Jonathan Lewis script to locate degenerated indexes –-or indexes that are occupying more space than they should–. Among those indexes I have isolated this one:

16:20:33:TABLE1 - PK_TAB1
Current Leaf blocks: 2,846,555 Target size:1,585,492

According to this SQL script the above index possesses 2.8 million worth of leaf blocks while it should normally occupy half this number of blocks.

The sys_op_lb_id function when applied on this index gives the following average leaf block per index key picture:

ROWS_PER_BLOCK BLOCKS
-------------- ----------
 2              1
 7              1
 27             1
 32             1
 92             1
 94             1
 103            1
 107            1
 108            1
 111            1
 112            800
 113            1,627,529
……
 422            980,894
 423            40
 432            1
 434            1
 448            5496
 449            32803
 450            7
 456            3
 458            1
 466            1
 478            54
 479            200
 487            1
 ----------    -----------
sum             2,979,747

Spot that odd value of 1.6 million leaf blocks (out of a total of 2,9 million) we have to visit to get only 113 index keys. Add to this the other 980,984 leaf blocks we need to visit to get an extra 422 index keys and you might end up by approximatively having to visit the entire index leaf blocks to get only a couple of hundred of index keys. That is a completely degenerated index.

Let’s then rebuild it and check if we will get back the reclaimed space:

SQL> alter index PK_TAB1 rebuild parallel 8;

SQL> alter index PK_TAB1 noparallel;

SQL> break on report skip 1

SQL> compute sum of blocks on report

SQL> select
        rows_per_block,
        count(*) blocks
     from
       (
        select
            /*+
              cursor_sharing_exact
              dynamic_sampling(0)
              no_monitoring
              no_expand
              index_ffs(t1,t1_i1)
              noparallel_index(t,t1_i1)
             */
        sys_op_lbid( &m_ind_id ,'L',t1.rowid) as block_id,
        count(*) as rows_per_block
      from
        TABLE1 t1
      where
        tab_id is not null
      group by
       sys_op_lbid( &m_ind_id ,'L',t1.rowid)
     )
   group by rows_per_block
   order by rows_per_block
   ;
Enter value for m_ind_id: 53213
Enter value for m_ind_id: 53213

ROWS_PER_BLOCK BLOCKS
-------------- ----------
 26            1
 206           1
 208           1
 243           1
 249           1
 272           1
 316           1
 339           1
 422           1,558,800
 423           53
 432           1
 448           5496
 449           32803
 458           1
 478           54
 479           200
 487           1
 ----------------------
sum          1,597,417

Notice the new number of index leaf block we’ve got after rebuilding the index (1,597,417) and compare it with the number predicted by Jonathan Lewis script (1,585,492). That’s really very accurate. The initial estimation is almost 100% accurate. In passing the new index size has been reduced by at factor of 46%.

While rebuilding the index has reduced drastically the number of leaf blocks and the disk space they occupy, that odd value of 1,558,800 leaf blocks we have to visit to get only 422 index keys is still present. This has prompted me to try coalescing the index even though I was not very confident that such a high number of leaf blocks could be merged with adjacent leaf blocks making the index less smashed.

SQL> alter index PK_TAB1 coalesce;

ROWS_PER_BLOCK    BLOCKS
-------------- ----------
           26          1
          206          1
          208          1
          243          1
          249          1
          272          1
          316          1
          339          1
          422          1,558,800
          423          53
          432          1
          448          5496
          449          32803
          458          1
          478          54
          479          200
          487          1
              -------------
sum              1,597,417

Definitely this primary key index has a strange way of being filled up which I have to figure out with the Java developers.

The bottom line of this article is that Jonathan Lewis script locating degenerated index is amazingly precise.

↧

Extended Statistics Part I : histogram effect

May 25, 2015, 8:37 am

≫ Next: SUBQ INTO VIEW FOR COMPLEX UNNEST

≪ Previous: Index Efficiency

Extended statistic, also known as column group extension, is one of the important statistic improvements introduced with Oracle 11g. While Oracle Cost Based Optimizer is able to get a correct single column selectivity estimation, it is, however, unable to figure out the cardinality of a conjunction of two or more correlated columns present in a query predicate. A column group extension calculated for this conjunction of columns aims to help the CBO figuring out this columns correlation in order to get an accurate estimation. But there are cases where the CBO refuses to use a column group extension. This article aims to show one of those cases via a concrete example.

The scene

Below is the table and its unique index on which I am going to show you when the CBO will not use the column group extension:

create table t_ext_stat
  ( dvpk_id    number(10) not null
  , vpk_id     number(10) not null
  , layer_code varchar2(1 char) not null
  , dvpk_day   date not null
  , cre_date   date not null
  , cre_usr    varchar2(40 char) not null
  , mod_date   date not null
  , mod_usr    varchar2(40 char) not null
 );

create unique index t_ext_uk_i on t_ext_stat(vpk_id, layer_code, dvpk_day);

And this is the query I will be using all over the article

select
  count(1)
from
  t_ext_stat
where
  vpk_id = 63148
and
  layer_code = 'R';

 COUNT(1)
----------
 338

The two columns in the predicate part, layer_code and vpk_id are compared against an equality which makes them candidate for a column group extension; but let’s see first how skew are these two columns starting by the layer_code

The layer_code column has 4 distinct values with two popular ones: R (400,087) and S (380,069) which can be captured via a frequency histogram.

The vpk_id does not present such a noticeable skewness in its data scattering as shown by its representative chart:
It has 4947 distinct values ranging from a vpk_id value of 62866 with 1456 occurrences to a vpk_id value 62972 with a single occurrence

SQL> select
       vpk_id
      ,count(1)
    from
      t_ext_stat
    group by
      vpk_id
    order by 2 desc;

    VPK_ID   COUNT(1)
---------- ----------
     62866       1456
     62953       1456
     63528       1456
     63526       1456
     63518       1456
     62947       1456
     62850       1456
     62849       1456
     62851       1456
     62954       1456
     64362       1452
     64538       1424
     64483       1358
….
     63207          1
     63021          1
     62972          1

4947 rows selected.

Extended Statistics and histogram

In order to create a column group extension we need to make a call to the following piece of code:

SQL> SELECT
         dbms_stats.create_extended_stats
         (ownname   => user
         ,tabname   => 't_ext_stat'
         ,extension =>'(vpk_id,layer_code)'
         )
    FROM dual;

which will create a virtual column (SYS_STUMVIRBZA6_$QWEX6DE2NGQA1) supporting the two predicate column correlation.

Next, I will gather statistics with histogram for all t_ext_stat table columns including the above newly created virtual one:

BEGIN

dbms_stats.gather_table_stats
 (user
 ,'t_ext_stat'
 ,method_opt => 'for all columns size auto'
 ,cascade => true
 ,no_invalidate => false
 );
END;
/

And let’s check the collected columns statistics

SQL> SELECT
       column_name
      ,num_distinct
      ,density
      ,histogram
    FROM
       user_tab_col_statistics
    WHERE
       table_name = 'T_EXT_STAT'
    AND
      column_name in ('VPK_ID','LAYER_CODE','SYS_STUMVIRBZA6_$QWEX6DE2NGQA1');

COLUMN_NAME                    NUM_DISTINCT    DENSITY HISTOGRAM
------------------------------ ------------ ---------- ---------------
SYS_STUMVIRBZA6_$QWEX6DE2NGQA1         4967 .000201329  NONE
LAYER_CODE                                4  6.2471E-07 FREQUENCY
VPK_ID                                 2862 .000349406  NONE

As expected a skew has been identified on the layer_code column and therefore a frequency histogram has been gathered on it to indicate this skewness. There is nevertheless two remarks which seems to be worth to mention:

Since one of the column group extension has a histogram why the extension itself has not been identified as a skewed column as well?
What happens in this particular case where there is no histogram on the extension and a histogram on one of the column forming the extension?

It is easy to answer the first question by looking directly at the column group scattering chart presented below:

Where we can notice that the extension does not present a skewness in its data scattering. In fact the extension has 10,078 distinct values where the most popular value appears 728 times while the less popular appears only once:

SQL> select
        to_char(SYS_STUMVIRBZA6_$QWEX6DE2NGQA1) extension
       ,count(1)
     from
       t_ext_stat
     group by
       SYS_STUMVIRBZA6_$QWEX6DE2NGQA1
      order by 2 desc;

EXTENSION               COUNT(1)
--------------------- ----------
10113707817839868275         728
6437420856234749785          728
6264201076174478674          728
7804673458963442057          728
2433504440213765306          728
6976215179539283979          728
493591537539092624           728

6710977030485345437            1
18158393637293365880           1
5275318825200713603            1
13895660777899711317           1

This is a clear demonstration that it is not because there is a massive skew in one column forming the extension that the resulting column group combination will necessary present a skew. This is particularly true when the other column has a large number of dictint values (> 254)

But you might wonder why one has to care about this absence of histogram in the extension? Christian Antognini has already answered this question in this article where he wrote “be careful of extensions without histograms. They might be bypassed by the query optimizer”. In fact if one of the columns forming the extension has a histogram while the extension itself has no histogram then the Optimizer will not use the extension.

Here below is a demonstration of this claim taken from this current model:

select
   count(1)
from
   t_ext_stat
where vpk_id = 63148
and layer_code = 'R';

COUNT(1)
----------
338

SQL_ID  d26ra17afbfyh, child number 0
-------------------------------------
-------------------------------------------------------------------
| Id  | Operation         | Name       | Starts | E-Rows | A-Rows |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |      1 |        |      1 |
|   1 |  SORT AGGREGATE   |            |      1 |      1 |      1 |
|*  2 |   INDEX RANGE SCAN| T_EXT_UK_I |      1 |    142 |    338 |
-------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("VPK_ID"=63148 AND "LAYER_CODE"='R')

How can we prove that Oracle didn’t used the extension to compute the 142 estimated rows when accessing the underlying index table? Well by looking at the corresponding 10053 trace file

Access path analysis for T_EXT_STAT
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T_EXT_STAT[T_EXT_STAT]
SPD: Directive valid: dirid = 17542990197075222359, state = 5, flags = 1, loc = 1 {EC(98564)[2, 3]}
SPD: Return code in qosdDSDirSetup: EXISTS, estType = TABLE
Column (#2): VPK_ID(NUMBER)
  AvgLen: 5 NDV: 2862 Nulls: 0 Density: 0.000000 Min: 0.000000 Max: 62849.000000
Column (#3):
   NewDensity:0.002043, OldDensity:0.000001 BktCnt:5873.000000, PopBktCnt:5873.000000, PopValCnt:4, NDV:4

Column (#3): LAYER_CODE(VARCHAR2)
    AvgLen: 2 NDV: 4 Nulls: 0 Density: 0.000000
    Histogram: Freq  #Bkts: 4  UncompBkts: 5873  EndPtVals: 4  ActualVal: no

Column (#9): SYS_STUMVIRBZA6_$QWEX6DE2NGQA1(NUMBER)
    AvgLen: 12 NDV: 4967 Nulls: 0 Density: 0.000000 Min: 0.000000 Max: 1980066.000000
ColGroup (#2, Index) T_EXT_UK_I
    Col#: 2 3 4    CorStregth: -1.00
ColGroup (#1, VC) SYS_STUMVIRBZA6_$QWEX6DE2NGQA1
    Col#: 2 3    CorStregth: 2.30

ColGroup Usage:: PredCnt: 2  Matches Full:  Partial:
Table: T_EXT_STAT  Alias: T_EXT_STAT
Card: Original: 803809.000000  Rounded: 142  Computed: 141.74  Non Adjusted: 141.74

If Oracle has used the extension to compute the 142 estimated rows it will have then used the following formula:

E-rows = num_rows(t_ext_stat) * 1/(NDV(SYS_STUMVIRBZA6_$QWEX6DE2NGQA1))
E-rows = 803809 * 1/(4967) = 161.829877

Another clue showing that the Optimizer didn’t used the extension is visible in the above 10053 trace file as well via the following lines:

ColGroup Usage:: PredCnt: 2  Matches Full:  Partial:

where the Matches Full and Partial information are null.
As mentioned by Christian’s article there is a fix to what seems to be identified as a bug we can set to make Oracle using the extension:

SQL> alter session set "_fix_control"="6972291:ON";

SQL> alter session set events '10053 trace name context forever, level 1';

SQL> select
      count(1)
    from
     t_ext_stat
    where
      vpk_id = 63148
    and
      layer_code = 'R';

  COUNT(1)
----------
       338

SQL> alter session set events '10053 trace name context off';

Below is the corresponding execution plan (with a new estimation 162) and the part of the 10053 trace file related to the extension

============
Plan Table
============
---------------------------------------+-----------------------------------+
| Id  | Operation          | Name      | Rows  | Bytes | Cost  | Time      |
---------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT   |           |       |       |     3 |           |
| 1   |  SORT AGGREGATE    |           |     1 |     7 |       |           |
| 2   |   INDEX RANGE SCAN | T_EXT_UK_I|   162 |  1134 |     3 |  00:00:01 |
---------------------------------------+-----------------------------------+
Predicate Information:
----------------------
2 - access("VPK_ID"=63148 AND "LAYER_CODE"='R')

=====================================
Access path analysis for T_EXT_STAT
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T_EXT_STAT[T_EXT_STAT]
SPD: Directive valid: dirid = 17542990197075222359, state = 5, flags = 1, loc = 1 {EC(98564)[2, 3]}
SPD: Return code in qosdDSDirSetup: EXISTS, estType = TABLE
  Column (#2): VPK_ID(NUMBER)
    AvgLen: 5 NDV: 2899 Nulls: 0 Density: 0.000000 Min: 0.000000 Max: 62849.000000
  Column (#3):
    NewDensity:0.001753, OldDensity:0.000001 BktCnt:6275.000000, PopBktCnt:6275.000000, PopValCnt:4, NDV:4
  Column (#3): LAYER_CODE(VARCHAR2)
    AvgLen: 2 NDV: 4 Nulls: 0 Density: 0.000000
    Histogram: Freq  #Bkts: 4  UncompBkts: 6275  EndPtVals: 4  ActualVal: no
  Column (#9): SYS_STUMVIRBZA6_$QWEX6DE2NGQA1(NUMBER)
    AvgLen: 12 NDV: 4985 Nulls: 0 Density: 0.000000 Min: 0.000000 Max: 1980066.000000
  ColGroup (#2, Index) T_EXT_UK_I
    Col#: 2 3 4    CorStregth: -1.00
  ColGroup (#1, VC) SYS_STUMVIRBZA6_$QWEX6DE2NGQA1
    Col#: 2 3    CorStregth: 2.33
  ColGroup Usage:: PredCnt: 2  Matches Full: #1  Partial:  Sel: 0.0002
  Table: T_EXT_STAT  Alias: T_EXT_STAT
    Card: Original: 806857.000000  Rounded: 162  Computed: 161.86  Non Adjusted: 161.86

Where we can notice that, this time, the CBO has used the extension to compute its rows estimation as far as 162 comes from the following formula:

E-rows = num_rows(t_ext_stat) * 1/(NDV(SYS_STUMVIRBZA6_$QWEX6DE2NGQA1))
E-rows = 806857* 1/(4985) = 161.856971 --> rounded to 162

But instead of setting the fix I would have preferred to delete histogram from the layer_code column so that both the extension and its column combination will not have histogram:

SQL> exec dbms_stats.gather_table_stats(user ,'t_ext_stat', method_opt => 'for all columns size 1');

SQL> SELECT
       column_name
      ,num_distinct
      ,density
      ,histogram
    FROM
       user_tab_col_statistics
    WHERE
       table_name = 'T_EXT_STAT'
    AND
      column_name in ('VPK_ID','LAYER_CODE','SYS_STUMVIRBZA6_$QWEX6DE2NGQA1');

COLUMN_NAME                    NUM_DISTINCT    DENSITY HISTOGRAM
------------------------------ ------------ ---------- ----------
SYS_STUMVIRBZA6_$QWEX6DE2NGQA1         5238 .000190913 NONE
LAYER_CODE                                4        .25 NONE
VPK_ID                                 2982 .000335345 NONE

In which case the extension would be used as shown below:

-------------------------------------------------------------------
| Id  | Operation         | Name       | Starts | E-Rows | A-Rows |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |      1 |        |      1 |
|   1 |  SORT AGGREGATE   |            |      1 |      1 |      1 |
|*  2 |   INDEX RANGE SCAN| T_EXT_UK_I |      1 |    154 |    338 |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("VPK_ID"=63148 AND "LAYER_CODE"='R')

 Column (#2): VPK_ID(NUMBER)
    AvgLen: 5 NDV: 2982 Nulls: 0 Density: 0.000000 Min: 0.000000 Max: 62849.000000
  Column (#3): LAYER_CODE(VARCHAR2)
    AvgLen: 2 NDV: 4 Nulls: 0 Density: 0.000000
  Column (#9): SYS_STUMVIRBZA6_$QWEX6DE2NGQA1(NUMBER)
    AvgLen: 12 NDV: 5238 Nulls: 0 Density: 0.000000
  ColGroup (#2, Index) T_EXT_UK_I
    Col#: 2 3 4    CorStregth: -1.00
  ColGroup (#1, VC) SYS_STUMVIRBZA6_$QWEX6DE2NGQA1
    Col#: 2 3    CorStregth: 2.28
  ColGroup Usage:: PredCnt: 2  Matches Full: #1  Partial:  Sel: 0.0002
  Table: T_EXT_STAT  Alias: T_EXT_STAT
    Card: Original: 807515.000000  Rounded: 154  Computed: 154.16  Non Adjusted: 154.16

Where it is clearly shown that the extension has been used

E-rows = num_rows(t_ext_stat) * 1/(NDV(SYS_STUMVIRBZA6_$QWEX6DE2NGQA1))
E-rows = 807515 * 1/(5238) = 154.164758 --> rounded to 154

Notice by the way, that despite the extension has been used, the estimation is not as good as expected (154 instead of 338). An explanation of this discrepancy might come from the very weak correlation strenght that exist between the layer_code and the vpk_id (CorStregth: 2.30) which will be considered in a separate article

The bottom line of this article is : be careful about collecting histogram when you have the intention to use extended statistics. It is not necessary for the extension to present a skew if one of the columns from the combination has a histogram. And in such case Oracle will bypass the extension.

↧

SUBQ INTO VIEW FOR COMPLEX UNNEST

June 6, 2015, 1:42 am

≫ Next: Why Dynamic Sampling has not been used?

≪ Previous: Extended Statistics Part I : histogram effect

If you are a regular reader of Jonathan Lewis blog you will have probably came across this article in which the author explains why an “OR subquery” pre-empts the optimizer from unnesting the subquery and merging it with its parent query for a possible optimal join path. This unnesting impossibility is so that the “OR subquery” is executed as a FILTER predicate which when applied on a huge data row set penalizes dramatically the performance of the whole query. In the same article, you will have hopefully also learned how by re-writing the query using a UNION ALL (and taking care of the always threatening NULL via the LNNVL() function) you can open a new path for the CBO allowing an unnest of the subquery.

Unfortunately, nowadays there is a massive expansion of third party software where changing SQL code is not possible so that I hoped that the optimizer was capable to automatically re-factor a disjunctive subquery and consider unnesting it using the UNION ALL workaround.

I was under that impression that this hope is never exhausted by the optimizer until last week when I have received from my friend Ahmed Aangour an e-mail showing a particular disjunctive subquery which has been unnested by the optimizer without any rewrite of the original query by the developer. I have found the case very interesting so that I decided to model it and to share it with you. Take a look to the query and the execution plan first in 11.2.0.2 (the table script is supplied at the end of the article)

SQL> alter session set statistics_level=all;

SQL> alter session set optimizer_features_enable='11.2.0.2';

SQL> select
 a.id1
 ,a.n1
 ,a.start_date
from t1 a
where (a.id1 in
 (select
 b.id
 from t2 b
 where
 b.status = 'COM'
 )
 OR
 a.id1 in
 (select
 c.id1
 from t2 c
 where
 c.status = 'ERR'
 )
 );

SQL> select * from table(dbms_xplan.display_cursor(null,null, ‘allstats last’));

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |   9890 |00:00:02.23 |     742K|<--
|*  1 |  FILTER            |      |      1 |        |   9890 |00:00:02.23 |     742K|
|   2 |   TABLE ACCESS FULL| T1   |      1 |  10000 |  10000 |00:00:00.01 |    1686 |
|*  3 |   TABLE ACCESS FULL| T2   |  10000 |      1 |   9890 |00:00:02.16 |     725K|
|*  4 |   TABLE ACCESS FULL| T2   |    110 |      1 |      0 |00:00:00.05 |   15400 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter(( IS NOT NULL OR IS NOT NULL))
 3 - filter(("B"."ID"=:B1 AND "B"."STATUS"='COM'))
 4 - filter(("C"."ID1"=:B1 AND "C"."STATUS"='ERR'))

The double full access to table t2 plus the FILTER operation indicate clearly that the OR clause has not been combined with the parent query. If you want to know what is behind the filter predicate n°1 above then the “not so famous” explain plan for command will help in this case:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   975 | 15600 |   462   (0)| 00:00:01 |
|*  1 |  FILTER            |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   156K|   462   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |     8 |    42   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL| T2   |     1 |     7 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT 0 FROM "T2" "B" WHERE "B"."ID"=:B1 AND
              "B"."STATUS"='COM') OR  EXISTS (SELECT 0 FROM "T2" "C" WHERE
              "C"."ID1"=:B2 AND "C"."STATUS"='ERR'))
   3 - filter("B"."ID"=:B1 AND "B"."STATUS"='COM')
   4 - filter("C"."ID1"=:B1 AND "C"."STATUS"='ERR')

Notice how the subquery has been executed as a FILTER operation which sometimes (if not often) represents a real performance threat.

However, when I‘ve executed the same query under optimizer 11.2.0.3 I got the following interesting execution plan

SQL> alter session set optimizer_features_enable='11.2.0.3';

--------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |      1 |        |   9890 |00:00:00.03 |    1953 |<--
|*  1 |  HASH JOIN            |          |      1 |   5000 |   9890 |00:00:00.03 |    1953 |
|   2 |   VIEW                | VW_NSO_1 |      1 |   5000 |   9890 |00:00:00.01 |     282 |
|   3 |    HASH UNIQUE        |          |      1 |   5000 |   9890 |00:00:00.01 |     282 |
|   4 |     UNION-ALL         |          |      1 |        |   9900 |00:00:00.01 |     282 |
|*  5 |      TABLE ACCESS FULL| T2       |      1 |   2500 |     10 |00:00:00.01 |     141 |
|*  6 |      TABLE ACCESS FULL| T2       |      1 |   2500 |   9890 |00:00:00.01 |     141 |
|   7 |   TABLE ACCESS FULL   | T1       |      1 |  10000 |  10000 |00:00:00.01 |    1671 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("A"."ID1"="ID1")
   5 - filter("C"."STATUS"='ERR')
   6 - filter("B"."STATUS"='COM')

Notice now how the new plan is showing a HASH JOIN operation between an internal view( VW_NSO_1) and table t1 coming from the parent query block. Notice as well the HASH JOIN condition (access(“A”.”ID1″=”ID1″)) that appears in filter n°1. The optimizer has done a double transformation:

created an internal view VW_NSO_1 representing a UNION-ALL between the two subqueries present in the where clause
joined the newly online created view with table t1 present in the parent query block

Looking at the corresponding 10053 trace file I have found how the CBO has transformed the initial query:

select a.id1 id1,
  a.n1 n1,
  a.start_date start_date
from (
  (select c.id1 id1 from c##mhouri.t2 c where c.status='ERR')
union
  (select b.id id from c##mhouri.t2 b where b.status='COM')
     ) vw_nso_1,
  c##mhouri.t1 a
where a.id1= vw_nso_1.id1;

In fact the optimizer has first combined the two subqueries into a VIEW and finished by UNNESTING them with the parent query. This is a transformation which Oracle optimizer seems to name : SUBQ INTO VIEW FOR COMPLEX UNNEST

In the same 10053 trace file we can spot the following lines:

*****************************
Cost-Based Subquery Unnesting
*****************************
Query after disj subq unnesting:******* UNPARSED QUERY IS *******

SU:   Transform an ANY subquery to semi-join or distinct.
Registered qb: SET$7FD77EFD 0x15b5d4d0 (SUBQ INTO VIEW FOR COMPLEX UNNEST SET$E74BECDC)

SU: Will unnest subquery SEL$3 (#2)
SU: Will unnest subquery SEL$2 (#3)
SU: Reconstructing original query from best state.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#3)
SU:   Checking validity of unnesting subquery SEL$3 (#2)
Query after disj subq unnesting:******* UNPARSED QUERY IS *******

SU:   Checking validity of unnesting subquery SET$E74BECDC (#6)
SU:   Passed validity checks.

This is a clear enhancement made in the optimizer query transformation that will help improving performance of disjunctive subqueries automatically without any external intervention.

Unfortunately, I was going to end this article until I’ve realized that although I am testing this case under 12.1.0.1.0 database release I still have not executed the same query under optimizer feature 12.1.0.1.0

SQL> alter session set optimizer_features_enable='12.1.0.1.1';
SQL > execute query
-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |   9890 |00:00:03.84 |     716K|
|*  1 |  FILTER            |      |      1 |        |   9890 |00:00:03.84 |     716K|
|   2 |   TABLE ACCESS FULL| T1   |      1 |  10000 |  10000 |00:00:00.01 |    1686 |
|*  3 |   TABLE ACCESS FULL| T2   |  10000 |      2 |   9890 |00:00:03.81 |     715K|
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)
   3 - filter((("B"."ID1"=:B1 AND "B"."STATUS"='ERR') OR ("B"."ID"=:B2 AND
              "B"."STATUS"='COM')))

The automatic unnesting of the disjunctive subquery has been removed in the 12.1.0.1.1 optimizer model.

If you want to reproduce and test this case here below is the model (I would be interested to see if the disjunctive subquery is unnested or not in the 12.1.0.1.2 release )

create table t1
   as select
    rownum                id1,
    trunc((rownum-1/3))   n1,
    date '2012-06-07' + mod((level-1)*2,5) start_date,
    lpad(rownum,10,'0')   small_vc,
    rpad('x',1000)        padding
from dual
connect by level <= 1e4;

create table t2
as select
    rownum id
    ,mod(rownum,5) + mod(rownum,10)* 10  as id1
    ,case
       when mod(rownum, 1000) = 7 then 'ERR'
       when rownum <= 9900 then 'COM'
       when mod(rownum,10) between 1 and 5 then 'PRP'
     else
       'UNK'
     end status
     ,lpad(rownum,10,'0')    as small_vc
     ,rpad('x',70)           as padding
from dual
connect by level <= 1e4;

alter table t1 add constraint t1_pk primary key (id1);

↧

Why Dynamic Sampling has not been used?

June 12, 2015, 3:15 am

≫ Next: Real Time SQL Monitoring oddity

≪ Previous: SUBQ INTO VIEW FOR COMPLEX UNNEST

Experienced tuning guys are known for their pronounced sense of looking at details others are very often ignoring. This is why I am always paying attention to their answers in otn and oracle-l list. Last week I have been asked to look at a query performing badly which has been monitored via the following execution plan:

Global Information
------------------------------
 Status              :  EXECUTING
 Instance ID         :  1
 SQL ID              :  8114dqz1k5arj
 SQL Execution ID    :  16777217

Global Stats
=============================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Cluster  |  Other   | Buffer | Read  | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Waits(s) |  Gets  | Reqs  | Bytes |
=============================================================================================
|  141842 |  140516 |       75 |        5.82 |       69 |     1176 |    21G | 26123 | 204MB |
=============================================================================================

SQL Plan Monitoring Details (Plan Hash Value=3787402507)
===========================================================================================
| Id   |             Operation             |      Name       |  Rows   | Execs |   Rows   |
|      |                                   |                 | (Estim) |       | (Actual) |
===========================================================================================
|    0 | SELECT STATEMENT                  |                 |         |     1 |          |
|    1 |   SORT ORDER BY                   |                 |       1 |     1 |          |
|    2 |    FILTER                         |                 |         |     1 |          |
|    3 |     NESTED LOOPS                  |                 |         |     1 |        0 |
| -> 4 |      NESTED LOOPS                 |                 |       1 |     1 |       4G |
| -> 5 |       TABLE ACCESS BY INDEX ROWID | TABLEXXX        |       1 |     1 |     214K |
| -> 6 |        INDEX RANGE SCAN           | IDX_MESS_RCV_ID |      2M |     1 |     233K |
| -> 7 |       INDEX RANGE SCAN            | VGY_TEST2       |       1 |  214K |       4G |->
|    8 |      TABLE ACCESS BY INDEX ROWID  | T_TABL_YXZ      |       1 |    4G |        0 |->
|      |                                   |                 |         |       |          |
===========================================================================================

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(TO_DATE(:SYS_B_2,:SYS_B_3)<=TO_DATE(:SYS_B_4,:SYS_B_5))
   5 - filter(("TABLEXXX"."T_NAME"=:SYS_B_6 AND
              "TABLEXXX"."M_TYPE"=:SYS_B_0 AND
              "TABLEXXX"."A_METHOD"=:SYS_B_7 AND
              "TABLEXXX"."M_STATUS"<>:SYS_B_8))
   6 - access("TABLEXXX"."R_ID"=:SYS_B_1)
   7 - access("T_TABL_YXZ"."SX_DATE">=TO_DATE(:SYS_B_2,:SYS_B_3) AND
              "T_TABL_YXZ"."SX_DATE"<=TO_DATE(:SYS_B_4,:SYS_B_5))
   8 - filter("T_TABL_YXZ"."T_ID"="TABLEXXX"."T_ID")

Those 214K and 4G executions (Execs) of operations 7 and 8 respectively are the classical wrong NESTED LOOP join the CBO has decided to go with because of the wrong cardinality estimation at operation n° 5 (the double NESTED LOOP operation is the effect of the NLJ_BATCHING optimisation).
There was no previous historical plan_hash_value for this particular sql_id in order to compare with the current execution plan. But the report has certainly been executed in the past without any complaint from the end user.
The outline_data section of the execution plan is where I usually look when trying to understand what the optimizer has done behind the scene:

Outline Data
-------------
   /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
      DB_VERSION('11.2.0.3')
      OPT_PARAM('_b_tree_bitmap_plans' 'false')
      OPT_PARAM('optimizer_dynamic_sampling' 4) ---------------------------> spot this
      OPT_PARAM('optimizer_index_cost_adj' 20)
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX_RS_ASC(@"SEL$1" "TABLEXXX"@"SEL$1" ("TABLEXXX"."R_ID"))
      INDEX(@"SEL$1" "T_TABL_YXZ"@"SEL$1" ("T_TABL_YXZ"."SX_DATE"
              "T_TABL_YXZ"."GL_ACCOUNT_ID" "T_TABL_YXZ"."CASH_ACCOUNT_ID"))
      LEADING(@"SEL$1" "TABLEXXX"@"SEL$1" "T_TABL_YXZ"@"SEL$1")
      USE_NL(@"SEL$1" "T_TABL_YXZ"@"SEL$1")
      NLJ_BATCHING(@"SEL$1" "T_TABL_YXZ"@"SEL$1")
      END_OUTLINE_DATA
  */

As you can see apart from the optimizer_index_cost_adj parameter value we should never change, there is one thing that has kept my attention: optimizer_dynamic_sampling. Since the outline is showing that the optimizer has used dynamic sampling why then there is no Note about dynamic sampling at the bottom of the above corresponding execution plan?
I decided to run the same query in a CLONE data base (cloned via RMAN). Below is the corresponding execution plan for the same set of input parameters:

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 SQL ID              :  8114dqz1k5arj
 SQL Execution ID    :  16777217
 Duration            :  904s

SQL Plan Monitoring Details (Plan Hash Value=2202725716)
========================================================================================
| Id |            Operation             |      Name       |  Rows   | Execs |   Rows   |
|    |                                  |                 | (Estim) |       | (Actual) |
========================================================================================
|  0 | SELECT STATEMENT                 |                 |         |     1 |      280 |
|  1 |   SORT ORDER BY                  |                 |    230K |     1 |      280 |
|  2 |    FILTER                        |                 |         |     1 |      280 |
|  3 |     HASH JOIN                    |                 |    230K |     1 |      280 |
|  4 |      TABLE ACCESS BY INDEX ROWID | T_TABL_YXZ      |    229K |     1 |     301K |
|  5 |       INDEX RANGE SCAN           | VGY_TEST2       |       1 |     1 |     301K |
|  6 |      TABLE ACCESS BY INDEX ROWID | TABLEXXX        |    263K |     1 |       2M |
|  7 |       INDEX RANGE SCAN           | IDX_MESS_RCV_ID |      2M |     1 |       2M |
========================================================================================

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(TO_DATE(:SYS_B_2,:SYS_B_3)<=TO_DATE(:SYS_B_4,:SYS_B_5))
   3 - access("T_TABL_YXZ"."T_ID"="TABLEXXX"."T_ID")
   5 - access("T_TABL_YXZ"."SX_DATE">=TO_DATE(:SYS_B_2,:SYS_B_3) AND
              "T_TABL_YXZ"."SX_DATE"<=TO_DATE(:SYS_B_4,:SYS_B_5))
   6 - filter(("TABLEXXX"."T_NAME"=:SYS_B_6 AND
              "TABLEXXX"."M_TYPE"=:SYS_B_0 AND
              "TABLEXXX"."A_METHOD"=:SYS_B_7 AND
              "TABLEXXX"."M_STATUS"<>:SYS_B_8))
   7 - access("TABLEXXX"."R_ID"=:SYS_B_1)

Note
-----
   - dynamic sampling used for this statement (level=4)

In this CLONED database, in contrast to the Production database, the optimizer has used dynamic sampling at its level 4 and has come up with a different estimation when visiting TABLEXXX (263K instead of 1) and T_TABL_YXZ (229K instead of 1) tables so that it has judiciously opted for a HASH JOIN instead of that dramatic production NESTED LOOP operation making the query completing in 904 seconds.

The fundamental question turns then from why the report is performing badly to why the optimizer has ignored using dynamic sampling at level 4?

There are several ways to answer this question (a) 10053 trace file, (b) 10046 or (c) trace file or tracing directly dynamic sampling as it has been suggested to me by Stefan Koehler

SQL> alter session set events 'trace[RDBMS.SQL_DS] disk=high';

The corresponding 10053 optimizer trace shows the following lines related to dynamic sampling:

10053 of the COPY database

*** 2015-06-03 11:05:43.701
** Executed dynamic sampling query:
    level : 4
    sample pct. : 0.000489
    actual sample size : 837
    filtered sample card. : 1
    orig. card. : 220161278
    block cnt. table stat. : 6272290
    block cnt. for sampling: 6345946
    max. sample block cnt. : 32
sample block cnt. : 31
min. sel. est. : 0.00000000
** Using single table dynamic sel. est. : 0.00119474
  Table: TABLEXXX  Alias: TABLEXXX
    Card: Original: 220161278.000000  Rounded: 263036  Computed: 263036.17  Non Adjusted: 263036.17

In the COPY data base, the optimiser has used dynamic sampling at level 4 and did come up with a cardinality estimation of TABLEXXX of be 263K which obviously has conducted the CBO to opt for a reasonable HASH JOIN operation.

10053 of the PRODUCTION database

*** 2015-06-03 13:39:03.992
** Executed dynamic sampling query:
    level : 4
    sample pct. : 0.000482
    actual sample size : 1151
    filtered sample card. : 0  ------------------>  spot this information
    orig. card. : 220161278
    block cnt. table stat. : 6272290
    block cnt. for sampling: 6435970
    max. sample block cnt. : 32
sample block cnt. : 31
min. sel. est. : 0.00000000
** Not using dynamic sampling for single table sel. or cardinality.
DS Failed for : ----- Current SQL Statement for this session (sql_id=82x3mm8jqn5ah) -----
  Table: TABLEXXX  Alias: TABLEXXX
    Card: Original: 220161278.000000  Rounded: 1  Computed: 0.72  Non Adjusted: 0.72

In the PRODUCTION database, the CBO failed to use dynamic sampling at level 4 as clearly shown by the following line taken from the above 10053 trace file:

** Not using dynamic sampling for single table sel. or cardinality.
DS Failed for : ----- Current SQL Statement for this session (sql_id=82x3mm8jqn5ah)

PS: 10053 trace file has been applied on the important part of the query this is
    why the sql_id is not the same as the one mentioned above.

Thanks to Randolf Geist I learnt that the internal code of the Dynamic Sampling algorithm is so that when the predicate part has been applied on a sample of the TABLEXXX it returned 0 rows

filtered sample card. : 0

which is the reason why the optimizer has ignored Dynamic sampling at level 4 and falls back to the available object statistics producing a 1 row cardinality estimation and henceforth a dramatic wrong NESTED LOOP operation. By the way, should I have been in 12c database release the STATISTICS COLLECTOR placed above the first operation in the NESTED LOOP join would have reached the inflexion point and would have, hopefully, switched to a HASH JOIN operation during execution time.

A quick solution to this very critical report was to up the level of the dynamic sampling to a higher value. And, as far as this query belongs to a third party software I decided to use Kerry Osborne script in order to inject a dynamic sampling hint as shown below:

SQL>@create_1_hint_sql_profile.sql
Enter value for sql_id: 8114dqz1k5arj
Enter value for profile_name (PROFILE_sqlid_MANUAL):
Enter value for category (DEFAULT):
Enter value for force_matching (false): true
Enter value for hint: dynamic_sampling(6)
Profile PROFILE_8114dqz1k5arj_MANUAL created.

Once this done, the end user re-launched the report which completed within 303 seconds instead of those not ending 141,842 seconds


Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 SQL ID              :  8114dqz1k5arj
 SQL Execution ID    :  16777216
 Execution Started   :  06/10/2015 11:40:39
 First Refresh Time  :  06/10/2015 11:40:45
 Last Refresh Time   :  06/10/2015 11:45:39
 Duration            :  300s

SQL Plan Monitoring Details (Plan Hash Value=2202725716)
========================================================================================
| Id |            Operation             |      Name       |  Rows   | Execs |   Rows   |
|    |                                  |                 | (Estim) |       | (Actual) |
========================================================================================
|  0 | SELECT STATEMENT                 |                 |         |     1 |     2989 |
|  1 |   SORT ORDER BY                  |                 |    234K |     1 |     2989 |
|  2 |    FILTER                        |                 |         |     1 |     2989 |
|  3 |     HASH JOIN                    |                 |    234K |     1 |     2989 |
|  4 |      TABLE ACCESS BY INDEX ROWID | T_TABL_YXZ      |    232K |     1 |     501K |
|  5 |       INDEX RANGE SCAN           | VGY_TEST2       |       1 |     1 |     501K |
|  6 |      TABLE ACCESS BY INDEX ROWID | TABLEXXX        |    725K |     1 |       2M |
|  7 |       INDEX RANGE SCAN           | IDX_MESS_RCV_ID |      2M |     1 |       2M |
========================================================================================

Note
-----
   - dynamic sampling used for this statement (level=6)
   - SQL profile PROFILE_8114dqz1k5arj_MANUAL used for this statement

↧

Real Time SQL Monitoring oddity

June 23, 2015, 6:45 am

≫ Next: Don’t pre-empt the CBO from doing its work

≪ Previous: Why Dynamic Sampling has not been used?

This is a small note about a situation I have encountered and which I thought it is worth sharing with you. There was an insert/select executing in parallel DOP 16 on a 11.2.0.3 Oracle database for which the end user was complaining about the exceptional time it was taking without completing. Since the job was still running I tried getting its Real Time SQL monitoring report:


Global Information
------------------------------
 Status              :  DONE (ERROR)       
 Instance ID         :  1                  
 Session             :  XXXXX (392:229)   
 SQL ID              :  bbccngk0nn2z2      
 SQL Execution ID    :  16777216           
 Execution Started   :  06/22/2015 11:57:06
 First Refresh Time  :  06/22/2015 11:57:06
 Last Refresh Time   :  06/22/2015 11:57:46
 Duration            :  40s                

Global Stats
=================================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency |  Other   | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes | Reqs  | Bytes |
=================================================================================================
|   15315 |   15220 |       54 |        0.38 |       40 |     2G | 8601 |   2GB |  5485 |   1GB |

The insert/select according to the above report summary is DONE with (ERROR).
So why the end user is still complaining about the not ending batch job? And why he didn’t receive an error?

After having ruled out the resumable time out hypothesis I came back to the v$sql_monitor and issued the following two selects:

SQL> SELECT
  2    sql_id,
  3    process_name,
  4    status
  5  FROM v$sql_monitor
  6  WHERE sql_id = 'bbccngk0nn2z2'
  7  AND status   ='EXECUTING'
  8  ORDER BY process_name ;

SQL_ID        PROCE STATUS
------------- ----- ------------
bbccngk0nn2z2 p000  EXECUTING
bbccngk0nn2z2 p001  EXECUTING
bbccngk0nn2z2 p002  EXECUTING
bbccngk0nn2z2 p003  EXECUTING
bbccngk0nn2z2 p004  EXECUTING
bbccngk0nn2z2 p005  EXECUTING
bbccngk0nn2z2 p006  EXECUTING
bbccngk0nn2z2 p007  EXECUTING
bbccngk0nn2z2 p008  EXECUTING
bbccngk0nn2z2 p009  EXECUTING
bbccngk0nn2z2 p010  EXECUTING
bbccngk0nn2z2 p011  EXECUTING
bbccngk0nn2z2 p012  EXECUTING
bbccngk0nn2z2 p013  EXECUTING
bbccngk0nn2z2 p014  EXECUTING
bbccngk0nn2z2 p015  EXECUTING
bbccngk0nn2z2 p019  EXECUTING
bbccngk0nn2z2 p031  EXECUTING

SQL> SELECT
  2    sql_id,
  3    process_name,
  4    status
  5  FROM v$sql_monitor
  6  WHERE sql_id = 'bbccngk0nn2z2'
  7  AND status   ='DONE (ERROR)'
  8  ORDER BY process_name ;

SQL_ID        PROCE STATUS
------------- ----- -------------------
bbccngk0nn2z2 ora   DONE (ERROR)
bbccngk0nn2z2 p016  DONE (ERROR)
bbccngk0nn2z2 p017  DONE (ERROR)
bbccngk0nn2z2 p018  DONE (ERROR)
bbccngk0nn2z2 p020  DONE (ERROR)
bbccngk0nn2z2 p021  DONE (ERROR)
bccngk0nn2z2  p022  DONE (ERROR)
bbccngk0nn2z2 p023  DONE (ERROR)
bbccngk0nn2z2 p024  DONE (ERROR)
bbccngk0nn2z2 p025  DONE (ERROR)
bbccngk0nn2z2 p026  DONE (ERROR)
bbccngk0nn2z2 p027  DONE (ERROR)
bbccngk0nn2z2 p028  DONE (ERROR)
bbccngk0nn2z2 p029  DONE (ERROR)
bbccngk0nn2z2 p030  DONE (ERROR)

Among the 32 parallel servers half are executing and half are in error! How could this be possible? I have already been confronted to a parallel process that ends in its entirety when a single parallel server is in error. For example I have encountered several times the following error which is due to a parallel broadcast distribution of a high data row source exploding henceforth the TEMP tablespace:

ERROR at line 1:
ORA-12801: error signaled in parallel query server P013
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

A simple select against v$active_session_history confirmed that the insert/select is still running and it is consuming CPU

SQL> select sql_id, count(1)
  2  from gv$active_session_history
  3  where sample_time between to_date('22062015 12:30:00', 'ddmmyyyy hh24:mi:ss')
  4                    and     to_date('22062015 13:00:00', 'ddmmyyyy hh24:mi:ss')
  5  group by  sql_id
  6  order by 2 desc;

SQL_ID          COUNT(1)
------------- ----------
bbccngk0nn2z2       2545
                       4
0uuczutvk6jqj          1
8f1sjvfxuup9w          1

SQL> select decode(event,null, 'on cpu', event), count(1)
  2  from gv$active_session_history
  3  where sample_time between to_date('22062015 12:30:00', 'ddmmyyyy hh24:mi:ss')
  4                    and     to_date('22062015 13:00:00', 'ddmmyyyy hh24:mi:ss')
  5  and sql_id = 'bbccngk0nn2z2'
  6  group by  event
  7  order by 2 desc;

DECODE(EVENT,NULL,'ONCPU',EVENT)  COUNT(1)
--------------------------------  ---------
on cpu                            5439
db file sequential read           3

SQL> /

DECODE(EVENT,NULL,'ONCPU',EVENT)  COUNT(1)
--------------------------------- ---------
on cpu                            5460
db file sequential read           3

SQL> /

DECODE(EVENT,NULL,'ONCPU',EVENT)  COUNT(1)
--------------------------------  ---------
on cpu                            5470
db file sequential read           3

And after a while


SQL> /

DECODE(EVENT,NULL,'ONCPU',EVENT)   COUNT(1)
---------------------------------- ---------
on cpu                             15152
db file sequential read            9

While the parallel insert is still running I took several SQL monitoring reports of which the two followings ones:

Parallel Execution Details (DOP=16 , Servers Allocated=32)
============================================================================================
|      Name      | Type  | Server# | Elapsed |Buffer | Read  |         Wait Events         |
|                |       |         | Time(s) | Gets  | Bytes |         (sample #)          |
============================================================================================
| PX Coordinator | QC    |         |    0.48 |  2531 | 16384 |                             |
| p000           | Set 1 |       1 |    1049 |  128M |  63MB | direct path read (1)        |
| p001           | Set 1 |       2 |    1518 |  222M |  61MB |                             |
| p002           | Set 1 |       3 |     893 |  109M |  59MB |                             |
| p003           | Set 1 |       4 |    1411 |  194M |  62MB | direct path read (1)        |
| p004           | Set 1 |       5 |     460 |   64M |  62MB | direct path read (1)        |
| p005           | Set 1 |       6 |     771 |   87M | 322MB | direct path read (1)        |
|                |       |         |         |       |       | direct path read temp (5)   |
| p006           | Set 1 |       7 |     654 |   67M |  62MB | direct path read (1)        |
| p007           | Set 1 |       8 |     179 |   24M |  55MB | direct path read (1)        |
| p008           | Set 1 |       9 |    1638 |  235M |  70MB |                             |
| p009           | Set 1 |      10 |     360 |   46M |  54MB | direct path read (1)        |
| p010           | Set 1 |      11 |    1920 |  294M | 337MB | direct path read temp (6)   | --> 1920s
| p011           | Set 1 |      12 |     289 |   30M |  69MB |                             |
| p012           | Set 1 |      13 |     839 |   98M |  66MB | direct path read (1)        |
| p013           | Set 1 |      14 |     524 |   63M |  55MB |                             |
| p014           | Set 1 |      15 |    1776 |  263M |  69MB |                             |
| p015           | Set 1 |      16 |    1016 |  130M |  61MB | direct path read (1)        |
| p016           | Set 2 |       1 |    0.22 |  1166 |   3MB |                             |
| p017           | Set 2 |       2 |    1.36 |  6867 |  51MB |                             |
| p018           | Set 2 |       3 |    1.02 |  1298 |  36MB |                             |
| p019           | Set 2 |       4 |    6.71 |  2313 | 129MB | direct path read temp (2)   |
| p020           | Set 2 |       5 |    0.40 |   978 |  16MB |                             |
| p021           | Set 2 |       6 |    1.32 |  8639 |  41MB | direct path read temp (1)   |
| p022           | Set 2 |       7 |    0.18 |   896 |   2MB |                             |
| p023           | Set 2 |       8 |    0.23 |   469 |   9MB |                             | --> 0.23s
| p024           | Set 2 |       9 |    0.52 |  3635 |  19MB |                             | --> 0.52s
| p025           | Set 2 |      10 |    0.33 |  1163 |   3MB |                             |
| p026           | Set 2 |      11 |    0.65 |   260 |  31MB | db file sequential read (1) |
| p027           | Set 2 |      12 |    0.21 |  1099 |   6MB |                             |
| p028           | Set 2 |      13 |    0.58 |   497 |  20MB |                             |
| p029           | Set 2 |      14 |    1.43 |  4278 |  54MB |                             |
| p030           | Set 2 |      15 |    0.30 |  3481 |   8MB |                             |
| p031           | Set 2 |      16 |    2.86 |   517 |  91MB |                             |
============================================================================================


Parallel Execution Details (DOP=16 , Servers Allocated=32)
=============================================================================================
|      Name      | Type  | Server# | Elapsed | Buffer | Read  |         Wait Events         |
|                |       |         | Time(s) |  Gets  | Bytes |         (sample #)          |
=============================================================================================
| PX Coordinator | QC    |         |    0.48 |   2531 | 16384 |                             |
| p000           | Set 1 |       1 |    1730 |   202M |  63MB | direct path read (1)        |
| p001           | Set 1 |       2 |    2416 |   351M |  61MB |                             |
| p002           | Set 1 |       3 |    1094 |   133M |  59MB |                             |
| p003           | Set 1 |       4 |    2528 |   348M |  64MB | direct path read (1)        |
| p004           | Set 1 |       5 |     965 |   129M |  63MB | direct path read (1)        |
| p005           | Set 1 |       6 |    1089 |   129M | 322MB | direct path read (1)        |
|                |       |         |         |        |       | direct path read temp (5)   |
| p006           | Set 1 |       7 |    1459 |   165M |  62MB | direct path read (1)        |
| p007           | Set 1 |       8 |     221 |    30M |  55MB | direct path read (1)        |
| p008           | Set 1 |       9 |    2640 |   357M |  70MB |                             |
| p009           | Set 1 |      10 |     952 |   115M |  54MB | direct path read (1)        |
| p010           | Set 1 |      11 |    3117 |   471M | 337MB | direct path read temp (6)   | --> 3117s
| p011           | Set 1 |      12 |     400 |    42M |  69MB |                             |
| p012           | Set 1 |      13 |    1621 |   195M |  66MB | direct path read (1)        |
| p013           | Set 1 |      14 |    1126 |   132M |  55MB |                             |
| p014           | Set 1 |      15 |    2662 |   370M |  72MB |                             |
| p015           | Set 1 |      16 |    1194 |   147M |  61MB | direct path read (1)        |
| p016           | Set 2 |       1 |    0.22 |   1166 |   3MB |                             |
| p017           | Set 2 |       2 |    1.36 |   6867 |  51MB |                             |
| p018           | Set 2 |       3 |    1.02 |   1298 |  36MB |                             |
| p019           | Set 2 |       4 |    6.72 |   2313 | 131MB | direct path read temp (2)   |
| p020           | Set 2 |       5 |    0.40 |    978 |  16MB |                             |
| p021           | Set 2 |       6 |    1.32 |   8639 |  41MB | direct path read temp (1)   |
| p022           | Set 2 |       7 |    0.18 |    896 |   2MB |                             |
| p023           | Set 2 |       8 |    0.23 |    469 |   9MB |                             | --> 0.23s
| p024           | Set 2 |       9 |    0.52 |   3635 |  19MB |                             | --> 0.52s
| p025           | Set 2 |      10 |    0.33 |   1163 |   3MB |                             |
| p026           | Set 2 |      11 |    0.65 |    260 |  31MB | db file sequential read (1) |
| p027           | Set 2 |      12 |    0.21 |   1099 |   6MB |                             |
| p028           | Set 2 |      13 |    0.58 |    497 |  20MB |                             |
| p029           | Set 2 |      14 |    1.43 |   4278 |  54MB |                             |
| p030           | Set 2 |      15 |    0.30 |   3481 |   8MB |                             |
| p031           | Set 2 |      16 |    2.89 |    517 |  92MB |                             |
=============================================================================================

If you look carefully to the above reports you will notice that the elapsed time of the parallel servers mentioned being in ERROR (p16-p30) is not increasing in contrast to the elapsed time of the parallel servers mentioned being in EXECUTION (p0-p15) which is continuously increasing.

Thanks to Randolf Geist (again) I knew that there is a bug in Real time SQL monitoring report which occurs when a parallel server is not working for more than 30 minutes. In such a case the Real time SQL monitoring will starts showing those parallel severs in in ERROR confusing the situation.

As far as I was able to reproduce the issue I started the process again at 16h03 and I kept executing the following select from time to time having no rows for each execution

SELECT
  sql_id,
  process_name,
  status
FROM v$sql_monitor
WHERE sql_id = '5np4u0m0h69jx' –- changed a little bit the sql_id
AND status   ='DONE (ERROR)'
ORDER BY process_name ;

no rows selected

Until at around 16h37 i.e. after 30 minutes (and a little bit more) of execution the above select started showing processes in error:

SQL> SELECT
  2    sql_id,
  3    process_name,
  4    status
  5  FROM v$sql_monitor
  6  WHERE sql_id = '5np4u0m0h69jx'
  7  AND status   ='DONE (ERROR)'
  8  ORDER BY process_name ;

SQL_ID        PROCE STATUS
------------- ----- ---------------
5np4u0m0h69jx ora   DONE (ERROR)
5np4u0m0h69jx p016  DONE (ERROR)
5np4u0m0h69jx p017  DONE (ERROR)
5np4u0m0h69jx p018  DONE (ERROR)
5np4u0m0h69jx p020  DONE (ERROR)
5np4u0m0h69jx p021  DONE (ERROR)
5np4u0m0h69jx p022  DONE (ERROR)
5np4u0m0h69jx p023  DONE (ERROR)
5np4u0m0h69jx p024  DONE (ERROR)
5np4u0m0h69jx p025  DONE (ERROR)
5np4u0m0h69jx p026  DONE (ERROR)
5np4u0m0h69jx p027  DONE (ERROR)
5np4u0m0h69jx p028  DONE (ERROR)
5np4u0m0h69jx p029  DONE (ERROR)
5np4u0m0h69jx p030  DONE (ERROR)

At the very beginning of the process several parallel servers was not running while several others were busy. And when the first parallel server (p10 in this case) reaches more than 1800 seconds (1861 seconds in this case) the real time Sql monitoring started showing the not working parallel servers in ERROR.

Bottom line: don’t be confused (as I have been) by that DONE (ERROR) status, your SQL statement might still be running consuming time and energy despite this wrong real time SQL monitoring reporting status

↧

Don’t pre-empt the CBO from doing its work

July 2, 2015, 7:03 am

≫ Next: Stressed ASH

≪ Previous: Real Time SQL Monitoring oddity

This is the last part of the parallel insert/select saga. As a reminder below is the two preceding episodes:

Part 1: where I have explained why I was unable to get the corresponding SQL monitoring report because of the _sqlmon_max_planlines parameter.
Part 2: where I have explained the oddity shown by the SQL monitoring report when monitoring non active parallel server for more than 30 minutes.

In Part 3 I will share with you how I have succeeded to solve this issue and convinced people to not pre-empt the Oracle optimizer from doing its work.

Thanks to the monitoring of this insert/select I have succeeded to isolate the part of the execution plan that needs absolutely to be tuned:

Error: ORA-12805
------------------------------
ORA-12805: parallel query server died unexpectedly

Global Information
------------------------------
 Status                                 :  DONE (ERROR)
 Instance ID                            :  2
 SQL ID                                 :  bg7h7s8sb5mnt
 SQL Execution ID                       :  33554432
 Execution Started                      :  06/24/2015 05:06:14
 First Refresh Time                     :  06/24/2015 05:06:21
 Last Refresh Time                      :  06/24/2015 09:05:10
 Duration                               :  14336s
 DOP Downgrade                          :  50%

Global Stats
============================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Cluster  |  Other   | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Waits(s) |  Gets  | Reqs | Bytes |
============================================================================================
|   38403 |   35816 |     0.42 |        2581 |     0.16 |     6.09 |     7G |  103 | 824KB |
============================================================================================

SQL Plan Monitoring Details (Plan Hash Value=3668294770)
======================================================================================================
| Id  |                Operation         |             Name  |  Rows   | Execs |   Rows   | Activity |
|     |                                  |                   | (Estim) |       | (Actual) |   (%)    |
======================================================================================================
| 357 |VIEW PUSHED PREDICATE             | NAEHCE            |      59 | 23570 |    23541 |          |
| 358 | NESTED LOOPS                     |                   |      2M | 23570 |    23541 |     0.05 |
| 359 |  INDEX FAST FULL SCAN            | TABLEIND1         |   27077 | 23570 |     667M |     0.19 |
| 360 |  VIEW                            | VW_JF_SET$E6DCA8A3|       1 |  667M |    23541 |     0.10 |
| 361 |   UNION ALL PUSHED PREDICATE     |                   |         |  667M |    23541 |    30.59 |
| 362 |    NESTED LOOPS                  |                   |       1 |  667M |     1140 |     0.12 |
| 363 |     TABLE ACCESS BY INDEX ROWID  | TABLE2            |       1 |  667M |    23566 |     1.25 |
| 364 |      INDEX UNIQUE SCAN           | IDX_TABLE2        |       1 |  667M |     667M |    17.81 |
| 365 |     TABLE ACCESS BY INDEX ROWID  | TABLE3            |       1 | 23566 |     1140 |          |
| 366 |      INDEX RANGE SCAN            | IDX_TABLE3        |      40 | 23566 |     174K |          |
| 367 |    NESTED LOOPS                  |                   |       1 |  667M |    22401 |     0.11 |
| 368 |     TABLE ACCESS BY INDEX ROWID  | TABLE2            |       1 |  667M |    23566 |     1.27 |
| 369 |      INDEX UNIQUE SCAN           | IDX_TABLE2        |       1 |  667M |     667M |    17.72 |
| 370 |     TABLE ACCESS BY INDEX ROWID  | TABLE3            |       1 | 23566 |    22401 |     0.01 |
| 371 |      INDEX RANGE SCAN            | TABLE31           |      36 | 23566 |       4M |          |

The NESTED LOOPS operation at line 358 has an INDEX FAST FULL SCAN (TABLEIND1) as an outer data source driven an inner data row source represented by an internal view (VW_JF_SET$E6DCA8A3) built by Oracle on the fly. Reduced to the bare minimum it should resemble to this:

SQL Plan Monitoring Details (Plan Hash Value=3668294770)
=====================================================================================
| Id  |                 Operation |             Name   |  Rows   | Execs |   Rows   |
|     |                           |                    | (Estim) |       | (Actual) |
=====================================================================================
| 358 |  NESTED LOOPS             |                    |      2M | 23570 |    23541 |
| 359 |   INDEX FAST FULL SCAN    | TABLEIND1          |   27077 | 23570 |     667M |
| 360 |   VIEW                    | VW_JF_SET$E6DCA8A3 |       1 |  667M |    23541 |

Observe carefully operation at line 359 which is the operation upon which Oracle makes its join method choice. Very often a NESTED LOOPS operation is wrongly chosen by the optimizer because of not accurate estimations made at the first operation of the NESTED LOOPS join. Let’s check the accuracy of the estimation done in this case by Oracle for operation at line 359:

   Rows(Estim) * Execs = 27077 * 23570 = 638204890 ~ 638M
   Rows(Actual = 667M

Estimations done by the optimizer at this step are good. So why in earth Oracle will decide to opt for a NESTED LOOPS operation when it knows prior the execution that the outer data row set will produce 667M of rows inducing the inner operations to be executed 667M times? There is no way that Oracle will opt for this solution unless it is instructed to do so. And indeed, looking to the huge insert/select statement I found, among a tremendous amount of hints, a use_nl (o h) hint which dictates the optimizer to join the TABLEIND table with the rest of the view using a NESTED LOOPS operation. It was then a battle to convince the client that he has to get rid of that hint. What makes the client hesitating is that very often the same insert/select statement (including the use_nl hint) completes in an acceptable time. I was then obliged to explain why despite the presence of the use_nl hint (I am suggesting to be the problem of the performance degradation) the insert/select very often completes in an acceptable execution time. To explain this situation it suffices to get the execution plan of the acceptable execution time (reduced to the bare minimum) and spot the obvious:

SQL Plan Monitoring Details (Plan Hash Value=367892000)
====================================================================================
| Id  |                Operation |             Name   |  Rows   | Execs |   Rows   |
|     |                          |                    | (Estim) |       | (Actual) |
====================================================================================
| 168 |VIEW PUSHED PREDICATE     | NAEHCE             |       1 | 35118 |    35105 |
| 169 | NESTED LOOPS             |                    |       2 | 35118 |    35105 |
| 170 |  VIEW                    | VW_JF_SET$86BE946E |       2 | 35118 |    35105 |
| 182 |  INDEX UNIQUE SCAN       | TABLEIND1          |       1 | 35105 |    35105 |

The join order switched from (TABLEIND1, VW_JF_SET$86BE946E) to (VW_JF_SET$86BE946E,TABLEIND1). As far as the use_nl (o h) hint is not completed by a leading (h o) hint in order to indicate in what order Oracle has to join this two objects, then the choice of the important outer operation is left to Oracle. When the index is chosen as the outer operation, the insert/select statement performs very poorly. However when the same index is used as the inner operation of the join then the insert/select statement performs in an acceptable time.

With that explained, the client has been convinced, the hints disabled and the insert/select re-launched and completed within few seconds thanks to the approriate HASH JOIN operation used by the optimizer:

Global Information
------------------------------
 Status                                 :  DONE
 Instance ID                            :  2
 SQL ID                                 :  9g2a3gstkr7dv
 SQL Execution ID                       :  33554432
 Execution Started                      :  06/24/2015 12:53:49
 First Refresh Time                     :  06/24/2015 12:53:52
 Last Refresh Time                      :  06/24/2015 12:54:05
 Duration                               :  16s

Global Stats
============================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Cluster  |  Other   | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Waits(s) |  Gets  | Reqs | Bytes |
============================================================================================
|      23 |      21 |     0.91 |        0.03 |     0.22 |     0.31 |     1M |  187 |   1MB |
============================================================================================

SQL Plan Monitoring Details (Plan Hash Value=3871743977)
=================================================================================================
| Id  |                           Operation   |             Name   |  Rows   | Execs |   Rows   |
|     |                                       |                    | (Estim) |       | (Actual) |
=================================================================================================
| 153 |       VIEW                            | NAEHCE             |      2M |     1 |       2M |
| 154 |        HASH JOIN                      |                    |      2M |     1 |       2M |
| 155 |         INDEX FAST FULL SCAN          | TABLEIND1          |   27077 |     1 |    28320 |
| 156 |         VIEW                          | VW_JF_SET$86BE946E |      2M |     1 |       2M |

Spot as well that when the optimizer opted for a HASH JOIN operation the VIEW PUSHED PREDICATE operation and the JPPD (JOIN PREDICATE PUSH DOWN) underlying transformation cease to used because it is occurs only with NESTED LOOP.

Bottom line: always try to supply Oracle with fresh and representative statistics and let it do its job. Don’t pre-empt it from doing its normal work by systematically hinting it when confronted to a performance issue. And when you decide to use hints make sure to hint correctly particularly for the outer (build) table and the inner(probe) table in case of NESTED LOOPS (HASH JOIN) hinted operation.

↧

Stressed ASH

July 9, 2015, 10:29 am

≫ Next: Degree of Parallelism is 16 because of table property

≪ Previous: Don’t pre-empt the CBO from doing its work

It is well known that any record found in dba_hist_active_session_history has inevitably been routed there from v$active_session_history. If so, then how could we interpret the following cut & past from a running production system?

ASH first

SQL> select event, count(1)
    from gv$active_session_history
    where sample_time between to_date('06072015 18:30:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('06072015 19:30:00', 'ddmmyyyy hh24:mi:ss')
    group by event
    order by 2 desc;

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
                                                                        372
direct path read                                                        185
log file parallel write                                                  94
Disk file Mirror Read                                                    22
control file sequential read                                             20
control file parallel write                                              18
direct path write temp                                                   16
Streams AQ: qmn coordinator waiting for slave to start                   12
db file parallel read                                                    11
gc cr multi block request                                                 6
enq: KO - fast object checkpoint                                          4
db file sequential read                                                   3
ges inquiry response                                                      3
os thread startup                                                         2
PX Deq: Signal ACK RSG                                                    2
enq: CF - contention                                                      1
PX Deq: Slave Session Stats                                               1
Disk file operations I/O                                                  1
IPC send completion sync                                                  1
reliable message                                                          1
null event                                                                1
enq: CO - master slave det                                                1
db file parallel write                                                    1
gc current block 2-way                                                    1

AWR next

SQL> select event, count(1)
    from dba_hist_active_sess_history
    where sample_time between to_date('06072015 18:30:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('06072015 19:30:00', 'ddmmyyyy hh24:mi:ss')
    group by event
    order by 2 desc;

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
SQL*Net break/reset to client                                         12950
enq: TM - contention                                                  12712
                                                                        624
db file sequential read                                                 386
enq: TX - row lock contention                                           259
SQL*Net message from dblink                                              74
direct path read                                                         62
SQL*Net more data from dblink                                            27
log file parallel write                                                  26
log file sync                                                            15
SQL*Net more data from client                                             9
control file sequential read                                              7
Disk file Mirror Read                                                     6
gc cr grant 2-way                                                         5
db file parallel write                                                    4
read by other session                                                     3
control file parallel write                                               3
Streams AQ: qmn coordinator waiting for slave to start                    3
log file sequential read                                                  2
direct path read temp                                                     2
enq: KO - fast object checkpoint                                          2
gc cr multi block request                                                 1
CSS initialization                                                        1
gc current block 2-way                                                    1
reliable message                                                          1
db file parallel read                                                     1
gc buffer busy acquire                                                    1
ges inquiry response                                                      1
direct path write temp                                                    1
rdbms ipc message                                                         1
os thread startup                                                         1

12,950 snapshots of SQL*Net break/reset to client and 12,712 snapshots of an enq: TM – contention wait events in AWR not found in ASH. How can we interpret this situation?

This 11.2.0.4.0 database is implemented under a RAC infrastructure with 2 instances. Let’s look at the ASH of the two instances separately

Instance 1 first

SQL> select event, count(1)
    from v$active_session_history
    where sample_time between to_date('06072015 18:30:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('06072015 19:30:00', 'ddmmyyyy hh24:mi:ss')
    group by event
    order by 2 desc;

 no rows selected

Instance 2 next

SQL> select event, count(1)
    from v$active_session_history
    where sample_time between to_date('06072015 18:30:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('06072015 19:30:00', 'ddmmyyyy hh24:mi:ss')
    group by event
    order by 2 desc;

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
                                                                        372
direct path read                                                        185
log file parallel write                                                  94
Disk file Mirror Read                                                    22
control file sequential read                                             20
control file parallel write                                              18
direct path write temp                                                   16
Streams AQ: qmn coordinator waiting for slave to start                   12
db file parallel read                                                    11
gc cr multi block request                                                 6
enq: KO - fast object checkpoint                                          4
db file sequential read                                                   3
ges inquiry response                                                      3
os thread startup                                                         2
PX Deq: Signal ACK RSG                                                    2
enq: CF - contention                                                      1
PX Deq: Slave Session Stats                                               1
Disk file operations I/O                                                  1
IPC send completion sync                                                  1
reliable message                                                          1
null event                                                                1
enq: CO - master slave det                                                1
db file parallel write                                                    1
gc current block 2-way                                                    1

All what is sampled in ASH in that specific time interval is coming from the second instance while the first instance doesn’t report any record for the corresponding time interval. This inevitably questions either the ash size of instance one or an imbalanced workload between the two instances:

ASH size first

SQL> select
  2        inst_id
  3        ,total_size
  4      from gv$ash_info;

   INST_ID TOTAL_SIZE
---------- ----------
         1  100663296
         2  100663296

ASH Activity next

SQL> select
        inst_id
       ,total_size
       ,awr_flush_emergency_count
     from gv$ash_info;

   INST_ID TOTAL_SIZE AWR_FLUSH_EMERGENCY_COUNT
---------- ---------- -------------------------
         1  100663296                       136
         2  100663296                         0

Typically the activity is mainly oriented towards instance 1 and the abnormal and unusual 12,712 SQL*Net break/reset to client wait events have exacerbated the rate of insert into ASH buffers of instance one generating the 136 awr_flush_emergency_count and, as such, the discrepancies between ASH and AWR.

This is also confirmed by the difference in the ASH retention period between the two instances

Instance 1 first where only 3 hours of ASH data are kept

SQL> select min(sample_time), max(sample_time)
  2  from v$active_session_history;

MIN(SAMPLE_TIME)                         MAX(SAMPLE_TIME)
---------------------------------------  -------------------------
08-JUL-15 05.51.20.502 AM                08-JUL-15 08.35.48.233 AM

Instance 2 next where several days worth of ASH data are still present

SQL> select min(sample_time), max(sample_time)
  2  from v$active_session_history;

MIN(SAMPLE_TIME)                         MAX(SAMPLE_TIME)
---------------------------------------  -------------------------
25-JUN-15 20.01.43                       08-JUL-15 08.37.17.233 AM

The solution would be one of the following points (I think in the order of priority):

Solve this SQL*Net break/reset to client issue which is dramatically filling up the ash buffer causing unexpected rapid flush of important and more precise data
Balance the work load activity between the two instances
Increase the ash size of the instance 1 by means of alter system set “_ash_size”=25165824;

In the next article I will explain how I have identified what is causing this unusual SQL*Net break/reset to client wait events.

↧

Degree of Parallelism is 16 because of table property

August 4, 2015, 3:11 am

≫ Next: Flash back causing library cache: mutex X

≪ Previous: Stressed ASH

I have been pleasantly surprised by the following Note at the bottom of an execution plan coming from a 12.1.0.2.0 Oracle instance


SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for Linux: Version 12.1.0.2.0 - Production                                            0
NLSRTL Version 12.1.0.2.0 - Production                                                    0


SQL> create table t_par as select rownum n1, trunc((rownum -1/3)) n2, mod(rownum, 5) n3
    from dual
    connect by level<=1e6;

SQL> create index t_part_idx on t_par(n1);

Index created.

SQL> alter table t_par parallel 16;

Table altered.

SQL> select count(1) from t_par where n1> 1;

  COUNT(1)
----------
    999999

SQL> select * from table(dbms_xplan.display_cursor);
----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |       |    48 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |     5 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |     5 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |     5 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |   999K|  4882K|    48   (3)| 00:00:01 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| T_PAR    |   999K|  4882K|    48   (3)| 00:00:01 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter("N1">1)

Note
-----
   - Degree of Parallelism is 16 because of table property

As you can point it out, thanks to the above Note, we can immediately guess that the Optimizer decided to run the query in parallel because the T_PAR table has been decorated with a DOP of 16

SQL> select table_name,degree
  2  from user_tables
  3  where table_name = 'T_PAR';

TABLE_NAME    DEGREE
------------ -------
T_PAR        16

A nice 12c add.

A couple of month ago a query running on 11.2.0.3 which used to run very quickly suddenly started deviating dangerously from its habitual execution time. The end user told me that they didn’t changed anything and he asked to investigate the root cause of this performance degradation. The corresponding SQL real time monitoring looks like:

Global Stats
======================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Gets  | Reqs | Bytes | Reqs  | Bytes |
======================================================================================
|     799 |     443 |      356 |        0.01 |     3M | 398K |  11GB |  122K |  24GB |
======================================================================================

Parallel Execution Details (DOP=4 , Servers Allocated=8)
SQL Plan Monitoring Details (Plan Hash Value=637438362)
========================================================================================================================
| Id    |                 Operation                  |       Name        |  Rows   |Execs |   Rows   | Temp | Activity |
|       |                                            |                   | (Estim) |      | (Actual) |      |   (%)    |
========================================================================================================================
|     0 | SELECT STATEMENT                           |                   |         |    9 |        0 |      |     0.13 |
|     1 |   PX COORDINATOR                           |                   |         |    9 |          |      |          |
|     2 |    PX SEND QC (RANDOM)                     | :TQ10003          |     19M |    4 |          |      |          |
|     3 |     HASH JOIN RIGHT SEMI                   |                   |     19M |    4 |        0 |      |          |
|     4 |      PX RECEIVE                            |                   |      3M |    4 |     1853 |      |          |
|     5 |       PX SEND HASH                         | :TQ10002          |      3M |    4 |     1853 |      |          |
|     6 |        VIEW                                | VW_NSO_1          |      3M |    4 |     1853 |      |          |
|     7 |         FILTER                             |                   |         |    4 |     1853 |      |          |
|     8 |          NESTED LOOPS                      |                   |      3M |    4 |     1853 |      |          |
|     9 |           BUFFER SORT                      |                   |         |    4 |       38 |      |          |
|    10 |            PX RECEIVE                      |                   |         |    4 |       38 |      |          |
|    11 |             PX SEND ROUND-ROBIN            | :TQ10000          |         |    1 |       38 |      |          |
|    12 |              HASH JOIN                     |                   |   69556 |    1 |       38 |      |          |
|    13 |               INLIST ITERATOR              |                   |         |    1 |     6258 |      |          |
|    14 |                TABLE ACCESS BY INDEX ROWID | TAB_001X          |   69556 |  840 |     6258 |      |          |
|    15 |                 INDEX RANGE SCAN           | IDX_TAB_001X25    |   69556 |  840 |     6258 |      |          |
|    16 |               INDEX FAST FULL SCAN         | PK_TAB_00X13      |     18M |    1 |      19M |      |     0.27 |
|    17 |           INDEX RANGE SCAN                 | PK_IDX_MAIN_TAB   |      36 |   38 |     1853 |      |          |
| -> 18 |      BUFFER SORT                           |                   |         |    4 |        0 |  26G |    34.18 |
| -> 19 |       PX RECEIVE                           |                   |    648M |    4 |     566M |      |     4.14 |
| -> 20 |        PX SEND HASH                        | :TQ10001          |    648M |    1 |     566M |      |    13.89 |
| -> 21 |         TABLE ACCESS FULL                  | MAIN_TABLE_001    |    648M |    1 |     566M |      |    47.40 |
========================================================================================================================

The BUFFER SORT operation at line 18 was killing the performance of this query as far as it was buffering 566M of rows.

Looking back to the previous execution plans shows that they were serial plans!!! What makes this new plan running in parallel? I was practically sure from where this was coming. I know that this application rebuilds indexes from time to time. And I know that very often, they use parallel rebuild to accelerate the operation. But I know also that very often, DBA forget to set back the indexes at their default value at the end of the index rebuild process. Indeed the primary index PK_IDX_MAIN_TAB was at a DOP of 4 while it shouldn’t. Putting back this index to degree 1 sets back the corresponding execution plan to the serial execution plan the underlying query used to follow in the past:

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|      43 |      43 |     0.02 |    11 |     4M |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=1734192894)
============================================================================================
| Id |              Operation              |       Name        |  Rows   |Execs |   Rows   |
|    |                                     |                   | (Estim) |      | (Actual) |
============================================================================================
|  0 | SELECT STATEMENT                    |                   |         |    1 |      108 |
|  1 |   HASH JOIN RIGHT SEMI              |                   |     19M |    1 |      108 |
|  2 |    VIEW                             | VW_NSO_1          |    701K |    1 |      108 |
|  3 |     FILTER                          |                   |         |    1 |      108 |
|  4 |      NESTED LOOPS                   |                   |    701K |    1 |      108 |
|  5 |       HASH JOIN                     |                   |   19387 |    1 |        3 |
|  6 |        INLIST ITERATOR              |                   |         |    1 |        3 |
|  7 |         TABLE ACCESS BY INDEX ROWID | TAB_001X          |   19387 |  168 |        3 |
|  8 |          INDEX RANGE SCAN           | IDX_TAB_001X25    |   19387 |  168 |        3 |
|  9 |        INDEX FAST FULL SCAN         | PK_TAB_00X13      |     18M |    1 |      19M |
| 10 |       INDEX RANGE SCAN              | PK_IDX_MAIN_TAB   |      36 |    3 |      108 |
| 11 |    TABLE ACCESS FULL                | MAIN_TABLE_001    |    648M |    1 |     677M |
============================================================================================

In this context of rebuild indexes left at a DOP > 1 and this nicely 12c added Note about the reason for which Oracle has decided to use parallel run, I was curious to know if the 12c Note will show the same information if the parallel plan was due to an index having a DOP > 1

SQL> alter table t_par noparallel;

SQL> alter index T_PART_IDX parallel 16;

SQL> select count(1) from t_par where n1> 1;

  COUNT(1)
----------
    999999

SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  4s7n5z52gun33, child number 0
-------------------------------------
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name       | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |            |       |       |   610 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE           |            |     1 |     5 |            |          |        |      |            |
|   2 |   PX COORDINATOR          |            |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)    | :TQ10000   |     1 |     5 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE        |            |     1 |     5 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR    |            |   999K|  4882K|   610   (1)| 00:00:01 |  Q1,00 | PCWC |            |
|*  6 |       INDEX FAST FULL SCAN| T_PART_IDX |   999K|  4882K|   610   (1)| 00:00:01 |  Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter("N1">1)

Unfortunately there is no Note indicating that the above parallel execution plan is due to the parallel degree of the index T_PART_IDX.

↧

Flash back causing library cache: mutex X

August 5, 2015, 10:28 am

≫ Next: Adaptive Cursor Sharing triggering mechanism

≪ Previous: Degree of Parallelism is 16 because of table property

Recently one of our applications suffered from a severe performance issue. It is an application running on a database(11.2.0.4.0) used to validate a pre-production release. This performance issue has delayed the campaign test and the validation process for more than 3 days. The ASH data taken during the altered performance period shows this:

SQL> select event, count(1)
    from gv$active_session_history
    where sample_time between to_date('15072015 16:00:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('15072015 16:30:00', 'ddmmyyyy hh24:mi:ss')
    group by event
    order by 2 desc;

EVENT                                COUNT(1)
----------------------------------- ----------
library cache: mutex X                    3928
kksfbc child completion                    655
cursor: pin S wait on X                    580
PX Deq: Slave Session Stats                278
                                           136
db file sequential read                    112
cursor: pin S                               35
null event                                  26
latch: shared pool                          15
cursor: mutex S                             13
library cache lock                          11
read by other session                       10
log file parallel write                      5
PX Deq: Signal ACK EXT                       3
os thread startup                            3
log file sync                                2
latch free                                   1
db file parallel write                       1
SQL*Net more data from client                1
enq: PS - contention                         1
cursor: mutex X                              1
direct path read                             1
control file sequential read                 1
CSS operation: action                        1

As you can notice the dominant wait event is:

EVENT                          COUNT(1)
------------------------------ -------
library cache: mutex X         3928

A library cache: mutex X wait event represents a concurrency wait event that is a part of 6 mutexes wait events

	CURSOR : pin S
	CURSOR : pin X
	CURSOR : pin S wait on X
	CURSOR : mutex S
	CURSOR : mutex X
	Library cache : mutex X

Mutexes are similar to locks except that they lock object in shared memory rather than rows in tables and indexes. Whenever a session wants to read or write into the library cache shared memory it needs to pin that object (cursor generally) and acquire a mutex on it. If another session wants simultaneously to read the same piece of memory it will try to acquire a mutex on it. This session might then wait on one of those library or cursor mutex wait event since another session has already preceded it and has still not released the latch (the mutex).

So, extended to my actual case what has been exaggerated so that a library cache: mutex X has made the database unusable?

SQL> select
       sql_id
      ,session_id
      ,in_parse
      ,in_sql_execution
    from
      gv$active_session_history
    where sample_time between to_date('15072015 16:00:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('15072015 16:30:00', 'ddmmyyyy hh24:mi:ss')
    and event = 'library cache: mutex X'
    order by sql_id;

SQL_ID        SQL_CHILD_NUMBER SESSION_ID IN_PARSE   IN_SQL_EXECUTION
------------- ---------------- ---------- ---------- ----------------
20f2kut7fg4g0               -1          8 Y          N
20f2kut7fg4g0               -1         20 Y          N
20f2kut7fg4g0                1         24 N          Y
20f2kut7fg4g0               -1         40 Y          N
20f2kut7fg4g0               -1         60 Y          N
20f2kut7fg4g0               -1         88 Y          N
20f2kut7fg4g0                1         89 N          Y
20f2kut7fg4g0                1         92 N          Y
20f2kut7fg4g0               -1        105 Y          N
20f2kut7fg4g0               -1        106 Y          N
20f2kut7fg4g0                1        109 N          Y
20f2kut7fg4g0               -1        124 Y          N
20f2kut7fg4g0               -1        128 Y          N
20f2kut7fg4g0               -1        143 Y          N
20f2kut7fg4g0               -1        157 Y          N
20f2kut7fg4g0                1        159 N          Y
20f2kut7fg4g0               -1        160 Y          N
20f2kut7fg4g0               -1        161 Y          N
20f2kut7fg4g0                1        172 N          Y
20f2kut7fg4g0                1        178 N          Y
20f2kut7fg4g0               -1        191 Y          N
20f2kut7fg4g0               -1        192 Y          N
20f2kut7fg4g0               -1        194 Y          N
20f2kut7fg4g0                1        209 N          Y
20f2kut7fg4g0               -1        223 Y          N
20f2kut7fg4g0                1        229 N          Y
20f2kut7fg4g0               -1        241 Y          N
20f2kut7fg4g0               -1        246 Y          N
20f2kut7fg4g0               -1        258 Y          N
20f2kut7fg4g0                1        259 N          Y
20f2kut7fg4g0                1        280 N          Y
20f2kut7fg4g0                1        294 N          Y
20f2kut7fg4g0               -1        309 Y          N
20f2kut7fg4g0               -1        310 Y          N
20f2kut7fg4g0               -1        328 Y          N
20f2kut7fg4g0                1        348 N          Y
20f2kut7fg4g0               -1        382 Y          N
20f2kut7fg4g0               -1        413 Y          N
20f2kut7fg4g0               -1        415 Y          N
20f2kut7fg4g0                1        428 N          Y
20f2kut7fg4g0               -1        449 Y          N
20f2kut7fg4g0               -1        450 Y          N
20f2kut7fg4g0               -1        462 Y          N
20f2kut7fg4g0                1        467 N          Y
20f2kut7fg4g0               -1        480 Y          N
20f2kut7fg4g0               -1        484 Y          N
20f2kut7fg4g0                1        516 N          Y
20f2kut7fg4g0               -1        533 Y          N
20f2kut7fg4g0                1        535 N          Y
20f2kut7fg4g0               -1        546 Y          N
20f2kut7fg4g0               -1        565 Y          N
20f2kut7fg4g0                1        568 N          Y
20f2kut7fg4g0               -1        584 Y          N
20f2kut7fg4g0               -1        585 Y          N
20f2kut7fg4g0               -1        601 Y          N
20f2kut7fg4g0               -1        602 Y          N
20f2kut7fg4g0               -1        615 Y          N
20f2kut7fg4g0               -1        619 Y          N
20f2kut7fg4g0               -1        635 Y          N
20f2kut7fg4g0               -1        652 Y          N
20f2kut7fg4g0               -1        667 Y          N
20f2kut7fg4g0               -1        668 Y          N
20f2kut7fg4g0               -1        687 Y          N
20f2kut7fg4g0               -1        705 Y          N
20f2kut7fg4g0               -1        717 Y          N
20f2kut7fg4g0               -1        721 Y          N
20f2kut7fg4g0               -1        733 Y          N
20f2kut7fg4g0               -1        735 Y          N
20f2kut7fg4g0               -1        753 Y          N
20f2kut7fg4g0               -1        754 Y          N
20f2kut7fg4g0                1        770 N          Y
20f2kut7fg4g0               -1        773 Y          N
20f2kut7fg4g0               -1        785 Y          N
20f2kut7fg4g0               -1        786 Y          N
20f2kut7fg4g0               -1        804 Y          N

75 rows selected.

I have limited the output to just one sql_id (20f2kut7fg4g0) in order to keep the explanation clear and simple.

What does represent this particular sql_id which is executed by 75 different sessions that are sometimes in parse and sometimes in execution?

SQL> with got_my_sql_id
    as ( select sql_id, count(1)
    from gv$active_session_history
    where sample_time between to_date('16072015 09:30:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('16072015 10:30:00', 'ddmmyyyy hh24:mi:ss')
    and event  = 'library cache: mutex X'
    group by sql_id)
    select distinct sql_id, sql_text
    from v$sql b
   where exists (select null
                 from got_my_sql_id a
                 where a.sql_id = b.sql_id)
   order by sql_id;

SQL_ID        SQL_TEXT
------------ ----------------------------------------------------------------------
20f2kut7fg4g0 /* Flashback Table */ INSERT /*+ PARALLEL(S, DEFAULT) PARALLEL(T,
                 DEFAULT) */ INTO "DEV_ZXX"."CLOSED_DAY" SELECT
              /*+ USE_NL(S) ORDERED PARALLEL(S, DEFAULT) PARALLEL(T, DEFAULT) */
              S.* FROM SYS_TEMP_FBT T , "DEV_ZXX"."CLOSED_DAY"
              as of SCN :1 S WHERE T.rid = S.
              rowid and T.action = 'I' and T.object# = :2

The above piece of SQL code is generated by Oracle behind the scene when flashing backward a content of a given table. And this is exactly what this client was doing. At the end of their pre-production test campaign they flash back a certain number of tables to the data they have at the beginning of the test. And since the generated code uses parallel run with default degree it has produce such a kind of monitored execution plan


Parallel Execution Details (DOP=96 , Servers Allocated=96)

SQL Plan Monitoring Details (Plan Hash Value=4258977226)
===============================================================================
| Id |        Operation        |      Name       |  Rows   | Execs |   Rows   |
|    |                         |                 | (Estim) |       | (Actual) |
===============================================================================
|  0 | INSERT STATEMENT        |                 |         |     1 |          |
|  1 |   LOAD AS SELECT        |                 |         |     1 |          |
|  2 |    PX COORDINATOR       |                 |         |    82 |          |
|  3 |     PX SEND QC (RANDOM) | :TQ10000        |     322 |    81 |          |
|  4 |      PX BLOCK ITERATOR  |                 |     322 |    81 |          |
|  5 |       TABLE ACCESS FULL | TABLE_RULE_SUP  |     322 |     4 |          |
===============================================================================

For every flashed back table Oracle started 96 parallel servers (96 sessions) in order to do a simple insert statement causing the observed library cache mutex X wait event. The DOP 96 is the maximum DOP which represents in fact the default DOP determined by the following simplified formula:

DOP = PARALLEL_THREADS_PER_CPU x CPU_COUNT
DOP = 2 x 48 = 96

SQL> show parameter parallel

NAME                               TYPE        VALUE
---------------------------------- ----------- ----------
fast_start_parallel_rollback       string      LOW
parallel_adaptive_multi_user       boolean     TRUE
parallel_automatic_tuning          boolean     FALSE
parallel_degree_limit              string      CPU
parallel_degree_policy             string      MANUAL
parallel_execution_message_size    integer     16384
parallel_force_local               boolean     FALSE
parallel_instance_group            string
parallel_io_cap_enabled            boolean     FALSE
parallel_max_servers               integer     100
parallel_min_percent               integer     0
parallel_min_servers               integer     0
parallel_min_time_threshold        string      AUTO
parallel_server                    boolean     FALSE
parallel_server_instances          integer     1
parallel_servers_target            integer     100
_parallel_syspls_obey_force        boolean     TRUE
parallel_threads_per_cpu           integer     2
recovery_parallelism               integer     0

SQL> show parameter cpu

NAME                           TYPE        VALUE
------------------------------ ----------- -----
cpu_count                       integer     48
parallel_threads_per_cpu        integer     2
resource_manager_cpu_allocation integer     48

Having no possibility to hint the internal flash back Oracle code so that it will not execute in parallel, all what I have been left with is to pre-empt Oracle from starting a huge number of parallel process by limiting the parallel_max_servers parameter to 8 and, and as such, the maximum DOP will be limited to 8 whatever the cpu_count is.

Once this done I observed the following new situation for one flashed back sql_id (a5u912v53t11t)


Global Information
------------------------------
 Status              :  DONE
 Instance ID         :  1
 Session             :  XXXXX (172:40851)
 SQL ID              :  a5u912v53t11t
 SQL Execution ID    :  16777236
 Execution Started   :  07/16/2015 11:21:55
 First Refresh Time  :  07/16/2015 11:21:55
 Last Refresh Time   :  07/16/2015 11:21:55
 Duration            :  .011388s
 Module/Action       :  JDBC Thin Client/-
 Service             :  SYS$USERS
 Program             :  JDBC Thin Client
 DOP Downgrade       :  92%

Global Stats
=======================================================
| Elapsed |   Cpu   | Concurrency |  Other   | Buffer |
| Time(s) | Time(s) |  Waits(s)   | Waits(s) |  Gets  |
=======================================================
|    0.05 |    0.00 |        0.04 |     0.00 |     19 |
=======================================================

Parallel Execution Details (DOP=8 , Servers Requested=96 , Servers Allocated=8)
==============================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   | Concurrency |Buffer |
|                |       |         | Time(s) | Time(s) |  Waits(s)   | Gets  |
==============================================================================
| PX Coordinator | QC    |         |    0.00 |    0.00 |             |     4 |
| p000           | Set 1 |       1 |    0.01 |         |        0.01 |     3 |
| p001           | Set 1 |       2 |    0.01 |         |        0.01 |     3 |
| p002           | Set 1 |       3 |    0.00 |         |        0.00 |     3 |
| p003           | Set 1 |       4 |    0.00 |         |        0.00 |     3 |
| p004           | Set 1 |       5 |    0.00 |    0.00 |        0.00 |     3 |
| p005           | Set 1 |       6 |    0.01 |         |        0.01 |       |
| p006           | Set 1 |       7 |    0.00 |         |        0.00 |       |
| p007           | Set 1 |       8 |    0.01 |         |        0.01 |       |
==============================================================================

SQL Plan Monitoring Details (Plan Hash Value=96405358)
============================================================
| Id |        Operation        |   Name   |  Rows   | Cost |
|    |                         |          | (Estim) |      |
============================================================
|  0 | INSERT STATEMENT        |          |         |      |
|  1 |   LOAD AS SELECT        |          |         |      |
|  2 |    PX COORDINATOR       |          |         |      |
|  3 |     PX SEND QC (RANDOM) | :TQ10000 |     409 |    2 |
|  4 |      PX BLOCK ITERATOR  |          |     409 |    2 |
|  5 |       TABLE ACCESS FULL | TABLE_CLS|     409 |    2 |
============================================================

Notice how Oracle has serviced the insert statement with 8 parallel servers instead of the requested 96 servers. This is a clear demonstration of how to bound the default DOP

Parallel Execution Details (DOP=8, Servers Requested=96, Servers Allocated=8)

Unfortunately, despite this implicit parallel run limitation, the application was still suffering from the same library cache symptoms (less than before thought) as shown below:

SQL> select event, count(1)
    from gv$active_session_history
    where sample_time between to_date('16072015 10:57:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('16072015 11:30:00', 'ddmmyyyy hh24:mi:ss')
    group by event
    order by 2 desc;

EVENT                                  COUNT(1)
------------------------------------ ----------
library cache: mutex X                      518
                                            382
db file sequential read                     269
read by other session                        42
kksfbc child completion                      37
null event                                   31
log file parallel write                      18
cursor: pin S wait on X                      12
latch: shared pool                            7
cursor: pin S                                 7
log file sync                                 5
latch free                                    5
enq: RO - fast object reuse                   3
SQL*Net more data from client                 2
db file parallel write                        2
enq: CR - block range reuse ckpt              1
os thread startup                             1

SQL> select sql_id, session_id,in_parse, in_sql_execution
    from gv$active_session_history
    where sample_time between to_date('16072015 10:57:00', 'ddmmyyyy hh24:mi:ss'
                      and     to_date('16072015 11:30:00', 'ddmmyyyy hh24:mi:ss'
    and event = 'library cache: mutex X'
    order by sql_id;

SQL_ID        SESSION_ID IN_PARSE IN_SQL_EXEC
------------- ---------- -------- -----------
a5u912v53t11t        516 Y 	 N
a5u912v53t11t        494 Y 	 N
a5u912v53t11t        343 Y 	 N
a5u912v53t11t        482 Y 	 N

Finally we agreed with the client to disable parallelism (by setting the parallel_max_servers parameter value to 1) so that the flash back treatment will go serially:

SQL> show parameter parallel_max_servers

NAME                           TYPE        VALUE
------------------------------ ----------- -----
parallel_max_servers           integer     1

Once this has been done the test campaign finally started to perform very quickly with the following picture from ASH:

SQL> select event, count(1)
    from gv$active_session_history
    where sample_time between to_date('16072015 14:15:00', 'ddmmyyyy hh24:mi:ss')
                      and     to_date('16072015 15:30:00', 'ddmmyyyy hh24:mi:ss')
    group by event
    order by 2 desc;

EVENT                                  COUNT(1)
------------------------------------- ----------
                                             966
db file sequential read                      375
db file scattered read                        49
log file parallel write                       46
log file sync                                 22
db file parallel write                        13
null event                                     8
local write wait                               7
SQL*Net more data from client                  5
os thread startup                              3
reliable message                               3
enq: PS - contention                           3
enq: RO - fast object reuse                    3
cursor: pin S wait on X                        1
direct path read                               1
Disk file operations I/O                       1
enq: CR - block range reuse ckpt               1
enq: TX - row lock contention                  1

The flashed back treatment ceases completely from being run in parallel and the campaign test started again to perform quickly.

This is not an invitation to go with drastic and brutal workaround to reduce the effect of many sessions waked up due to a very high degree of parallelism itself due to the default maximum DOP. It represents a demonstration on

how a high degree of parallelism can affect the locking in the library cache
how the parallel_max_servers parameter can bound the DOP of your query

↧

Adaptive Cursor Sharing triggering mechanism

August 12, 2015, 3:08 pm

≫ Next: Cardinality Feedback: a practical case

≪ Previous: Flash back causing library cache: mutex X

Inspired by Dominic Brooks’ last post on SQL Plan Management choices, I decided to do the same work about my thoughts on Adaptive and Extended Cursor Sharing triggering mechanism:

Once a cursor is bind aware and subject to an eventual plan optimization at each execution keep a careful eye on the number of cursors the Extended Cursor Sharing Layer are going to produce

↧

Cardinality Feedback: a practical case

August 21, 2015, 10:36 am

≫ Next: CBO decision: unique or non-unique index?

≪ Previous: Adaptive Cursor Sharing triggering mechanism

Here it is an interesting case of cardinality feedback collected from an 11.2.0.3 running system. A simple query against a single table has a perfect first execution response time with, according to the human eyes, a quite acceptable difference between Oracle cardinality estimates and actual rows as shown below:

SELECT
   tr_id
FROM
    t1 t1
WHERE
     t1.t1_col_name= 'GroupID'
AND  t1.t1_col_value= '6276931'
AND EXISTS(SELECT
               1
            FROM
                t1 t2
            WHERE t1.tr_id   = t2.tr_id
            AND   t2.t1_col_name= 'TrRangeOrder'
            AND   t2.t1_col_value= 'TrOrderPlace'
           );

SQL_ID  8b3tv5uh8ckfb, child number 0
-------------------------------------

Plan hash value: 1066392926
--------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |      1 |        |      1 |00:00:00.14 |
|   1 |  NESTED LOOPS SEMI           |                         |      1 |      1 |      1 |00:00:00.14 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1                      |      1 |      1 |      6 |00:00:00.07 |
|*  3 |    INDEX RANGE SCAN          | IDX_T1_NAME_VALUE       |      1 |      1 |      6 |00:00:00.03 |
|*  4 |   TABLE ACCESS BY INDEX ROWID| T1                      |      6 |      1 |      1 |00:00:00.07 |
|*  5 |    INDEX UNIQUE SCAN         | T1_PK                   |      6 |      1 |      6 |00:00:00.07 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."T1_COL_NAME"='GroupID' AND "T1"."T1_COL_VALUE"='6276931')
   4 - filter("T2"."T1_COL_VALUE"='TrOrderPlace')
   5 - access("T1"."TR_ID"="T2"."TR_ID" AND
               "T2"."T1_COL_NAME"='TrRangeOrder')

And here it is the second dramatic execution plan and response time due, this time, to cardinality feedback optimisation:


SQL_ID  8b3tv5uh8ckfb, child number 1
-------------------------------------
Plan hash value: 3786385867
----------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                         |      1 |        |      1 |00:10:40.14 |
|   1 |  NESTED LOOPS                  |                         |      1 |        |      1 |00:10:40.14 |
|   2 |   NESTED LOOPS                 |                         |      1 |      1 |    787K|00:09:31.00 |
|   3 |    SORT UNIQUE                 |                         |      1 |      1 |    787K|00:02:44.83 |
|   4 |     TABLE ACCESS BY INDEX ROWID| T1                      |      1 |      1 |    787K|00:02:41.58 |
|*  5 |      INDEX RANGE SCAN          | IDX_T1_NAME_VALUE       |      1 |      1 |    787K|00:00:36.46 |
|*  6 |    INDEX UNIQUE SCAN           | T1_PK                   |    787K|      1 |    787K|00:06:45.25 |
|*  7 |   TABLE ACCESS BY INDEX ROWID  | T1                      |    787K|      1 |      1 |00:05:00.24 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("T2"."T1_COL_NAME"='TrRangeOrder' AND "T2"."T1_COL_VALUE"='TrOrderPlace')
   6 - access("T1"."TR_ID"="T2"."TR_ID" AND "T1"."T1_COL_NAME"='GroupID')
   7 - filter("T1"."T1_COL_VALUE"='6276931')

Note
-----
   - cardinality feedback used for this statement

There is no real noticeable difference between actual and estimated rows in the first run of the query (E-Rows =1 versus A-Rows = 6) which implies a new re-optimisation. But Oracle did it and marked the child cursor n°0 candidate for a cardinality feedback:

SQL> select
       sql_id
      ,child_number
      ,use_feedback_stats
    from
      v$sql_shared_cursor
    where
      sql_id = '8b3tv5uh8ckfb';

SQL_ID        CHILD_NUMBER U
------------- ------------ -
8b3tv5uh8ckfb            0 Y

The bad news however with this Oracle decision is that we went from a quasi-instantaneous response time to a catastrophic 10 min. In the first plan the always suspicious estimated ‘’1’’ cardinality is not significantly far from actual rows (6), so why then Oracle has decided to re-optimize the first cursor? It might be “possible” that when Oracle rounds up its cardinality estimation to 1 for a cursor that has been previously monitored for cardinality feedback, it flags somewhere that this cursor is subject to a re-optimization during its next execution whatever the actual rows will be (close to 1 or not)?

Fortunately, this second execution has also been marked for re-optimisation:

SQL> select
       sql_id
      ,child_number
      ,use_feedback_stats
    from
      v$sql_shared_cursor
    where
      sql_id = '8b3tv5uh8ckfb';

SQL_ID        CHILD_NUMBER U
------------- ------------ -
8b3tv5uh8ckfb            0 Y
8b3tv5uh8ckfb            1 Y

And the third execution of the query produces the following interesting execution plan

SQL_ID  8b3tv5uh8ckfb, child number 2
-------------------------------------

Plan hash value: 1066392926
--------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |      1 |        |      1 |00:00:00.01 |
|   1 |  NESTED LOOPS SEMI           |                         |      1 |      1 |      1 |00:00:00.01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1                      |      1 |      6 |      6 |00:00:00.01 |
|*  3 |    INDEX RANGE SCAN          | IDX_T1_NAME_VALUE       |      1 |      6 |      6 |00:00:00.01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID| T1                      |      6 |      1 |      1 |00:00:00.01 |
|*  5 |    INDEX UNIQUE SCAN         | T1_PK                   |      6 |      1 |      6 |00:00:00.01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."T1_COL_NAME"='GroupID' AND "T1"."T1_COL_VALUE"='6276931')
   4 - filter("T2"."T1_COL_VALUE"='TrOrderPlace')
   5 - access("T1"."TR_ID"="T2"."TR_ID" AND
              "T2"."T1_COL_NAME"='TrRangeOrder')

Note
-----
   - cardinality feedback used for this statement

Oracle is back to its first execution plan. The new estimations coincide perfectly with the actuals so that Oracle decided to stop monitoring this cursor with cardinality feedback as shown below:

SQL> select
       sql_id
      ,child_number
      ,use_feedback_stats
    from
      v$sql_shared_cursor
    where
      sql_id = '8b3tv5uh8ckfb';

SQL_ID        CHILD_NUMBER U
------------- ------------ -
8b3tv5uh8ckfb            0 Y
8b3tv5uh8ckfb            1 Y
8b3tv5uh8ckfb            2 N

Several questions come to my mind at this stage of the investigation:

What are the circumstances for which Oracle marks a cursor for cardinality feedback optimisation?
How Oracle decides that E-Rows are significantly different from A-Rows and henceforth a cursor re-optimization will be done? In other words is E-Rows =1 significantly different from A-Rows =6? Or does that suspicious cardinality 1 participate in Oracle decision to re-optimize a cursor monitored with cardinality feedback?

Let’s try to answer the first question. There is only one unique table involved in this query with two conjunctive predicates. The two predicate columns have the following statistics

SQL> select
        column_name
       ,num_distinct
       ,density
       ,histogram
     from
	    all_tab_col_statistics
     where
        table_name = 'T1'
     and
       column_name in ('T1_COL_NAME','T1_COL_VALUE');

COLUMN_NAME     NUM_DISTINCT    DENSITY HISTOGRAM
--------------- ------------ ---------- ---------------
T1_COL_NAME           103     4,9781E-09 FREQUENCY
T1_COL_VALUE      14833664   ,000993049  HEIGHT BALANCED

The presence of histograms, particularly the HEIGHT BALANCED, on these two columns participates strongly in the Oracle decision to monitor the cursor for cardinality feedback. In order to be sure of it I decided to get rid of histograms from both columns and re-query again:

SQL> select
        column_name
       ,num_distinct
       ,density
       ,histogram
     from
	    all_tab_col_statistics
     where
        table_name = 'T1'
     and
       column_name in ('T1_COL_NAME','T1_COL_VALUE');

COLUMN_NAME     NUM_DISTINCT    DENSITY HISTOGRAM
--------------- ------------ ---------- ---------
T1_COL_NAME           103    ,009708738 NONE
T1_COL_VALUE      15477760   6,4609E-08 NONE

The new cursor is not anymore monitored with the cardinality feedback as shown below:

SQL_ID  fakc7vfbu1mam, child number 0
-------------------------------------

Plan hash value: 739349168
--------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |      1 |        |      1 |00:02:00.68 |
|*  1 |  HASH JOIN SEMI              |                         |      1 |      6 |      1 |00:02:00.68 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1                      |      1 |      6 |      6 |00:00:00.01 |
|*  3 |    INDEX RANGE SCAN          | IDX_T1_NAME_VALUE       |      1 |      6 |      6 |00:00:00.01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T1                      |      1 |      6 |    787K|00:02:00.14 |
|*  5 |    INDEX RANGE SCAN          | IDX_T1_NAME_VALUE       |      1 |      6 |    787K|00:00:12.36 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."TR_ID"="T2"."TR_ID")
   3 - access("T1"."T1_COL_NAME"='GroupID' AND "T1"."T1_COL_VALUE"='6276931')
   5 - access("T2"."T1_COL_NAME"='TrRangeOrder' AND "T2"."T1_COL_VALUE"='TrOrderPlace')

   SQL> select
         sql_id
        ,child_number
        ,use_feedback_stats
    from
       v$sql_shared_cursor
    where
        sql_id = 'fakc7vfbu1mam';

SQL_ID        CHILD_NUMBER U
------------- ------------ -
fakc7vfbu1mam            0 N --> cursor not re-optimisable

Without histograms on the two columns Oracle has not monitored the query for cardinality feedback. Unfortunately getting rid of histogram was not an option accepted by the client nor changing this packaged query to force the optimizer not unnesting the EXISTS subquery with its parent query as far as the later is always generating a couple of rows that will not hurt performance when filtered with the EXIST subquery. Attaching a SQL Profile has also been discarded because several copies of the same query are found in the packaged application which would have necessitated a couple of extra SQL Profiles.

The last option that remains at my hands was to collect extended statistics so that Oracle will be able to get accurate estimations and henceforth will stop using cardinality feedback

SQL> SELECT
       dbms_stats.create_extended_stats
       (ownname   => user
       ,tabname   => 'T1'
       ,extension => '(T1_COL_NAME,T1_COL_VALUE)'
      )
  FROM dual;

DBMS_STATS.CREATE_EXTENDED_STATS(
---------------------------------
SYS_STUE3EBVNLB6M1SYS3A07$LD52

SQL> begin
      dbms_stats.gather_table_stats
            (user
           ,'T1'
           ,method_opt    => 'for columns SYS_STUE3EBVNLB6M1SYS3A07$LD52 size skewonly'
           ,cascade       => true
           ,no_invalidate => false
            );
    end;
    /

SQL> select
       column_name
      ,num_distinct
      ,density
      ,histogram
    from all_tab_col_statistics
    where
        table_name = 'T1'
    and column_name in ('T1_COL_NAME','T1_COL_VALUE', 'SYS_STUE3EBVNLB6M1SYS3A07$LD52');

COLUMN_NAME                    NUM_DISTINCT    DENSITY HISTOGRAM
------------------------------ ------------ ---------- ---------------
SYS_STUE3EBVNLB6M1SYS3A07$LD52     18057216 ,000778816 HEIGHT BALANCED
T1_COL_NAME                          103    4,9781E-09 FREQUENCY
T1_COL_VALUE                     14833664   ,000993049 HEIGHT BALANCED


SQL_ID  dn6p58b9b6348, child number 0
-------------------------------------
Plan hash value: 1066392926
--------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |      1 |        |      1 |00:00:00.01 |
|   1 |  NESTED LOOPS SEMI           |                         |      1 |      3 |      1 |00:00:00.01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1                      |      1 |      3 |      6 |00:00:00.01 |
|*  3 |    INDEX RANGE SCAN          | IDX_T1_NAME_VALUE       |      1 |      3 |      6 |00:00:00.01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID| T1                      |      6 |    832K|      1 |00:00:00.01 |
|*  5 |    INDEX UNIQUE SCAN         | T1_PK                   |      6 |      1 |      6 |00:00:00.01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."T1_COL_NAME"='GroupID' AND "T1"."T1_COL_VALUE"='6276931')
   4 - filter("T2"."T1_COL_VALUE"='TrOrderPlace')
   5 - access("T1"."TR_ID"="T2"."TR_ID" AND
              "T2"."T1_COL_NAME"='TrRangeOrder')

SQL> select
       sql_id
      ,child_number
      ,use_feedback_stats
    from
      v$sql_shared_cursor
    where
      sql_id = 'dn6p58b9b6348';

SQL_ID        CHILD_NUMBER U
------------- ------------ -
dn6p58b9b6348            0 N

This time, for E-Rows = 3 and A-Rows =6, Oracle decided that there is no significant difference between cardinality estimates and the actual rows so that the cursor is not anymore subject to cardinality feedback optimization.

You might have pointed out that I have forced the Extended Statistics column to have histogram. Otherwise the cardinality feedback will kicks off. In fact I have conducted several experiments to see when the cardinality feedback occurs and when not depending on the existence or the absence of the column group extension, its type of statistics and the statistics that have been gathered on the underlying two columns predicates:

↧

CBO decision: unique or non-unique index?

September 2, 2015, 10:43 am

≫ Next: Bind aware secret sauce (again)

≪ Previous: Cardinality Feedback: a practical case

I have been asked to look at one of those few particular frustrating situations that only running systems can procure. It is an update of a single table using a complete set of its primary key columns in order to locate and update a unique row. This update looks like:

UPDATE T1
SET
  {list of columns}
WHERE
    T1_DATE    = :B9
AND T1_I_E_ID  = :B8
AND T1_TYPE    = :B7
AND DATE_TYPE  = :B6
AND T1_AG_ID   = :B5
AND T1_ACC_ID  = :B4
AND T1_SEC_ID  = :B3
AND T1_B_ID    = :B2
AND T1_FG_ID   = :B1;

The 9 columns in the above where clause represent the primary key of the T1 table. You might be surprised to know that this update didn’t used the primary key index and preferred instead a range scan of an existing 3 columns index plus a table access by index rowid to locate and update a unique row:

----------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT             |        |       |       |     1 (100)|
|   1 |  UPDATE                      | T1     |       |       |            |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |     1 |   122 |     1   (0)|
|   3 |    INDEX RANGE SCAN          | IDX_T1 |     1 |       |     1   (0)|
----------------------------------------------------------------------------
   2 - filter("T1_TYPE"=:B7 AND "DATE_TYPE"=:B6 AND "T1_I_E_ID"=:B8 AND
              "T1_AG_ID"=TO_NUMBER(:B5) AND "T1_DATE"=TO_TIMESTAMP(:B9) AND
              "T1_FG_ID"=TO_NUMBER(:B1))
   3 - access("T1_SEC_ID"=TO_NUMBER(:B3) AND "T1_B_ID"=TO_NUMBER(:B2) AND
              "T1_ACC_ID"=TO_NUMBER(:B4))

The same update, when hinted with the primary key index, uses the following much desirable execution plan:

------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |        |     1 |   126 |     1   (0)|
|   1 |  UPDATE            | T1     |       |       |            |
|*  2 |   INDEX UNIQUE SCAN| PK_T19 |     1 |   126 |     1   (0)|
------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1_I_E_ID"=:B8 AND "T1_TYPE"=:B7 AND "DATE_TYPE"=:B6
              AND "T1_AG_ID"=TO_NUMBER(:B5) AND "T1_ACC_ID"=TO_NUMBER(:B4) AND
              "T1_SEC_ID"=TO_NUMBER(:B3) AND "T1_B_ID"=TO_NUMBER(:B2) AND
              "T1_FG_ID"=TO_NUMBER(:B1) AND "T1_DATE"=TO_TIMESTAMP(:B9))

I don’t know how Oracle manage to get the same cost (Cost = 1) for two completely different indexes one with 9 columns and one(which is a subset of the first index) with only 3 columns (not necessarily of the same starting order though)?

So why Oracle has not selected the primary key unique index?

First here are below the available statistics on the primary key columns:

SQL> select
      column_name
     ,num_distinct
     ,num_nulls
     ,histogram
    from
      all_tab_col_statistics
    where
     table_name = 'T1'
   and
    column_name in ('T1_DATE'
                   ,'T1_I_E_ID'
                   ,'T1_TYPE'
                   ,'DATE_TYPE'
                   ,'T1_AG_ID'
                   ,'T1_ACC_ID'
                   ,'T1_SEC_ID'
                   ,'T1_B_ID'
                   ,'T1_FG_ID' );

COLUMN_NAME         NUM_DISTINCT  NUM_NULLS HISTOGRAM
------------------- ------------ ---------- ---------------
T1_I_E_ID              2          0 		FREQUENCY
T1_TYPE                5          0 		FREQUENCY
DATE_TYPE              5          0 		FREQUENCY
T1_AG_ID               106        0 		FREQUENCY
T1_ACC_ID             182         0 		FREQUENCY
T1_DATE               2861        0 		HEIGHT BALANCED
T1_SEC_ID             3092480     0 		NONE
T1_B_ID               1452        0 		HEIGHT BALANCED
T1_FG_ID              1           0 		FREQUENCY

And here’s the corresponding “update” 10053 trace file restricted only to the important part related to my investigations

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 387027527  #Blks:  6778908  AvgRowLen:  126.00  ChainCnt:  0.00

Index Stats::
Index: IDX_T1  Col#: 7 8 5
 LVLS: 3  #LB: 1568007  #DK: 3314835  LB/K: 1.00   DB/K: 25.00  CLUF: 84758374.00

Index: PK_T19  Col#: 1 2 3 4 5 7 8 37 6
 LVLS: 4  #LB: 4137117  #DK: 377310281  LB/K: 1.00  DB/K: 1.00  CLUF: 375821219.00

And the part of same trace file where the index choice is done

Access Path: index (UniqueScan)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647 ----------------------> spot this
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 1.00  Resp: 1.00  Degree: 1
    ColGroup Usage:: PredCnt: 3  Matches Full:  Partial:
    ColGroup Usage:: PredCnt: 3  Matches Full:  Partial:

Access Path: index (AllEqRange)
    Index: IDX_T1
    resc_io: 5.00  resc_cpu: 36797 ----------------------> spot this
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 1.00  Resp: 1.00  Degree: 1

Best:: AccessPath: IndexRange
  Index: IDX_T1
    Cost: 1.00  Degree: 1  Resp: 1.00  Card: 0.00  Bytes: 0

Looking closely to the above trace file I didn’t find any difference in the index costing information (ix_sel_with_filters, resc_io, cost) which favours the non-unique IDX_T1 index over the PK_T19 primary key unique index except the resc_cpu value which equals 36797 for the former and 37647 for the latter index. I haven’t considered the clustering factor information because the index I wanted the CBO to use is a unique index. The two indexes have the same cost in this case. What extra information the CBO is using in this case to prefer the non-unique index over the unique one?

This issue remembered me an old otn thread in which the Original Poster says that, under the default CPU costing model, when two indexes have the same cost, Oracle will consider using the less CPU expensive index.

As far as I have a practical case of two different indexes with the same cost, I have decided to check this assumption by changing the costing model from CPU to I/O

SQL> alter session set "_optimizer_cost_model"=io;

SQL> explain plan for
UPDATE T1
SET
  {list of columns}
WHERE
    T1_DATE   = :B9
AND T1_I_E_ID = :B8
AND T1_TYPE   = :B7
AND DATE_TYPE = :B6
AND T1_AG_ID  = :B5
AND T1_ACC_ID = :B4
AND T1_SEC_ID = :B3
AND T1_B_ID   = :B2
AND T1_FG_ID  = :B1;

SQL> select * from table(dbms_xplan.display);

Plan hash value: 704748203
-------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | UPDATE STATEMENT   |        |     1 |   126 |     1 |
|   1 |  UPDATE            | T1     |       |       |       |
|*  2 |   INDEX UNIQUE SCAN| PK_T19 |     1 |   126 |     1 |
-------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1_I_E_ID"=:B8 AND "T1_TYPE"=:B7 AND
              "DATE_TYPE"=:B6 AND "T1_AG_ID"=TO_NUMBER(:B5) AND
              "T1_ACC_ID"=TO_NUMBER(:B4) AND "T1_SEC_ID"=TO_NUMBER(:B3) AND
              "T1_B_ID"=TO_NUMBER(:B2) AND "T1_FG_ID"=TO_NUMBER(:B1) AND
              "T1_DATE"=TO_TIMESTAMP(:B9))

Note
-----
   - cpu costing is off (consider enabling it)

Spot on. We get the desired primary key index without any help.

However changing the default costing model is not acceptable in the client PRODUCTION database. Continuing my root causes investigations I was getting the feeling that histograms are messing up this CBO index choice. This is why I decided to give it a try and get rid of histograms and analyze the corresponding 10053 trace file:

SQL> exec dbms_stats.gather_table_stats (user, 'T1', method_opt =&gt; 'for all columns size 1');


***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 387029216  #Blks:  6778908  AvgRowLen:  126.00  ChainCnt:  0.00

Index Stats::
  Index: IDX_T1  Col#: 7 8 5
  LVLS: 3  #LB: 1625338  #DK: 3270443  LB/K: 1.00   DB/K: 26.00  CLUF: 87831324.00

  Index: PK_T19  Col#: 1 2 3 4 5 7 8 37 6
  LVLS: 4  #LB: 4335908  #DK: 395902926  LB/K: 1.00  DB/K: 1.00  CLUF: 394578898.00

Access Path: index (UniqueScan)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 1.00  Resp: 1.00  Degree: 1
  ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.0000
  ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.0000

  Access Path: index (AllEqRange)
    Index: IDX_T1
    resc_io: 31.00  resc_cpu: 370081
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 6.20  Resp: 6.20  Degree: 1 ------&gt; spot the cost

Access Path: index (AllEqUnique)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000
    Cost: 1.00  Resp: 1.00  Degree: 1
 One row Card: 1.000000

  Best:: AccessPath: IndexUnique
  Index: PK_T19
    Cost: 1.00  Degree: 1  Resp: 1.00  Card: 1.00  Bytes: 0

And now that the cost of accessing the non-unique index became 6 times (Cost = 6,2) greater than that of the unique index (Cost =1) Oracle preferred the primary key index without any help:

============
Plan Table
============
-------------------------------------------------+----------------------+
| Id  | Operation           | Name  | Rows  | Bytes | Cost  | Time      |
-------------------------------------------------+----------------------+
| 0   | UPDATE STATEMENT    |       |       |       |     1 |           |
| 1   |  UPDATE             | T1    |       |       |       |           |
| 2   |   INDEX UNIQUE SCAN | PK_T19|     1 |   126 |     1 |  00:00:01 |
-------------------------------------------------+-----------------------+

The final acceptable decision to solve this issue was to hint an instance of the same query to use the primary key index and attach an SQL profile to the original packaged query using the plan_hash_value of the primary key index execution plan.

Bottom Line: under the default CPU costing model, when two (or more) indexes have the same cost Oracle will prefer using the index that it is going to consume the less amount of CPU (resc_cpu). And, before jumping on collecting histograms (particularly the Height Balanced ones) by default, be informed that they do participate in the perception Oracle has on the amount of CPU the different indexes will need and ultimately on the CBO index desirability.

↧

Bind aware secret sauce (again)

September 5, 2015, 1:21 am

≫ Next: Basic versus OLTP compression

≪ Previous: CBO decision: unique or non-unique index?

I am sure many of you have already become bored by my posts on adaptive cursor sharing. I hope this article will be the last one :-). In part III of the installment I was unable to figure out the secret sauce Oracle is using to mark a cursor bind aware when count of the 3 buckets of a bind sensitive cursor are greater than 0. For those who want to know what bucket and count represent they can start by reading part I and part II

Thanks to a comment dropped by an anonymous reader coupled with my understanding of the adaptive cursor sharing mechanism I think I have succeeded to figure out that secret sauce. My goal is to publish it and let readers break it and invalidate it.

To make things simple I created the following function

------------------------------------------------------------------------------
-- File name:   fv_will_cs_be_bind_aware
-- Author   :   Mohamed Houri (Mohamed.Houri@gmail.com)
-- Date     :   29/08/2015
-- Purpose  :   When supplied with 3 parameters
--                   pin_cnt_bucket_0 : count of bucket_id n°0
--                   pin_cnt_bucket_1 : count of bucket_id n°1
--                   pin_cnt_bucket_2 : count of bucket_id n°2
--
--              this function will return a status:
--
--              'Y' if the next execution at any bucket_id will mark the cursor bind aware
--
--              'N' if the next execution any bucket_id will NOT mark the cursor bind aware
--
--------------------------------------------------------------------------------
create or replace function fv_will_cs_be_bind_aware
  (pin_cnt_bucket_0 in number
  ,pin_cnt_bucket_1 in number
  ,pin_cnt_bucket_2 in number)
return
   varchar2
is
  lv_will_be_bind_aware
                 varchar2(1) := 'N';
  ln_least_0_2   number      := least(pin_cnt_bucket_0,pin_cnt_bucket_2);
  ln_great_0_2   number      := greatest(pin_cnt_bucket_0,pin_cnt_bucket_2);

begin
  if ln_least_0_2 >= ceil ((ln_great_0_2-pin_cnt_bucket_1)/3)
  then
    return 'Y';
  else
    return 'N';
  end if;
end fv_will_cs_be_bind_aware;

If you have a cursor with a combination of 3 buckets having a count > 0 and you want to know whether the next execution will mark the cursor bind aware or not then you have just to do this:

SQL> select fv_will_cs_be_bind_aware(10,1,3) acs from dual;

ACS
---
Y

Or this

SQL> select fv_will_cs_be_bind_aware(10,1,2) acs from dual;

ACS
---
N

In its first call the function is indicating that the next cursor execution will compile a new optimal plan while the second call indicates that the existing child cursor will still be shared.

It’s now time to practice:

SQL> alter session set cursor_sharing=FORCE;

SQL> select count(1) from t_acs where n2 = 100;

  COUNT(1)
----------
       100

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5  from
  6     v$sql_cs_histogram
  7  where sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0          1
           0          1          0
           0          2          0

SQL> select count(1) from t_acs where n2 = 100;

  COUNT(1)
----------
       100
SQL> /

-- repeat this 7 times

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5  from
  6     v$sql_cs_histogram
  7  where sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0         10 --> 10 executions at bucket_id 0
           0          1          0
           0          2          0

Now change the bind variable value so that the bucket_id n°1 will be incremented

SQL> select count(1) from t_acs where n2 = 10000;

  COUNT(1)
----------
    100000

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5  from
  6     v$sql_cs_histogram
  7  where sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0         10
           0          1          1 --> 1 executions at bucket_id 1
           0          2          0

Now change again the bind variable value so that the bucket_id n°2 will be incremented

SQL> select count(1) from t_acs where n2 = 1000000;

  COUNT(1)
----------
   1099049

SQL> select count(1) from t_acs where n2 = 1000000;

  COUNT(1)
----------
   1099049

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5 from
  6     v$sql_cs_histogram
  7  where sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0         10
           0          1          1
           0          2          2 --> 2 executions at bucket_id 2

If, at this stage, you want to know whether the next execution at bucket id n°2 will mark the cursor bind aware or not then make a call to the function:

SQL> select fv_will_cs_be_bind_aware(10,1,2) acs from dual;

ACS
----
N

No, it will not and here’s below the proof:

SQL> select count(1) from t_acs where n2 = 1000000;

  COUNT(1)
----------
   1099049

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5  from
  6     v$sql_cs_histogram
  7  where sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0         10
           0          1          1
           0          2          3

And what about the next execution, say at bucket_id n° 0?

SQL> select fv_will_cs_be_bind_aware(10,1,3) acs from dual;

ACS
----
Y

The function is indicating that the next execution will compile a new child cursor; let’s check:

SQL> select count(1) from t_acs where n2 = 100;

  COUNT(1)
----------
       100

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5  from
  6     v$sql_cs_histogram
  7  where sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           1          0          1
           1          1          0
           1          2          0
           0          0         10
           0          1          1
           0          2          3

Yes it did.

Can we assume from a single test that this function is reliable? No.

You want another example? Here it is:

SQL> -- run the following sql 19 times at bucket_id n°0
SQL> select count(1) from t_acs where n2 = 100;

SQL> -- run the same sql 6 times at bucket_id n°1
SQL> select count(1) from t_acs where n2 = 10000;

SQL> -- run the same sql 2 times at bucket_id n°2
SQL> select count(1) from t_acs where n2 = 1000000;

And here’s the resulting cursor sharing picture:

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5 from
  6     v$sql_cs_histogram
  7 where  sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0         19
           0          1          6
           0          2          2

Will the next execution compile a new execution plan?

SQL> select fv_will_cs_be_bind_aware(19,6,2) acs from dual;

ACS
---
N

No, it will not as proofed here below:

SQL> select count(1) from t_acs where n2 = 1000000;

  COUNT(1)
----------
   1099049

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5  from
  6     v$sql_cs_histogram
  7  where  sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0         19
           0          1          6
           0          2          3

And what about the next execution at the same bucket_id n°2? And the next next execution?

SQL> select fv_will_cs_be_bind_aware(19,6,3) acs from dual;

ACS
---
N

SQL> select fv_will_cs_be_bind_aware(19,6,4) acs from dual;

ACS
---
N

And the next execution?

SQL> select fv_will_cs_be_bind_aware(19,6,5) acs from dual;

ACS
----
Y

At the (bucket_id, count) situation shown below the function is indicating that the next execution will mark the cursor bind aware and compile a new execution plan:

SQL> select
  2     child_number
  3    ,bucket_id
  4    ,count
  5  from
  6    v$sql_cs_histogram
  7  where  sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0         19
           0          1          6
           0          2          5

Isn’t it?

SQL> select count(1) from t_acs where n2 = 1000000;

  COUNT(1)
----------
   1099049

SQL> select
  2      child_number
  3     ,bucket_id
  4     ,count
  5  from
  6     v$sql_cs_histogram
  7  where  sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           1          0          0
           1          1          0
           1          2          1
           0          0         19
           0          1          6
           0          2          5

Yes it did as you can point it out via the apparition of the new child cursor n°1

Want another example? Here’s

SQL> select
        child_number
       ,bucket_id
       ,count
    from
        v$sql_cs_histogram
    where sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0          3
           0          1          1
           0          2         11

SQL> select fv_will_cs_be_bind_aware(3,1,11) acs from dual;

ACS
---
N

SQL> select count(1) from t_acs where n2 = 10000;

  COUNT(1)
----------
    100000

SQL> select
        child_number
       ,bucket_id
       ,count
    from
        v$sql_cs_histogram
    where sql_id = '7ck8k47bnqpnv';

CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           0          0          3
           0          1          2
           0          2         11

SQL> select fv_will_cs_be_bind_aware(3,2,11) acs from dual;

ACS
---
Y

SQL> select count(1) from t_acs where n2 = 1000000;

  COUNT(1)
----------
   1099049

SQL> select
        child_number
       ,bucket_id
       ,count
    from
        v$sql_cs_histogram
    where sql_id = '7ck8k47bnqpnv';


CHILD_NUMBER  BUCKET_ID      COUNT
------------ ---------- ----------
           1          0          0
           1          1          0
           1          2          1
           0          0          3
           0          1          2
           0          2         11

I am sure that someone will come up with a simple situation where the function is returning a wrong result. Bear in mind that this is what I want and you’re welcome.

Footnote

If you want to break this function then here’s the model you can use (you need to have histogram on n2 column)

create table t_acs(n1  number, n2 number);

BEGIN
     for j in 1..1200150 loop
      if j = 1 then
       insert into t_acs values (j, 1);
      elsif j>1 and j<=101 then
       insert into t_acs values(j, 100);
      elsif j>101 and j<=1101 then
       insert into t_acs values (j, 1000);
      elsif j>10001 and j<= 110001 then
      insert into t_acs values(j,10000);
     else
      insert into t_acs values(j, 1000000);
     end if;
    end loop;
   commit;
END;
/
create index t_acs_i1 on t_acs(n2);

↧

Basic versus OLTP compression

September 18, 2015, 4:53 am

≫ Next: Oracle Optimizer and SPM plan interaction

≪ Previous: Bind aware secret sauce (again)

If you are going to archive a table having a very large amount of data from a running system into a historical table and you expect from this archived table to be read-only then you have better to use the free BASIC compression mode instead of the paid OLTP mode. This article aims to show you the reason for this compression mode preference based on real life example.

In order to take advantage from the free BASIC compression mode you need first to insert data using a direct path load (this is why this compression mode used to be known as compression for direct path load operations or bulk load operations) and ensure that there is no trigger or integrity constraint on the inserted table that will make Oracle silently ignoring your requested direct path load mode. In passing, in contrast to the preceding Oracle releases, 12.1.0.2 Oracle database will inform you when a direct path is ignored via the Note at the bottom of the corresponding insert execution plan:

Note
-----
   - PDML disabled because triggers are defined
   - Direct Load disabled because triggers are defined

Here’s an example of BASIC compression I’ve executed on a 12.1.0.2 database?

SELECT
    t.owner,
    t.table_name,
    t.compress_for,
    s.bytes/1024/1024/1024 GB
FROM
    dba_tables t ,
    dba_segments s
WHERE
    s.segment_name = t.table_name
AND s.owner        = 'XXX'
AND t.table_name   = 'T1'
ORDER BY 4;

OWNER      TABLE_NAME   COMPRESS_FOR    GB
---------- ------------ ------------- ----------
XXX        T1                         263.114136

263 GB worth of data of which we need to keep only 2 years and archive the rest.

The first step in this process was to create an empty table cloned from the production table and define it to accept the BASIC mode compression:

-- create a table compressed with BASIC mode
CREATE TABLE T1_HIST
      COMPRESS BASIC
      TABLESPACE HIST_TABSPACE
         AS
      SELECT * FROM T1
      WHERE 1 = 2;

As far as I was going to send a huge amount of data into this archived table I decided to use a parallel insert which needs to be preceded by enabling parallel DML either by altering the session or by using the appropriate hint (for 12c)

-- enable parallel dml
SQL> alter session enable parallel dml;

The way is now paved to start sending historical data into their new destination:

-- direct path load data older than 2 years into the archived table
INSERT /*+ append parallel(bsic,4) */
  INTO T1_HIST bsic
  SELECT
          /*+ parallel(oltp,4) */
              *
    FROM  T1 oltp
       WHERE HIST_DAT > add_months(sysdate,-24);

2,234,898,367 rows created.

More than 2 billion of rows direct path loaded.

And the agreeable surprise is:

SQL> SELECT
    t.owner,
    t.table_name,
    t.compress_for,
    s.bytes/1024/1024/1024 GB
FROM
    dba_tables t ,
    dba_segments s
WHERE
    t.compress_for ='BASIC'
AND
    s.segment_name   = t.table_name
AND s.owner       = 'XXX'
ORDER BY 4;

OWNER      TABLE_NAME   COMPRESS_FOR   GB
---------- ------------ ------------- ----------
XXX         T1_HIST     BASIC         53.2504272

We went from a source table with 263GB worth of data to a cloned table compressed using BASIC mode to end up with only 53GB or historical data.

If you are wondering how much rows I have not send into the archived table then here’s

SQL> select /*+ parallel(8) */ count(1)
     from T1
     where HIST_DAT <= add_months(sysdate,-24);

7,571,098

This means that using the BASIC free compression mode I have archived almost all rows from the production table and succeeded to pack the initial 263GB into 53GB only. That’s an excellent compression ratio of 5 (we have divided the initial size by 5). Though that Oracle is saying that the compression ration depends on the nature of your data and it could range between a factor of 2x to 4x.

Should you have used the paid OLTP compression mode you would have got an archived table with a size approximatively 10% higher (~60GB). This is due to Oracle considering the table compressed in BASIC mode to be read only and not subject to any update with a default PCT_FREE set to 0 behind the scene:

SQL> select table_name, pct_free
    from dba_tables
    where table_name = 'T1_HIST';

TABLE_NAME    PCT_FREE
----------- ----------
T1_HIST          0

As you can see, if you intend to archive a huge amount of data into a read only table, want to gain disk space with this operation and you don’t want to pay for the OLTP compression then you can opt for the free BASIC compression mode.

There are few interesting things that come up along with this archiving process. Particularly the new 12c way (HYBRID TSM/HWMB) Oracle is using to keep down the number of new extents and to avoid HV enqueue wait event during parallel direct path load with a high DOP across several instances:

SQL Plan Monitoring Details (Plan Hash Value=3423860299)
====================================================================================
| Id |              Operation               |   Name   |  Rows   |Execs |   Rows   |
|    |                                      |          | (Estim) |      | (Actual) |
====================================================================================
|  0 | INSERT STATEMENT                     |          |         |    5 |        8 |
|  1 |   PX COORDINATOR                     |          |         |    5 |        8 |
|  2 |    PX SEND QC (RANDOM)               | :TQ10000 |    819M |    4 |        8 |
|  3 |     LOAD AS SELECT (HYBRID TSM/HWMB) |          |         |    4 |        8 |
|  4 |      OPTIMIZER STATISTICS GATHERING  |          |    819M |    4 |       2G |
|  5 |       PX BLOCK ITERATOR              |          |    819M |    4 |       2G |
|  6 |        TABLE ACCESS FULL             | T1       |    819M |  442 |       2G |
====================================================================================

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access(:Z>=:Z AND :Z<=:Z) filter("HIST_DAT">ADD_MONTHS(SYSDATE@!,-24))

Note
-----
   - Degree of Parallelism is 4 because of table property

But this will be detailed in a separate blog article.

↧

Oracle Optimizer and SPM plan interaction

October 1, 2015, 10:25 am

≫ Next: All on Adaptive and Extended Cursor Sharing presentation

≪ Previous: Basic versus OLTP compression

Continuing in the inspiration instilled into me by Dominic Brooks’ post on SQL Plan Management choices, I decided to picture the Oracle CBO behavior in presence of enabled and accepted SPM plan(s) baseline:

The right part of the picture, when triggered, demonstrates the parsing penalty you will have to pay before running your SQL query. Particularly when there are multiple accepted and enabled SPM plans the CBO has to try reproducing and costing all of them before making its final decision. The picture also shows that under all circumstances the CBO will start first by compiling its execution plan as if it is not constrained by any SPM plan. This clearly demonstrates that if your query is suffering from a hard parsing execution time (when the plan generation takes a lot of time) then SPM will not help you. This is where the mantra “When you can hint it then Baseline it” ceases to be accurate.

↧