Quantcast
Channel: Mohamed Houri’s Oracle Notes
Viewing all 224 articles
Browse latest View live

ORA-06502: PL/SQL: numeric or value error: Bulk Bind: Truncated Bind

$
0
0

Have you ever been faced to the above error?  I mean this bizarre Bulk Bind: Truncated Bind?  This is a strange and unclear error message. Let me put you in the context that gave me this error. I am trying to load data into an 11gR2 (11.2.0.3.0) database from data coming via a db link from a 10gR2 (10.2.0.5.0). This is not more as what I have simplified to the maximum here below

11gR2> begin
2   for x in (
3    select v2_col
4    from  distant_table
5    )
6   loop
7     null;
8  end loop;
9  end;
10  /

begin
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: Bulk Bind: Truncated Bind
ORA-06512: at line 2

And if instead I issue a straightforward select from a SQL windows

11gR2>
select v2_col
from  distant_table;

I have no errors.

I was struggling with this issue until one of my smart colleagues puts me into the right direction: there is a difference in the character set between the 11g (multi-byte) and the 10g database (single byte) as shown below

11gR2> select * from nls_database_parameters;
PARAMETER                      VALUE
------------------------------ ----------------------------
NLS_CHARACTERSET               AL32UTF8    --> here multibyte
NLS_LENGTH_SEMANTICS           CHAR        --> here
NLS_NCHAR_CHARACTERSET         AL16UTF16
NLS_RDBMS_VERSION              11.2.0.3.0

10gR2> select * from nls_database_parameters;

PARAMETER                      VALUE
------------------------------ ------------------------------------
NLS_CHARACTERSET               WE8ISO8859P1  --> here single byte
NLS_LENGTH_SEMANTICS           BYTE          --> here
NLS_NCHAR_CHARACTERSET         AL16UTF16
NLS_RDBMS_VERSION              10.2.0.5.0

I was thinking that the 11gR2 PL/SQL engine (because still I am not fetching anything explicitly, I am just selecting) could recognize this character set difference between the local and the distant database and implicitly declare its local variable so that it could have enough space to contain incoming column when this one will have special characters .

Having smart colleagues is always a lucky situation, because not only my colleague pointed me to the right direction but gave me a workaround that I am pleased to reproduce here below :

11gR2> begin
2   for x in (
3    select v2_col ||''
4    from  distant_table
5    )
6   loop
7     null;
8  end loop;
9  end;
10  /

PL/SQL procedure successfully completed.

The work around consists of concatenating the distant column v2_col with a null string. This magically overcome the Bulk Bind: Truncated Bind error (don’t tell me why the PL/SQL engine can overcome this error when I have used such a concatenation).

If you have already been faced to this error then I will be pleased to know how you managed to solved it



Intelligent CBO

$
0
0

Just a small note to show you a situation I’ve recently encountered which shows how the CBO is intelligent. I have the following two pieces of SQL

UPDATE t1 a
SET
a.padding = 'yyyyyyyy'
WHERE
a.id1 in
(SELECT
     b.id2
FROM t2  b
WHERE a.id1 = a.n1   ---> spot this
);

And the second one

UPDATE t1 a
SET
a.padding = 'yyyyyyyy'
WHERE
a.id1 in
(SELECT
 b.id2
 FROM t2  b
)
AND a.id1 = a.n1;       ---> spot this

I would not have written the first SQL in order to restrict the updates only to records in t1 having identical id1 and n1. I would have logically issued the second one instead.

But to my surprise the CBO recognized that the where clause in the subquery (WHERE a.id1 = a.n1 ) should be applied to the main update by replacing it with the AND clause outside the brackets. Here below are the corresponding execution plans

First query

Plan hash value: 1788758844
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT    |      |     1 |    91 |   899   (2)| 00:00:03 |
|   1 |  UPDATE             | T1   |       |       |            |          |
|*  2 |   HASH JOIN SEMI    |      |     1 |    91 |   899   (2)| 00:00:03 |
|*  3 |    TABLE ACCESS FULL| T1   |     1 |    78 |   447   (2)| 00:00:02 |
|   4 |    TABLE ACCESS FULL| T2   |   104K|  1320K|   449   (1)| 00:00:02 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"."ID1"="B"."ID2")
3 - filter("A"."ID1"="A"."N1")

Note
-----
- dynamic sampling used for this statement (level=2)

Second query

Plan hash value: 1788758844
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT    |      |     1 |    91 |   899   (2)| 00:00:03 |
|   1 |  UPDATE             | T1   |       |       |            |          |
|*  2 |   HASH JOIN SEMI    |      |     1 |    91 |   899   (2)| 00:00:03 |
|*  3 |    TABLE ACCESS FULL| T1   |     1 |    78 |   447   (2)| 00:00:02 |
|   4 |    TABLE ACCESS FULL| T2   |   104K|  1320K|   449   (1)| 00:00:02 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"."ID1"="B"."ID2")
3 - filter("A"."ID1"="A"."N1")

Note
-----
- dynamic sampling used for this statement (level=2)

The same plan hash value and the same predicate part. It’s funny enough.

If you want to play with the test here is the model (borrowed from Jonathan Lewis)

create table t1
as
with generator as (
select  --+ materialize
rownum id
from dual
connect by
level <= 10000)
select
   rownum id1,
   trunc(dbms_random.value(1,1000))    n1,
   lpad(rownum,10,'0') small_vc,
   rpad('x',100)       padding
from
 generator   v1,
 generator   v2
where
rownum <= 100000;
create table t2
 as
with generator as (
select  --+ materialize
rownum id
from dual
connect by
level <= 10000)

select
 rownum                  id2,
 trunc(dbms_random.value(10001,20001))   x1,
 lpad(rownum,10,'0') small_vc,
rpad('x',100)       padding
from
 generator   v1,
 generator   v2
where
rownum <= 100000;

On how to enforce uniqueness when unique constraint is not possible

$
0
0

The following select gives a list of existing table that do not possess a unique constraint

select table_name
from user_tables u1
where not exists (select null
from  user_constraints u2
where u2.table_name      = u1.table_name
and   u2.constraint_type = 'U');

However, the above select doesn’t mean that there isn’t any other uniqueness enforcement  implemented without a dedicated unique constraint.

Recently, I have been confronted to a business rule to be implemented in a Data Vault model which specifies that each Satellite or Satellite of Link table should have a unique key enforcement satisfying the following rule

For each couple of (col1, col2) I could have at most a unique record for a not null dv_load_end_date column.

A picture being worth a thousand words let’s see this through a real example


SQL> create table t1
(t1_sk            number
,dv_load_date     date
,seq_nbr          number
,n1               number
,n2               number
,status           varchar2(10)
,identification   varchar2(30)
,dv_load_end_date date);

SQL> alter table t1 add constraint t1_pk primary key (t1_sk,dv_load_date, seq_nbr);

The unique business rule would be described as follows: for each couple (t1_sk, n2) I could have at most one record for a null dv_load_end_date column

Is this doable via a unique constraint? No it is not.

And here where function based index comes to the rescue

SQL> create unique index ind_t1_uk on t1
(case when dv_load_end_date is null then t1_sk else null end
,case when dv_load_end_date is null then n2 else null end);

And here how the inserts go

SQL> insert into t1 (t1_sk ,dv_load_date ,seq_nbr,n1,n2,dv_load_end_date)
              values(1, sysdate, 1, 1, 100, null);

1 row created.

SQL> insert into t1 (t1_sk ,dv_load_date ,seq_nbr,n1,n2,dv_load_end_date)
              values(1, sysdate, 1, 1, 100, sysdate);

1 row created.

SQL> insert into t1 (t1_sk ,dv_load_date ,seq_nbr,n1,n2,dv_load_end_date)
              values(1, sysdate, 1, 1, 100, null);

insert into t1 (t1_sk ,dv_load_date ,seq_nbr,n1,n2,dv_load_end_date)
*
ERROR at line 1:
ORA-00001: unique constraint (XXX.IND_T1_UK) violated

SQL> insert into t1 (t1_sk ,dv_load_date ,seq_nbr,n1,n2,dv_load_end_date)
              values(1, sysdate, 1, 1, 200, null);
1 row created.

This post is the second one in a set of small scripts (blog articles) I decided to collect in my blog for my documentation and reuse purposes


You want a bridge from single instance to RAC? this book is for you

$
0
0

rac expert

When Syed Jaafar Oracle ACE Director and one of the world RAC experts asked me to review the new book Oracle Expert Oracle RAC 12c he has coauthored with 3 other Oracle ACE Directors, my first thought was: how a non experienced  RAC developer could seriously review this book?

In my last decade of Oracle single Instance DBA-Developer carrier, I have got the feeling that Jonathan Lewis books (plus Tom Kyte and Christian Antognini ones) have let no place for other books ; at least in what those books deal about. In addition, Jonathan has a style of writing and narrating  technical features that makes you feeling as if you are reading an agreeable story. I am still waiting to read as attractive and as elegant books as those written by Jonathan Lewis.

Although I might not have any RAC experience, I finally decided to start reviewing this book more like a typical reader who knew the single instance fundamentals and wanted to learn basics of RAC. I know that I am very severe in judging the writing style of books (even in English which isn’t my mother tongue) and this is on what I was going to strengthen my attention.

The first thing I did is to go through the Table of content and isolate Chapters that might be close to my background. The Chapters I have selected for review was then 6, 10, 11 and 12. Unintentionally, those Chapters belong to Riyaj Shamsudeen. Then I started reading Chapter 8 Backup and Recovery in RAC written by Syed. I couldn’t resist the temptation to write the current review before finishing reading the complete book.

Below you can find my review of what I have already read. The review I sent to the authors contains several pages where I asked several clarifications and questions that revealed to be crucial and very important by the author themselves:

Chapter 6:   Application Design Issues: I am very pleased to say that I have been agreeably surprised by Riyaj style of writing. He goes to the essential using concise words touching the heart of his desired goal (what he wants to explain).  Although the design issues he presented can be found in several other trusted documents and blog articles, it is nevertheless worth having them listed all over in a dedicated chapter and in an elegant writing style; and among all of that, Riyaj has presented these issues always with a comparison between the effect these issues have on single instance and the magnified effect they can have in RAC. We have had a discussion about Localized Inserts and globally hash partitioning them to reduce the contention on the index. About excessive Commits, Sequence Cache, Index Design, Excessive Parallel Scans and Full Table Scans which you have already guessed are the main application design you should absolutely be aware of before embarking into a RAC project. Definitely this Chapter is a must to have Chapter for RAC (and even single Instance) designer.

Chapter 10: RAC Database Optimization: A very interesting Chapter where several important RAC specific feature have been clearly identified and explained such as Cache Fusion, Global Resource Directory (GRD), Several specific RAC wait events, etc…

When you read this (red is mine): In a single instance, a database block can reside only in the local buffer cache. Therefore, if the block is not in the local buffer cache, it is read from the disk. In RAC, the block can be resident in any buffer cache. So, a special type of processing takes place that deviates sharply from single-instance processing” and you know the fundamental work Oracle has to do to guaranty the ACID in single instance then you may realize how the task could be complicated in RAC. Riyaj  ”simplifies” this for you. He has clearly explained this concept and showed how a consistent read is fabricated and how a Current Block read is requested and got in RAC.

If you read this Chapter and Chapter 2 (Chapter 2 Redo and Undo) of the last Jonathan Lewis book  then you will be sure that you’ve made your Redo and Undo Internal “Giro” or “tour de France”.

Chapter 11: Locks and Deadlocks: In the process of reviewing this book I was very impatient to start reading this chapter. Because I am a big fan of Locks and Deadlocks and I have read everything about these two subjects. Although I can read and interpret easily a single instance deadlock graph I am unable to correctly interpret a deadlock graph from a RAC machine. Riyaj, as far as I am aware, is the first author who has published something about deadlock in RAC. I told him that I would have highly appreciate if he went into one or two real life RAC deadlock graphs and explained them in details in this chapter. Because practical cases are what readers want to see and  what they most appreciate.

Chapter 12: Parallel Query in RAC: Having no real experience in this particular subject, I read it as someone who wanted to learn how parallel Query is handled in a RAC machine. I will certainly keep this Chapter very close to me and will couple it to the work (blog articles and webinars) done by Randolph Geist and recent articles written by Jonathan Lewis to  definitely master the parallel query in both single instance and RAC. As always, my experience let me know that sometimes I read a chapter (for example Chapter 2 Redo and Undo of Jonathan Lewis) that I found very difficult to understand. But when I worked hard to understand it (and it took me several months), I magically discovered how much this Chapter is well written and strictly speaking is wonderful. I have a feeling that it will be the same story with Riyaj Chapter 12.

Chapter 8: Backup and Recovery in RAC: this chapter explains step by step how a backup and recovery of instance are done in RAC. It includes very interesting pictures which confirm the mantra “a picture is worth a thousand words”. The concepts are well presented and explained with details. I have no real experience in this field to make objective judgments. However, if I am to start a new RAC job where Backup and Recovery are in my tasks then this chapter will be close to me and I will be basing my comprehension effort on it. Nice to have 

Conclusion: The book contains 14 Chapters and more than 431 pages. I have reviewed 5 Chapters. If you come from single instance as I am, you will find this book an excellent bridge. If you are experienced RAC developers and you want to learn how to avoid main application design in RAC (you shouldn’t doing them as you are already an experienced RAC person) then Chapter 6 is for you and should absolutely be read and understood. If you want to learn internal of Redo, Undo, Consistent and Current Read, the Chapter 10 is for you. You will also learn and understand in this Chapter the main RAC wait events. If you want to learn how Locks and Deadlocks are handled (do you know that the internal process, that kills the first session which started waiting in case of deadlock, takes place every 10 seconds and not every 3 seconds?) Chapter 11 is for you; and if, as I have the intention to do, you want to definitely understand and master Parallel Query in general and in RAC particularly then print out Chapter 12 and start exploring it. I am sure that after a hard work (it depends on your knowledge) you will certainly finish by savoring it


CBO and unusable unique index

$
0
0

Very recently a question came up on oracle-list where the original poster was wondering about two things (a) how could a refresh on materialized view allows duplicate key to be possible in the presence of a unique index and (b) he was struggling about a particular select which is giving a wrong results.

The answer to the first question is easy and I have already blogged about it. He was refreshing the materialized view using a FALSE value for the parameter atomic refresh. With this particular refresh parameter, the materialized view is refreshed  using a rapid truncate table followed by a direct path insert. Direct path load as shown in my blog, will silently disable the unique index allowing duplicate keys to be accepted. Thought that this seems to be true (in this context of materialized view refresh) only in 11gR2. The preceding release (10gR2) is not allowing duplicate keys during this kind of refresh as I’ve shown in my answer in the oracle-list forum. Does this mean that 10gR2 is not direct path loading when atomic refresh is set to FALSE? I have to check.

But what motivated the current blog article is the second question. See with me

SQL> create table a(id int,val number);
Table created.

SQL> insert into a select 1, 1 from dual;
1 row created.

SQL> create table b(id int);
Table created.

SQL> create unique index uq_b on b(id);
Index created.

Then I will use a sqlloader to load data into table b using a direct path load in order to silently disable the unique index. The control file(c.ctl) I will be using resembles to:

LOAD DATA
INFILE *
REPLACE
INTO TABLE B
(id position(1:1) TERMINATED BY ",")
BEGINDATA
1;
1;

And now I will launch the sqlloader

C:\>sqlldr user/paswd@database control=c.ctl direct=true

SQL*Loader: Release 10.2.0.3.0 - Production on Tue Oct 22 16:46:06 2013
Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Load completed - logical record count 3.

What do you think it happens to the unique index and to the table b after this direct path load?


SQL> select index_name, status from user_indexes where index_name ='UQ_B';

INDEX_NAME                     STATUS
------------------------------ --------
UQ_B                           UNUSABLE

SQL> select count(1) from b;

COUNT(1)
----------
2

The unique index has been disabled and there are duplicate keys in table b.

So far so good.

Let’s now start exploring the Original Poster queries problem

SQL> select a.*
from a, b
where a.id = b.id(+);

ID        VAL
---------- ----------
1          1
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |    15 (100)|          |
|   1 |  TABLE ACCESS FULL| A    |     1 |    26 |    15   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement (level=2) 

[/sourcecode]

The CBO knows that there is a unique index on b(id). And, as far as there is one id in table a, the CBO, obviously, assumes that there will be only one record for the corresponding id in table b. This is why table b is not present in the execution plan by the way. Unfortunately the unique index has been disabled by the direct path load and has permitted the presence of duplicate record in table b. This is the reason why the query is producing a wrong result.

If we force the CBO to access the table b the result is however correct

SQL>  select a.*,b.id
from a, b
where a.id = b.id(+);

ID        VAL         ID
---------- ---------- ----------
1          1          1
1          1          1

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    36 (100)|          |
|*  1 |  HASH JOIN OUTER   |      |     2 |    78 |    36   (3)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| A    |     1 |    26 |    15   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| B    |     2 |    26 |    20   (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."ID"="B"."ID")

Note
-----
- dynamic sampling used for this statement (level=2)

It is clear that the CBO is not looking the unique index status during the optimization (plan generation) phase.

But, what if instead of the unusable unique index, we have a disabled unique constraint? Will the CBO consider the status of the unique constraint in this case?

SQL> alter table b add constraint b_uk unique (id) disable;
Table altered.

SQL> select a.*
from a, b
where a.id = b.id(+);

ID        VAL
---------- ----------
1          1

Oups. The CBO is still wrong. What if we drop the culprit index?

SQL> drop index uq_b;
Index dropped.

SQL> select a.*
from a, b
where a.id = b.id(+);

ID        VAL
---------- ----------
1          1
1          1

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    36 (100)|          |
|*  1 |  HASH JOIN OUTER   |      |     2 |    18 |    36   (3)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| A    |     1 |     6 |    15   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| B    |     2 |     6 |    20   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."ID"="B"."ID")

Which finally gives a correct result.

Bottom line: always make sure your unique indexes are usable.


Index on a virtual column: would it help others?

$
0
0

Is it possible to create an index on a column a and use it for column b? In certain situations yes.

Let’s first build the model and explore those situations (in 11.2.0.3)

  create table t1
        as   select
             rownum n1
            ,to_date(to_char(to_date('01/01/1985','dd/mm/yyyy'),'J') + trunc(dbms_random.value(1,11280)),'J') ord_date
            ,dbms_random.string('L',dbms_random.value(1,30))||rownum text_comment
        from dual
        connect by level <= 10000;

  alter table t1 add virt_date date generated always as (trunc(ord_date)) virtual;

  begin
        dbms_stats.gather_table_stats
           (ownname          => user,
            tabname          =>'t1',
            method_opt       => 'for all columns size 1'
          );
      end;
     /
 

I have created a simple table t1 to which I have attached a virtual column using trunc function (here below with n°4):

  describe t1
 Name                 Type
 -------------------- --------------------
 1      N1            NUMBER
 2      ORD_DATE      DATE
 3      TEXT_COMMENT  VARCHAR2(4000 CHAR)
 4      VIRT_DATE     DATE
 

I want to explore the following select:

 select * from t1 where ord_date =to_date('02/08/2011','dd/mm/yyyy');
 

I have a predicate on ord_date. I have a virtual column based on ord_date. What do you think if I create an index on the virtual column?  Would it help my above query?

 create index ind_virt_date on t1(virt_date);
 select * from t1 where ord_date =to_date('02/08/2011','dd/mm/yyyy');

 ---------------------------------------------------------------------------------------------
 | Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
 ---------------------------------------------------------------------------------------------
 |   0 | SELECT STATEMENT            |               |       |       |     3 (100)|          |
 |*  1 |  TABLE ACCESS BY INDEX ROWID| T1            |     1 |    40 |     3   (0)| 00:00:01 |
 |*  2 |   INDEX RANGE SCAN          | IND_VIRT_DATE |     1 |       |     1   (0)| 00:00:01 |
 ---------------------------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
 1 - filter("ORD_DATE"=TO_DATE(' 2011-08-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
 2 - access("T1"."VIRT_DATE"=TRUNC(TO_DATE(' 2011-08-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
 

Yes it did. The CBO uses the index on the virtual column (ind_virt_date) which is based on a different column (ord_date) to cover the predicate where ord_date = to_date(’02/08/2011′,’dd/mm/yyyy’);

What I have shown up to now is: If you create an index on a virtual column, and you use a where clause on the column on which the virtual column is based on, the index is selected by the CBO.

But the question is: would this be extensible to all types of virtual columns?

And the answer is: I don’t think so

Ok, let’s test with another virtual column using again the trunc function

 alter table t1 add virt1_n1 number generated always as(trunc(n1)) virtual;
 create index ind_virt_n1 on t1 (virt1_n1);

 select * from t1 where n1 = 1;

 -------------------------------------------------------------------------------------------
 | Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
 -------------------------------------------------------------------------------------------
 |   0 | SELECT STATEMENT            |             |       |       |     2 (100)|          |
 |*  1 |  TABLE ACCESS BY INDEX ROWID| T1          |     1 |    41 |     2   (0)| 00:00:01 |
 |*  2 |   INDEX RANGE SCAN          | IND_VIRT_N1 |    40 |       |     1   (0)| 00:00:01 |
 -------------------------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
 1 - filter("N1"=1)
 2 - access("T1"."VIRT_N1"=TRUNC(1))
 

The index on the virtual column has been again used to cover a predicate on its underlying column.

Let’s test another case with ceil function this time.


 drop index ind_virt_n1;
 drop index ind_virt_date;
 alter table t1 drop column virt1_n1;
 alter table t1 add virt1_n1 generated always as (ceil(n1)) virtual;

 create index ind_virt_n1 on t1(virt1_n1);

 select * from t1 where n1 = 1;

 --------------------------------------------------------------------------
 | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 --------------------------------------------------------------------------
 |   0 | SELECT STATEMENT  |      |       |       |    17 (100)|          |
 |*  1 |  TABLE ACCESS FULL| T1   |     1 |    41 |    17   (0)| 00:00:01 |
 --------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
 1 - filter("N1"=1)
 

The index on the virtual column has, however, not been used by the CBO in this case

Another case with abs function this time.


 alter table t1 drop column virt1_n1;
 alter table t1 add virt1_n1 generated always as (abs(n1)) virtual;
 create index ind_virt_n1 on t1(virt1_n1);

 select * from t1 where n1 = 1;

 --------------------------------------------------------------------------
 | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 --------------------------------------------------------------------------
 |   0 | SELECT STATEMENT  |      |       |       |    17 (100)|          |
 |*  1 |  TABLE ACCESS FULL| T1   |     1 |    41 |    17   (0)| 00:00:01 |
 --------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
 1 - filter("N1"=1)
 

Another case with nvl function this time:

alter table t1 drop column virt1_n1;
alter table t1 add virt1_n1 generated always as (nvl(n1,0)) virtual;

create index ind_virt_n1 on t1(virt1_n1);

select * from t1 where n1 = 1;
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |    17 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    40 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("N1"=1)

The index on these virtual columns has not been used by the CBO.

Summary: In all cases of functions I have used to generate the virtual column, the only function that has permitted to an index created on its generated virtual column to be selected by the CBO (when this virtual column is not in the predicate part) is the trunc function.

My first thought was that when the internal definition of the virtual column matches the expression I’ve used to create the virtual column then the index will be used. But I was wrong because both trunc and ceil (or abs or nvl) have the same matching between their internal definition and the expression I have used to create them respectively:

 select column_name col_name, data_default def
 from dba_tab_columns
 where table_name = 'T1'
 and column_name in ('VIRT_DATE','VIRT_N1', 'VIRT1_N1', 'VIRT2_N1','VIRT3_N1');

 COL_NAME       DEF
 ------------  -----------------
 VIRT_DATE     TRUNC("ORD_DATE")
 VIRT_N1       TRUNC("N1",2)
 VIRT2_N1      ABS("N1")
 VIRT3_N1      NVL("N1",0)
 VIRT1_N1      CEIL("N1")
 

Do you know why the trunc function works differently here?

Ah! by the way I have also tested the trunc function by setting(in 12c) the hidden parameter _truncate_optimization_enabled from its default value TRUE to FALSE and this has not changed anything


Partitioned Index: Global or local?

$
0
0

Recently I was investigating a performance problem using the SQL Monitoring feature looking for SQLs taking more than 5 seconds, when one query retained my attention. I drilled down to its corresponding SQL Monitoring Report and started looking carefully to its execution plan. There was a particular index range scan operation which retained my attention because it was consuming a huge amount of logical I/O (buffers). A careful examination of the index and its underlying table reveals that the latter is range partitioned while the former is a local non prefixed index (a locally partitioned index which doesn’t include the partition key in its definition).

My curiosity is so that I issued a query to see how many non prefixed indexes exist in the whole bunch of partitioned tables this application owns. The query of course returned several rows. When I asked the developer about the reason, he said that this is a ”standard” they have adopted because they are, every couple of months, truncating old partitions; and having local indexes (even non prefixed ones) is a good idea in this case because they don’t have to rebuild any global indexes (by the way there is the UPDATE GLOBAL INDEXES clause for that).

And here where the problem resides: ignoring the technology. Partitioning is a nice feature which could damage the performance when it is wrongly designed. Creating a locally non-prefixed index without knowing the collateral effects they can produce if partition pruning is not guaranteed is something I have very often seen in my daily consultancy work. In order to explain this particular client situation I have engineered the following partitioned table with 1493 partitions (you should open the file in a new page copy the content into a .sql file and execute it). Table to which I have attached a locally non prefixed index (LC_NON_PREFIXED_TYP_I).  Here below are the observations I can emphasize when selecting from this table:

SQL> desc partitioned_tab

Name               Null?     Type
------------------ --------- -------------------------------
MHO_ID             NOT NULL NUMBER(10)
MHO_DATE           NOT NULL DATE       ---> partition key
MHO_CODE           NOT NULL VARCHAR2(1 CHAR)
MHO_TYP_ID         NOT NULL NUMBER(10) ---> indexed column

The important question here is how would react the database to a query that doesn’t eliminate partitions (because it doesn’t include the partition key in its predicate) and which will be honored via a locally partitioned non prefixed index. Something like this:

select * from partitioned_tab where mho_typ_id = 0;

In the presence of an index of this type:

CREATE INDEX LC_NON_PREFIXED_TYP_I ON partitioned_tab (MHO_TYP_ID) LOCAL;

Here are the results


-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name                  | Starts | E-Rows | A-Rows |Buffers |Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                       |      1 |        |   1493 |   2984 |      |       |
|   1 |  PARTITION RANGE ALL               |                       |      1 |   1493 |   1493 |   2984 |    1 |  1493 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| PARTITIONED_TAB       |   1493 |   1493 |   1493 |   2984 |    1 |  1493 |
|*  3 |    INDEX RANGE SCAN                | LC_NON_PREFIXED_TYP_I |   1492 |   1493 |   1493 |   1492 |    1 |  1493 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("MHO_TYP_ID"=0)

Statistics
----------------------------------------------------------
0     recursive calls
0     db block gets
2984  consistent gets
0     physical reads
0     redo size
28937 bytes sent via SQL*Net to client
372   bytes received via SQL*Net from client
4     SQL*Net roundtrips to/from client
0     sorts (memory)
0     sorts (disk)
1493  rows processed

Spot how many times the INDEX RANGE SCAN operation has been started: 1492 times. Compare this number to the number of table partitions (1493) and you will find that in such a kind of situation you will do N-1 INDEX RANGE SCAN operations (where N is the number of partitions). That is an enormous waste of time and energy.

Why 1492 INDEX RANGE SCANs?

It is simply because a locally partitioned index contains multiple segments in contrast to a b-tree index which consists of a single segment.

SQL> select count(1) from dba_segments where segment_name = 'LC_NON_PREFIXED_TYP_I';

COUNT(1)
----------
1492

I am not saying that you don’t have to create a locally non prefixed index. What I am trying to emphasize is that when you decide to do so be sure that your queries will eliminate partitions and will hence prune down to a single index partition as it is shown here below when my query is doing partition pruning

SQL> select * from partitioned_tab
     where mho_typ_id = 0
     and  mho_date = to_date('01122012','ddmmyyyy');

MHO_ID MHO_DATE          M MHO_TYP_ID
---------- ----------------- - ----------
1 20121201 00:00:00 Z          0

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name                  | Starts | E-Rows | A-Rows | Buffers | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                       |      1 |        |      1 |       2 |       |       |
|   1 |  PARTITION RANGE SINGLE            |                       |      1 |      1 |      1 |       2 |     2 |     2 |
|*  2 |   TABLE ACCESS BY LOCAL INDEX ROWID| PARTITIONED_TAB       |      1 |      1 |      1 |       2 |     2 |     2 |
|*  3 |    INDEX RANGE SCAN                | LC_NON_PREFIXED_TYP_I |      1 |      1 |      1 |       1 |     2 |     2 |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("MHO_DATE"=TO_DATE(' 2012-12-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
3 - access("MHO_TYP_ID"=0)

Since Oracle has succeeded to eliminate non touched partitions (PARTITION RANGE SINGLE) it has pruned down to a single segment index range scan as shown by the Starts information which equals 1. In addition, the consistent gets (Buffers) has been drastically reduced from 2984 to only 2.

That’s when your query is able to eliminate partitions. However, if you have a particular query that can’t eliminate partitions and that you want to cover via an appropriate index then in this case you have better to not local partition the index. Let’s see this in action


SQL> alter index LC_NON_PREFIXED_TYP_I invisible;

Index altered.

SQL> create index gl_typ_i on partitioned_tab(mho_typ_id);

create index gl_typ_i on partitioned_tab(mho_typ_id)
*
ERROR at line 1:
ORA-01408: such column list already indexed

Damn!!! I can’t do it in 11.2.0.3.0

SQL> drop index LC_NON_PREFIXED_TYP_I;

SQL> create index gl_typ_i on partitioned_tab(mho_typ_id);

 

SQL> select * from partitioned_tab where mho_typ_id = 0;

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name            | Starts | E-Rows | A-Rows |Buffers | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                 |      1 |        |   1493 |   1496 |       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| PARTITIONED_TAB |      1 |   1493 |   1493 |   1496 | ROWID | ROWID |
|*  2 |   INDEX RANGE SCAN                 | GL_TYP_I        |      1 |   1493 |   1493 |      4 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("MHO_TYP_ID"=0)

Statistics
----------------------------------------------------------
1     recursive calls
0     db block gets
1496  consistent gets
1493  physical reads
0     redo size
28937 bytes sent via SQL*Net to client
372   bytes received via SQL*Net from client
4     SQL*Net roundtrips to/from client
0     sorts (memory)
0     sorts (disk)
1493  rows processed

And spot how many index range scan we did this time? Only one. Because there is only one segment for this type of index

SQL> select count(1) from dba_segments where segment_name = 'GL_TYP_I';

COUNT(1)
----------
1

You can also point out that in contrast to the locally non prefixed index we did 50% less of logical I/O – from 2984 down to 1496.

By the way, why do you think Oracle allows the creation of the non prefixed index LC_NON_PREFIXED_TYP_I when it is declared as non unique and refuse to obey you  when you want to create it as a unique index?

SQL> create unique index lc_non_prefixed_typ_i on partitioned_tab (mho_typ_id) local;

*
ERROR at line 1:
ORA-14039: partitioning columns must form a subset of key columns of a UNIQUE index

Simply because Oracle has already measured the impact this kind of index can have on the insert performance if it has allowed it to exist. In this case a mho_typ_id could go in any of the 1493 partitions. How would Oracle proceed to check if the inserted mho_typ_id value has not been already inserted (or is being inserted) without impeaching others to insert into the whole bunch of the 1492 partitions? Is this scalable and performant? Of course it is not.

Bottom Line: when you create a locally non prefixed index (index that doesn’t include the partition key in its definition) then be sure that queries using this index will eliminate partitions. Otherwise,  the more partitions you have the more index partitions you will range scan and the more logical I/O you will do


Cost-Based Oracle Fundamentals: a magic book

$
0
0

Cost Based OptimizerIt was back in late 2005 or early 2006 when I was prompted by an Oracle magazine issue to buy the then best Oracle seller book of the year: Oracle Cost Based Fundamentals. When Amazon shipped me this book, I was really disappointed when I started browsing its content. The first question that came to my mind was: is this book speaking about an Oracle technology? You can imagine how much this question revealed my degree of Oracle ignorance by that time. To my defense I was working for a customer as a PL/SQL developer under Oracle 8i. My tasks were to faithfully transform business requirements into a technical requirements and then into a set of stored procedures. Oracle 8i was under Rule Based model while the book I bought explains fundamentals of Oracle Cost Based Optimizer.  I was in such a situation that inevitably the content of this new book was not matching my interests. So I put it on hold.

Several years after, I started a new job where trouble shooting performance issues was a crucial part. The application was upgraded from 8i (Rule Based) to 10gR2 (Cost Based) and was suffering terrible performance problems. It was time for me to wake up both the book and my head from their hibernation. Degustation of the book content begins.

More than 4 years after I have started trouble shooting performance problems and particularly bad query execution time, I am still using and savoring the content of this unrivalled and unequalled book.

There are 14 chapters in this book; I am not going to tell you what chapter is must to read or what has most retained my attention. This is not a book to only read and re-read. This is a book to learn by heart. This is a book you should always have with you when trouble shooting queries bad execution time. This is a Swiss knife allowing a CBO dissection.

Simply put, this book is so that if, when speaking with an experienced Oracle tuning DBA-Developer, I came to realize that he still has not read this book then I immediately measure the gap he has to fill before he will start doing correctly his job (unless he has read the Performance Guide itself :-) ).



Interval Partitioning and PSTOP in execution plan

$
0
0

The last release of the Northern California Oracle Users Group Journal published a Tom Kyte article entitled Advice for an Oracle Beginner?  In which the author wrote the following phrase “Participation in the Oracle community is what took me from being just another programmer to being ‘AskTom’. Without the act of participating, I do not think I would be where I am today.”

That is 100% correct. I am trying to participate in several Oracle forums including OTN, French Oracle forum and Oracle-list. If I can’t bring my help I try to understand the Original Poster (OP) question and analyze answers brought by other participants. Proceeding as such, I learnt many things. Among them what I learnt today via this otn thread.

For convenience (because it will be very easy for me to find it when needed) I decided to summarize what I learnt to day here via this very brief article. Observe carefully the following execution plan and spot particularly the PSTOP information (1048575)at line 1 and 2

-------------------------------------------------------------------------------------------
| Id  | Operation                | Name                   | Rows  | Bytes | Pstart| Pstop |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                        |       |       |       |       |
|   1 |  PARTITION RANGE ITERATOR|                        |     2 |    70 |   KEY |1048575|
|*  2 |   TABLE ACCESS FULL      | PARTITION_INTERVAL_TAB |     2 |    70 |   KEY |1048575|
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("TRADE_DATE">=TRUNC(SYSDATE@!))

Does this particular table contain 1,048,575 partitions? Of course it doesn’t.

The name of the table already gives a hint of what is happening here. The table is range partitioned and it is also using interval partitioning feature.  I will expose the second condition that causes the apparition of this magic number in a moment. Here below the model

SQL> CREATE TABLE partition_interval_tab (
  n1 NUMBER
 ,trade_date DATE
 ,n2 number
 )
 PARTITION BY RANGE (trade_date)
 INTERVAL (NUMTOYMINTERVAL(1,'MONTH'))
 (
 PARTITION p_1 values LESS THAN (TO_DATE(' 2013-11-11 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
 ,PARTITION p_2 values LESS THAN (TO_DATE(' 2013-12-11 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
 );

SQL> insert into partition_interval_tab values (1, trunc(sysdate), 100);
SQL> insert into partition_interval_tab values (2, trunc(sysdate + 20), 200);
SQL> commit;
SQL> select * from partition_interval_tab where trade_date = trunc(sysdate);

N1 TRADE_DATE                N2
---------- ----------------- ----------
1 20131108 00:00:00        100

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                        |       |       |    15 (100)|          |       |       |
|   1 |  PARTITION RANGE SINGLE|                        |     1 |    35 |    15   (0)| 00:00:01 |   KEY |   KEY |
|*  2 |   TABLE ACCESS FULL    | PARTITION_INTERVAL_TAB |     1 |    35 |    15   (0)| 00:00:01 |   KEY |   KEY |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("TRADE_DATE"=TRUNC(SYSDATE@!))

Nothing special to point out here when the predicate on the partition key represent an equality. But spot what happens when I change the predicate part to be an inequality (>=)

SQL> select * from partition_interval_tab where trade_date >= trunc(sysdate);

N1 TRADE_DATE                N2
---------- ----------------- ----------
1 20131108 00:00:00        100
2 20131128 00:00:00        200

-------------------------------------------------------------------------------------------
| Id  | Operation                | Name                   | Rows  | Bytes | Pstart| Pstop |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                        |       |       |       |       |
|   1 |  PARTITION RANGE ITERATOR|                        |     2 |    70 |   KEY |1048575|
|*  2 |   TABLE ACCESS FULL      | PARTITION_INTERVAL_TAB |     2 |    70 |   KEY |1048575|
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("TRADE_DATE">=TRUNC(SYSDATE@!))

And here you are; the magic number 1048575 appears which seems to represent the upper band of

Thanks again for Jonathan Lewis who showed this information in the above mentioned otn thread.

I also include few references I have found when searching the number 1048575 on the internet

https://forums.oracle.com/thread/2230426?start=15&tstart=0

http://docs.oracle.com/cd/B28359_01/server.111/b32024/part_avail.htm

And this number seems also to be used by Oracle when auto tuning the db_file_multiblock_read_count parameter value


Log file sync and user commits

$
0
0

This is a brief note to demonstrate from where a difference between the number of log file sync wait event and user commits might come. Very recently I have been pointed out in an otn thread that my AWR report contains inconsistent numbers of log file sync and user commits as shown below:

Top 5 Timed Foreground Events

Event Waits Time(s) Avg wait (ms) % DB time Wait Class
db file sequential read 1,234,292 6,736 5 44.97 User I/O
DB CPU 5,251 35.05
log file sync 83,846 1,594 19 10.64 Commit
log file switch completion 1,256 372 296 2.48 Configuration
enq: TX – index contention 19,327 310 16 2.07 Concurrency

In this 60 minutes AWR report I have 83,846 log file sync waits suggesting 23 commits per second (83,846/3600)

Instance Activity Stats

Statistic Total per Second per Trans
user commits 701,112 193.72 0.8

But why the Instance Activity Stats part of the same report is showing 193 user commits per second? From where this discrepancy is coming from?

Well, I have been given a clue also in the same otn thread which I have tested and which I am reporting here in this article

SQL> create table t (n1 number);

SQL> start c:\commits#

NAME                        VALUE
--------------------------- -----
user commits                 5

SQL> start c:\events#

EVENT                      TOTAL_WAITS
-------------------------- -----------
log file sync               5

In the next simple PL/SQL anonymous block I am inserting into a for loop and committing outside the loop only one time

 SQL>  BEGIN
          FOR j in 1..10 LOOP
             insert into t values(j);
          END LOOP;
        commit;
      END;
     /

 PL/SQL procedure successfully completed.
 

If I check now the log file sync and user commits numbers I will see that the figures have been both incremented once and hence are in perfect synchronisation

SQL> start c:\commits#

NAME                VALUE
------------------- ------
user commits        6

SQL> start c:\events#

EVENT                TOTAL_WAITS
-------------------- -----------
log file sync        6

However, if I commit inside the loop (you shouldn’t commit across fetch by the way, this is not a recommended PL/SQL habit) the figures start their deviation.

 SQL> BEGIN
      FOR j in 1..10 LOOP
       INSERT INTO T VALUES(J);
       commit;
      END LOOP;
 END;
 /

PL/SQL procedure successfully completed.

 

SQL> start c:\commits#

NAME                    VALUE
----------------------- ----------
user commits             16   --> old value was 6

SQL> start c:\events#

EVENT                    TOTAL_WAITS
------------------------ -----------
log file sync             7  --> old value was 6

And here where the discrepancy starts. I have an extra 10 user commits while my log file sync wait event underwent a single wait increment. This is a PL/SQL feature which seems to increment the log file sync per PL/SQL block call and not per user commit call.

Bottom line: you have better to worry about your log file sync figures instead of your user commits value when these two values do not match

In case you want to play with this example then I’ve attached here below the commits# and events# sql files respectively

select b.name, a.value
from
v$mystat a,v$statname b
where
a.statistic# = b.statistic#
and b.name = 'user commits';

--

select a.event, a.total_waits
from
 v$session_event a
,v$event_name b
where
a.event = b.name
and  b.name = 'log file sync'
and  a.sid = 578 -- adapt with your sid
;

Expert Oracle Database Architecture: buy your Oracle job insurance

$
0
0

Oracle Database ArchitectureIn the process of reviewing books I have bought almost a decade ago (first edition), today is the turn of Tom Kyte Expert Oracle Database Architecture.

The review of this book, and a couple of others that will follow I hope in a near future, will not be a classical review in which I will explain what I learnt in Chapter 1 and what I’ve most appreciated in Chapter 4 and so on. However, what I prefer to emphasize is what this book procured me in my daily Oracle consultancy work. It is like if I was shooting in the dark before reading this book and the light came up after I have started investigating the content of this book.

I remember a performance meeting crisis where I have been invited as one of the Oracle suffering application developers. The application was inadequately range partitioned by a partition key that was never invoked in the client business queries in which there were practically no partition elimination. All partitioned tables have been given a composite primary key (id, partition key) policed via a unique locally partitioned index. Few days before the crisis meeting, I was religiously reading Chapter 13 about Partitioning in which, among other interesting things, Tom Kyte explained the relationship between Local Indexes and Unique Constraints which should absolutely include the partition key in their definition to be allowed to exist. Oracle enforces uniqueness only within an index partition—never across partitions.

To trouble shoot this performance issue, the newly recruited DB -Architect suggested with authority to (a) transform all global (or b-tree) indexes into locally partitioned ones and (b) to get rid of the partition key from the primary local index.

I am against advises that say ”transform all”. In Oracle there are always  ”It depends” situations that make the ”change all” advice very often if not always a bad one. But the Tom Kyte words about local Indexes and unique constraints were still ringing in my ears so that I couldn’t resist the temptation to stop by and say to the architect ”You need to review your partitioning skills before suggesting such an impossible unique index change”. I would have never said that if I haven’t been in touch with this book.

This book gave me the necessary self-confidence I was lacking to develop performant scalable and available Oracle applications. It gave me a good picture of how Oracle works. I learnt via the content of this book how to model and test a situation before jumping to a hurried conclusion.

I need no effort to persuade you to have this book with you. Just go to ask tom web site and see the enormous work the author has done and you will realize how intelligent you could be by buying this book. 


SQLTXPLAIN under Oracle 12c

$
0
0

I like very much Tanel Poder snapper and Carlos Sierra SQLTXPLAIN . They represent valuable performance diagnostic tools. Unfortunately I am still waiting to find a customer site where I will be allowed or granted necessary privileges to install and to use them. There are client sites where I have been asked to tune queries without having the possibility to execute dbms_xplan.display_cursor. Let alone installing SQLTXPLAIN under SYS user or having grant select on x$ tables.

This is why I have installed them in my personal laptop and I am using them very often in my personal Oracle Research & Developments (R&D). Although, personal work are peanuts when compared with onsite oracle consultancy work, I didn’t renounce to ”home” use them.

I have already successfully installed SQLTXPLAIN on Oracle 11.2.0.1. My first work on SQLTXPLAIN was to go back to personal engineered performance problems ”traditionally” solved and ask myself  “What would I have pointed out using SQLTXPLAIN in such a performance issue?’’. Up to now this is my sole strategy of using this tool like what I have published here and here

My second step in deepening my SQLTXPLAIN R&D was to buy Stelios Charalambides book Oracle SQL Tuning with Oracle SQLTXPLAIN. In the meantime Oracle 12c has been released and naturally I have installed this release after having uninstalled the last one.

I am still reviewing this book. Last week I finished reviewing Chapter 8 and thought that it is time now to devote few time on this tool again because it makes no sense to review this book without having at least the main html report produced by SQLTXTRACT module of the SQLT tool. So I decided to start installing it on my personal Oracle 12c database.

C:\sqlt\install>sqlplus sys/sys@orcl as sysdba

SQL> select DBID, name, CDB, CON_ID, CON_DBID from v$database;

DBID        NAME      CDB     CON_ID   CON_DBID
---------- --------- --- ---------- ----------
1352104669 ORCL      YES          0 1352104669

As you can see from the above select I am going to install SQLTXPLAIN on the container DB

SQL> start sqcreate

…

SQUTLTEST completed.
adding: 131117093318_10_squtltest.log (160 bytes security) (deflated 61%)
no rows selected

Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

Oops!! I have been abruptly disconnected; something went wrong for sure. After having tried several times without success I decided to contact Mauro Pagano from the Oracle support. Thanks to the generated installation log file I have sent him, he immediately answered me that I am trying to install SQLT on a container DB which is impossible for this moment. He kindly suggested me to install it on a pluggable DB and to let him know the results of this new installation. So I embarked again on a new installation

First I have figured out how to connect to the pluggable data base

C:\sqlt\install>sqlplus sys/sys@localhost:1521/pdborcl as sysdba;

SQL> select name, con_id from v$active_services;

NAME               CON_ID
------------------ --------
pdborcl              3

If the pluggable data base is not already open then open it


SQL> alter database pdborcl open;

alter database pdborcl open
*
ERROR at line 1:
ORA-65019: pluggable database PDBORCL already open

And finally I launched the sqcreate which I have preceded by the sqdrop for a clean starting situation


SQL> start sqcreate

………

SQLT users must be granted SQLT_USER_ROLE before using this tool.

SQCREATE completed. Installation completed successfully.

Hopefully this time the installation finished successfully with the last two above instructions which I have religiously followed

SQL> create user mohamed identified by mohamed;

User created.

SQL> grant SQLT_USER_ROLE to mohamed;

Grant succeeded.

Am I now ready to use this tool under 12c pluggable db? Let’s test


C:\sqlplus mohamed/mohamed@localhost:1521/pdborcl

SQL> create table t1 as select rownum n1 from dual connect by level<=10;

SQL> create index i1 on t1(n1);

SQL> select * from t1 where rownum<=1;

N1
----------
1

SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  7yzrbhp4b6vhr, child number 0
Plan hash value: 3836375644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     2 (100)|          |
|*  1 |  COUNT STOPKEY     |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T1   |     1 |     3 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=1)

And what if I apply SQLTXTRACT to my above simple sql query?


SQL> start c:\sqlt\run\sqltxtract 7yzrbhp4b6vhr My_Password

…/…

SQLDX files have been created.
Archive:  sqlt_s86941_sqldx.zip
Length  Date       Time    Name
------- ---------- ----- ----
4631   17/11/2013 10:20  sqlt_s86941_sqldx_7yzrbhp4b6vhr_csv.zip
28048  17/11/2013 10:20  sqlt_s86941_sqldx_global_csv.zip
4363   17/11/2013 10:20  sqlt_s86941_sqldx_7yzrbhp4b6vhr_log.zip
---------                -------
37042                     3 files

adding: sqlt_s86941_sqldx.zip (160 bytes security) (stored 0%)

SQLTXTRACT completed.

And finally it works as shown via this pdf file sqlt_s86941_main

Bottom line

  1. The SQLTXPLAIN plan is currently available to be installed only on a pluggable db in Oracle 12c
  2.  You have on Mauro Pagano a very modest person always ready to help you trouble shooting SQLT installation or using SQLT different modules
  3.  You need to start exploring this tool. It is worth the investigation believe me

On how important is collecting statistics adequately

$
0
0

Very recently I have been handled the following sub-optimal execution plan of a classical MERGE statement between two tables, t1 and t2

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                        |           |   379 | 24635 |  1107  (19)| 00:00:05 |       |       |
|   1 |  MERGE                                 | T1        |       |       |            |          |       |       |
|   2 |   VIEW                                 |           |       |       |            |          |       |       |
|   3 |    SEQUENCE                            | SEQ_SEQ   |       |       |            |          |       |       |
|*  4 |     HASH JOIN OUTER                    |           |   379 | 23119 |  1107  (19)| 00:00:05 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T2        |   372 |  8184 |     5   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | AUDIT_IND |   383 |       |     3   (0)| 00:00:01 |       |       |
|   7 |      PARTITION RANGE ALL               |           |  5637K|   209M|  1046  (15)| 00:00:05 |     1 |     2 |
|   8 |       TABLE ACCESS FULL                | T1        |  5637K|   209M|  1046  (15)| 00:00:05 |     1 |     2 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("L"."COL_SK"(+)="COL_SK")
6 - access("AUDIT_ID"=246395)

According to this execution plan there are only 379 rows to be merged. But the client was complaining about the 12 seconds this merge statement is taking and asked for improvement. Well, it is quite clear from the supplied information that the CBO is wrongly hash joining 5,637,000 worth of rows from the probed table t1 (operation 8) with 372 rows from the built table t2 (operation 5 and predicate 6), burning a lot of CPU at operation 4 and finally throwing away the majority of the 5,637,000 rows from table t1 that are not fulfilling the join condition (predicate 6).

This is the MERGE statement?

MERGE INTO T1 L USING
(SELECT COL_SK,LOAD_DTS
FROM T2
WHERE AUDIT_ID = 246395) H
ON (L.COL_SK = H.COL_SK)
WHEN NOT MATCHED THEN
INSERT
(COL1,COL_SK,COL2,LOAD_DTS ,ORIG,AUDIT_ID)
VALUES
(SEQ_SEQ.NEXTVAL,H.COL_SK,-1,H.LOAD_DTS,'MHO',246395);

Looking carefully to the above statement I realized that the 372 rows are coming from this

(SELECT
   COL_SK,
   LOAD_DTS
FROM T2
WHERE AUDIT_ID = 246395
) H

There are 372 values of the join column, COL_SK, to be merged. So why the CBO has not started by getting those 372 values from table t2 and then looked up into the second table t1 using the join column to satisfy the join condition

 ON (L.COL_SK = H.COL_SK)

This kind of execution path becomes more adequate when you know that there is additionally a unique index on table t1 starting with the join column

create unique index IND_T1_UK on t1 (col_sk, other_column);

When I hinted the MERGE statement to use this unique index

MERGE /*+ index(L IND_T1_UK) */ INTO T1 L USING
(SELECT COL_SK,LOAD_DTS
FROM T2
WHERE AUDIT_ID = 246395) H
ON (L.COL_SK = H.COL_SK)
WHEN NOT MATCHED THEN
INSERT
(COL1,COL_SK,COL2,LOAD_DTS ,ORIG,AUDIT_ID)
VALUES
(SEQ_SEQ.NEXTVAL,H.COL_SK,-1,H.LOAD_DTS,'MHO',246395);

I got the optimal plan. The plan I want the CBO to use without any hint.

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                        |             |   379 | 24635 |  1116   (1)| 00:00:05 |       |       |
|   1 |  MERGE                                 | T1          |       |       |            |          |       |       |
|   2 |   VIEW                                 |             |       |       |            |          |       |       |
|   3 |    SEQUENCE                            | SEQ_SEQ     |       |       |            |          |       |       |
|   4 |     NESTED LOOPS OUTER                 |             |   379 | 23119 |  1116   (1)| 00:00:05 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T2          |   372 |  8184 |     5   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | AUDIT_IND   |   383 |       |     3   (0)| 00:00:01 |       |       |
|   7 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T1          |     1 |    39 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  8 |       INDEX RANGE SCAN                 | IND_T1_UK   |     1 |       |     2   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("AUDIT_ID"=246395)
8 - access("L"."COL_SK"(+)="COL_SK")

You can easily point out from the above two execution plans that the cost of the NESTED LOOP (1116) is greater than the cost of the HASH JOIN (1107) and this is the main reason why the CBO is prefering the sub-optimal HASH JOIN plan. A HASH JOIN being unable to perform an index lookups on the probed table t1 using the join condition col_sk, the CBO traumatized the merge statement by using a costly full table scan.

Finally, solving my problem resides on understanding why the cost of the NESTED LOOP(NL) is greater than the cost of the HASH JOIN.  

The cost of a nested loop join is given by

Cost of acquiring data from first table +
               Cardinality of result from first table * Cost of single visit to second table

So, in your opinion, what statistics information could play an important role in the decision the CBO makes for the NL join method?  Think a little bit before going down the page




















It is the density of the join column (col_sk).

SQL> select
          num_distinct
         ,density
         ,num_nulls
         ,last_analyzed
         ,sample_size
         ,global_stats
         ,histogram
  from   all_tab_col_statistics
  where table_name ='T1'
  and   column_name='COL_SK';

NUM_DISTINCT    DENSITY  NUM_NULLS LAST_ANALYZED       SAMPLE_SIZE  GLO HISTOGRAM
------------ ---------- ---------- -------------------   -----------  --- -------
5,546,458       0.000000180     0   2013/11/07 22:00:31   5,546,458  YES NONE     --> spot the join column last_analyzed date

select
    table_name
    ,last_analyzed
from all_tables
where table_name = ‘T1’;

TABLE_NAME    LAST_ANALYZED
------------- ---------------------
T1            2013/11/21 05:00:31     --> spot the table last_analyzed date

select count(1) from (select distinct col_sk from t1)---> 5,447,251

Two important points are to be emphasied here:

1. The table t1 has been last analyzed on 21/11/2013 while the join column has been last analyzed on 07/11/2013
2. There is about 100K difference between the real count of distinct(col_sk) and num_distinct of col_sk as taken from the all_tab_col_statistics table

A deeper look on the parameters used to compute the table statistics shows that the statistics are collected using  two non adequate parameters:

  method_opt        -=> null
  estimate_percent  -=> a given value

The first parameter setting when used translates to: collect stats on table and ignore stats on columns. Which explains the difference between the last analyzed date of the table and that of its join column. It also explains the discrepancy noticed between the num_distinct of the join column and its real count when taken directly from table t1. My prefered value of the method_opt parameter is:

 method_opt            -=> 'for all columns size 1'

Which collects stats for all columns without collecting histogram.

The second parameter(estimate_percent) indicates at which sample (precision) statistics should be gathered. Starting from 11g and above, my preferred value for this parameter is

estimate_percent  -=> 'dbms_stats.auto.sample_size'

Particularly when approximate_ndv is set to true

SELECT DBMS_STATS.get_prefs('approximate_ndv') FROM dual;

DBMS_STATS.GET_PREFS('APPROXIMATE_NDV')
--------------------------------------------------------------------------------
TRUE

Back to my MERGE problem. When the statistics have been gathered again on table t1 using the above adequate dbms_stats parameters

BEGIN
   dbms_stats.gather_table_stats  (ownname          => user,
                                   tabname          => 'T1',
                                   estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
                                   cascade          => true,
                                   method_opt       => 'FOR ALL COLUMNS SIZE 1'                          
                                  );
END;
/

the CBO started selecting automatically the optimal plan using the NESTED LOOP JOIN as shown via the following execution plan taken from memory


----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                        |             |       |       |  1078 (100)|          |       |       |
|   1 |  MERGE                                 | T1          |       |       |            |          |       |       |
|   2 |   VIEW                                 |             |       |       |            |          |       |       |
|   3 |    SEQUENCE                            | SEQ_SEQ     |       |       |            |          |       |       |
|   4 |     NESTED LOOPS OUTER                 |             |   357 | 21777 |  1078   (1)| 00:00:05 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T2          |   357 |  7854 |     5   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | AUDIT_IND   |   367 |       |     3   (0)| 00:00:01 |       |       |
|   7 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T1          |     1 |    39 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  8 |       INDEX RANGE SCAN                 | IND_T1_UK   |     1 |       |     2   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("AUDIT_ID"=:B1)
8 - access("L"."COL_SK"="COL_SK")

Now, before ending this article I would like to show two things :

The predicate of the sub-optimal plan

Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("L"."COL_SK"(+)="COL_SK")
6 - access("AUDIT_ID"=:B1)

And the predicate part of the optimal plan

Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("AUDIT_ID"=:B1)
8 - access("L"."COL_SK"="COL_SK")

In the optimal plan we start by taking the 357 (or 379) rows to be merged from the outer table using predicate 6 and then we scan the inner table using this 357 join columns via an index range scan. The majority of the rows are eliminated much earlier.

In the sub-optimal plan it is the last operation, the HASH JOIN OUTER that eliminates the majority of the rows. Which means in this case : we started big and finally finished small. But to have a performant query we need to start small and try to keep small.

Bottom line: when you meet a sub-optimal execution plan using a HASH JOIN driven by a costly full table scan on the probed table while you know that the probed table could be very quickly transformed to an inner table of a NESTED LOOP join scanned via a precise index access on the join column, then verify the statistics on the join column because this one plays an important role in the cost of the NESTED LOOP join.


Why?

$
0
0

Why? This is a fundamental question. And in the context of my work (or actually my hobby to say so) I often say ”Why”. Yesterday I have been handled a query to tune. This query is honored with the following sub-optimal execution plan

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                         |      1 |        |     10 |00:02:13.09 |
|   1 |  SORT ORDER BY                      |                         |      1 |      8 |     10 |00:02:13.09 |
|   2 |   NESTED LOOPS                      |                         |      1 |        |     10 |00:02:13.06 |
|   3 |    NESTED LOOPS                     |                         |      1 |      8 |     10 |00:02:13.06 |
|   4 |     NESTED LOOPS                    |                         |      1 |      8 |     10 |00:02:13.06 |
|*  5 |      HASH JOIN SEMI                 |                         |      1 |      8 |     10 |00:02:13.06 |
|   6 |       JOIN FILTER CREATE            | :BF0000                 |      1 |        |   1469 |00:00:02.06 |
|   7 |        NESTED LOOPS                 |                         |      1 |        |   1469 |00:00:00.17 |
|   8 |         NESTED LOOPS                |                         |      1 |    307 |   5522 |00:00:00.11 |
|*  9 |          TABLE ACCESS BY INDEX ROWID| T2                      |      1 |    316 |   5522 |00:00:00.07 |
|* 10 |           INDEX RANGE SCAN          | T2_OOST_START_DATE_I    |      1 |   1033 |   8543 |00:00:00.03 |
|* 11 |          INDEX RANGE SCAN           | T1_OBST_OOST_DK_I       |   5522 |      1 |   5522 |00:00:00.08 |
|* 12 |         TABLE ACCESS BY INDEX ROWID | T1                      |   5522 |      1 |   1469 |00:00:00.13 |
|  13 |       VIEW                          | VW_SQ_1                 |      1 |  64027 |   1405 |00:00:07.82 |
|* 14 |        FILTER                       |                         |      1 |        |   1405 |00:00:07.82 |
|  15 |         JOIN FILTER USE             | :BF0000                 |      1 |  64027 |   1405 |00:00:07.82 |
|  16 |          PARTITION REFERENCE ALL    |                         |      1 |  64027 |  64027 |00:01:48.22 |
|* 17 |           HASH JOIN                 |                         |     52 |  64027 |  64027 |00:02:03.37 | --> spot this
|  18 |            TABLE ACCESS FULL        | T4                      |     52 |  64027 |  64027 |00:00:00.34 |
|* 19 |            TABLE ACCESS FULL        | T5                      |     41 |    569K|   5555K|00:02:08.32 | --> spot this
|  20 |      TABLE ACCESS BY INDEX ROWID    | T3                      |     10 |      1 |     10 |00:00:00.01 |
|* 21 |       INDEX UNIQUE SCAN             | T3_CHP_PK               |     10 |      1 |     10 |00:00:00.01 |
|* 22 |     INDEX UNIQUE SCAN               | T3_CHP_PK               |     10 |      1 |     10 |00:00:00.01 |
|  23 |    TABLE ACCESS BY INDEX ROWID      | T3                      |     10 |      1 |     10 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("ITEM_1"="OBST"."OBST_ID")
       filter("ITEM_2">"OBST"."START_DATE")
   9 - filter(("OOST"."CHP_ID_START" IS NOT NULL AND "OOST"."CHP_ID_END" IS NOT NULL))
  10 - access("OOST"."START_DATE">=:LD_FROM_DATE AND "OOST"."START_DATE"<=:LD_TO_DATE)
  11 - access("OBST"."OOST_ID"="OOST"."OOST_ID")
  12 - filter(("OBST"."OBS_STAT"=2 AND ADD_MONTHS(INTERNAL_FUNCTION("OBST"."MOD_DATE"),13)>=:LD_CURR_DATE))
  14 - filter((:LN_MIN_ac_number<=:ln_max_number AND TO_DATE(:LD_FROM_DATE)<=TO_DATE(:LD_TO_DATE)))
  17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
  19 - filter(((:LN_OPR_ID IS NULL OR "VEH"."OPR_ID"=:LN_OPR_ID) AND "VEH"."ac_number">=:LN_MIN_ac_number AND
              "VEH"."ac_number"<=:ln_max_number))
  21 - access("OOST"."CHP_ID_START"="CHP1"."CHP_ID")
  22 - access("OOST"."CHP_ID_END"="CHP2"."CHP_ID")

This query takes 2 minutes and 13 seconds to complete. Whereas a single operation (operation 19) takes 2 minutes and 8 seconds before feeding back its parent operation 17. At least I know where to focus my attention.

But Why? Why this full table scan?

Just before having this execution plan I have asked to re-gather statistics on tables and columns without histogram. Statistics seems to be Okay. And, by the way, it is normal that a full table scan is chosen by the CBO to generate 5,5 millions rows. Isn’t it?

So where is the problem?

An expert eye would say “look you start manipulating 5,5 million of rows to end up finally only with 10 rows; you need to start small and keep small”
So it is now time to investigate the query

SELECT  
                obst.obst_id obstructionId
               ,oost.comment_clob CommentClob
               ,chp1.ptcar_no StartPtcar
               ,chp2.ptcar_no EndPtcar
               ,oost.track_code Track
               ,oost.start_date StartPeriod
               ,oost.end_date EndPeriod
               ,oost.doc_no RelaasId
               ,obst.status_code Status
        FROM   T1 obst
             , T2 oost
             , T3 chp1
             , T3 chp2
          where obst.oost_id      = oost.oost_id
          and oost.chp_id_start = chp1.chp_id
          and oost.chp_id_end   = chp2.chp_id
          and obst.obs_stat     = 2 
		  and add_months(obst.mod_date,13) >= :ld_curr_date
		  and oost.start_date between :ld_from_date and :ld_to_date          
          and exists (select 1
                        from T4  obsv
                           , T5  veh
                        where  obsv.veh_id = veh.veh_id
		       and (:ln_opr_id is null
                               OR veh.opr_id = :ln_opr_id
                            )
                          and  obsv.obst_id = obst.obst_id
                          and  veh.ac_number between :ln_min_number and :ln_max_number
                          and  obsv.start_date > obst.start_date
                      )          
         order by obst.obst_id;		

As far as the most time consuming operation (table T5 full access) is in the EXISTS part I commented it and run the query which, unsurprisingly, came up instantaneously with about thousand of rows (1469). All in all the commented part when uncommented would normally keep from those 1469 rows only rows that fulfill the EXISTS clause and throw away the remaining rows. We need then to manipulate at least 1469 records. So why this enormous 5,5 millions rows?

Back to the execution plan where I decided to focus my attention only on the following operations:

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------
|* 17 |           HASH JOIN                 |                         |     52 |  64027 |  64027 |00:02:03.37 | 
|  18 |            TABLE ACCESS FULL        | T4                      |     52 |  64027 |  64027 |00:00:00.34 |
|* 19 |            TABLE ACCESS FULL        | T5                      |     41 |    569K|   5555K|00:02:08.32 | 
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
  17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
  19 - filter(((:LN_OPR_ID IS NULL OR "VEH"."OPR_ID"=:LN_OPR_ID) AND "VEH"."ac_number">=:LN_MIN_ac_number AND
              "VEH"."ac_number"<=:ln_max_number))
 

The CBO started by generating 5555K worth of rows using the filter number 19 and pass this enormous amount of data to the HASH JOIN operation number 17 which throw away the majority of those rows from which only 64027 survived the join condition indicated by predicate number 17. It is clear that the ideal situation would have been to (a) join first and (b) filter in a second step. So why (again) the CBO opted for a reversed situation (a) filter first and (b) join after? The filtering bind variables (:LN_MIN_AC_NUMBER,:LN_MAX_NUMBER) are not filtering anything because they represent the real min (veh.ac_number) and max (veh.ac_number)

Why the CBO has opted for this sub-optimal execution plan? What can I do to make the CBO joining first and filtering after? Statistics seems Okay as I said above. I spent a couple of hours searching on how to make this possible without changing the query but failed to succeed. This is why I asked a question on the otn forum where Jonathan Lewis gave me a help and suggested to use the following hint in the subquery

          and exists (select /*+ no_unnest push_subq */ 1
                        from T4  obsv
                           , T5  veh
                        where  obsv.veh_id = veh.veh_id
		       and (:ln_opr_id is null
                               OR veh.opr_id = :ln_opr_id
                            )
                          and  obsv.obst_id = obst.obst_id
                          and  veh.ac_number between :ln_min_number and :ln_max_number
                          and  obsv.start_date > obst.start_date
                      )                   	

Which effectively gave the following optimal execution plan:

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                         |      1 |        |      6 |00:00:00.56 |
|   1 |  SORT ORDER BY                            |                         |      1 |    254 |      6 |00:00:00.56 |
|*  2 |   HASH JOIN                               |                         |      1 |    254 |      6 |00:00:00.11 |
|   3 |    TABLE ACCESS FULL                      | T3                      |      1 |   2849 |   2849 |00:00:00.01 |
|*  4 |    HASH JOIN                              |                         |      1 |    254 |      6 |00:00:00.11 |
|   5 |     TABLE ACCESS FULL                     | T3                      |      1 |   2849 |   2849 |00:00:00.01 |
|   6 |     NESTED LOOPS                          |                         |      1 |        |      6 |00:00:00.10 |
|   7 |      NESTED LOOPS                         |                         |      1 |    254 |   5012 |00:00:00.09 |
|*  8 |       TABLE ACCESS BY INDEX ROWID         | T2                      |      1 |    262 |   5012 |00:00:00.06 |
|*  9 |        INDEX RANGE SCAN                   | T2_OOST_START_DATE_I    |      1 |    857 |   7722 |00:00:00.01 |
|* 10 |       INDEX RANGE SCAN                    | T1_OBST_OOST_DK_I       |   5012 |      1 |   5012 |00:00:00.03 |
|* 11 |      TABLE ACCESS BY INDEX ROWID          | T1                      |   5012 |      1 |      6 |00:00:00.48 |
|  12 |       NESTED LOOPS                        |                         |   1277 |        |      6 |00:00:00.46 |
|  13 |        NESTED LOOPS                       |                         |   1277 |      2 |      6 |00:00:00.46 |
|  14 |         PARTITION REFERENCE ALL           |                         |   1277 |      4 |      6 |00:00:00.46 |
|* 15 |          TABLE ACCESS BY LOCAL INDEX ROWID| T4                      |  66380 |      4 |      6 |00:00:00.43 |
|* 16 |           INDEX RANGE SCAN                | T4_OBSV_OBST_FK_I       |  66380 |     86 |      6 |00:00:00.28 |
|* 17 |         INDEX UNIQUE SCAN                 | T5_VEH_PK               |      6 |      1 |      6 |00:00:00.01 |
|* 18 |        TABLE ACCESS BY GLOBAL INDEX ROWID | T5                      |      6 |      1 |      6 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OOST"."CHP_ID_END"="CHP2"."CHP_ID")
   4 - access("OOST"."CHP_ID_START"="CHP1"."CHP_ID")
   8 - filter(("OOST"."CHP_ID_START" IS NOT NULL AND "OOST"."CHP_ID_END" IS NOT NULL))
   9 - access("OOST"."START_DATE">=TO_DATE(' 2013-11-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "OOST"."START_DATE"
      <=TO_DATE(' 2013-11-27 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  10 - access("OBST"."OOST_ID"="OOST"."OOST_ID")
  11 - filter(("OBST"."OBS_STAT"=2 AND ADD_MONTHS(INTERNAL_FUNCTION("OBST"."MOD_DATE"),13)>=SYSDATE@! AND  IS NOT NULL))
  15 - filter("OBSV"."START_DATE">:B1)
  16 - access("OBSV"."OBST_ID"=:B1)
  17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
  18 - filter(("VEH"."ac_number">=1 AND "VEH"."ac_number"<=99999))

This plan is doing exactly what I wanted it to do

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------------
|* 17 |         INDEX UNIQUE SCAN                 | T5_VEH_PK               |      6 |      1 |      6 |00:00:00.01 |
|* 18 |        TABLE ACCESS BY GLOBAL INDEX ROWID | T5                      |      6 |      1 |      6 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
  18 - filter(("VEH"."ac_number"&gt;=1 AND "VEH"."ac_number"&lt;=99999))

Joining first using the T5 primary key index and filtering in a second step. The Bloom filter and the Oracle internal view (VW_SQ_1) in the first plan disappeared from the new one. You know what the no_unnest hint does and the effect of push_pred hint has got on the query.

I will be very happy if someone could let me know how to obtain the desired execution plan without any hint and without re-writing the query


Dynamic Partition Pruning

$
0
0

A recent question on otn forum prompted me to write few words about dynamic pruning. Dynamic pruning occurs when Oracle knows that it will accomplish a partition pruning (elimination) but is unable to identify, at parse time, the partitions it has to eliminate. It has to wait until the query run time to find which partitions it will prune. This is represented in the execution plan via the word KEY in the PSTART (the first scanned partition) and PSTOP (the last scanned partition). So every time you see the word (KEY) in an execution plan, this is the CBO telling you that it is going to eliminate partitions but it will do that only during the query execution time. In this article I will try to show you few examples where dynamic pruning occurs.
The first and most evident situation of dynamic pruning is when you compare the partition key to a function which, in the CBO perception, it might return different results when called at parse time and at execution time. Here below a simple example

 create table t_range
    (
      id           number              not null,
      x            varchar2(30 char)   not null
    )
    partition by range (id)
    (
     partition p_10000 values less than (10000) ,
     partition p_20000 values less than (20000) ,
     partition p_30000 values less than (30000) ,
     partition p_40000 values less than (40000) ,
     partition p_50000 values less than (50000) ,
     partition p_60000 values less than (60000)
   );

insert into t_range values (150, 'First Part');
insert into t_range values (11500, 'Second Part');
insert into t_range values (25000, 'Third Part');
insert into t_range values (34000, 'Fourt Part');
insert into t_range values (44000, 'Fifth Part');
insert into t_range values (53000, 'Sixth Part');
commit;
exec dbms_stats.gather_table_stats (user, 't_range');

In the first next query I am going to compare the partition key (id) to a known and fixed value (150) while in the second query I will be comparing the same partition key to a variable value (trunc(dbms_random.value(150,150))) as perceived by the CBO.


SQL> explain plan for
    select *
    from t_range
    where id = 150 ;

--------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Pstart| Pstop |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |     1 |    15 |       |       |
|   1 |  PARTITION RANGE SINGLE|         |     1 |    15 |     1 |     1 | --> first partition
|*  2 |   TABLE ACCESS FULL    | T_RANGE |     1 |    15 |     1 |     1 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID"=150)

SQL> explain plan for
    select *
    from t_range
    where id = trunc(dbms_random.value(150,150)) ;

--------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Pstart| Pstop |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |     1 |    15 |       |       |
|   1 |  PARTITION RANGE SINGLE|         |     1 |    15 |   KEY |   KEY | --> dynamic partition pruning
|*  2 |   TABLE ACCESS FULL    | T_RANGE |     1 |    15 |   KEY |   KEY |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID"=TRUNC("DBMS_RANDOM"."VALUE"(150,150)))

When I compared the partition key to a ”parse time unknown value then the CBO delayed the partition pruning at run time and materialized this dynamic pruning by the word KEY in the execution plan.

A more indicative and clear example is when you use a date range partitioned table and compare the partition key to the SYSDATE value.


drop table t_range;

create table t_range (id number, create_date date, rec_type varchar2(2))
partition by range (create_date)
(
 partition p1 values less than (to_date('01/01/2012','DD/MM/YYYY'))
,partition p2 values less than (to_date('01/02/2012','DD/MM/YYYY'))
,partition p3 values less than (to_date('01/03/2012','DD/MM/YYYY'))
,partition p4 values less than (to_date('01/04/2012','DD/MM/YYYY'))
,partition p5 values less than (to_date('31/12/2013','DD/MM/YYYY'))
);

insert into t_range values (1,to_date('01/01/2012', 'DD/MM/YYYY'),'RR');
insert into t_range values (2,to_date('05/03/2012', 'DD/MM/YYYY'),'RR');
insert into t_range values (3,to_date('03/02/2012', 'DD/MM/YYYY'),'RR');
insert into t_range values (4,to_date('03/02/2012', 'DD/MM/YYYY'),'ZZ');
insert into t_range values (5,to_date('06/03/2012', 'DD/MM/YYYY'),'ZZ');
insert into t_range values (6,to_date('18/03/2012', 'DD/MM/YYYY'),'WE');
insert into t_range values (7,to_date('15/01/2012', 'DD/MM/YYYY'),'ZZ');
insert into t_range values (8,to_date('24/12/2013', 'DD/MM/YYYY'),'ER');

commit;

exec dbms_stats.gather_table_stats (user, 't_range');

explain plan for
select * from t_range
where create_date = sysdate;

--------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Pstart| Pstop |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |     1 |    14 |       |       |
|   1 |  PARTITION RANGE SINGLE|         |     1 |    14 |   KEY |   KEY | --> dynamic pruning with sysdate
|*  2 |   TABLE ACCESS FULL    | T_RANGE |     1 |    14 |   KEY |   KEY | --> because sysdate is a function
-------------------------------------------------------------------------- --> its value changes continuously with time

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("CREATE_DATE"=SYSDATE@!)

explain plan for
select * from t_range
where create_date = to_date('24-12-2013','dd-mm-yyyy');

--------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Pstart| Pstop |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |     1 |    14 |       |       |
|   1 |  PARTITION RANGE SINGLE|         |     1 |    14 |     5 |     5 |--> partition elimination occured
|*  2 |   TABLE ACCESS FULL    | T_RANGE |     1 |    14 |     5 |     5 |--  when sysdate has not been used
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("CREATE_DATE"=TO_DATE(' 2013-12-24 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

SYSDATE, in contrast to the current day, which is “24-12-2013″,  is continuously changing during time. Even though that we are interested only in the date part of the  SYSDATE value, the CBO still has to wait until the query execution time to find and eliminate the untouched partitions.

There is another situation where dynamic pruning might occur; it is when the partitioned table is joined to another table using the partition key and the CBO detects that there will be partition pruning because the joined table will not join all the partitions. In the next example I will create a t_join table in which I will insert all t_range ”partition keys”

create table t_join (id number, date_rec date, vc varchar2(30));

insert into t_join values (1,to_date('01/01/2012', 'DD/MM/YYYY'),'RR');
insert into t_join values (2,to_date('05/03/2012', 'DD/MM/YYYY'),'RR');
insert into t_join values (3,to_date('03/02/2012', 'DD/MM/YYYY'),'RR');
insert into t_join values (4,to_date('03/02/2012', 'DD/MM/YYYY'),'ZZ');
insert into t_join values (5,to_date('06/03/2012', 'DD/MM/YYYY'),'ZZ');
insert into t_join values (6,to_date('18/03/2012', 'DD/MM/YYYY'),'WE');
insert into t_join values (7,to_date('15/01/2012', 'DD/MM/YYYY'),'ZZ');
insert into t_join values (8,to_date('24/12/2013', 'DD/MM/YYYY'),'ER');
commit;

explain plan for
select *
from t_range a, t_join b
where
a.create_date = b.date_rec;

------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Pstart| Pstop |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |     9 |   882 |       |       |
|*  1 |  HASH JOIN           |         |     9 |   882 |       |       |
|   2 |   PARTITION RANGE ALL|         |     8 |   112 |     1 |     5 |
|   3 |    TABLE ACCESS FULL | T_RANGE |     8 |   112 |     1 |     5 |
|   4 |   TABLE ACCESS FULL  | T_JOIN  |     8 |   672 |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."CREATE_DATE"="B"."DATE_REC")

In this case the CBO determines, at parse time, that all partitions will be scanned and hence decided to scan all partitions (operation 2) at parse time. But what if I add a filter operation that eliminates few records?

explain plan for
select *
from t_range a, t_join b
where
a.create_date = b.date_rec
and b.id = 7;

-----------------------------------------------------------------------------
| Id  | Operation                 | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |         |     1 |    98 |       |       |
|   1 |  NESTED LOOPS             |         |     1 |    98 |       |       |
|*  2 |   TABLE ACCESS FULL       | T_JOIN  |     1 |    84 |       |       |
|   3 |   PARTITION RANGE ITERATOR|         |     1 |    14 |   KEY |   KEY |
|*  4 |    TABLE ACCESS FULL      | T_RANGE |     1 |    14 |   KEY |   KEY |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("B"."ID"=7)
4 - filter("A"."CREATE_DATE"="B"."DATE_REC")

Thanks to this additional filter on the t_join table , the CBO knows at parse time that it has to eliminate partitions. But as far as it has to wait until run time to apply the filter (predicate n° 2) then it is only at that time when it will be able to isolate the accessed partitions and eliminate the rest: this is dynamic pruning.

The next article will deal about partition pruning with hash partitioned tables.



Partition pruning with hash partitioned tables

$
0
0

This is a simple reminder on how partition pruning with hash partitioned table could be different from range partitioned ones. If you have a range partitioned table, a query using a partition key in the where clause can eliminate partitions either dynamically (dynamic pruning) or at hard parse time. This partition pruning might occur even if the predicate on the partition key is not an equality (>= or <=). This is, however, not the case for a hash partitioned table. For this partioning method you have to apply an equality or an in list predicate on the partitioned key to see the CBO pruning your hash partitions.

drop table t1;
create table t1
(n1             number not null,
creation_date  date   not null,
comments       varchar2(500))
partition by hash (n1)
partitions 16;

insert into t1 select rownum, sysdate + (rownum-1), lpad('x',10)
from dual
connect by level <= 1000;

commit;

I am going to execute 3 queries against the above engineered hash partitioned table. The first one will use an equality predicate against the partition key (n1) while in the second one I will be using an in list operator on the same partition key. And finally in the last query I will compare the partition key to a value using a greater than predicate operator.


select * from t1
where n1 = 42;

select * from t1
where n1 in(42,100);

select * from t1
where n1 > 42;

The corresponding execution plans are given below respectively

----------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |  1024 |       |       |
|   1 |  PARTITION HASH SINGLE|      |     1 |  1024 |    14 |    14 |
|*  2 |   TABLE ACCESS FULL   | T1   |     1 |  1024 |    14 |    14 |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N1"=42)

----------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     2 |  2048 |       |       |
|   1 |  PARTITION HASH INLIST|      |     2 |  2048 |KEY(I) |KEY(I) |
|*  2 |   TABLE ACCESS FULL   | T1   |     2 |  2048 |KEY(I) |KEY(I) |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N1"=42 OR "N1"=100)

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Pstart| Pstop |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   737 |   737K|       |       |
|   1 |  PARTITION HASH ALL|      |   737 |   737K|     1 |    16 |
|*  2 |   TABLE ACCESS FULL| T1   |   737 |   737K|     1 |    16 |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N1">42)

A partition pruning occurs, at hard parse time, for the first query (equality predicate against the hash partitioned key). A dynamic partition pruning occurs for the second query (in list predicate against the partition key) and there was no partition pruning for the third query (an inequality predicate on the partition key).

Simply put, when selecting from a hash partitioned table, if you want to prune your partitions and accelerate your query response time, you need to apply an equality predicate (or in list) on the partition key.

Spot the resemblance (or coincidence) with the hash join method that works only when the join condition is an equality.

Up to now we are using a single hash partition key. What if I hash partition by a composite key?

create table t1
(n1             number not null,
 n2             number not null,
 creation_date  date   not null,
 comments       varchar2(500))
partition by hash (n1,n2) --> composite partition key
partitions 16;

insert into t1 select rownum,rownum*2, sysdate + (rownum-1), lpad('x',10)
from dual
connect by level <= 1000;

commit;

In the following query I will use a single equality predicate on the first column of the composed partition key as shown below:

explain plan for
select * from t1
where n1 = 42;

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Pstart| Pstop |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     8 |  8296 |       |       |
|   1 |  PARTITION HASH ALL|      |     8 |  8296 |     1 |    16 |
|*  2 |   TABLE ACCESS FULL| T1   |     8 |  8296 |     1 |    16 |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N1"=42)

Despite the existence of an equality predicate on the first partition key, partition pruning didn’t occurred. However if I add to the above query an extra predicate on the second partition key, partition pruning will this time occur as shown below:

explain plan for
select * from t1
where n1 = 42
and   n2 = 84;

----------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |  1037 |       |       |
|   1 |  PARTITION HASH SINGLE|      |     1 |  1037 |    13 |    13 |
|*  2 |   TABLE ACCESS FULL   | T1   |     1 |  1037 |    13 |    13 |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N1"=42 AND "N2"=84)

Put is simply, when you have a hash partitioned table, partitioned via a composite partition key, to eliminate partitions, you need to have an equality (or an in list) predicate on the complete set of the partition keys( in any order).

I will not finish this hash partition pruning visit without extending the above cases to a composite range hash partition

drop table t1;

create table t1
(n1             number not null,
creation_date  date   not null,
comments       varchar2(500)
 )
partition by range (creation_date)
subpartition by hash(n1)
subpartitions 16
(
partition p1 values less than (to_date('01/01/2011','DD/MM/YYYY'))
,partition p2 values less than (to_date('01/01/2012','DD/MM/YYYY'))
,partition p3 values less than (to_date('01/01/2013','DD/MM/YYYY'))
,partition p4 values less than (to_date('31/12/2014','DD/MM/YYYY'))
);

insert into t1 values (1, to_date('29032010','ddmmyyyy'), '2010_Part');
insert into t1 values (42, to_date('17022011','ddmmyyyy'), '2011_Part');
insert into t1 values (17, to_date('15022012','ddmmyyyy'), '2012_Part');
insert into t1 values (25, to_date('13/02/2013','dd/mm/yyyy'), '2013_Part');

commit;

And the usual selects with their corresponding execution plans

explain plan for
select * from t1
here creation_date = to_date('29032010','ddmmyyyy')
and  n1 = 42;

-----------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |     1 |   274 |       |       |
|   1 |  PARTITION RANGE SINGLE|      |     1 |   274 |     1 |     1 | --> range partition pruning
|   2 |   PARTITION HASH SINGLE|      |     1 |   274 |    14 |    14 | --> hash partition pruning
|*  3 |    TABLE ACCESS FULL   | T1   |     1 |   274 |    14 |    14 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("CREATION_DATE"=TO_DATE(' 2010-03-29 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "N1"=42)

explain plan for
select * from t1
where creation_date = to_date('29032010','ddmmyyyy')
and  n1 >= 42;

-----------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |     1 |   274 |       |       |
|   1 |  PARTITION RANGE SINGLE|      |     1 |   274 |     1 |     1 | --> range partition pruning
|   2 |   PARTITION HASH ALL   |      |     1 |   274 |     1 |    16 | --> no hash partition pruning due to n1>42
|*  3 |    TABLE ACCESS FULL   | T1   |     1 |   274 |     1 |    16 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("CREATION_DATE"=TO_DATE(' 2010-03-29 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "N1">=42)

Bottom line : was it a simple hash partitioned table or a composite range-hash one, to have a partition pruning within the hash partitioned bit of the table you need to apply an equality (or an in list) predicate on the complete partition key (single or composite key).


Tuning via row source execution plan

$
0
0

Browsing over draft blogs I have not finished and forget to come back to, I found one article about a sub optimal execution plan of a query suffering terrible performance problem. If my memory is still servicing me well, I think this execution plan has been sent to me by one of my ex-colleague with whom I have had several e-mail exchange and have supplied him with a proposition. I was waiting for his answer before publishing this blog article. It seems he has never answered since then :-). Anyway, I thought it is, nevertheless, worth sharing with you at least my starting investigations on this query performance problem.

This is the execution plan with row source statistics showing the estimations done by the CBO based on available statistics

 ----------------------------------------------------------------------------------------------------------------------
 | Id  | Operation                            | Name                          | Starts | E-Rows | A-Rows |   A-Time   |
 ----------------------------------------------------------------------------------------------------------------------
 |   1 |  SORT UNIQUE                         |                               |      1 |      1 |    109 |00:00:35.22 | --> 35sec
 |   2 |   WINDOW BUFFER                      |                               |      1 |      1 |    109 |00:00:35.22 |
 |*  3 |    FILTER                            |                               |      1 |        |    109 |00:00:35.21 |
 |   4 |     SORT GROUP BY                    |                               |      1 |      1 |    760 |00:00:35.21 |
 |*  5 |      FILTER                          |                               |      1 |        |   9938 |00:00:35.00 |
 |   6 |       NESTED LOOPS OUTER             |                               |      1 |      1 |  14546 |00:00:34.93 |
 |   7 |        NESTED LOOPS                  |                               |      1 |      1 |  14546 |00:00:33.58 |
 |   8 |         NESTED LOOPS OUTER           |                               |      1 |      1 |    952 |00:00:00.52 |
 |   9 |          NESTED LOOPS OUTER          |                               |      1 |      1 |    952 |00:00:00.31 |
 |* 10 |           TABLE ACCESS BY INDEX ROWID| XXX_TABLE1                    |      1 |      2 |    760 |00:00:00.09 |
 |* 11 |            INDEX RANGE SCAN          | XXX_TAB1_IND_1                |      1 |      2 |   8766 |00:00:00.02 |
 |  12 |           TABLE ACCESS BY INDEX ROWID| XXX_TABLE2                    |    760 |      1 |    248 |00:00:00.22 |
 |* 13 |            INDEX RANGE SCAN          | XXX_TAB2_IND_FK               |    760 |      4 |    248 |00:00:00.19 |
 |* 14 |          TABLE ACCESS BY INDEX ROWID | XXX_TABLE3                    |    952 |      1 |    952 |00:00:00.20 |
 |* 15 |           INDEX RANGE SCAN           | XXX_TABLE3_PK                 |    952 |      1 |   4833 |00:00:00.12 |
 |* 16 |         TABLE ACCESS BY INDEX ROWID  | XXX_TABLE1                    |    952 |      1 |  14546 |00:00:33.04 |
 |* 17 |          INDEX RANGE SCAN            | XXX_TAB1_IND_1                |    952 |      1 |   7980K|00:00:00.05 |
 |* 18 |        TABLE ACCESS BY INDEX ROWID   | XXX_TABLE5                    |  14546 |      1 |     15 |00:00:01.29 |
 |* 19 |         INDEX RANGE SCAN             | XXX_TAB5_IND_1                |  14546 |      1 |     15 |00:00:01.21 |
 |  20 |       SORT AGGREGATE                 |                               |      3 |      1 |      3 |00:00:00.01 |
 |* 21 |        TABLE ACCESS BY INDEX ROWID   | XXX_TABLE2                    |      3 |      1 |     11 |00:00:00.01 |
 |* 22 |         INDEX RANGE SCAN             | XXX_TAB2_IND_FK               |      3 |      4 |     11 |00:00:00.01 |
 ----------------------------------------------------------------------------------------------------------------------
 

where I have deliberately omitted the predicate part for clarity (I will show it in a moment).

Have you already pointed out the most time consuming operation?

If not yet then look at the different operations (from 1 to 22) and isolate  the most consuming child/parent operation  by looking at the A-Time column.

Have you already found it?

Here it is together with its predicate part:

 ----------------------------------------------------------------------------------------------------------------------
 | Id  | Operation                            | Name                          | Starts | E-Rows | A-Rows |   A-Time   |
 ----------------------------------------------------------------------------------------------------------------------
 |* 16 |         TABLE ACCESS BY INDEX ROWID  | XXX_TABLE1                    |    952 |      1 |  14546 |00:00:33.04 |
 |* 17 |          INDEX RANGE SCAN            | XXX_TAB1_IND_1                |    952 |      1 |   7980K|00:00:00.05 |
 ----------------------------------------------------------------------------------------------------------------------
 16 - filter(("TAB1_2"."N1">=20000 AND "TAB1"."XXX_ID"="TAB1_2"."XXX_ID"))
 17 - access("TAB1_2"."DATH">SYSDATE@!-.041666 AND "TAB1"."DAT_TRD"="TAB1_2"."DAT_TRD" AND "TAB1_2"."DATH" IS NOT NULL)
      filter("TAB1"."DAT_TRD"="TAB1_2"."DAT_TRD")
 

This is how to proceed generally to identify the most consuming operation in an execution plan. By the way, this is also my method of tuning a query not performing in a client acceptable response time i.e. (a) I get the row source execution plan including the estimations and actuals (b) and I then scrutinize or scan this plan looking through the A-Time column for the most consuming operation. Thanks to this method, I end up in the majority of cases (there are of course exceptions) by isolating the operation on which attention should be concentrated.

So back to my execution plan. The most consuming operation being found, what observations can be done? Well, without bothering yourself trying to understand the above filter operation, you can, however, obviously point out two major facts

  1. Looking at the high number (7980,000) of rowid the index range scan operation 17 has supplied its parent operation 16 and finally to the very small number of those row-rowids (14,546) that survived the filter operation number 16, I end up by realizing that this is an enormous waste of time and resource spent discarding rows that would have not been sent to the table access operation at all.
  2. Looking at the estimations done by the CBO and the actuals rows generated I ended up by realizing that there is obviously a statistics problem my colleague should look at very urgently.
  3. Almost all E-Rows of the above execution plan has a cardinality equal to one. This particular cardinality is typically suspicious. It is a clear indication of the absence of fresh statistics

Whether the most consuming operation is due to the not up-to-date statistics or to an imprecise index (XXX_TAB1_IND_1) depends on my colleague answer that will probably not come at all.

Bottom line: when you have to tune a query you can proceed using the following steps:

  1. get the row source execution plan from memory using dbms_xplan package that includes Estimations and Actuals, then track the most consuming operation and the accuracy of statistics (table and columns statistics)
  2. get a SQL monitoring report and analyze it if you have a license for
  3. use Tanel Poder snapper
  4. use Carlos Sierra SQLTXPLAN
  5. etc…

Partition range operation: how many times it has been started

$
0
0

It is well known that when reading row source execution plan in order to states about the accuracy of the estimations done by the CBO based on the available table and index statistics, we should use the following comparison

Starts * E-Rows = A-Rows

The more these two operands are close to each other the more are statistics accurate and the more is the chance seeing the CBO producing an optimal execution plan.

Fine, but very recently I have been reminded by one of my smart friend Ahmed Aangoour, that this formula should be considered differently when there is a partition range xxx operation followed by a table full access.

As always an example being worth a thousand words let see this in action

drop table t_range;

CREATE TABLE t_range
(
ID           NUMBER              NOT NULL,
X            VARCHAR2(30 CHAR)   NOT NULL
)
PARTITION BY RANGE (ID)
(
PARTITION P_10000 VALUES LESS THAN (10000) ,
PARTITION P_20000 VALUES LESS THAN (20000) ,
PARTITION P_30000 VALUES LESS THAN (30000) ,
PARTITION P_40000 VALUES LESS THAN (40000) ,
PARTITION P_50000 VALUES LESS THAN (50000) ,
PARTITION P_60000 VALUES LESS THAN (60000)
);

INSERT INTO t_range VALUES (150, 'First Part');
INSERT INTO t_range VALUES (11500, 'Second Part');
INSERT INTO t_range VALUES (25000, 'Third Part');
INSERT INTO t_range VALUES (34000, 'Fourt Part');
INSERT INTO t_range VALUES (44000, 'Fifth Part');
INSERT INTO t_range VALUES (53000, 'Sixth Part');

commit;

exec dbms_stats.gather_table_stats(user, 't_range');

select /*+ gather_plan_statistics */ count(1) from t_range;

select * from table (dbms_xplan.display_cursor(null,null,'ALLSTATS LAST +PARTITION'));

-----------------------------------------------------------------------------------
| Id  | Operation            | Name    | Starts | E-Rows | Pstart| Pstop | A-Rows |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |      1 |        |       |       |      1 |
|   1 |  SORT AGGREGATE      |         |      1 |      1 |       |       |      1 |
|   2 |   PARTITION RANGE ALL|         |      1 |      6 |     1 |     6 |      6 |
|   3 |    TABLE ACCESS FULL | T_RANGE |      6 |      6 |     1 |     6 |      6 |
-----------------------------------------------------------------------------------

I have engineered a table with 6 partitions. Each partition possess one row so that a full table scan of this table would generate 6 rows.

The Starts information related to operation 3 suggests that the table t_range has been scanned 6 times. So that Starts * E-Rows = 36 which is 6 times greater than A-Rows. But A-Rows is correct and E-Rows is also absolutely correct.

I believe that in such situation of PARTITION RANGE operation followed by a TABLE ACCESS FULL we should read the Starts operation as to be the number of scanned partitions and not the number of times the partitioned table has been fully scanned. So one has to be prudent when comparing Estimations versus Actuals in such a kind of situations.

Browsing the above Ahmed Aangour blog post, I saw a particular operation which attracted my attention


--------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name                 | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------------------------------
|   7 |      PARTITION RANGE AND               |                      |     86 |      6 |   7949K|--> spot this
|*  8 |       TABLE ACCESS BY LOCAL INDEX ROWID| ODD_ASSURE           |     86 |      6 |   7949K|
|*  9 |        INDEX RANGE SCAN                | PK_ASS_ASSURE        |     86 |     56 |   7949K|
|  10 |     PARTITION RANGE AND                |                      |   7949K|      1 |     11M|--> spot this
|* 11 |      INDEX RANGE SCAN                  | PK_ADH_INFO_ADHESION |   7949K|      1 |     11M|
|  12 |    TABLE ACCESS BY LOCAL INDEX ROWID   | ODD_INFO_ADHESION    |     11M|      2 |     11M|
--------------------------------------------------------------------------------------------------

This is the first time I came across such a kind of partition operation:

 PARTITION RANGE AND

This operation not only exists but there is another one which is

 PARTITION RANGE OR

That I can simulate with my current t_range table

SQL> select /*+ gather_plan_statistics */ count(1) from t_range
where id = 142
or id between 5000 and 15000;

-----------------------------------------------------------------------------------------
| Id  | Operation           | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |      1 |        |      1 |00:00:00.01 |      44 |
|   1 |  SORT AGGREGATE     |         |      1 |      1 |      1 |00:00:00.01 |      44 |
|   2 |   PARTITION RANGE OR|         |      1 |      1 |      1 |00:00:00.01 |      44 |
|*  3 |    TABLE ACCESS FULL| T_RANGE |      2 |      1 |      1 |00:00:00.01 |      44 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter((("ID"<=15000 AND "ID">=5000) OR "ID"=142))

Note
-----
- statistics feedback used for this statement

The PARTITION RANGE OR (and the AND) is a feature of partition pruning that seems to be implemented starting from 11g allowing partition elimination when the partition key is within the where clause but used several times with an OR (AND) operation.

PS : The last Note about statistics feedback is an excellent reminder to investigate deeply this new 12c optimizer feature.


2013 in review

$
0
0

I would like to share with you a summary of my annual blogging stuff. My 2013 blogging activity was more accurate than it was the last year. I have published 33 new posts growing the total archive to 96 posts (97 including the current post). Friends who are working close to me know that I promised them to reach 100 posts by the end of 2013. Unfortunately, I will not achieve this goal. I would have been, nevertheless, able to achieve it as far as I still have a couple of drafts almost ready to publish. But, being one of the active writing style adept, it always takes me time before publishing even a simple article: I need to test, re-test, read and re-read before approving the post.

For this coming new year I want to fix a new challenge: publish 100 posts.

Will it be possible? Wait and see.

Click here to see the complete report.

Best wishes


SQL Plan Management: what’s new in 12c

$
0
0

Don’t be disappointed by the title. This is a simple note to show one difference I have noticed in how a stored SPM execution plan is managed when a dependent object is dropped or renamed. To keep it simple, let me start by an existing SPM baseline as shown below

 select sql_text, plan_name
 from dba_sql_plan_baselines
 where signature = '1292784087274697613';

 SQL_TEXT                                            PLAN_NAME
 --------------------------------------------------- ------------------------------
 select count(*), max(col2) from t1 where flag = :n  SQL_PLAN_13w748wknkcwd8576eb1f
 

If I want to display the execution plan of this stored SPM baseline I will proceed as follow:

 select * from table(dbms_xplan.display_sql_plan_baseline(plan_name =>'SQL_PLAN_13w748wknkcwd8576eb1f'));

 --------------------------------------------------------------------------------
 SQL handle: SQL_11f0e4472549338d
 SQL text: select count(*), max(col2) from t1 where flag = :n
 --------------------------------------------------------------------------------
 --------------------------------------------------------------------------------
 Plan name: SQL_PLAN_13w748wknkcwd8576eb1f         Plan id: 2239163167
 Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
 --------------------------------------------------------------------------------

 Plan hash value: 3625400295
 -------------------------------------------------------------------------------------
 | Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 -------------------------------------------------------------------------------------
 |   0 | SELECT STATEMENT             |      |     1 |    59 |   124   (2)| 00:00:01 |
 |   1 |  SORT AGGREGATE              |      |     1 |    59 |            |          |
 |   2 |   TABLE ACCESS BY INDEX ROWID| T1   | 25000 |  1440K|   124   (2)| 00:00:01 |
 |*  3 |    INDEX RANGE SCAN          | I1   | 25000 |       |    13   (8)| 00:00:01 |
 -------------------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
 3 - access("FLAG"=:N)
 

It is a baseline using an index range scan. What happen to that stored execution plan when I drop or rename that index

 drop index i1;

 select * from table(dbms_xplan.display_sql_plan_baseline(plan_name =>'SQL_PLAN_13w748wknkcwd8576eb1f'));
 --------------------------------------------------------------------------------
 SQL handle: SQL_11f0e4472549338d
 SQL text: select count(*), max(col2) from t1 where flag = :n

 --------------------------------------------------------------------------------
 --------------------------------------------------------------------------------
 Plan name: SQL_PLAN_13w748wknkcwd8576eb1f         Plan id: 2239163167
 Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
 --------------------------------------------------------------------------------

 Plan hash value: 3724264953

 ---------------------------------------------------------------------------
 | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 ---------------------------------------------------------------------------
 |   0 | SELECT STATEMENT   |      |     1 |    59 |   241   (3)| 00:00:01 |
 |   1 |  SORT AGGREGATE    |      |     1 |    59 |            |          |
 |*  2 |   TABLE ACCESS FULL| T1   | 25000 |  1440K|   241   (3)| 00:00:01 |
 ---------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
 2 - filter("FLAG"=:N)
 

The baseline stored execution plan has been automatically updated to reflect the disappeared index.  But wait; this is an 11g behavior. Do the same thing in 12c and you will get a different functioning.

 SQL> select * from table(dbms_xplan.display_sql_plan_baseline(plan_name => 'SQL_PLAN_13w748wknkcwd7823646b'));

 --------------------------------------------------------------------------------
 SQL handle: SQL_11f0e4472549338d
 SQL text: select count(*), max(col2) from t1 where flag = :n
 --------------------------------------------------------------------------------

 --------------------------------------------------------------------------------
 Plan name: SQL_PLAN_13w748wknkcwd7823646b         Plan id: 2015585387
 Enabled: YES     Fixed: NO      Accepted: YES     Origin: MANUAL-LOAD
 Plan rows: From dictionary
 --------------------------------------------------------------------------------

 Plan hash value: 497086120

 ---------------------------------------------------------------------------------------------
 | Id  | Operation                            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 ---------------------------------------------------------------------------------------------
 |   0 | SELECT STATEMENT                     |      |       |       |     2 (100)|          |
 |   1 |  SORT AGGREGATE                      |      |     1 |    30 |     0   (0)|          |
 |   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |     1 |    30 |     2   (0)| 00:00:01 |
 |*  3 |    INDEX RANGE SCAN                  | I1   |     1 |       |     1   (0)| 00:00:01 |
 ---------------------------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
 3 - access("FLAG"=:N)
 

This time dropping the index in a 12c release will not influence the stored SPM execution plan as shown below:


 SQL> drop index I1;

 SQL> select * from table(dbms_xplan.display_sql_plan_baseline(plan_name => 'SQL_PLAN_13w748wknkcwd7823646b'));

 SQL_PLAN_13w748wknkcwd8576eb1f
 --------------------------------------------------------------------------------
 SQL handle: SQL_11f0e4472549338d
 SQL text: select count(*), max(col2) from t1 where flag = :n
 --------------------------------------------------------------------------------

 --------------------------------------------------------------------------------
 Plan name: SQL_PLAN_13w748wknkcwd7823646b         Plan id: 2015585387
 Enabled: YES     Fixed: NO      Accepted: YES     Origin: MANUAL-LOAD

 Plan rows: From dictionary
 --------------------------------------------------------------------------------
 Plan hash value: 497086120

 ---------------------------------------------------------------------------------------------
 | Id  | Operation                            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 ---------------------------------------------------------------------------------------------
 |   0 | SELECT STATEMENT                     |      |       |       |     2 (100)|          |
 |   1 |  SORT AGGREGATE                      |      |     1 |    30 |     0   (0)|          |
 |   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |     1 |    30 |     2   (0)| 00:00:01 |
 |*  3 |    INDEX RANGE SCAN                  | I1   |     1 |       |     1   (0)| 00:00:01 |
 ---------------------------------------------------------------------------------------------

 Predicate Information (identified by operation id):
 ---------------------------------------------------
 3 - access("FLAG"=:N)
 

The interesting question which motivated this post is: is this change good or bad for our proper use/debug of SPM baseline?

When I was deeply investigation, in the last release, how a baseline is chosen or discarded following a drop of a dependent object, I have already remarked that kind of automatic update of the stored execution plan. I was having two stored baselines one using full table scan and the other one was using the index range scan. In such a situation this index drop let me with the two different stored baselines but this time having both an identical full table scan plan. I knew why, because I originated the two baselines and I provoked that change. But a new developer who might come after me would then question what are those two different baselines having the same execution plan. Would he be able to understand that this situation has been made as such because someone in the past has dropped and index? Not sure.

So, saving definitely the execution plan of the stored SPM baseline as it was at its creation time is a good news in my honest opinion. Why?  Because if you see your query not using the baseline your want it to be used, you will dig into its stored plan and you will remark that this stored plan is using an index named I1. It is then easy to go to that table and verify if the I1 index exists or has been dropped which might immediately explain why your stored baseline is not anymore selected.

Bottom line: I think that the most important thing to remember when looking to the stored SPM execution plan is that, starting from 12c, the displayed execution plan is the one that was available at the baseline captured time. It could be now still available and it might be not anymore reproducible


Viewing all 224 articles
Browse latest View live