Quantcast
Channel: Mohamed Houri’s Oracle Notes
Viewing all articles
Browse latest Browse all 224

Statistics: On the importance of the Notes column

$
0
0

Notice the output of the following query:

SQL> getPartCol

col partition_name format a20
col num_distinct   format a20
col last_anal      format a20
col column_name    format a12
col notes          format a35

select
       partition_name
      ,column_name
      ,num_distinct
      ,density
      ,to_char(last_analyzed,'dd/mm/yyyy hh24:mi:ss') last_anal
      ,histogram
      ,notes
from
      all_part_col_statistics
where owner     = upper('&owner')
and table_name  = upper('&table_name');

PARTITION_NAME       COLUMN_NAME  LAST_ANAL            HISTOGRAM       NOTES
-------------------- ------------ -------------------- --------------- -----------------------------------
P_A                  PKEY         30/06/2021 17:39:50  FREQUENCY       HYPERLOGLOG
P_A                  N1           30/06/2021 17:39:50  FREQUENCY       HYPERLOGLOG
P_A                  N2           30/06/2021 17:39:50  HYBRID          HYPERLOGLOG
P_A                  N_50         30/06/2021 17:39:50  FREQUENCY       HYPERLOGLOG
P_A                  N_800        30/06/2021 17:39:50  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG
P_B                  PKEY         30/06/2021 17:39:51  FREQUENCY       HYPERLOGLOG
P_B                  N1           30/06/2021 17:39:51  FREQUENCY       HYPERLOGLOG
P_B                  N2           30/06/2021 17:39:51  HYBRID          HYPERLOGLOG
P_B                  N_50         30/06/2021 17:39:51  FREQUENCY       HYPERLOGLOG
P_B                  N_800        30/06/2021 17:39:51  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG
P_C                  PKEY         30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_C                  N1           30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_C                  N2           30/06/2021 17:39:52  HYBRID          HYPERLOGLOG
P_C                  N_50         30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_C                  N_800        30/06/2021 17:39:52  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG
P_D                  PKEY         30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_D                  N1           30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_D                  N2           30/06/2021 17:39:52  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG
P_D                  N_50         30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_D                  N_800        30/06/2021 17:39:52  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG

20 rows selected.


Do you know what the impact of the Notes HIST_FOR_INCREM_STATS HYPERLOGLOG is?

Impact n°1

The first impact I have recently discovered concerns the bind sensitive property of cursors using predicates on columns flagged with the above Notes. Here’s a simple demonstration. I will first start with a “normal column” N_50 and then with an “abnormal” N_800 one

SQL> select count(1) from t1_part where n_50=14;

  COUNT(1)
----------
     31984

SQL> @xpsimp

SQL_ID  18fhcmjy68zv6, child number 0
-------------------------------------
-----------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |       |       |       |       |
|   1 |  SORT AGGREGATE     |         |     1 |     3 |       |       |
|   2 |   PARTITION LIST ALL|         | 31984 | 95952 |     1 |     4 |
|*  3 |    TABLE ACCESS FULL| T1_PART | 31984 | 95952 |     1 |     4 | -- very good Estimation
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N_50"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: 18fhcmjy68zv6

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
18fhcmjy68zv6      2324535125            0 Y          1 -- cursor is bind sensitive
SQL> select count(1) from t1_part where n_800=14;

  COUNT(1)
----------
      2072

SQL> @xpsimp

SQL_ID  b2b4nm8tqqukg, child number 0
-------------------------------------
-----------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |       |       |       |       |
|   1 |  SORT AGGREGATE     |         |     1 |     4 |       |       |
|   2 |   PARTITION LIST ALL|         |  1958 |  7832 |     1 |     4 |
|*  3 |    TABLE ACCESS FULL| T1_PART |  1958 |  7832 |     1 |     4 | -- not a very good Estimation
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N_800"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: b2b4nm8tqqukg

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
b2b4nm8tqqukg      2324535125            0 N          1  -- cursor is not bind sensitive

Thus, once again, this needs to be brought to someone’s attention:  a cursor will not be marked bind sensitive when it uses a predicate column having histogram but signaled by the Notes HIST_FOR_INCREM_STATS HYPERLOGLOG.

And what about a column that has a mix of different not null Notes values?

PARTITION_NAME       COLUMN_NAME  LAST_ANAL            HISTOGRAM       NOTES
-------------------- ------------ -------------------- --------------- -----------------------------------
P_A                  N2           30/06/2021 17:39:50  HYBRID          HYPERLOGLOG
P_B                  N2           30/06/2021 17:39:51  HYBRID          HYPERLOGLOG
P_C                  N2           30/06/2021 17:39:52  HYBRID          HYPERLOGLOG
P_D                  N2           30/06/2021 17:39:52  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG

In order to answer this question we need also to see the corresponding Notes of this column(N2) at the global level

select       
       column_name
      ,num_distinct
      ,density
      ,to_char(last_analyzed,'dd/mm/yyyy hh24:mi:ss') last_anal
      ,histogram
      ,notes
from
      all_tab_col_statistics
where owner     = 'C##MHOURI'
and table_name  = 'T1_PART'
and column_name ='N2';


COLUMN_NAME  NUM_DISTINCT    DENSITY LAST_ANAL            HISTOGRAM       NOTES
------------ ------------ ---------- -------------------- --------------- ------------
N2                   1031    .000814 30/06/2021 17:39:55  HYBRID          INCREMENTAL

When Oracle uses statistics at global level, the cursor will be marked bind sensitive as the following proves:

SQL> select count(1) from t1_part where n2=100;

  COUNT(1)
----------
      1725

SQL_ID  a7car6agnw2xd, child number 0
-------------------------------------
-----------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |       |       |       |       |
|   1 |  SORT AGGREGATE     |         |     1 |     4 |       |       |
|   2 |   PARTITION LIST ALL|         |   571 |  2284 |     1 |     4 | -- global level
|*  3 |    TABLE ACCESS FULL| T1_PART |   571 |  2284 |     1 |     4 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N2"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: a7car6agnw2xd

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
a7car6agnw2xd      2324535125            0 Y          2 -- cursor is bind sensitive

When Oracle decides to visit any of the first 3 partitions, the cursor will be also marked bind sensitive as shown below:

SQL> select count(1) from t1_part partition(p_a) where n2=100;

  COUNT(1)
----------
       420

SQL_ID  cz3vd6pyh783s, child number 0
-------------------------------------
--------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Pstart| Pstop |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |       |       |       |       |
|   1 |  SORT AGGREGATE        |         |     1 |     4 |       |       |
|   2 |   PARTITION LIST SINGLE|         |   724 |  2896 |     1 |     1 |
|*  3 |    TABLE ACCESS FULL   | T1_PART |   724 |  2896 |     1 |     1 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N2"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: cz3vd6pyh783s

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
cz3vd6pyh783s      2382952772            0 Y          1 -- cursor is bind sensitive

However, when Oracle visits the p_d partition using this particular N2 column, the underlying cursor will not be marked bind sensitive in this case:

SQL> select count(1) from t1_part partition(p_d) where n2=100;

  COUNT(1)
----------
       465

SQL> @xpsimp

SQL_ID  2aju26ccnxrzp, child number 0
-------------------------------------
-------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes |Pstart| Pstop |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |       |       |      |       |
|   1 |  SORT AGGREGATE        |         |     1 |     4 |      |       |
|   2 |   PARTITION LIST SINGLE|         |   490 |  1960 |    4 |     4 |
|*  3 |    TABLE ACCESS FULL   | T1_PART |   490 |  1960 |    4 |     4 |
-------------------------------------------------------------------------
														  
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N2"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: 2aju26ccnxrzp

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
2aju26ccnxrzp      2382952772            0 N          2 -- cursor is not bind sensitive

Spot, in passing, how I have managed to force a single partition scan without using the partition key column (pkey) in the predicate part. I proceeded as such because, whenever the partition key is present in the predicate part, the underlying cursor will be marked bind sensitive regardless of the Notes column that goes along with the N2 predicate or even with the partition key as the following proves:

SQL> select count(1) from t1_part where n2=100 and pkey='D';

  COUNT(1)
----------
       465

SQL> @xpsimp

SQL_ID  dfr7cz8640wav, child number 0
-------------------------------------

--------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Pstart| Pstop |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |       |       |       |       |
|   1 |  SORT AGGREGATE        |         |     1 |     6 |       |       |
|   2 |   PARTITION LIST SINGLE|         |   490 |  2940 |   KEY |   KEY |
|*  3 |    TABLE ACCESS FULL   | T1_PART |   490 |  2940 |   KEY |   KEY |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N2"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: dfr7cz8640wav

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
dfr7cz8640wav       573263709            0 Y          2 --> cursor is bind sensitive

Impact n°2

The second impact is that the histogram will be ignored when computing cardinality estimation. This is why the cardinality estimation of column N_800 is not very precise. Here’s the corresponding 10053 trace file:

SQL> select count(1) from t1_part where n_800=14;

  COUNT(1)
----------
      2072

SQL> @xpsimp

SQL_ID  b2b4nm8tqqukg, child number 0
-------------------------------------
-----------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |       |       |       |       |
|   1 |  SORT AGGREGATE     |         |     1 |     4 |       |       |
|   2 |   PARTITION LIST ALL|         |  1958 |  7832 |     1 |     4 |
|*  3 |    TABLE ACCESS FULL| T1_PART |  1958 |  7832 |     1 |     4 | -- not a very good Estimation
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N_800"=:SYS_B_1)

kkecdn: Single Table Predicate:"T1_PART"."N_800"=:B1
NOTE: ignoring histogram of column (T1_PART.N_800) -------> it ignores the Hybrid Histogram
      used only for incremental stats maintenance  -------> spot this
  Column (#5): N_800(NUMBER)
    AvgLen: 4 NDV: 817 Nulls: 0 Density: 0.001224 Min: 0.000000 Max: 799.000000
  Estimated selectivity: 0.001224 , col: #5
  Table: T1_PART  Alias: T1_PART
    Card: Original: 1600000.000000  Rounded: 1958  Computed: 1958.384333  Non Adjusted: 1958.384333
  Scan IO  Cost (Disk) =   2183.000000
  Scan CPU Cost (Disk) =   425370320.640000

Instead of using this formula to get the cardinality estimation of a predicate having HYBRID histogram, Oracle simply has ignored this histogram and used the following classical cardinality estimation:

E-Rows = sample_size/num_distinct 

SELECT
    sample_size / num_distinct e_rows
FROM
    user_tab_col_statistics
WHERE
        table_name = 'T1_PART'
    AND column_name = 'N_800';

    E_ROWS
----------
1958.38433

Conclusion

I think that from now and on, the Notes column should always go along with the histogram column (both in tab_col_statistics and part_col_statistics) as this can clearly remove the confusion of why a cursor is not bind sensitive. It can also explain why Oracle is ignoring histograms during its cardinality estimation when it looks like it shouldn’t ignore them.

PS1-here is below the model I built to reproduce the above case and for further incoming blogs about this subject

create table t1
as
select
     rownum n1	
	,trunc(dbms_random.value(0,50))  n_50
    ,trunc(dbms_random.value(0,800)) n_800    
	,trunc(50* dbms_random.normal)   n2
from
    (select /*+ materialize */ rownum from dba_objects where rownum <= 3000),
    (select /*+ materialize */ rownum from dba_objects where rownum <= 3000)
where
    rownum <= 4e5
;

create table t1_part
  (pkey varchar2(11)
  ,n1    number 
  ,n_50  number
  ,n_800 number
  ,n2    number
  )
 partition by list(pkey)
                (partition p_a values ('A')
				,partition p_b values ('B')
				,partition p_c values ('C')
				,partition p_d values ('D')
				);

insert into t1_part select 'A', t1.* from t1;
insert into t1_part select 'B', t1.* from t1;
insert into t1_part select 'C', t1.* from t1;
insert into t1_part select 'D', t1.* from t1;

-- create skew into n2 column
update t1_part set n2=n_800 where pkey='D';

--set incremental mode
execute dbms_stats.set_table_prefs(user, 't1_part', 'incremental', 'true');

-- gather statistics
exec dbms_stats.gather_table_stats(user, 't1_part', partname =>'p_a', method_opt => 'for all columns size skewonly', granularity => 'PARTITION');
exec dbms_stats.gather_table_stats(user, 't1_part', partname =>'p_b', method_opt => 'for all columns size skewonly', granularity => 'PARTITION');
exec dbms_stats.gather_table_stats(user, 't1_part', partname =>'p_c', method_opt => 'for all columns size skewonly', granularity => 'PARTITION');
exec dbms_stats.gather_table_stats(user, 't1_part', partname =>'p_d', method_opt => 'for all columns size skewonly', granularity => 'PARTITION');
exec dbms_stats.gather_table_stats(user, 't1_part', method_opt => 'for all columns size skewonly', granularity => 'GLOBAL');

PS2-scripts used in this blog post can be found here


Viewing all articles
Browse latest Browse all 224

Trending Articles