Quantcast
Channel: Mohamed Houri’s Oracle Notes
Viewing all articles
Browse latest Browse all 224

INDEX FULL SCAN , NLS_SORT and ORDER BY

$
0
0

If I was to ask you how an INDEX FULL SCAN is operated by the SQL engine you would certainly answer that it will go to the first leaf block of the index and walk down up to the last leaf block in the index key order using typically a db file sequential read.

I have strengthened the words  index key order because it is specifically that operation which interests me here in this article. Each time I see an INDEX FULL SCAN operation in an execution plan, I immediately try to know if the CBO did took an advantage of this typical access to avoid an eventual supplementary ORDER BY operation. There is a bug, under the FIRST_ROWS mode, where an INDEX FULL SCAN is preferred, whatever its cost is, in order to avoid an ORDER BY operation. Hence, in the presence of such index operation, I also try to verify the CBO mode and/or the presence of a where clause such as where rownum=1 which makes, behind the scene, the CBO behaving as if it was running under FIRST_ROWS mode.

Recently an excellent question comes up in a French forum where the Original Poster (OP) was wondering why the CBO was making a wrong decision. Several very good interventions by very nice peoples motivated me to write two articles, the first one related to the relationship that might exist between an INDEX FULL SCAN and an ORDER BY operation while the second article will look on the effect the optimizer_index_cost_adj parameter might have on the choice of a good or wrong execution path.

The  OP query and execution plan  are shown below:

SELECT
     colonne1
FROM matable
GROUP BY colonne1
ORDER BY colonne1 ASC NULLS LAST;
Plan hash value: 2815412565

------------------------------------------------------------------------------------------
| Id  | Operation         | Name                 | Starts | E-Rows | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                      |      1 |        |      3 |00:00:21.60 |
|   1 |  SORT ORDER BY    |                      |      1 |      3 |      3 |00:00:21.60 |
|   2 |   HASH GROUP BY   |                      |      1 |      3 |      3 |00:00:21.60 |
|   3 |    INDEX FULL SCAN| MATABLE_PK           |      1 |   2923K|   2928K|00:00:21.99 |
------------------------------------------------------------------------------------------

The query takes more than 20 seconds to complete. And when he instructs the CBO to use a FULL table scans the response time fells down to about 4 seconds

SELECT /*+ NO_INDEX(matable matable_pk) */
colonne1
FROM matable
GROUP BY colonne1
ORDER BY colonne1 ASC NULLS LAST;

-----------------------------------------------------------------------------------------
| Id  | Operation           | Name              | Starts | E-Rows | A-Rows |   A-Time   |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                   |      1 |        |      3 |00:00:04.03 |
|   1 |  SORT ORDER BY      |                   |      1 |      3 |      3 |00:00:04.03 |
|   2 |   HASH GROUP BY     |                   |      1 |      3 |      3 |00:00:04.03 |
|   3 |    TABLE ACCESS FULL| MATABLE           |      1 |   2923K|   2928K|00:00:03.19 |
-----------------------------------------------------------------------------------------

The next blog article will discuss the reason of this wrong execution plan choice.  For this moment, let me just spot with you the duplicate sort operation the OP has got.

------------------------------------------------------------------------------------------
| Id  | Operation         | Name                 | Starts | E-Rows | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------
|   1 |  SORT ORDER BY    |                      |      1 |      3 |      3 |00:00:21.60 |
|   2 |   HASH GROUP BY   |                      |      1 |      3 |      3 |00:00:21.60 |
|   3 |    INDEX FULL SCAN| MATABLE_PK           |      1 |   2923K|   2928K|00:00:21.99 |
------------------------------------------------------------------------------------------

An ordered INDEX FULL SCAN (on the leading PK column) access followed by a SORT ORDER BY of this PK column.

Why?

This is the aim of the current blog article.

First let me present the model

CREATE TABLE t
(c1 VARCHAR2(64), c2 CHAR(15), d1 DATE);

INSERT INTO t
  SELECT
       mod(ABS(dbms_random.random),3)+ 1||chr(ascii('Y')) ,
       dbms_random.string('L',dbms_random.value(1,5))||rownum ,
       to_date(TO_CHAR(to_date('01/01/1980','dd/mm/yyyy'),'J') + TRUNC(dbms_random.value(1,11280)),'J')
FROM dual
CONNECT BY level <= 2e6;

ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (c1,c2) USING INDEX;

EXEC dbms_stats.gather_table_stats (USER, 't', CASCADE => true, method_opt => 'FOR ALL COLUMNS SIZE 1');

And now the query on 11.2.0.3.0 – 64bit Production

SQL > SELECT  c1
 2    FROM t
 3    GROUP BY c1
 4   ORDER BY c1 ASC NULLS LAST;

C1
 --------------------------
 1Y
 2Y
 3Y

--------------------------------------
 SQL_ID  0nfhzk4r58zuw, child number 1
 -------------------------------------
 SELECT  c1   FROM t   GROUP BY c1  ORDER BY c1 ASC NULLS LAST

Plan hash value: 2111031280
 -----------------------------------------------------------------------------
 | Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 -----------------------------------------------------------------------------
 |   0 | SELECT STATEMENT     |      |       |       |  2069 (100)|          |
 |   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |  2069   (5)| 00:00:06 |
 |   2 |   INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
 -----------------------------------------------------------------------------

As I have expected, an ordered INDEX FULL SCAN on the leading primary key column which allows the CBO to avoid the ORDER BY c1 operation as clearly shown by the operation 1 SORT GROUP BY NOSORT

So what is the difference between my model and the OP one? Or more precisely what is the difference between my environment and the OP one? It should exist something that makes the difference. Fortunately the thread was under good hands and someone cleverly asked to get the execution plan with the advanced option thought that his intention was to see the cost. Nevertheless, the advanced option shows that the OP was using a French NLS_SORT parameter.

Hmmmm…

Let me then change my nls_sort to FRENCH and see what happens to my engineered query


SQL> show parameter nls_sort

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------
nls_sort                             string      BINARY

SQL> alter session set nls_sort=FRENCH;

Session altered.

SQL> SELECT  c1
2    FROM t
3    GROUP BY c1
4   ORDER BY c1 ASC NULLS LAST;

C1
------------------------
1Y
2Y
3Y

SQL_ID  0nfhzk4r58zuw, child number 3
-------------------------------------
SELECT  c1   FROM t   GROUP BY c1  ORDER BY c1 ASC NULLS LAST

Plan hash value: 1760210272

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |  2451 (100)|          |
|   1 |  SORT ORDER BY        |      |     3 |     9 |  2451  (20)| 00:00:07 |
|   2 |   SORT GROUP BY NOSORT|      |     3 |     9 |  2451  (20)| 00:00:07 |
|   3 |    INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
3 - SEL$1 / T@SEL$1

Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX(@"SEL$1" "T"@"SEL$1" ("T"."C1" "T"."C2"))
END_OUTLINE_DATA
*/

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) NLSSORT("C1",'nls_sort=''FRENCH''')[2000],"C1"[VARCHAR2,256]
2 - (#keys=1) "C1"[VARCHAR2,256]
3 - "C1"[VARCHAR2,256] 

The column projection gives an interesting information on what’s going on here (nls_sort= french)


Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) NLSSORT("C1",'nls_sort=''FRENCH''')[2000],"C1"[VARCHAR2,256]

On contrast to the situation where my column c1 would have been declared as of a NUMBER data type, the nls_sort parameter value would not have played any effect as shown below:


SQL> describe t1

Name                            Null?    Type
------------------------------- -------- ----------
1      C1                              NOT NULL NUMBER
2      C2                              NOT NULL CHAR(15)
3      D1                                       DATE

SQL> show parameter nls_sort

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----------
nls_sort                             string      FRENCH

SQL>SELECT  c1
2    FROM t1
3    GROUP BY c1
4   ORDER BY c1 ASC NULLS LAST;

C1
----------
1
2
3

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |       |       |  2105 (100)|          |
|   1 |  SORT GROUP BY NOSORT|       |  1754K|    21M|  2105   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T1_PK |  1754K|    21M|  2105   (5)| 00:00:06 |
------------------------------------------------------------------------------

SQL> alter session set nls_sort = BINARY;

Session altered.

SQL> SELECT  c1
2    FROM t1
3    GROUP BY c1
4   ORDER BY c1 ASC NULLS LAST;

C1
----------
1
2
3

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |       |       |  2105 (100)|          |
|   1 |  SORT GROUP BY NOSORT|       |  1754K|    21M|  2105   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T1_PK |  1754K|    21M|  2105   (5)| 00:00:06 |
------------------------------------------------------------------------------

Footnote: When you see in your execution plan two ordered operations like and INDEX FULL SCAN followed by an ORDER BY on the leading index column then check the nls_sort parameter. It might be due to the difference of the session nls_sort parameter and the sort parameter used internally by Oracle when reading the INDEX FULL SCAN keys.



Viewing all articles
Browse latest Browse all 224

Trending Articles