If I was to ask you how an INDEX FULL SCAN is operated by the SQL engine you would certainly answer that it will go to the first leaf block of the index and walk down up to the last leaf block in the index key order using typically a db file sequential read.
I have strengthened the words index key order because it is specifically that operation which interests me here in this article. Each time I see an INDEX FULL SCAN operation in an execution plan, I immediately try to know if the CBO did took an advantage of this typical access to avoid an eventual supplementary ORDER BY operation. There is a bug, under the FIRST_ROWS mode, where an INDEX FULL SCAN is preferred, whatever its cost is, in order to avoid an ORDER BY operation. Hence, in the presence of such index operation, I also try to verify the CBO mode and/or the presence of a where clause such as where rownum=1 which makes, behind the scene, the CBO behaving as if it was running under FIRST_ROWS mode.
Recently an excellent question comes up in a French forum where the Original Poster (OP) was wondering why the CBO was making a wrong decision. Several very good interventions by very nice peoples motivated me to write two articles, the first one related to the relationship that might exist between an INDEX FULL SCAN and an ORDER BY operation while the second article will look on the effect the optimizer_index_cost_adj parameter might have on the choice of a good or wrong execution path.
The OP query and execution plan are shown below:
SELECT colonne1 FROM matable GROUP BY colonne1 ORDER BY colonne1 ASC NULLS LAST; Plan hash value: 2815412565 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 3 |00:00:21.60 | | 1 | SORT ORDER BY | | 1 | 3 | 3 |00:00:21.60 | | 2 | HASH GROUP BY | | 1 | 3 | 3 |00:00:21.60 | | 3 | INDEX FULL SCAN| MATABLE_PK | 1 | 2923K| 2928K|00:00:21.99 | ------------------------------------------------------------------------------------------
The query takes more than 20 seconds to complete. And when he instructs the CBO to use a FULL table scans the response time fells down to about 4 seconds
SELECT /*+ NO_INDEX(matable matable_pk) */ colonne1 FROM matable GROUP BY colonne1 ORDER BY colonne1 ASC NULLS LAST; ----------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 3 |00:00:04.03 | | 1 | SORT ORDER BY | | 1 | 3 | 3 |00:00:04.03 | | 2 | HASH GROUP BY | | 1 | 3 | 3 |00:00:04.03 | | 3 | TABLE ACCESS FULL| MATABLE | 1 | 2923K| 2928K|00:00:03.19 | -----------------------------------------------------------------------------------------
The next blog article will discuss the reason of this wrong execution plan choice. For this moment, let me just spot with you the duplicate sort operation the OP has got.
------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ------------------------------------------------------------------------------------------ | 1 | SORT ORDER BY | | 1 | 3 | 3 |00:00:21.60 | | 2 | HASH GROUP BY | | 1 | 3 | 3 |00:00:21.60 | | 3 | INDEX FULL SCAN| MATABLE_PK | 1 | 2923K| 2928K|00:00:21.99 | ------------------------------------------------------------------------------------------
An ordered INDEX FULL SCAN (on the leading PK column) access followed by a SORT ORDER BY of this PK column.
Why?
This is the aim of the current blog article.
First let me present the model
CREATE TABLE t (c1 VARCHAR2(64), c2 CHAR(15), d1 DATE); INSERT INTO t SELECT mod(ABS(dbms_random.random),3)+ 1||chr(ascii('Y')) , dbms_random.string('L',dbms_random.value(1,5))||rownum , to_date(TO_CHAR(to_date('01/01/1980','dd/mm/yyyy'),'J') + TRUNC(dbms_random.value(1,11280)),'J') FROM dual CONNECT BY level <= 2e6; ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (c1,c2) USING INDEX; EXEC dbms_stats.gather_table_stats (USER, 't', CASCADE => true, method_opt => 'FOR ALL COLUMNS SIZE 1');
And now the query on 11.2.0.3.0 – 64bit Production
SQL > SELECT c1 2 FROM t 3 GROUP BY c1 4 ORDER BY c1 ASC NULLS LAST; C1 -------------------------- 1Y 2Y 3Y -------------------------------------- SQL_ID 0nfhzk4r58zuw, child number 1 ------------------------------------- SELECT c1 FROM t GROUP BY c1 ORDER BY c1 ASC NULLS LAST Plan hash value: 2111031280 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 2069 (100)| | | 1 | SORT GROUP BY NOSORT| | 3 | 9 | 2069 (5)| 00:00:06 | | 2 | INDEX FULL SCAN | T_PK | 2000K| 5859K| 2069 (5)| 00:00:06 | -----------------------------------------------------------------------------
As I have expected, an ordered INDEX FULL SCAN on the leading primary key column which allows the CBO to avoid the ORDER BY c1 operation as clearly shown by the operation 1 SORT GROUP BY NOSORT
So what is the difference between my model and the OP one? Or more precisely what is the difference between my environment and the OP one? It should exist something that makes the difference. Fortunately the thread was under good hands and someone cleverly asked to get the execution plan with the advanced option thought that his intention was to see the cost. Nevertheless, the advanced option shows that the OP was using a French NLS_SORT parameter.
Hmmmm…
Let me then change my nls_sort to FRENCH and see what happens to my engineered query
SQL> show parameter nls_sort NAME TYPE VALUE ------------------------------------ ----------- ---------- nls_sort string BINARY SQL> alter session set nls_sort=FRENCH; Session altered. SQL> SELECT c1 2 FROM t 3 GROUP BY c1 4 ORDER BY c1 ASC NULLS LAST; C1 ------------------------ 1Y 2Y 3Y SQL_ID 0nfhzk4r58zuw, child number 3 ------------------------------------- SELECT c1 FROM t GROUP BY c1 ORDER BY c1 ASC NULLS LAST Plan hash value: 1760210272 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 2451 (100)| | | 1 | SORT ORDER BY | | 3 | 9 | 2451 (20)| 00:00:07 | | 2 | SORT GROUP BY NOSORT| | 3 | 9 | 2451 (20)| 00:00:07 | | 3 | INDEX FULL SCAN | T_PK | 2000K| 5859K| 2069 (5)| 00:00:06 | ------------------------------------------------------------------------------ Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$1 3 - SEL$1 / T@SEL$1 Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('11.2.0.3') DB_VERSION('11.2.0.3') ALL_ROWS OUTLINE_LEAF(@"SEL$1") INDEX(@"SEL$1" "T"@"SEL$1" ("T"."C1" "T"."C2")) END_OUTLINE_DATA */ Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - (#keys=1) NLSSORT("C1",'nls_sort=''FRENCH''')[2000],"C1"[VARCHAR2,256] 2 - (#keys=1) "C1"[VARCHAR2,256] 3 - "C1"[VARCHAR2,256]
The column projection gives an interesting information on what’s going on here (nls_sort= french)
Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - (#keys=1) NLSSORT("C1",'nls_sort=''FRENCH''')[2000],"C1"[VARCHAR2,256]
On contrast to the situation where my column c1 would have been declared as of a NUMBER data type, the nls_sort parameter value would not have played any effect as shown below:
SQL> describe t1 Name Null? Type ------------------------------- -------- ---------- 1 C1 NOT NULL NUMBER 2 C2 NOT NULL CHAR(15) 3 D1 DATE SQL> show parameter nls_sort NAME TYPE VALUE ------------------------------------ ----------- ----------- nls_sort string FRENCH SQL>SELECT c1 2 FROM t1 3 GROUP BY c1 4 ORDER BY c1 ASC NULLS LAST; C1 ---------- 1 2 3 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 2105 (100)| | | 1 | SORT GROUP BY NOSORT| | 1754K| 21M| 2105 (5)| 00:00:06 | | 2 | INDEX FULL SCAN | T1_PK | 1754K| 21M| 2105 (5)| 00:00:06 | ------------------------------------------------------------------------------ SQL> alter session set nls_sort = BINARY; Session altered. SQL> SELECT c1 2 FROM t1 3 GROUP BY c1 4 ORDER BY c1 ASC NULLS LAST; C1 ---------- 1 2 3 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 2105 (100)| | | 1 | SORT GROUP BY NOSORT| | 1754K| 21M| 2105 (5)| 00:00:06 | | 2 | INDEX FULL SCAN | T1_PK | 1754K| 21M| 2105 (5)| 00:00:06 | ------------------------------------------------------------------------------
Footnote: When you see in your execution plan two ordered operations like and INDEX FULL SCAN followed by an ORDER BY on the leading index column then check the nls_sort parameter. It might be due to the difference of the session nls_sort parameter and the sort parameter used internally by Oracle when reading the INDEX FULL SCAN keys.
