Oracle CBO优化模式中的5种索引访问方法浅析

本文主要讨论以下几种索引访问方法:

1.索引唯一扫描(INDEX UNIQUE SCAN)
2.索引范围扫描(INDEX RANGE SCAN)
3.索引全扫描(INDEX FULL SCAN)
4.索引跳跃扫描(INDEX SKIP SCAN)
5.索引快速全扫描(INDEX FAST FULL SCAN)

索引唯一扫描(INDEX UNIQUE SCAN)

通过这种索引访问数据的特点是对于某个特定的值只返回一行数据,通常如果在查询谓语中使用UNIQE和PRIMARY KEY索引的列作为条件的时候会选用这种扫描;访问的高度总是索引的高度加一,除了某些特殊的情况,如另外存储的LOB对象。

代码如下:
SQL> set autotrace traceonly explain
SQL> select * from hr.employees where employee_id = 100;

Execution Plan
----------------------------------------------------------
Plan hash value: 1833546154

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 69 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 69 | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | EMP_EMP_ID_PK | 1 | | 0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("EMPLOYEE_ID"=100)

索引范围扫描(INDEX RANGE SCAN)

谓语中包含将会返回一定范围数据的条件时就会选用索引范围扫描,索引可以是唯一的亦可以是不唯一的;所指定的条件可以是(<,>,LIKE,BETWEEN,=)等运算符,不过使用LIKE的时候,如果使用了通配符%,极有可能就不会使用范围扫描,因为条件过于的宽泛了,下面是一个示例:

代码如下:
SQL> select * from hr.employees where DEPARTMENT_ID = 30;

6 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2056577954

-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 414 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 6 | 414 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 6 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("DEPARTMENT_ID"=30)

Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
7 consistent gets
1 physical reads
0 redo size
1716 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed

范围扫描的条件需要准确的分析返回数据的数目,范围越大就越可能执行全表扫描;

代码如下:
SQL> select department_id,count(*) from hr.employees group by department_id order by count(*);

DEPARTMENT_ID COUNT(*)
------------- ----------
10 1
40 1
1
70 1
20 2
110 2
90 3
60 5
30 6
100 6
80 34
50 45

12 rows selected.

-- 这里使用数值最多的50来执行范围扫描
SQL> set autotrace traceonly explain
SQL> select * from hr.employees where DEPARTMENT_ID = 50;

45 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1445457117

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 45 | 3105 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMPLOYEES | 45 | 3105 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("DEPARTMENT_ID"=50)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
4733 bytes sent via SQL*Net to client
545 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
45 rows processed

可以看到在获取范围数据较大的时候,优化器还是执行了全表扫描方法。

一种对于索引范围扫描的优化方法是使用升序排列的索引来获得降序排列的数据行,这种情况多发生在查询中包含有索引列上的ORDER BY子句的时候,这样就可避免一次排序操作了,如下:
代码如下:
SQL> set autotrace traceonly explain
SQL> select * from hr.employees
2 where department_id in (90,100)
3 order by department_id desc;

Execution Plan
----------------------------------------------------------
Plan hash value: 3707994525

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 621 | 2 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | EMPLOYEES | 9 | 621 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN DESCENDING| EMP_DEPARTMENT_IX | 9 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("DEPARTMENT_ID"=90 OR "DEPARTMENT_ID"=100)

上例中,索引条目被相反的顺序读取,避免了排序操作。

索引全扫描(INDEX FULL SCAN)

索引全扫描的操作将会扫描索引结构的每一个叶子块,读取每个条目的的行编号,并取出数据行,既然是访问每一个索引叶子块,那么它相对的全表扫描的优势在哪里呢?实际上在索引块中因为包含的信息列数较少,通常都是索引键和ROWID,所以对于同一个数据块和索引块,包含的索引键的条目数通常都是索引块中居多,因此如果查询字段列表中所有字段都是索引的一部分的时候,就可以完全跳过对表数据的访问了,这种情况索引全扫描的方法会获得更高的效率。

发生索引全扫描的情况有很多,几种典型的场景:

1,查询总缺少谓语,但获取的列可以通过索引直接获得
代码如下:
SQL> select email from hr.employees;

Execution Plan
----------------------------------------------------------
Plan hash value: 2196514524

---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 107 | 856 | 1 (0)| 00:00:01 |
| 1 | INDEX FULL SCAN | EMP_EMAIL_UK | 107 | 856 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------

2,查询谓语中包含一个位于索引中非引导列上的条件(其实也取决于引导列值的基数大小,如果引导列的唯一值较少,也可能出现跳跃扫描的情况)
代码如下:
SQL> select first_name,last_name from hr.employees
2 where first_name like 'A%' ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2228653197

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 45 | 1 (0)| 00:00:01 |
|* 1 | INDEX FULL SCAN | EMP_NAME_IX | 3 | 45 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("FIRST_NAME" LIKE 'A%')
filter("FIRST_NAME" LIKE 'A%')

SQL> SET LONG 2000000
SQL> select dbms_metadata.get_ddl('INDEX','EMP_NAME_IX','HR') from dual;

DBMS_METADATA.GET_DDL('INDEX','HR')
--------------------------------------------------------------------------------

dawei

【声明】:丽水站长网内容转载自互联网,其相关言论仅代表作者个人观点绝非权威,不代表本站立场。如您发现内容存在版权问题,请提交相关链接至邮箱:bqsm@foxmail.com,我们将及时予以处理。