SQL 同时查询最大值、最小值的优化方法
4,538 total views, 1 views today
在实际工作中,我们经常需要将表中的最大值和最小值同时查询出来,用于分析。
类似于这样的SQL,select min(object_id),max(object_id) from borpt.t1;
但是,表很大的时候,用最容易想到的方式效率很低,那么有没有更高效的方式呢?下面展开讲解。
初始化测试环境
1 2 3 4 5 6 7 8 9 |
create table borpt.t1 as select * from dba_objects; select count(*) from borpt.t1; COUNT(*) ---------- 80588 create index borpt.ind_t1 on borpt.t1(object_id); |
查询最大值时的执行计划,走的是INDEX FUll SCAN,可以看到cost只有2,一致性读88,还是很高效的。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
set autot trace select max(OBJECT_ID) from borpt.t1; Execution Plan ---------------------------------------------------------- Plan hash value: 1333790886 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| IND_T1 | 1 | 13 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 5 recursive calls 0 db block gets 88 consistent gets 1 physical reads 0 redo size 534 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed |
查询最小值的执行计划,和查询最大值时的执行计划完全一样。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
select min(OBJECT_ID) from borpt.t1; Execution Plan ---------------------------------------------------------- Plan hash value: 1333790886 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| IND_T1 | 1 | 13 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 4 recursive calls 0 db block gets 88 consistent gets 1 physical reads 0 redo size 532 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed |
问题来了,同时查询最大值和最小值,SQL则变得很慢,cost则高达325,比上面的2高出一百多倍,并且执行计划居然走了全表扫描(TABLE ACCESS FULL)!!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
select min(object_id),max(object_id) from borpt.t1; Execution Plan ---------------------------------------------------------- Plan hash value: 3724264953 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 325 (1)| 00:00:04 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | TABLE ACCESS FULL| T1 | 103K| 1318K| 325 (1)| 00:00:04 | --------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 4 recursive calls 0 db block gets 1254 consistent gets 0 physical reads 0 redo size 613 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed |
优化办法:
查询最大值和查询最小值分开做,都作为查询dual表的标量子查询,保持了INDEX FULL SCAN的高效。整个SQL也高效了很多很多。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
--最优办法 select (select min(object_id) from borpt.t1),(select max(object_id) from borpt.t1) from dual; Execution Plan ---------------------------------------------------------- Plan hash value: 2617139519 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| IND_T1 | 1 | 13 | 2 (0)| 00:00:01 | | 3 | SORT AGGREGATE | | 1 | 13 | | | | 4 | INDEX FULL SCAN (MIN/MAX)| IND_T1 | 1 | 13 | 2 (0)| 00:00:01 | | 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 7 recursive calls 0 db block gets 176 consistent gets 0 physical reads 0 redo size 653 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed |
其他:
也有人提出用这条SQL,我要说的是这种写法效率更低,因为要进行全表扫描并且排序。
1 |
select object_id from (select object_id from borpt.t1 order by object_id) where rownum<2; |