数据库源代码里隐式类型转换导致的性能问题
今天在处理一个现网问题时,发现平台的一个SQL里一对单引号的缺失导致业务性能很差。
问题背景
客户现场电视墙每天凌晨11点50分左右会卡顿一会儿,通过搜集数据库的statspack以及业务侧的平台日志检查,发现以下SQL执行时间需要9-10秒:
SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID=001200003680101 AND RECSTOPTIME>20151113235012;
因为该表数据从未清理过(有500万+条数据),而且其中大部分数据都是过期数据,所以为了提高性能,每天凌晨2点对该表实施了清理动作,清理之后,每次执行还是需要3秒多。
处理过程
查看该SQL的执行计划:
SQL> SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID=001200003680101 AND RECSTOPTIME>20151113235012;
29 rows selected.
Elapsed: 00:00:03.46
Execution Plan
----------------------------------------------------------
Plan hash value: 4149311700
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23 | 1771 | 17678 (1)| 00:03:33 |
|* 1 | TABLE ACCESS FULL| ALARM_ACT_LOG | 23 | 1771 | 17678 (1)| 00:03:33 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("RESPID")=001200003680101 AND
TO_NUMBER("RECSTOPTIME")>20151113235012)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
65195 consistent gets
735 physical reads
0 redo size
2900 bytes sent via SQL*Net to client
531 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
29 rows processed
这个SQL走了全表扫描,而且物理读和逻辑读都较大。
查看该表的记录数以及在RESPID上的不同记录数:
SQL> select count(*) from alarm_act_log;
COUNT(*)
----------
994176
SQL> select count(*) from (select distinct respid from alarm_act_log);
COUNT(*)
----------
371
理论上讲,这个SQL的查询走respid这个字段上的索引的INDEX RANGE SCAN应该能减少物理读和逻辑读。于是,在alarm_act_log的respid上创建了索引:
SQL> create index nvs.ALARM_ACT_LOG_RESPID_IDX on nvs.alarm_act_log(respid);
Index created.
再次查看执行计划,还是走的全表扫描:
SQL> SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID=001200003680101 AND RECSTOPTIME>20151113235012;
29 rows selected.
Elapsed: 00:00:03.46
Execution Plan
----------------------------------------------------------
Plan hash value: 4149311700
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23 | 1771 | 17678 (1)| 00:03:33 |
|* 1 | TABLE ACCESS FULL| ALARM_ACT_LOG | 23 | 1771 | 17678 (1)| 00:03:33 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("RESPID")=001200003680101 AND
TO_NUMBER("RECSTOPTIME")>20151113235012)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
65195 consistent gets
735 physical reads
0 redo size
2900 bytes sent via SQL*Net to client
531 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
29 rows processed
检查表的定义:
SQL> desc alarm_act_log
Name Null? Type
----------------------------------------- -------- ----------------------------
ACTLOGID NOT NULL NUMBER(20)
ALARMLOGID NUMBER(20)
ALARMCODE NUMBER(4)
CTLCODE NUMBER(5)
RESPIDTYPE NOT NULL VARCHAR2(5)
RESPID VARCHAR2(32)
PARAM VARCHAR2(128)
ACTDATE VARCHAR2(14)
ACTENDDATE VARCHAR2(14)
RECSTARTTIME VARCHAR2(14)
RECSTOPTIME VARCHAR2(14)
ALARMDAY VARCHAR2(8)
RESPID和RECSTOPTIME都是VARCHAR2数据类型,但是在SQL里与这两个字段对应的值都是数值类型:RESPID=001200003680101 AND RECSTOPTIME>20151113235012
所以数据库要做一个隐式类型转换,也就是说它需要执行TO_NUMBER(RESPID)和TO_NUMBER(RECSTOPTIME)之后才会分别与001200003680101和20151113235012进行比较。如果创建基于TO_NUMBER的函数索引,它应该会走INDEX RANGE SCAN。
SQL> DROP INDEX NVS.ALARM_ACT_LOG_RESPID_IDX;
Index dropped.
SQL> create index nvs.ALARM_ACT_LOG_RESPID_IDX on nvs.alarm_act_log(to_number(respid));
Index created.
再次查看执行计划:
SQL> SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID=001200003680101 AND RECSTOPTIME>'20151113235012';
29 rows selected.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 2486932655
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 496 | 41168 | 2099 (0)| 00:00:26 |
|* 1 | TABLE ACCESS BY INDEX ROWID| ALARM_ACT_LOG | 496 | 41168 | 2099 (0)| 00:00:26 |
|* 2 | INDEX RANGE SCAN | ALARM_ACT_LOG_RESPID_IDX | 3975 | | 9 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("RECSTOPTIME")>20151113235012)
2 - access(TO_NUMBER("RESPID")=001200003680101)
Statistics
----------------------------------------------------------
24 recursive calls
0 db block gets
620 consistent gets
4 physical reads
0 redo size
2900 bytes sent via SQL*Net to client
531 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
29 rows processed
可以看到逻辑读和物理读都大幅减少,执行时间也从3秒多降为0.01秒。当然也可以在RESPID和RECSTOPTIME上建立复合函数索引: create index nvs.ALARM_ACT_LOG_RESPID_STOPTIME_IDX on nvs.alarm_act_log(to_number(respid),to_number(recstoptime));
因为是平台SQL,不能直接修改,只能创建函数索引规避这个问题。其实函数索引对于增删改频繁的表有一定的副作用,因为对于增删改的数据的索引维护都要做一个函数操作,好在这个表增删改并不频繁。最好的办法就是将SQL改为:SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID='001200003680101' AND RECSTOPTIME>'20151113235012';然后在RESPID上建立索引或在RESPID和RECSTOPTIME上建立复合索引。
而实际上,这个表上有一个复合索引,这个索引上的一个字段就是RESPID。如果平台的这个SQL不做隐式类型转换,写成:SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID='001200003680101' AND RECSTOPTIME>20151113235012;
那么它的执行计划还是可以做INDEX SKIP SCAN:
SQL> SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID='001200003680101' AND RECSTOPTIME>'20151113235012';
29 rows selected.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 3344032797
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 49 | 3773 | 621 (0)| 00:00:08 |
|* 1 | TABLE ACCESS BY INDEX ROWID| ALARM_ACT_LOG | 49 | 3773 | 621 (0)| 00:00:08 |
|* 2 | INDEX SKIP SCAN | ALARM_ACT_LOG_IDX | 1018 | | 62 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("RECSTOPTIME")>20151113235012)
2 - access("RESPID"='001200003680101')
filter("RESPID"='001200003680101')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
632 consistent gets
0 physical reads
0 redo size
2888 bytes sent via SQL*Net to client
531 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
29 rows processed
*************************
可以看到在INDEX SKIP SCAN时,SQL的性能也很好。那么这个因隐式数据类型转换引起的性能问题完全可以避免。
- 点赞
- 收藏
- 关注作者
评论(0)