数据库源代码里隐式类型转换导致的性能问题

举报
GaussDB数据库 发表于 2019/06/12 15:40:59 2019/06/12
【摘要】 今天在处理一个现网问题时,发现平台的一个SQL里一对单引号的缺失导致业务性能很差。 问题背景 客户现场电视墙每天凌晨11点50分左右会卡顿一会儿,通过搜集数据库的statspack以及业务侧的平台日志检查,发现以下SQL执行时间需要9-10秒: SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIM...

今天在处理一个现网问题时,发现平台的一个SQL里一对单引号的缺失导致业务性能很差。


问题背景

 

客户现场电视墙每天凌晨11点50分左右会卡顿一会儿,通过搜集数据库的statspack以及业务侧的平台日志检查,发现以下SQL执行时间需要9-10秒:
SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID=001200003680101 AND RECSTOPTIME>20151113235012;

因为该表数据从未清理过(有500万+条数据),而且其中大部分数据都是过期数据,所以为了提高性能,每天凌晨2点对该表实施了清理动作,清理之后,每次执行还是需要3秒多。

 

处理过程

查看该SQL的执行计划:
SQL> SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID=001200003680101 AND RECSTOPTIME>20151113235012;

29 rows selected.

Elapsed: 00:00:03.46

Execution Plan
----------------------------------------------------------
Plan hash value: 4149311700

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |    23 |  1771 | 17678   (1)| 00:03:33 |
|*  1 |  
TABLE ACCESS FULL| ALARM_ACT_LOG |    23 |  1771 | 17678   (1)| 00:03:33 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER("RESPID")=001200003680101 AND
              TO_NUMBER("RECSTOPTIME")>20151113235012)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      
65195  consistent gets
        735  physical reads

          0  redo size
       2900  bytes sent via SQL*Net to client
        531  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         29  rows processed

这个SQL走了全表扫描,而且物理读和逻辑读都较大。

 

查看该表的记录数以及在RESPID上的不同记录数:
SQL> select count(*) from alarm_act_log;

  COUNT(*)
----------
    994176

SQL> select count(*) from (select distinct respid from alarm_act_log);

  COUNT(*)
----------
       371

 

理论上讲,这个SQL的查询走respid这个字段上的索引的INDEX RANGE SCAN应该能减少物理读和逻辑读。于是,在alarm_act_log的respid上创建了索引:
SQL> create index nvs.ALARM_ACT_LOG_RESPID_IDX on nvs.alarm_act_log(respid);

Index created.

 

再次查看执行计划,还是走的全表扫描:
SQL> SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID=001200003680101 AND RECSTOPTIME>20151113235012;

29 rows selected.

Elapsed: 00:00:03.46

Execution Plan
----------------------------------------------------------
Plan hash value: 4149311700

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |    23 |  1771 | 17678   (1)| 00:03:33 |
|*  1 |  
TABLE ACCESS FULL| ALARM_ACT_LOG |    23 |  1771 | 17678   (1)| 00:03:33 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER("RESPID")=001200003680101 AND
              TO_NUMBER("RECSTOPTIME")>20151113235012)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
    
  65195  consistent gets
        735  physical reads

          0  redo size
       2900  bytes sent via SQL*Net to client
        531  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         29  rows processed

检查表的定义:
SQL> desc alarm_act_log
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ACTLOGID                                  NOT NULL NUMBER(20)
 ALARMLOGID                                         NUMBER(20)
 ALARMCODE                                          NUMBER(4)
 CTLCODE                                            NUMBER(5)
 RESPIDTYPE                                NOT NULL VARCHAR2(5)
 
RESPID                                             VARCHAR2(32)
 PARAM                                              VARCHAR2(128)
 ACTDATE                                            VARCHAR2(14)
 ACTENDDATE                                         VARCHAR2(14)
 RECSTARTTIME                                       VARCHAR2(14)
 
RECSTOPTIME                                        VARCHAR2(14)
 ALARMDAY                                           VARCHAR2(8)

RESPID和RECSTOPTIME都是VARCHAR2数据类型,但是在SQL里与这两个字段对应的值都是数值类型:RESPID=001200003680101 AND RECSTOPTIME>20151113235012
所以数据库要做一个隐式类型转换,也就是说它需要执行TO_NUMBER(RESPID)和TO_NUMBER(RECSTOPTIME)之后才会分别与001200003680101和20151113235012进行比较。如果创建基于TO_NUMBER的函数索引,它应该会走INDEX RANGE SCAN。

SQL> DROP INDEX NVS.ALARM_ACT_LOG_RESPID_IDX;

Index dropped.

SQL> create index nvs.ALARM_ACT_LOG_RESPID_IDX on nvs.alarm_act_log(to_number(respid));

Index created.

 

再次查看执行计划:
SQL>  SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID=001200003680101 AND RECSTOPTIME>'20151113235012';

29 rows selected.

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 2486932655

--------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                          |   496 | 41168 |  2099   (0)| 00:00:26 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| ALARM_ACT_LOG            |   496 | 41168 |  2099   (0)| 00:00:26 |
|*  2 |   
INDEX RANGE SCAN          | ALARM_ACT_LOG_RESPID_IDX |  3975 |       |     9   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER("RECSTOPTIME")>20151113235012)
   2 - access(TO_NUMBER("RESPID")=001200003680101)


Statistics
----------------------------------------------------------
         24  recursive calls
          0  db block gets
       
 620  consistent gets
          4  physical reads

          0  redo size
       2900  bytes sent via SQL*Net to client
        531  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         29  rows processed

可以看到逻辑读和物理读都大幅减少,执行时间也从3秒多降为0.01秒。当然也可以在RESPID和RECSTOPTIME上建立复合函数索引: create index nvs.ALARM_ACT_LOG_RESPID_STOPTIME_IDX on nvs.alarm_act_log(to_number(respid),to_number(recstoptime));

 

因为是平台SQL,不能直接修改,只能创建函数索引规避这个问题。其实函数索引对于增删改频繁的表有一定的副作用,因为对于增删改的数据的索引维护都要做一个函数操作,好在这个表增删改并不频繁。最好的办法就是将SQL改为:SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID='001200003680101' AND RECSTOPTIME>'20151113235012';然后在RESPID上建立索引或在RESPID和RECSTOPTIME上建立复合索引。

 

而实际上,这个表上有一个复合索引,这个索引上的一个字段就是RESPID。如果平台的这个SQL不做隐式类型转换,写成:SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID='001200003680101' AND RECSTOPTIME>20151113235012;

 

那么它的执行计划还是可以做INDEX SKIP SCAN:

SQL> SELECT ACTLOGID,ALARMLOGID,ALARMCODE,ACTDATE,RECSTARTTIME,RECSTOPTIME FROM ALARM_ACT_LOG WHERE RESPID='001200003680101' AND RECSTOPTIME>'20151113235012';

29 rows selected.

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3344032797

-------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                   |    49 |  3773 |   621   (0)| 00:00:08 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| ALARM_ACT_LOG     |    49 |  3773 |   621   (0)| 00:00:08 |
|*  2 |   
INDEX SKIP SCAN           | ALARM_ACT_LOG_IDX |  1018 |       |    62   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER("RECSTOPTIME")>20151113235012)
   2 - access("RESPID"='001200003680101')
       filter("RESPID"='001200003680101')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     
   632  consistent gets
          0  physical reads

          0  redo size
       2888  bytes sent via SQL*Net to client
        531  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         29  rows processed

*************************

可以看到在INDEX SKIP SCAN时,SQL的性能也很好。那么这个因隐式数据类型转换引起的性能问题完全可以避免。

 

 

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。