GaussDB(DWS)性能调优:多表关联情景下复杂过滤条件导致性能瓶颈案例分析

举报
Zawami 发表于 2023/12/13 18:49:09 2023/12/13
【摘要】 本篇通过一个简单案例,展示了过滤和关联在不同先后关系下的性能,分析了两者性能差异的根因,总结了一类性能调优的本质。

1、【问题描述】


多张行存表相关联,结果集相对最大表较小,但是慢。从执行计划看是表扫描慢。执行计划如下:

 id |                                          operation                                           |  A-time  | A-rows | E-rows | E-distinct | Peak Memory | A-width | E-width |  E-costs  
----+----------------------------------------------------------------------------------------------+----------+--------+--------+------------+-------------+---------+---------+-----------
  1 | ->  HashAggregate                                                                            | 4079.479 |     14 |     47 |            | 40KB        |         |      64 | 173342.44 
  2 |    ->  CTE Scan on p                                                                         | 4079.433 |    107 |     47 |            | 64KB        |         |      64 | 1.41      
  3 |    ->  Nested Loop (4,12)  [2, CTE p]                                                        | 4079.332 |    107 |     47 |            | 163KB       |         |     132 | 173336.96 
  4 |       ->  Nested Loop (5,11)                                                                 | 4077.058 |     14 |      5 |            | 32KB        |         |     132 | 172877.77 
  5 |          ->  Hash Join (6,7)                                                                 | 4077.004 |     14 |    198 |            | 32KB        |         |     104 | 171358.52 
  6 |             ->  Seq Scan on pg_catalog.pg_class c                                            | 4074.122 |   2692 | 523580 |            | 24667KB     |         |      72 | 169389.86 
  7 |             ->  Hash                                                                         | 0.050    |     14 |    100 |            | 312KB       |         |      64 | 2.00      
  8 |                ->  CTE Scan on t1                                                            | 0.040    |     14 |    100 |            | 56KB        |         |      64 | 2.00      
  9 |    ->  CTE Scan on t  [8, CTE t1]                                                            | 0.034    |     14 |    100 |            | 64KB        |         |      32 | 2.50      
 10 |    ->  Result  [9, CTE t]                                                                    | 0.011    |     14 |    100 |            | 33KB        |         |       0 | 0.51      
 11 |          ->  Index Scan using pg_namespace_oid_index on pg_catalog.pg_namespace n            | 0.033    |     14 |      1 |            | 40KB        |         |      68 | 7.66      
 12 |       ->  Index Only Scan using pg_attribute_relid_attnum_index on pg_catalog.pg_attribute a | 1.167    |    107 |    100 |            | 163KB       |         |       4 | 88.39     

2、【原始SQL】


explain performance
with
t as(
select unnest(array['bi_dashboard.dm_mss_cn_app_rep_ir_st_dw_v1_f',
'pg_toast.pg_toast_2964_index',
'pg_toast.pg_toast_2964',
'pg_toast.pg_toast_2396',
'pg_catalog.pgxc_node_oid_index',
'pg_catalog.pg_enum_typid_label_index',
'pg_catalog.pg_largeobject_metadata_oid_index',
'pg_catalog.pg_collation_name_enc_nsp_index',
'cbg_retail.rpt_tml_retail_complc_week_f',
'pg_catalog.pg_enum_oid_index',
'pg_catalog.pg_enum_typid_sortorder_index',
'pg_catalog.pg_collation_oid_index',
'dwrisc.dwd_plan_item_grossdemd_src_f_ext',
'pg_toast.pg_toast_2396_index']) as complemented -- 这些viewname即使用脚本获取的视图名称
),

t1 as (
select
split_part(complemented, '.', 1) as schemaname,
split_part(complemented, '.', 2) as viewname
from t
),

p as(
select obj_description(a.attrelid) as comment, (n.nspname||'.'||c.relname) as fullname
from 
t1
inner join pg_class c
on t1.viewname = c.relname
inner join pg_namespace n
on t1.schemaname = n.nspname
and c.relnamespace = n.oid
inner join pg_attribute a
on a.attrelid = c.oid
where a.attnum > 0 and comment is not null
)

SELECT DISTINCT ('COMMENT ON VIEW '||fullname||' IS '''||comment||''';') FROM p;

3、【性能分析】


从执行计划中可以看出,SQL的瓶颈在于第六层算子Seq Scan慢,而该算子慢的原因在于对整张表做了复杂过滤:

从计划中不难发现,在数据库中,SQL的执行顺序是先过滤再关联。obj_description(c.oid) IS NOT NULL这个过滤条件较复杂,同时注意到关联后的结果集非常小,如果先通过关联减少数据量,再来做这个过滤,是否可以快一些?

4、【改写SQL】


explain performance
with
t as(
select unnest(array['bi_dashboard.dm_mss_cn_app_rep_ir_st_dw_v1_f',
'pg_toast.pg_toast_2964_index',
'pg_toast.pg_toast_2964',
'pg_toast.pg_toast_2396',
'pg_catalog.pgxc_node_oid_index',
'pg_catalog.pg_enum_typid_label_index',
'pg_catalog.pg_largeobject_metadata_oid_index',
'pg_catalog.pg_collation_name_enc_nsp_index',
'cbg_retail.rpt_tml_retail_complc_week_f',
'pg_catalog.pg_enum_oid_index',
'pg_catalog.pg_enum_typid_sortorder_index',
'pg_catalog.pg_collation_oid_index',
'dwrisc.dwd_plan_item_grossdemd_src_f_ext',
'pg_toast.pg_toast_2396_index']) as complemented
),

t1 as (
select
split_part(complemented, '.', 1) as schemaname,
split_part(complemented, '.', 2) as viewname
from t
),

p as(
select obj_description(a.attrelid) as comment, (n.nspname||'.'||c.relname) as fullname
from 
t1
inner join pg_class c
on t1.viewname = c.relname
inner join pg_namespace n
on t1.schemaname = n.nspname
and c.relnamespace = n.oid
inner join pg_attribute a
on a.attrelid = c.oid
where a.attnum > 0
)

SELECT DISTINCT ('COMMENT ON VIEW '||fullname||' IS '''||comment||''';') FROM p where comment is not null;

改写后的SQL把comment is not null的过滤条件移到了外层查询。对应执行计划如下:

 id |                                          operation                                           | A-time  | A-rows | E-rows | E-distinct | Peak Memory | A-width | E-width | E-costs  
----+----------------------------------------------------------------------------------------------+---------+--------+--------+------------+-------------+---------+---------+----------
  1 | ->  HashAggregate                                                                            | 267.136 |     14 |    102 |            | 40KB        |         |      64 | 40796.81 
  2 |    ->  CTE Scan on p                                                                         | 267.054 |    107 |    102 |            | 64KB        |         |      64 | 3.08     
  3 |    ->  Nested Loop (4,13)  [2, CTE p]                                                        | 266.911 |    107 |    103 |            | 195KB       |         |     132 | 40788.43 
  4 |       ->  Hash Join (5,11)                                                                   | 265.063 |     14 |     11 |            | 32KB        |         |     132 | 39979.20 
  5 |          ->  Hash Join (6,7)                                                                 | 263.493 |     14 |    198 |            | 32KB        |         |     104 | 39820.62 
  6 |             ->  Seq Scan on pg_catalog.pg_class c                                            | 149.743 | 525067 | 525483 |            | 32KB        |         |      72 | 37844.83 
  7 |             ->  Hash                                                                         | 0.056   |     14 |    100 |            | 312KB       |         |      64 | 2.00     
  8 |                ->  CTE Scan on t1                                                            | 0.046   |     14 |    100 |            | 56KB        |         |      64 | 2.00     
  9 |    ->  CTE Scan on t  [8, CTE t1]                                                            | 0.038   |     14 |    100 |            | 64KB        |         |      32 | 2.50     
 10 |    ->  Result  [9, CTE t]                                                                    | 0.014   |     14 |    100 |            | 33KB        |         |       0 | 0.51     
 11 |          ->  Hash                                                                            | 1.376   |     35 |     33 |            | 320KB       |         |      68 | 156.33   
 12 |             ->  Seq Scan on pg_catalog.pg_namespace n                                        | 1.351   |     35 |     33 |            | 32KB        |         |      68 | 156.33   
 13 |       ->  Index Only Scan using pg_attribute_relid_attnum_index on pg_catalog.pg_attribute a | 0.132   |    107 |    101 |            | 40KB        |         |       4 | 70.12    

从计划中可以发现,Seq Scan算子的用时大幅减少了,而各层关联算子的执行时间并没有显著变慢。comment is not null的过滤移到了第2层算子里面,由于行数很少,耗时并不多。

5、【调优总结】


在本案例中,原SQL因为把复杂过滤条件放到where子句里,执行时在表扫描时做复杂过滤,导致性能瓶颈。把该复杂过滤条件移到外层查询中后,由于复杂过滤涉及行数较少,性能有了很大改善。

如果情况相反,存在非等值关联条件无法做hash join,而过滤条件简单呢?那就需要把过滤向子查询移动,提前过滤了。究竟要把过滤提前还是移后,取决于怎样才能快速地降低流经算子的元组个数。这同提前聚合的思想是异曲同工的。

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。