GaussDB(DWS)性能调优:实时区数十亿级数据量关联下盘过大优化案例
1、【问题描述】
财经实时区环境月结期间集群压力过大,大量业务出现数十亿级表关联的情况,下盘较为严重,影响业务调度性能。
后续解决方案是:采用调整下盘限制参数的方法,设置下盘量超过30G就杀掉会话。
引起的后果:下盘大的SQL会一直失败。报错内容:canceling statement due to workload manager exception.
DETAIL: except rule [s_878ec2a0_spillsize] is meet condition:rule [spillsize] is over limit, current value is:30086.
2、【原始SQL】
SELECT
aa.tpye
,aa.s3
,aa.sa
--,aa.sa2 废弃
,aa.cont
,substr(aa.cont,5,14) contract
,aa.trx_number
,aa.trx_date
,aa.cmmt
--,aa.dis 废弃
,aa.inv_type
,aa.sour
,aa.invoice_currency_code
,sum(aa.amount) amount
,sum(aa.fun_amount) fun_amount
,aa.customer_name
,aa.comp
,aa.reg
,aa.acct
,aa.prd
,aa.ic
--,aa.org_con 废弃
,aa.gl_date
--,aa.ship_date_actual 废弃
--,aa.orde 废弃
,aa.cust
,aa.creation_date
FROM
(
SELECT
/*+ rows (rctl # 500000000) */
'MAR' as tpye
,'' as s3
,rct.salesperson_name as sa
,rct.contract_number as cont
,rct.invoice_number as trx_number
,rct.invoice_date as trx_date
,rct.comments as cmmt
,tt.invoice_type_name as inv_type
,rct.source_type as sour
,rct.currency_code as invoice_currency_code
,rctlgd.distribution_amount as amount
,rctlgd.functional_distribution_amount as fun_amount
,hp.bp_name as customer_name
,rct.unit_code as comp
,mah.coa_geo_pc_code as reg
,mah.coa_account_code as acct
,mah.coa_product_code as prd
,mah.coa_ic_code as ic
,to_char(rct.gl_date,'DD-MON-YY') as gl_date
,hca.account_number as cust
,to_char(rct.creation_date,'DD-MON-YY') as creation_date
FROM SDIFIN.OGG_HAR_INVOICE_HEADER_8660_VI rct
,SDIFIN.OGG_HAR_INVOICE_LINE_8660_VI rctl
,SDIFIN.OGG_HAR_INVOICE_DIS_8660_VI rctlgd
,SDIFIN.OGG_HAH_AE_LINE_LINK_8663_VI lk -- SDI.ogg_hah_ae_line_link_t_8663 lk
,SDIFIN.OGG_HAH_AE_LINE_8663_VI mah -- SDI.ogg_hah_ae_line_t_8663 l
,SDIIT.OGG_HAC_MD_ENTITY_MANAGET_7050 hca
,SDICOMMON.OGG_HAC_MD_LEGAL_ENTITY_T_7050 hp
,SDIFIN.OGG_HAR_INVOICE_TYPE_T_8660 tt
WHERE 1 = 1
--AND rct.delete_flag<>'Y'
and NVL(rct.DELETE_FLAG,'N') = 'N'
--and ltrim(rtrim(rct.delete_flag)) <>'Y'
--AND rctl.delete_flag<>'Y'
and NVL(rctl.DELETE_FLAG,'N') = 'N'
--and ltrim(rtrim(rctl.delete_flag)) <>'Y'
AND rct.invoice_header_id = rctl.invoice_header_id
AND rctl.invoice_line_id = rctlgd.source_reference_id
AND rct.account_number = hca.account_number
AND hca.bp_id = hp.bp_id
AND rct.invoice_type_id = tt.invoice_type_id
AND rct.unit_code = tt.unit_code
AND tt.invoice_type_name NOT LIKE '%ACC'
AND tt.invoice_type_name NOT LIKE '%ACC(ACI)'
AND to_char(rctlgd.INVOICE_DISTRIBUTION_ID) = lk.source_distribution_id_char_1
AND mah.ae_line_num = lk.ae_line_num
AND mah.ae_header_id = lk.ae_header_id
AND ( rct.UNIT_CODE IN ('0021') ) --rct.UNIT_CODE
AND rct.gl_date between '2024-03-01 00:00:00' and '2024-03-31 23:59:59' --rct.gl_date
AND 1=1 --rct.contract_number
AND ( mah.coa_account_code IN ('2610000','2620000','2680100','4010000','4040000','4070000','9990455') ) --mah.coa_account_code
UNION ALL
SELECT
'HRC' as tpye
,h.attribute4 as s3
,h.salesperson_name as sa
,h.contract_number as cont
,'' as trx_number
,l.transaction_date as trx_date
,'' as cmmt
,l.account_business_type_code as inv_type
,'' as sour
,dis.transaction_currency_code as invoice_currency_code
,dis.transaction_amount as amount
,dis.functional_amount as fun_amount
,hp.bp_name as customer_name
,h.company_code as comp
,mah.coa_geo_pc_code as reg
,mah.coa_account_code as acct
,mah.coa_product_code as prd
,mah.coa_ic_code as ic
,to_char(dis.gl_date,'DD-MON-YY') as gl_date
,hca.account_number as cust
,to_char(h.creation_date,'DD-MON-YY') as creation_date
FROM SDIFIN.OGG_RCR_REV_COST_ACC_H_8659_VI h/*ar.ra_customer_trx_all*/
,SDIFIN.OGG_RCR_REV_COST_ACC_L_8659_VI l/*ar.ra_customer_trx_lines_all*/
,SDIFIN.OGG_RCR_REV_COST_DIS_8659_VI dis/*ar.ra_cust_trx_line_gl_dist_all*/
-- ,SDI.OGG_RCR_REV_COST_EVENT_8659_VI event/*增加*/
,SDIIT.OGG_HAC_MD_ENTITY_MANAGET_7050 hca/*ar.hz_cust_accounts*/
,SDICOMMON.OGG_HAC_MD_LEGAL_ENTITY_T_7050 hp/*ar.hz_parties*/
,SDIFIN.OGG_HAH_AE_LINE_LINK_8663_VI lk -- SDI.ogg_hah_ae_line_link_t_8663 lk
,SDIFIN.OGG_HAH_AE_LINE_8663_VI mah -- SDI.ogg_hah_ae_line_t_8663 l
WHERE 1 = 1
and h.rev_cost_header_id = l.rev_cost_header_id
and l.rev_cost_line_id = dis.rev_cost_line_id
-- and dis.rev_cost_event_id = event.rev_cost_event_id
-- and event.accounting_status = '6'
-- and event.event_status_code = '5'
AND h.account_number = hca.account_number
AND hca.bp_id = hp.bp_id
AND lk.source_distribution_id_char_1 = to_char(dis.rev_cost_distribution_id)
AND mah.ae_line_num = lk.ae_line_num
AND mah.ae_header_id = lk.ae_header_id
AND ( h.COMPANY_CODE IN ('0021') ) --h.COMPANY_CODE
AND dis.gl_date between '2024-03-01 00:00:00' and '2024-03-31 23:59:59' --dis.gl_date
AND 1=1 --h.contract_number
AND ( mah.coa_account_code IN ('2610000','2620000','2680100','4010000','4040000','4070000','9990455') ) --mah.coa_account_code
) aa
GROUP BY aa.tpye
,aa.s3
,aa.sa
,aa.cont
,aa.trx_number
,aa.trx_date
,aa.cmmt
,aa.inv_type
,aa.sour
,aa.invoice_currency_code
,aa.customer_name
,aa.comp
,aa.reg
,aa.acct
,aa.prd
,aa.ic
,aa.gl_date
,aa.cust
,aa.creation_date
HAVING (sum(aa.amount) = 0 and sum(aa.fun_amount) <> 0)
or sum(aa.amount) <> 0
3、【性能分析】
上图中可以看出,SQL执行时间达到了1200s+,lk子查询数据量达到了118亿,表扫描时间达到了300s+,同时存在非常严重的下盘。但是关联后数据量只有300万,与业务证实后确认的确不需要扫描所有数据(即可以利用某个字段进行提前过滤,降低关联数据量,减少下盘)。
例如本案例中,与业务求证后利用了lk中的application_code字段提前过滤,优化后SQL如下所示(取重点部分):
with tmp_mah as (
select mah.coa_geo_pc_code
,mah.coa_account_code
,mah.coa_product_code
,mah.coa_ic_code
,mah.ae_line_num
,mah.ae_header_id
,lk.source_distribution_id_char_1
,LK.application_code
from SDIFIN.OGG_HAH_AE_LINE_8663_VI mah,
SDIFIN.OGG_HAH_AE_LINE_LINK_8663_VI lk
where ( mah.coa_account_code IN ('2610000','2620000','2680100','4010000','4040000','4070000','9990455') )
AND mah.ae_line_num = lk.ae_line_num
AND mah.ae_header_id = lk.ae_header_id
AND mah.coa_company_code = '0021'
AND LK.application_code IN ('AR','RMC')
)
SELECT ...
FROM SDIFIN.OGG_HAR_INVOICE_HEADER_8660_VI rct
,SDIFIN.OGG_HAR_INVOICE_LINE_8660_VI rctl
,SDIFIN.OGG_HAR_INVOICE_DIS_8660_VI rctlgd
,tmp_mah mah
,SDIIT.OGG_HAC_MD_ENTITY_MANAGET_7050 hca
,SDICOMMON.OGG_HAC_MD_LEGAL_ENTITY_T_7050 hp
,SDIFIN.OGG_HAR_INVOICE_TYPE_T_8660 tt
WHERE 1 = 1
...
AND mah.application_code='AR'
UNION ALL
SELECT ...
WHERE 1 = 1
...
AND mah.application_code='RMC'
优化后,SQL执行时间降低为375s+,相比优化前性能提升了接近三倍,从执行计划中可以看出,利用lk字段提前过滤,减少了表扫描以及关联数据量,提高了SQL的执行效率。
4、【总结】
当前实时区存在较多大数据量的表或者视图,业务在使用时如果不进行提前过滤,就会导致无效数据过多,影响调度效率;而且,在月结或者年结这种高负载时间段,大量此类SQL并发执行,也会严重影响集群性能。因此,需要业务针对这种表或者视图加以提前过滤进行优化。
- 点赞
- 收藏
- 关注作者
评论(0)