- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

实践系列-GaussDB(DWS)不下推语句总结

爱加班的叶凡发表于 2020/11/26 15:14:23 2020/11/26

【摘要】 MPPDB不支持下推的用法总结

说明：

文中customer表定义如下：

CREATE TABLE CUSTOMER

(C_CUSTKEY BIGINT NOT NULL

, C_NAME VARCHAR(25) NOT NULL

, C_ADDRESS VARCHAR(40) NOT NULL

, C_NATIONKEY INT NOT NULL

, C_PHONE CHAR(15) NOT NULL

, C_ACCTBAL DECIMAL(15,2) NOT NULL

, C_MKTSEGMENT CHAR(10) NOT NULL

, C_COMMENT VARCHAR(117) NOT NULL

) distribute by hash(C_CUSTKEY);

1 语法上不支持下推的查询：

1.1 Distinct不支持下推的场景

1）如果count(distinct expr)中的字段不支持重分布，则不支持下推

示例：

create table test_stream(a int,b float); //float不支持重分布

postgres=# explain select count(distinct b) from test_stream;

QUERY PLAN

-------------------------------------------------------------------------------------------------

Aggregate (cost=2.50..2.51 rows=1 width=8)

-> Data Node Scan on test_stream "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=8)

Node/s: All datanodes

(3 rows)

1.2 Count(distinct) + group by 如果group by的字段不支持重分布，则不支持下推

示例：

create table test_stream(a int,b float);

postgres=# explain select count(distinct(a)) from test_stream group by b order by b;

QUERY PLAN

--------------------------------------------------------------------------------------------------------

GroupAggregate (cost=49.83..57.43 rows=10 width=12)

Group By Key: test_stream.b

-> Sort (cost=49.83..52.33 rows=1000 width=12)

Sort Key: test_stream.b

-> Data Node Scan on test_stream "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=12)

Node/s: All datanodes

(6 rows)

IsTypeDistributable:不支持的重分布类型

1.3 不支持distinct on用法下推：

示例：

postgres=# explain select distinct on (c_custkey) c_custkey from customer;

QUERY PLAN

----------------------------------------------------------------------------------------------------

Unique (cost=49.83..54.83 rows=10 width=8)

-> Sort (cost=49.83..52.33 rows=1000 width=8)

Sort Key: customer.c_custkey

-> Data Node Scan on customer "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=8)

Node/s: All datanodes

(5 rows)

1.4 不支持聚集函数中使用order by语句的下推：

示例：

postgres=# explain select count(c_custkey order by c_custkey) from customer;

QUERY PLAN

----------------------------------------------------------------------------------------------

Aggregate (cost=2.50..2.51 rows=1 width=4)

-> Data Node Scan on customer "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=4)

Node/s: All datanodes

(3 rows)

1.5 不支持范围表查询的下推,即使用 WITH RECURSIVE

示例：

postgres=# explain WITH RECURSIVE customer_recursive(c_name, c_nationkey) AS (

SELECT 1, c_nationkey FROM customer WHERE c_custkey = 1)

SELECT * FROM customer_recursive;

QUERY PLAN

------------------------------------------------------------------------------------------------

CTE Scan on customer_recursive (cost=0.00..20.00 rows=1000 width=8)

CTE customer_recursive

-> Data Node Scan on customer "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=4)

Node/s: datanode1

(4 rows)

1.6 不支持returning语句下推:

示例：

postgres=# explain update customer set c_name = 'a' returning c_name;

QUERY PLAN

------------------------------------------------------------------------------------------------

Update on customer (cost=0.00..0.00 rows=1000 width=498)

Node/s: All datanodes

Node expr: c_custkey

-> Data Node Scan on customer "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=498)

Node/s: All datanodes

(5 rows)

1.7 无from list也不支持下推

示例：

select cd_education_status, min(cd_gender), cd_dep_employed_count,grouping(cd_dep_employed_count)

from customer_demographics

group by 1 ,cube(1,3),cd_dep_employed_count >all (select distinct length(cd_demo_sk)

where cd_purchase_estimate = 500 and cd_education_status like '%da%')

having(case when length(trim(both 'e'

from cd_education_status)) > 8 then 'YES' when length(trim(both 'e'

from cd_education_status)) < 8 then 'NO' end) = 'YES'

and cd_dep_employed_count not in (2,4,5)

order by 1, 2, 3,4;

1.8 with recursive不下推场景

（1） enable_stream_recursive为off，with recursive语法不下推

（2） With-Recursive中涉及的基表在多nodegroup中不下推

（3） with recursive语法其他不支持下推的场景

序号	场景	不下推原因
1	WITH recursive t_result AS ( SELECT concat(test_rec.dm,test_rec.sj_dm),* FROM test_rec WHERE dm = '3' UNION ALL SELECT concat(t1.dm,t1.sj_dm),t2.* FROM t_result t1 JOIN test_rec t2 ON t2.dm = t1.sj_dm ) SELECT * FROM t_result;	带有不下推函数
2	explain (costs off, nodes off) with recursive rq as ( select id, name from chinamap where id = 11 union select origin.id, rq.name \|\| ' > ' \|\| origin.name from rq join chinamap origin on origin.pid = rq.id ) select id, name from rq order by 1; QUERY PLAN ----------------------------------------------------------------------- Sort Sort Key: rq.id CTE rq -> Recursive Union -> Data Node Scan on chinamap "_REMOTE_TABLE_QUERY_" -> Hash Join Hash Cond: (origin.pid = rq.id) -> Data Node Scan on chinamap "_REMOTE_TABLE_QUERY_" -> Hash -> WorkTable Scan on rq -> CTE Scan on rq (11 rows)	With-Recursive不包含ALL
2	with recursive all_path(id, neighbor, path) as ( SELECT id, neighbor,ARRAY[id] FROM graph WHERE id = 'A' union all SELECT all_path.id, graph.neighbor,ARRAY[graph.id] FROM graph JOIN all_path ON all_path.neighbor = graph.id ) select * from all_path;	数组操作
3	with recursive cte as (select area_code, area_code as chain, 1 as level /chain为路径,例如>101>201>203;level为层级，从1开始/ from gcms.gcm_mag_area_h where area_code='100000' union all select h.area_code,cast(cte.chain\|\|'>'\|\|h.area_code as varchar2(30)),cte.level+1 from cte join gcms.gcm_mag_area_h h on h.belong_area_code=cte.area_code and cte.level <3 and h.area_code not in (select regexp_split_to_table(cte.chain,'>') )/如果是循环的结构，可以用chain来退出循环；此处cte.level<6表示到第6层停止/ ) select * from cte;	1.subquery里面包含不带fromlist 2.包含相关子查询
4	select dm, ( with recursive cte as ( select distinct f.dm, f.sj_dm, f.name from test_rec_1 f where f.sj_dm=t1.sj_dm union all select distinct f.dm, f.sj_dm, f.name from test_rec_1 f join cte on f.dm=cte.sj_dm ) select distinct dm from cte where sj_dm=7 ) dm_2, sj_dm, name from test_rec t1;	相关子查询
5	WITH recursive t_result AS ( SELECT dm,sj_dm,name,1 as level FROM test_rec_part WHERE sj_dm < 10 UNION all SELECT t2.dm,t2.sj_dm,t2.name\|\|' > '\|\|t1.name,t1.level+1 FROM t_result t1 JOIN test_rec_part t2 ON 1=2 ) SELECT * FROM t_result t;	join条件恒不成立
6	WITH recursive t_result AS ( select * from( SELECT dm,sj_dm,name,1 as level FROM test_rec_part WHERE sj_dm < 10 limit 6 offset 2) UNION all SELECT t2.dm,t2.sj_dm,t2.name\|\|' > '\|\|t1.name,t1.level+1 FROM t_result t1 JOIN test_rec_part t2 ON t2.sj_dm = t1.dm ) SELECT * FROM t_result t;	non recursive term包含limit offset
7	WITH recursive t_result AS ( SELECT dm,sj_dm,name,1 as level FROM test_rec_part WHERE sj_dm < 10 UNION all SELECT t2.dm,t2.sj_dm,t2.name\|\|' > '\|\|t1.name,t1.level+1 FROM t_result t1 JOIN test_rec_part t2 ON t2.sj_dm = t1.dm and t1.sj_dm>t2.dm ) SELECT * FROM t_result t where dm < 5 union all SELECT * FROM t_result t where dm > 5;	output阶段多次调用cte
8	with recursive t_result1 as ( select id,pid,name as name,1 as level from test_date where pid='2018-12-31 23:59:54' union all select t2.id,t2.pid,t2.name\|\|' > '\|\|t1.name, t1.level+1 from t_result1 t1 join test_date t2 on t1.id=t2.pid ),t_result2 as ( select id,pid,name,1 as level from test_date where id='2018-12-31 23:59:54' union all select t2.id,t2.pid,t1.name\|\|' > '\|\|t2.name, t1.level+1 from t_result2 t1 join t_result1 t2 on t1.pid=t2.id ) select * from t_result1 join t_result2 on t_result1.id<t_result2.id;	并列多个递归cte，并且有相互引用（即：存在CTE嵌套）
9	WITH recursive t_result AS ( SELECT max(dm) as dm,max(sj_dm) as sj_dm,max(name) as name FROM test_rec WHERE dm = '3' UNION ALL SELECT t2.dm,t2.sj_dm,t2.name FROM t_result t1 JOIN test_rec_part t2 ON t2.dm = t1.sj_dm ) SELECT * FROM t_result;	agg结果做递归条件
10	WITH RECURSIVE migora_cte AS( --non recursive term with b as ( select * FROM dams_branch b WHERE b.stru_sname LIKE '%' \|\| 'a' \|\| '%' AND b.stru_state IN ('1', '2') AND b.stru_sign <> '7') SELECT stru_id ,stru_sname ,stru_lv ,stru_sort ,b.stru_id::text AS MIG_ROW_ALIAS1 ,b.sort AS MIG_ROW_ALIAS2 FROM dams_branch b WHERE b.stru_id = 100 AND b.stru_state IN ( '1' ,'2' ) UNION ALL --recursive term SELECT b.stru_id ,b.stru_sname ,b.stru_lv ,b.stru_sort ,mig_ora_cte_tab_alias.MIG_ROW_ALIAS1 \|\| '*' \|\| b.stru_id AS MIG_ROW_ALIAS1 ,b.sort AS MIG_ROW_ALIAS2 FROM migora_cte mig_ora_cte_tab_alias INNER JOIN dams_branch b ON mig_ora_cte_tab_alias.stru_id = b.sup_stru AND b.stru_state IN ( '1' ,'2' ) ) SELECT stru_id ,stru_sname FROM migora_cte b ORDER BY sign( instr( b.MIG_ROW_ALIAS1 ,'a' ) ) DESC ,b.stru_lv ,b.stru_sort DESC NULLS LAST ,MIG_ROW_ALIAS2 DESC , nlssort(stru_sname, 'NLS_SORT=SCHINESE_PINYIN_M') ;	order by nlssort

1.9 表在目标列中不支持下推

postgres=# explain select customer from customer;

QUERY PLAN

----------------------------------------------------------------------------------------

Data Node Scan on customer "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=40 width=580)

Node/s: All datanodes

(2 rows)

2 表对象不支持下推：

Roundrobin、modulo

2.1 对于普通表，不支持RROBIN 、MODULO两种分布的查询下推

示例：

postgres=# create by table test(a int) distribute roundrobin;

CREATE TABLE

postgres=# explain select * from test;

QUERY PLAN

------------------------------------------------------------------------------------

Data Node Scan on test "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=4)

Node/s: All datanodes

(2 rows)

postgres=# create table test(a int) distribute by modulo(a);

CREATE TABLE

postgres=# explain select * from test;

QUERY PLAN

------------------------------------------------------------------------------------

Data Node Scan on test "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=4)

Node/s: All datanodes

3 函数不支持下推：

3.1 不支持下推的聚集函数：

array_agg、string_agg、xmlagg、every、curval、setval、lastval、nextval、pg_backend_pid、pg_stat_get_backend_pid

3.2 不支持下推的普通函数：

注：查询函数属于那种类型使用：select provolatile from pg_proc where proname='funcname';

示例：

postgres=# select provolatile from pg_proc where proname='lastval';

provolatile

-------------

(1 row)

其中：v 代表易变的，即VOLATILE；s代表稳定的，即stable；i代表不变的，即IMMUTABLE

3.2.1 对于IMMUTABLE的函数，都是支持下推的，除了every 函数

示例：

postgres=# select provolatile from pg_proc where proname='int8um';

provolatile

-------------

(1 row)

postgres=# explain select int8um(C_CUSTKEY) from CUSTOMER;

QUERY PLAN

----------------------------------------------------------------

Streaming (type: GATHER) (cost=0.00..40.50 rows=40 width=8)

Node/s: All datanodes

-> Seq Scan on customer (cost=0.00..10.12 rows=10 width=8)

(3 rows)

3.2.2 对于易变的（VOLATILE）函数,都是不支持下推的，函数的返回值为RECORD类型的不支持下推

示例：

函数的返回值为RECORD类型的不支持下推

CREATE OR REPLACE FUNCTION test_proc( i integer,j out integer)

RETURN record

result int;

BEGIN

RETURN NULL;

END;

postgres=# explain select test_proc(1) from customer;

QUERY PLAN

------------------------------------------------------------------------------------------

Data Node Scan on customer "_REMOTE_TABLE_QUERY_" (cost=0.00..250.00 rows=1000 width=0)

Node/s: All datanodes

(2 rows)

易变函数不支持下推：

示例：

postgres=# select provolatile from pg_proc where proname='random';

provolatile

-------------

(1 row)

postgres=# explain select random() from CUSTOMER;

QUERY PLAN

----------------------------------------------------------------------------------------

Data Node Scan on customer "_REMOTE_TABLE_QUERY_" (cost=0.00..2.50 rows=1000 width=0)

Node/s: All datanodes

(2 rows)

3.2.3 对于稳定的（stable）的函数，只有部分是支持下推的；

支持的函数列表如下：

to_date（两个参数）

to_date(一个参数)

pg_column_size

abstimein,

abstimeout

reltimein,

reltimeout,

tintervalin,

tintervalout,

cash_in,

cash_out,

numeric(cash),

money(numeric),

money(int2),

money(int8),

date_in,

date_out,

time_in,

time_out,

timestamp_in,

timestamp_out,

smalldatetime_in,

timestamp_out,

timestamptz_out,

smalldatetime_out,

interval_in,

date_part(text,timestamptz),

timestamptz(date),

timestamptz(date,time),

date(timestamptz),

date(abstime),

timestamptz_pl_interval,

timestamptz_mi_interval,

date_trunc(text,timestamptz),

timetz_in,

time(abstime),

date_part(text,abstime),

date_part(text,reltime),

timetz(timestamptz),

to_timestamp(text,text),

time(timestamptz),

timestamp(abstime),

timestamp(timestamptz),

timestamptz(timestamp),

abstime(timestamp),

timetz(time),

date_lt_timestamptz,

date_le_timestamptz,

date_eq_timestamptz,

date_gt_timestamptz,

date_ge_timestamptz,

date_ne_timestamptz,

date_cmp_timestamptz,

timestamptz_lt_date,

timestamptz_le_date,

timestamptz_eq_date,

timestamptz_gt_date,

timestamptz_ge_date,

timestamptz_ne_date,

timestamptz_cmp_date,

timestamp_lt_timestamptz,

timestamp_le_timestamptz,

timestamp_eq_timestamptz,

timestamp_gt_timestamptz,

timestamp_ge_timestamptz,

timestamp_ne_timestamptz,

timestamp_cmp_timestamptz,

timestamptz_lt_timestamp,

timestamptz_le_timestamp,

timestamptz_eq_timestamp,

timestamptz_gt_timestamp,

timestamptz_ge_timestamp,

timestamptz_ne_timestamp,

timestamptz_cmp_timestamp,

interval_pl_timestamptz

3.3 not shippable关键字不下推

CREATE FUNCTION shipping_func_time_02() RETURNS timestamptz

AS 'select clock_timestamp();'

LANGUAGE SQL

stable not shippable

RETURNS NULL ON NULL INPUT;

explain (costs off) select shipping_func_time_02(),t5.c from t4, t5 where t4.b=t5.a order by 1 limit 3;

QUERY PLAN

---------------------------------------------------------------

Limit

-> Hash Join

Hash Cond: (t4.b = t5.a)

-> Data Node Scan on t4 "_REMOTE_TABLE_QUERY_"

Node/s: All datanodes

-> Hash

-> Data Node Scan on t5 "_REMOTE_TABLE_QUERY_"

Node/s: All datanodes

(8 rows)

/* 修改函数为下推函数 */

alter function shipping_func_time_02() shippable;

/* 查看函数定义是否下推 */

select proname,proshippable from pg_proc where proname='shipping_func_time_02' or proname='shipping_func_time_01' or proname='shipping_func_time_03' order by 1;

3.4 current_user / user不下推

postgres=# explain select * from customer where c_name=current_user;

QUERY PLAN

---------------------------------------------------------------------------------------

Data Node Scan on customer "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1 width=556)

Node/s: All datanodes

(2 rows)

SQL语句中存在current_user或user写法时不能下推，需要替换为可下推语法。方法如下：

在库中创建如下函数current_user_select()，并将业务中使用的current_user替换为current_user_select即可实现下推。

CREATE OR REPLACE FUNCTION public.current_user_select()

RETURNS name

LANGUAGE internal

immutable STRICT

AS $function$current_user$function$;

示例如下：

原语句	select current_user from emp;
原语句	select current_user_select() from emp;

3.5 sysdate / current_date / SYSTIMESTAMP不下推

SQL语句中存在sysdate或current_date写法时不能下推，同时，mpp不支持SYSTIMESTAMP，需要使用current_timetamp(0)替换。方法如下：

原语句	select sysdate from emp;
新语句	select current_timestamp(0) from emp;

原语句	select current_date from emp;
新语句	select current_timestamp(0) from emp;

原语句	select SYSTIMESTAMP from emp;
新语句	select current_timestamp(0) from emp;

注：必须带有参数才能下推，即使用current_timestamp(0)。current_timestamp带有时间及时区信息，如果不需要，可以使用substr函数截取。

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

实践系列-GaussDB(DWS)不下推语句总结

1 语法上不支持下推的查询：

1.1 Distinct不支持下推的场景

1.2 Count(distinct) + group by 如果group by的字段不支持重分布，则不支持下推

1.3 不支持distinct on用法下推：

1.4 不支持聚集函数中使用order by语句的下推：

1.5 不支持范围表查询的下推,即使用 WITH RECURSIVE

1.6 不支持returning语句下推:

1.7 无from list也不支持下推

1.8 with recursive不下推场景

1.9 表在目标列中不支持下推

2 表对象不支持下推：

Roundrobin、modulo

2.1 对于普通表，不支持RROBIN 、MODULO两种分布的查询下推

3 函数不支持下推：

3.1 不支持下推的聚集函数：

3.2 不支持下推的普通函数：

3.2.1 对于IMMUTABLE的函数，都是支持下推的，除了every 函数

3.2.2 对于易变的（VOLATILE）函数,都是不支持下推的，函数的返回值为RECORD类型的不支持下推

3.2.3 对于稳定的（stable）的函数，只有部分是支持下推的；

3.3 not shippable关键字不下推

3.4 current_user / user不下推

3.5 sysdate / current_date / SYSTIMESTAMP不下推

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

实践系列-GaussDB(DWS)不下推语句总结

1 语法上不支持下推的查询：

1.1 Distinct不支持下推的场景

1.2 Count(distinct) + group by 如果group by的字段不支持重分布，则不支持下推

1.3 不支持distinct on用法下推：

1.4 不支持聚集函数中使用order by语句的下推：

1.5 不支持范围表查询的下推,即使用 WITH RECURSIVE

1.6 不支持returning语句下推:

1.7 无from list也不支持下推

1.8 with recursive不下推场景

1.9 表在目标列中不支持下推

2 表对象不支持下推：

Roundrobin、modulo

2.1 对于普通表，不支持RROBIN 、MODULO两种分布的查询下推

3 函数不支持下推：

3.1 不支持下推的聚集函数：

3.2 不支持下推的普通函数：

3.2.1 对于IMMUTABLE的函数，都是支持下推的，除了every 函数

3.2.2 对于易变的（VOLATILE）函数,都是不支持下推的，函数的返回值为RECORD类型的不支持下推

3.2.3 对于稳定的（stable）的函数，只有部分是支持下推的；

3.3 not shippable关键字不下推

3.4 current_user / user不下推

3.5 sysdate / current_date / SYSTIMESTAMP不下推

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

推荐阅读

相关产品