GaussDB(DWS)迁移 - oracle兼容 -- CONNECT BY迁移

举报
譡里个檔 发表于 2023/05/17 15:24:08 2023/05/17
【摘要】 提供一种DWS迁移Oracle的CONNECT BY语法的方案

CONNECT BY是Oracle实现递归处理的SQL写法,DWS不支持CONNECT BY语法,但是有类似的WITH RECURSIVE语法可以实现递归逻辑,下面通过几个case描述CONNECT BY迁移到DWS的改写语法。

注意:

  1. case4中涉及connect_by_root、sys_connect_by_path、伪劣level相关逻辑
  2. case4中涉及CONNECT BY前有WHERE条件的场景,此场景及其特殊,请详细查看解析信息


0. 预置对象定义

DROP TABLE test_cb;

CREATE TABLE test_cb(id int,parent_id int);
INSERT INTO test_cb VALUES(0,'');
INSERT INTO test_cb VALUES(1,0);
INSERT INTO test_cb VALUES(11,1);
INSERT INTO test_cb VALUES(12,1);
INSERT INTO test_cb VALUES(111,11);
INSERT INTO test_cb VALUES(2,0);
INSERT INTO test_cb VALUES(21,2);
INSERT INTO test_cb VALUES(22,2);
INSERT INTO test_cb VALUES(222,22);
INSERT INTO test_cb VALUES(211,21);
INSERT INTO test_cb VALUES(212,21);


1. CONNECT BY PRIOR

PRIOR在等值递归条件左侧,等值条件左侧的作为驱动递归的条件

1.1 Oracle原始语句

SELECT
    id, parent_id,level lvl 
FROM test_cb 
CONNECT BY PRIOR id=parent_id 
START WITH parent_id IS NULL 
ORDER BY id;


1.2 DWS改写语句

WITH RECURSIVE tmp_cb AS(
    SELECT 
        id,parent_id,1 AS level 
    FROM test_cb
    WHERE parent_id IS NULL

    UNION ALL

    SELECT 
        a.id, a.parent_id, b.level+1 AS level 
    FROM test_cb a,tmp_cb b 
    WHERE a.parent_id=b.id
)
SELECT * 
FROM tmp_cb
ORDER BY id;


2. CONNECT BY .. PRIOR

PRIOR在等值递归条件右侧,等值条件右侧的作为驱动递归的条件。

2.1 oracle原始语句

SELECT
    id, parent_id,level lvl 
FROM test_cb 
CONNECT BY id = PRIOR parent_id 
START WITH parent_id =22 
ORDER BY id;

2.2 DWS改写语句

WITH RECURSIVE tmp_cb AS(
    SELECT 
        id,parent_id,1 AS level 
    FROM test_cb
    WHERE parent_id = 22

    UNION ALL

    SELECT 
        a.id, a.parent_id, b.level+1 AS level 
    FROM test_cb a,tmp_cb b 
    WHERE a.id = b.parent_id
)
SELECT * 
FROM tmp_cb
ORDER BY id;



3. PRIOR DBMS_RANDOM.VALUE

3.1 oracle原始语句

-- 因为自身与自身递归,CONNECT BY PRIOR ID=ID会报循环错误,因此为了欺骗ORACLE,我每次递归的条件是
-- 没有循环的,增加PRIOR DBMS_RANDOM.VALUE IS NOT NULL,如下:
WITH t AS (
  SELECT 1 id,5 times FROM dual 
  UNION ALL
  SELECT 2,3 FROM dual
 )
SELECT id, level FROM t
CONNECT BY PRIOR id=id AND LEVEL<=times AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL
ORDER BY id, level;

3.2 DWS改写语句

WITH RECURSIVE t_recur AS(
    WITH t AS (
        SELECT 1 id,5 times FROM dual 
        UNION ALL
        SELECT 2,3 FROM dual
    )

    SELECT id, times, 1 AS level FROM t
    UNION ALL
    SELECT
       r.id,
       r.times,
       (r.level + 1) AS level
    FROM t_recur r
    INNER JOIN t ON r.id = t.id AND r.level + 1 <= r.times
)

SELECT id, level from t_recur
ORDER BY id, level;

4. CONNECT BY NOCYCLE

在递归语法中,单递归条件成环时,执行会陷入死循环,导致验证的执行问题。针对这种场景,在Oracle语法中当指定了NOCYCLE关键字时,可以自动识别这种场景打破这种循环。与此相关的还有一个伪劣CONNECT_BY_ISCYCLE,当伪劣CONNECT_BY_ISCYCLE值为1时,表示当前节点再往下递归就会触发死循环。

INSERT INTO test_cb VALUES(0, 211);
INSERT INTO test_cb VALUES(2, 212);

4.1 oracle原始语句

SELECT
    connect_by_root(id) id_root,
    id,parent_id,
    sys_connect_by_path(i.id, '->') as id_path,
    sys_connect_by_path(i.parent_id, '->') as parent_id_path,  
    level AS instrument_level
FROM test_cb i
WHERE id >= 0 AND parent_id >= 0
CONNECT BY nocycle PRIOR id = parent_id
START WITH id = 2 AND parent_id = 0
ORDER BY instrument_level, parent_id, id 
;

oracle中的执行计划

关键点:

1)语句特征:SQL语句中WHERE字句包含非关联条件

2)执行特征:WHERE字句中的非关联条件计算的优先级低于CONNECT BY,即CONNECT BY执行之后才会进行这些条件的过滤,如上id=2的FILTER条件

4.2 DWS等价改写逻辑

WITH RECURSIVE t_recur AS (
    SELECT
        id AS id_root,
        id,parent_id,
        '->' || i.id AS id_path,
        '->' || i.parent_id AS parent_id_path,
        1 AS instrument_level,
        '#' || id || '#' AS check_val 
    FROM test_cb i
    WHERE id = 2 AND parent_id = 0

    UNION ALL

    SELECT
        r.id_root,
        i.id, i.parent_id,
        r.id_path || '->' || i.id AS id_path,
        r.parent_id_path || '->' || i.parent_id AS parent_id_path,
        (r.instrument_level + 1) AS instrument_level,
        check_val || i.id || '#'  AS check_val
    FROM t_recur r
    INNER JOIN test_cb i ON r.id = i.parent_id AND position('#' || i.id || '#' in check_val) = 0
)

SELECT 
    t.id_root, t.id, t.parent_id, t.id_path, t.parent_id_path, t.instrument_level
FROM t_recur t
WHERE t.id >= 0 AND t.parent_id >= 0
ORDER BY t.instrument_level, t.parent_id, t.id;

5. 伪劣CONNECT_BY_ISLEAF

5.1 oracle原始语句

SELECT
    connect_by_root(id) id_root,
    id,parent_id,
    sys_connect_by_path(i.id, '->') as id_path,
    sys_connect_by_path(i.parent_id, '->') as parent_id_path,  
    level AS instrument_level,
    connect_by_isleaf isleaf
FROM test_cb i
WHERE id >= 0 AND parent_id >= 0
CONNECT BY nocycle PRIOR id = parent_id
START WITH id = 2 AND parent_id = 0
ORDER BY instrument_level, parent_id, id 
;

5.2 DWS等价改写逻辑

WITH RECURSIVE t_recur AS (
    SELECT
        id AS id_root,
        id,parent_id,
        '->' || i.id AS id_path,
        '->' || i.parent_id AS parent_id_path,
        1 AS instrument_level,
        '#' || id || '#' AS check_val
    FROM test_cb i
    WHERE id = 2 AND parent_id = 0

    UNION ALL

    SELECT
        r.id_root,
        i.id, i.parent_id,
        r.id_path || '->' || i.id AS id_path,
        r.parent_id_path || '->' || i.parent_id AS parent_id_path,
        (r.instrument_level + 1) AS instrument_level,
        check_val || i.id || '#'  AS check_val
    FROM t_recur r
    INNER JOIN test_cb i ON r.id = i.parent_id AND position('#' || i.id || '#' in check_val) = 0
)

SELECT 
    t.id_root, t.id, t.parent_id, t.id_path, t.parent_id_path, t.instrument_level,
    (position(reverse('#' || id || '#') in reverse(check_val)) = 1 -- 当前只出现在check_val的尾部
      AND NOT EXISTS(SELECT 1 FROM test_cb i WHERE t.id = i.parent_id)  -- 没有父节点
    )::int AS isleaf
FROM t_recur t
WHERE t.id >= 0 AND t.parent_id >= 0
ORDER BY t.instrument_level, t.parent_id, t.id;
【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。