GaussDB(DWS)视图解耦与自动重建功能介绍
1. 背景简介
GaussDB(DWS)数仓产品内部使用对象标识符(oid)来保存对象之间的引用关系,这使得视图在定义的时候就绑定了其依赖的数据库对象的oid,而不管其名称怎么改变,都不会改变这层依赖关系。如果要对基表进行一些字段修改,会因为与视图字段存在强绑定而报错,如果要删除某个表字段或整个表,就需要连同其关联的视图一起使用cascade关键字删除,表字段删除完成或表重建后再顺序重建各级视图,这就给用户的使用增加了很大的工作量,相对于某些市场主流数仓产品存在明显的易用性差异。为了解决这一问题,GaussDB(DWS) 在8.1.0版本实现了视图的解耦,使得存在视图依赖的基表或其他数据库对象(视图、同义词、函数、表字段)可以单独删除,而其上关联的依赖视图依然存在,而在基表重建后,可以通过ALTER VIEW view_name REBUILD命令重建依赖关系。而8.1.1版本在此基础上又实现了自动重建,可以无感知自动重建依赖关系。
2. 场景介绍
如上图所示,以drop table table_1为例,支持其单独解耦删除,在重建该表后,对其上关联的视图view_1做一次ALTER VIEW view_1 REBUILD操作,可以自动将其上所有的视图重建依赖关系,其重建顺序为:view_1 -> view_2 -> view_3 -> view_4 -> view_5 -> view_6 -> view_7,或者为:view_1 -> view_4 -> view_5 -> view_7 -> view_2 -> view_3 -> view_6,对于存在多重依赖的view_6,不会重复重建。即使各级视图可能归属不同的用户所有,也会由于重建视图所具有的较高权限保证其顺利执行。完成该操作后各级视图可以正常查询,对同样依赖于table_1的其他路径的视图,也需要做同样的操作。而在即将发布的GaussDB(DWS) 8.1.1版本,在table_1重建后,查询任意节点的视图,都可以自动刷新由其通往该表路径上的所有视图,从而得到正确的结果。
3. 使用方法
为了查询视图的依赖关系,在8.1.0版本上实现了两个系统视图:GS_VIEW_DEPENDENCY和GS_VIEW_INVALID,前者可以查询当前用户可见的所有视图的依赖关系,其实现机制参考本部门博文:递归查询视图依赖。后者用来查询当前用户可见的所有的不可用视图。其效果如下:
dws=# SELECT * FROM GS_VIEW_DEPENDENCY order by 1;
objname | refobjname | path
---------+------------+---------------------------------------
view_1 | table_1 | table_1
view_1 | table_2 | table_2
view_2 | view_1 | table_1 -> view_1
view_2 | view_1 | table_2 -> view_1
view_3 | view_2 | table_2 -> view_1 -> view_2
view_3 | view_2 | table_1 -> view_1 -> view_2
view_4 | view_1 | table_1 -> view_1
view_4 | view_1 | table_2 -> view_1
view_5 | view_4 | table_1 -> view_1 -> view_4
view_5 | view_4 | table_2 -> view_1 -> view_4
view_6 | view_3 | table_2 -> view_1 -> view_2 -> view_3
view_6 | view_5 | table_2 -> view_1 -> view_4 -> view_5
view_6 | view_5 | table_1 -> view_1 -> view_4 -> view_5
view_6 | view_3 | table_1 -> view_1 -> view_2 -> view_3
view_7 | view_5 | table_1 -> view_1 -> view_4 -> view_5
view_7 | view_5 | table_2 -> view_1 -> view_4 -> view_5
(16 rows)
如图1所示对table_1做删除操作,其关联的所有视图都变为不可用,8.1.1版本通过查询GS_VIEW_INVALID视图可以查看其定义,便于查找失效原因。
dws=# DROP TABLE table_1;
DROP TABLE
dws=# SELECT * FROM GS_VIEW_INVALID order by 1;
objid | objname | def
-------+---------+-------------------------------------------------------------------------------------------
16463 | view_1 | Invalid view: VIEW "public"."view_1" +
| | AS SELECT * FROM public.table_1, public.table_2;
16467 | view_2 | Invalid view: VIEW "public"."view_2" +
| | AS SELECT * FROM public.view_1;
16471 | view_3 | Invalid view: VIEW "public"."view_3" +
| | AS SELECT * FROM public.view_2;
16475 | view_4 | Invalid view: VIEW "public"."view_4" +
| | AS SELECT * FROM public.view_1;
16479 | view_5 | Invalid view: VIEW "public"."view_5" +
| | AS SELECT * FROM public.view_4;
16483 | view_6 | Invalid view: VIEW "public"."view_6" +
| | ("col1","col2") AS SELECT a.a AS col1, b.b AS col2 FROM public.view_5 a, public.view_3 b;
16487 | view_7 | Invalid view: VIEW "public"."view_7" +
| | AS SELECT * FROM public.view_5;
(7 rows)
3.1 自动与手动重建
重建table_1,尽量保证其包含视图所引用的全部字段,名称必须匹配,数据类型可做符合一定约束的修改,之后查询任意视图,可以正常出结果。如果要使其他视图在查询前也恢复可用,可使用REBUILD命令在合适位置(因为view_1已经可用,这里是view_4最合适)执行,将其他视图恢复为可用。
字段数据类型修改的约束主要为仅可做大类范围内的适当转换(如数值之间,字符串之间,时间之间等),跨类型转换如integer转varchar会因为内部强制转换而产生存储格式不匹配的风险。
dws=# create table table_1(a int, b int);
NOTICE: The 'DISTRIBUTE BY' clause is not specified. Using 'a' as the distribution column by default.
HINT: Please use 'DISTRIBUTE BY' clause to specify suitable data distribution column.
CREATE TABLE
dws=# select * from view_5;
a | b | c | d
---+---+---+---
(0 rows)
dws=# SELECT * FROM GS_VIEW_INVALID ORDER BY 1;
objid | objname | def
-------+---------+-------------------------------------------------------------------------------------------
16483 | view_4 | Invalid view: VIEW "public"."view_4" +
| | AS SELECT * FROM public.view_1;
16487 | view_5 | Invalid view: VIEW "public"."view_5" +
| | AS SELECT * FROM public.view_4;
16491 | view_6 | Invalid view: VIEW "public"."view_6" +
| | ("col1","col2") AS SELECT a.a AS col1, b.b AS col2 FROM public.view_5 a, public.view_3 b;
16495 | view_7 | Invalid view: VIEW "public"."view_7" +
| | AS SELECT * FROM public.view_5;
(4 rows)
dws=# ALTER VIEW view_4 REBUILD;
ALTER VIEW
dws=# SELECT * FROM GS_VIEW_INVALID ORDER BY 1;
objid | objname | def
-------+---------+-----
(0 rows)
3.2 导入导出
在使用gs_dump进行数据库导出操作时,如果存在不可用视图,会给出提示,并在导出文件中以注释的形式保存不可用视图的定义,便于恢复基表时提取视图定义并重建。
./gs_dump -p 30001 dws -f ../dump1.sql
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: The total objects number is 368.
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: NOTICE: VIEW "public.view_1" is invalid and will be dumped as comment
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: NOTICE: VIEW "public.view_2" is invalid and will be dumped as comment
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: NOTICE: VIEW "public.view_3" is invalid and will be dumped as comment
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: NOTICE: VIEW "public.view_4" is invalid and will be dumped as comment
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: NOTICE: VIEW "public.view_5" is invalid and will be dumped as comment
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: NOTICE: VIEW "public.view_6" is invalid and will be dumped as comment
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: NOTICE: VIEW "public.view_7" is invalid and will be dumped as comment
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: [100.00%] 368 objects have been dumped.
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: dump database dws successfully
gs_dump[port='30001'][dws][2021-01-20 11:22:31]: total time: 427 ms
--
-- Name: view_1; Type: VIEW; Schema: public; Owner: postgresql
--
/* CREATE VIEW "public"."view_1"
AS SELECT * FROM public.table_1, public.table_2; */
--
-- Name: view_2; Type: VIEW; Schema: public; Owner: postgresql
--
/* CREATE VIEW "public"."view_2"
AS SELECT * FROM public.view_1; */
--
-- Name: view_3; Type: VIEW; Schema: public; Owner: postgresql
--
/* CREATE VIEW "public"."view_3"
AS SELECT * FROM public.view_2; */
4. 总结
这一功能使用VIEW_INDEPENDENT参数控制行为,在参数打开的情况下,可以单独修改删除任意节点的表、函数或视图,关闭时依然保持视图的强关联模式。
最后,欢迎大家使用该功能,并对其提出各种评测意见和建议。
- 点赞
- 收藏
- 关注作者
评论(0)