GaussDB(DWS)集群后台UDF进程异常
        【摘要】 本帖通过简单示例介绍UDF进程异常的排查和处理方式。
    
    
    
    背景:使用UDF出现报错,后台查看集群UDF进程发现进程异常。本帖通过简单示例介绍UDF进程异常的排查和处理方式,示例中用到的数据、路径配置、主机名等信息均为测试环境信息。
首先,通过以下命令查看集群UDF进程状态:
cm_ctl query -CvF
结果如下:
[  Fenced UDF State   ]
node         state
--------------------
1  ASG003    Down
2  host17967 Down
3  host17995 Down
发现UDF进程处于异常状态,查看cm_agent日志(在$GAUSSLOG/cm/cm_agent目录下)发现如下内容:
StartAndStop LOG: process (secbox) is not running, path is xxxx, have_found is 0
该日志说明secbox.conf的配置存在问题,进入$GAUSSHOME/secbox路径下,查看secbox.conf配置:
# read/write src_path [dst_path]
[mount_path]            read    /dev
[mount_path]            read    /sys
[mount_path]            read    /bin
[mount_path]            read    /sbin
[mount_path]            read    /usr/bin
[mount_path]            read    /lib
[mount_path]            read    /lib64
[mount_path]            read    /usr/lib
[mount_path]            read    /usr/lib64
[mount_path]            read    /usr/local
[mount_path]            read    /usr/share
[mount_path]            read    /etc
[mount_path]            read    /var
[mount_path]            read    /var/log
[mount_path]            read    /var/lkp
发现该配置中/var/lkp路径不存在,导致UDF进程拉起失败,注释或删除此条配置,几秒后观察本节点UDF进程恢复为Normal状态:
[  Fenced UDF State   ]
node         state
--------------------
1  ASG003    Down
2  host17967 Normal
3  host17995 Down
说明配置已生效,按照同样的方法对其他节点修改secbox.conf文件,问题解决。
[  Fenced UDF State   ]
node         state
--------------------
1  ASG003    Normal
2  host17967 Normal
3  host17995 Normal
        
            【声明】本内容来自华为云开发者社区博主,不代表华为云及华为云开发者社区的观点和立场。转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息,否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
                cloudbbs@huaweicloud.com
                
            
        
        
        
        
        
        
        - 点赞
 - 收藏
 - 关注作者
 
            
           
评论(0)