GaussDB(DWS) 日常维护命令

举报
GreatPeter 发表于 2022/01/27 15:55:24 2022/01/27
【摘要】 在日常使用GaussDB(DWS) 过程中,会遇到各种各样的问题,通过熟练的掌握常用的维护命令和问题定位方法,可以使我们提高问题定位效率,快速解决问题。根据以往的经验,将常用的操作命令分成了以下三个部分。在实际使用的过程中可能还需要掌握其它更多的命令,本文仅列举了部分命令。 1、日常维护DB命令会话查杀select pg_terminate_backend(procpid); --杀掉会话s...

在日常使用GaussDB(DWS) 过程中,会遇到各种各样的问题,通过熟练的掌握常用的维护命令和问题定位方法,可以使我们提高问题定位效率,快速解决问题。根据以往的经验,将常用的操作命令分成了以下三个部分。在实际使用的过程中可能还需要掌握其它更多的命令,本文仅列举了部分命令。

1、日常维护DB命令

  • 会话查杀
    select pg_terminate_backend(procpid); --杀掉会话
    select pg_cancel_backend(procpid); --取消正在执行的语句

  • 主备切换命令
    将DN备实例切换为主实例。假设备实例所在主机plat1,路径为“/gaussdb/data/data_dnS1”。
    gs_om -t switch -h plat1 -D /gaussdb/data/data_dnS1
    参数q表示快速切换,nodeid为需要升主的备实例所在节点ID,/srv/BigData/mppdb/data2为备DN或GTM的数据目录。
    cm_ctl switchover -n nodeid -D /srv/BigData/mppdb/data2 -q

  • 查看表分布情情况
    select getdistributekey(‘item’);
    SELECT n.nspname ,c.relname ,getdistributekey(c.oid) FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE n.nspname <> ‘pg_catalog’ AND n.nspname <> ‘information_schema’ AND n.nspname <> ‘cstore’ AND c.relkind = ‘r’ ORDER BY 1,2; --查找多个表的分布列信息
    select pg_size_pretty(pg_table_size(‘public.item’));
    select table_skewness(‘inventory’);

  • 查询审计日志
    select * from pgxc_query_audit(‘2020-07-16 10:36:05’,‘2020-07-16 12:36:05’) where username!=‘omm’;
    SELECT * FROM pg_catalog.pgxc_query_audit_ext (‘2021-09-01 19:15:00’,current_timestamp) where username = ‘xxx’ and audit_type = ‘user_login’; --查看审计日志

  • 查询pooler池
    select node_name, in_use, count() from pg_pooler_status group by node_name, in_use order by 3;
    select database,user_name,in_use,count(
    ) from pg_pooler_status group by 1, 2, 3 order by 4;
    clean connection to all for database xxx;

  • 根据filenode 查找对应的物理表
    select oid, * from pg_class where reltoastrelid = (select oid from pg_class where relfilenode = 103892072);

  • 查看内存使用情况
    select * from pgxc_total_memory_detail where memorytype = ‘dynamic_used_memory’ order by 3 desc;
    select split_part(pv_session_memory_detail.sessid,’.’,2),sum(totalsize),count(*) from pv_session_memory_detail group by split_part(pv_session_memory_detail.sessid,’.’,2) order by sum(totalsize) desc;
    select sessid, contextname, level,parent, pg_size_pretty(totalsize) as total ,pg_size_pretty(freesize) as freesize, pg_size_pretty(usedsize) as usedsize, datname,query_id, query from pv_session_memory_detail a , pg_stat_activity b where split_part(a.sessid,’.’,2) = b.pid and query_id = ‘76561193666355359’ order by totalsize desc limit 100;

  • 查询等待视图
    select query_start, state_change, waiting, enqueue, state, a.query_id, substr(replace(query, chr(10), ’ '), 0, 10), node_name,thread_name,tid,lwtid,ptid,tlevel,smpid,wait_status,wait_event from pgxc_stat_activity a, pgxc_thread_wait_status b where state = ‘active’ and a.query_id = b.query_id and a.query_id <> 0;
    select node_name, wait_status, count(*) from pgxc_thread_wait_status group by node_name, wait_status order by 3 desc;
    select nodename,username,application_name, start_time, max_peak_memory , queryid, substr(query,1, 10), substring(warning from ‘Statistic Not Collect’) as warning from wlm_session where warning like ‘%Statistic Not Collect%’ and application_name = ‘Data Studio’ order by max_peak_memory desc;

  • 查看活跃会话信息
    select coorname, usename, datname, enqueue , count(*) from pgxc_stat_activity where usename <> ‘omm’ and state = ‘active’ group by coorname, usename, datname, enqueue ;
    select coorname, usename, client_addr, sysdate - query_start as dur, enqueue, query_id, replace(query, chr(10), ’ ') from pgxc_stat_activity where usename!= ‘omm’ and state = ‘active’ order by coorname, dur desc;
    SELECT coorname, usename ,client_addr ,sysdate - query_start AS dur ,query_id ,substr(replace(query, chr(10), ’ '), 0, 100) FROM pgxc_stat_activity WHERE usename != ‘omm’ AND STATE = ‘active’ ORDER BY dur DESC;

  • 集群负载管理相关视图查询
    select usename,enqueue,datname,status,attribute,count(),sum(statement_mem) from pg_session_wlmstat group by 3,1,2,4,5 order by 1,3,4,5 ;
    select usename,processid,threadid,priority,attribute,lane,enqueue,status,block_time,elapsed_time,statement_mem from pg_session_wlmstat where usename=‘usr1’;
    select * from pg_stat_get_workload_struct_info();
    select count(
    ) from pg_stat_get_wlm_realtime_session_info(NULL);

  • 查找删除复制槽
    select * from pg_get_replication_slots(); – 查找复制槽
    select pg_drop_replication_slot(‘dn_6004’); --删除复制槽信息

  • 查看集群事务信息
    select * from pg_prepared_xacts; --查看残留事务
    select * from pgxc_prepared_xacts; --查看全局残留事务视图
    select * from pg_running_xacts; --查看运行时事务情况
    select * from pgxc_running_xacts; --查看集群运行事务情况

  • 集群启停
    checkpoint;
    cm_ctl stop -mi
    cm_ctl start -mi

  • 数据文件和日志解析
    pg_xlogdump 000000010000000000000002 -z
    pg_xlogdump 000000010000000000000004 -n
    pagehack -f pg_filenode.map -t filenode_map
    pagehack -f 16502 -t heap

2 常用GUC参数设置

  • 检查active sql配置
    show use_workload_manager;
    show enable_control_group;
    show enable_resource_record;
    show enable_resource_track;
    show resource_track_level;
    show resource_track_duration;
    show resource_track_cost;

  • 打开 TOPSQL功能
    set use_workload_manager = on;
    set enable_control_group = on;
    set enable_resource_record = on;
    set resource_track_level = query;

  • 修改收集统计信息的比例
    set default_statistics_target = -10;
    analzye public.customer;

  • 打开集群DEBUG2 级别的日志
    set log_min_messages=debug2;
    set logging_module=‘on(ALL)’;

  • 查看页面上所有元组的事务信息
    start transaction read only;
    set enable_show_any_tuples = true;
    set enable_indexscan = off;
    set enable_bitmapscan = off;
    select xmin,xmax,pgxc_is_committed(xmin),pgxc_is_committed(xmax),oid,* from pg_class where relname=‘表名’ ;

  • 优化器相关参数,通过调整参数干预估算模型
    cost_model_version;
    cost_param
    join_num_distinct
    qual_num_distinct

  • 取消集群只读设置只读级别
    gs_guc reload -Z coordinator -Z datanode -N all -I all -c “default_transaction_read_only=off”
    gs_guc reload -Z coordinator -Z datanode -N all -I all -c “datastorage_threshold_value_check=95”

3 OS相关命令

  • 查看进程相关信息
    ps -eo pid,lstart,etime,cmd | grep gaussdb;date
    strace -p 47148 -r -T -o strace.log
  • 查看透明大页信息
    cat /sys/kernel/mm/transparent_hugepage/enabled
    [never]表示未打开
  • gsql进程查杀
    ps -ef | grep gsql | grep -v grep | awk ‘{print $2}’ | xargs kill -9
  • 查看CPU占用高的线程
    top -b -p 39450 -H -n 1|head -100
  • sar命令
    sar -r 5 4 输出物理内存和虚拟内存的统计信息
    sar -B 5 5 分页统计
    sar -u 3 5 显示CPU使用信息
    sar -b 3 5 磁盘IO信息
    sar -n DEV 2 3 网络流量信息
  • 网络问题定位常用命令
    netstat –anop|grep “on (”| sort –rnk 3|head -50
    netstat -naop | grep 54321 --查看端口被占用。
    netstat -anop | awk ‘{print $4}’ | grep ip|sort|uniq -c|grep " 1 "|wc -l --随机端口不足
    ping -s 8192 -I eth0 dest_ip --对端IP是否可达
  • 查看raid 卡缓存策略 Write through,IO性能比WriteBack 要慢。
    /opt/MegaRAID/MegaCli/MegaCli64 -LDinfo -Lall –aAll
【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。