pg 流复制

举报
snowofsummer 发表于 2022/01/24 12:58:32 2022/01/24
【摘要】 PostgreSQL 13.4listen_addresses = '*'min_wal_size=800G#默认80M,导致wal日志备删除,主备不同步。wal_log_hints = on#pg_rewind需要使用,否则失败。max_wal_size=1000G#max_wal_senders = 10 #wal_level = replica 1,备份pg_basebackup -D...
PostgreSQL 13.4

listen_addresses = '*'

min_wal_size=800G

#默认80M,导致wal日志备删除,主备不同步。

wal_log_hints = on

#pg_rewind需要使用,否则失败。

max_wal_size=1000G

#max_wal_senders = 10

#wal_level = replica



1,备份

pg_basebackup -D s001/ -U postgres -Fp -X stream -R -Pv -p5432 -h 127.0.0.1

#备份过程完成后,(-R)会在data目录下创建了一个standby.signal,并将primary_conninfo写入postgresql.auto.conf

#如果postgresql.conf中的hot_standby参数设置为on(默认值),对外提供只读,并且数据目录中存在Standby.signal文件,则replication slave将在“热备”模式下运行。


`-bash-4.2$ pg_basebackup -D s01/ -U postgres  -Fp -X stream -R  -Pv -p5432 -h 127.0.0.1
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/20000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_8199"
 pg_basebackup -D s01/ -U postgres  -Fp -X stream -R  -Pv -p5432 -h 127.0.0.1
pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup -D s01/ -U postgres  -Fp -X stream -R  -Pv -p5432 -h 127              .0.0.1
pg_basebackup: initiating base backup, waiting for checkpoint to complete368910/563739 kB (65%), 0/1 tablespace (s01//arc/0000000100000000000000              563748/563748 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/20000138
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed`


pg_basebackup -h 10.20.20.1 -D /xxxx/data -U replicator -P -v  -R -X stream -C -S pgstandby1
exit
  • -h –指定作为主服务器的主机。

  • -D –指定数据目录。

  • -U –指定连接用户。

  • -P –启用进度报告。

  • -v –启用详细模式。

  • -R–启用恢复配置的创建:创建一个standby.signal文件,并将连接设置附加到数据目录下的postgresql.auto.conf

  • -X–用于在备份中包括所需的预写日志文件(WAL文件)。流的值表示在创建备份时流式传输WAL。

  • -C –在开始备份之前,允许创建由-S选项命名的复制插槽。

  • -S –指定复制插槽名称。

  • -c, --checkpoint=fast|spread

    pg_basebackup会发起一次checkpoint,可以选择checkpoint的模式,fast或者spread。如果没有指定fast,备份任务可能需要一段时间等待checkpoint set fast or spread checkpointin

2,端口修改

echo "port=5433">> /data/s01/postgresql.conf

`-bash-4.2$ pg_ctl -D /data/s01/ start
waiting for server to start....2021-10-19 09:51:15.679 CST [12427] LOG:  starting PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2021-10-19 09:51:15.684 CST [12427] LOG:  listening on IPv6 address "::1", port 5433
2021-10-19 09:51:15.684 CST [12427] LOG:  listening on IPv4 address "127.0.0.1", port 5433
2021-10-19 09:51:15.696 CST [12427] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
2021-10-19 09:51:15.711 CST [12428] LOG:  database system was interrupted; last known up at 2021-10-19 09:41:31 CST
...2021-10-19 09:51:19.068 CST [12428] LOG:  entering standby mode
2021-10-19 09:51:19.111 CST [12428] LOG:  redo starts at 0/20000028
2021-10-19 09:51:19.128 CST [12428] LOG:  consistent recovery state reached at 0/20000138
2021-10-19 09:51:19.128 CST [12427] LOG:  database system is ready to accept read only connections
 done
server started
2021-10-19 09:51:19.174 CST [12451] LOG:  started streaming WAL from primary at 0/21000000 on timeline 1`


3,查询复制状态(默认为异步 async)

postgres=# \x 
Expanded display is on. 
postgres=# 
postgres=# select * from pg_stat_replication; 
-[ RECORD 1 ]----+------------------------------ pid | 12452 usesysid | 10 usename | postgres application_name | walreceiver client_addr | 127.0.0.1 client_hostname | client_port | 37766 backend_start | 2021-10-19 09:51:19.171164+08 backend_xmin | state | streaming sent_lsn | 0/21000148 write_lsn | 0/21000148 flush_lsn | 0/21000148 replay_lsn | 0/21000148 write_lag | flush_lag | replay_lag | sync_priority | 0 sync_state | async reply_time | 2021-10-19 09:52:29.222453+08

4 , 同步复制

show synchronous_standby_names;

show synchronous_commit;


备数据库修改: postgresql.auto.conf

添加application_name=standby01

primary_conninfo = 'application_name=standby01 user=postgres passfile=''/var/lib/pgsql/.pgpass'' channel_binding=disable host=127.0.0.1 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'

#pg_ctl -D /data/s001 restart


主数据库:

synchronous_commit = on

synchronous_standby_names = 'standby01'

alter system set synchronous_commit = on;

alter system set synchronous_standby_names = 'standby01';

// 重新加载配置文件即可:pg_ctl -D /data/master reload




<!--synchronous_standby_names设置的'standby01,standby02'就是在Standby数据库中配置连接参数“application_name”。-->

重新启动主:

`-bash-4.2$ pg_ctl -D /data/db01/ start
waiting for server to start....2021-10-19 10:12:40.383 CST [21812] LOG:  starting PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2021-10-19 10:12:40.385 CST [21812] LOG:  listening on IPv6 address "::1", port 5432
2021-10-19 10:12:40.385 CST [21812] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2021-10-19 10:12:40.398 CST [21812] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-10-19 10:12:40.413 CST [21813] LOG:  database system was shut down at 2021-10-19 10:12:36 CST
2021-10-19 10:12:40.425 CST [21812] LOG:  database system is ready to accept connections
 done
server started
-bash-4.2$ 2021-10-19 10:12:42.173 CST [21849] LOG:  started streaming WAL from primary at 0/22000000 on timeline 1
2021-10-19 10:12:50.309 CST [12427] LOG:  received fast shutdown request
2021-10-19 10:12:50.338 CST [12427] LOG:  aborting any active transactions
2021-10-19 10:12:50.345 CST [21849] FATAL:  terminating walreceiver process due to administrator command
2021-10-19 10:12:50.391 CST [12445] LOG:  shutting down
2021-10-19 10:12:50.453 CST [12427] LOG:  database system is shut down
2021-10-19 10:12:53.108 CST [21951] LOG:  standby "standby01" is now a synchronous standby with priority 1
2021-10-19 10:12:53.108 CST [21951] STATEMENT:  START_REPLICATION 0/22000000 TIMELINE 1`

重新启动备:

`-bash-4.2$ pg_ctl -D /data/s01/ start
waiting for server to start....2021-10-19 10:12:52.986 CST [21945] LOG:  starting PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2021-10-19 10:12:52.986 CST [21945] LOG:  listening on IPv6 address "::1", port 5433
2021-10-19 10:12:52.986 CST [21945] LOG:  listening on IPv4 address "127.0.0.1", port 5433
2021-10-19 10:12:53.004 CST [21945] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
2021-10-19 10:12:53.045 CST [21946] LOG:  database system was shut down in recovery at 2021-10-19 10:12:50 CST
2021-10-19 10:12:53.049 CST [21946] LOG:  entering standby mode
2021-10-19 10:12:53.077 CST [21946] LOG:  consistent recovery state reached at 0/220000A0
2021-10-19 10:12:53.077 CST [21946] LOG:  invalid record length at 0/220000A0: wanted 24, got 0
2021-10-19 10:12:53.078 CST [21945] LOG:  database system is ready to accept read only connections
 done
server started
-bash-4.2$ 2021-10-19 10:12:53.108 CST [21950] LOG:  started streaming WAL from primary at 0/22000000 on timeline 1
2021-10-19 10:12:55.555 CST [21946] LOG:  redo starts at 0/220000A0`


5,同步状态检查(sync)

select pg_is_in_recovery();--主库是false 备库是true

`postgres=# \x
Expanded display is on.
postgres=#  select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 21951
usesysid         | 10
usename          | postgres
application_name | standby01
client_addr      | 127.0.0.1
client_hostname  |
client_port      | 39472
backend_start    | 2021-10-19 10:12:53.101559+08
backend_xmin     |
state            | streaming
sent_lsn         | 0/220000D8
write_lsn        | 0/220000D8
flush_lsn        | 0/220000D8
replay_lsn       | 0/220000D8
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 1
sync_state       | sync
reply_time       | 2021-10-19 10:14:35.761522+08`


<!--sync: 表示为 同步流复制模式 potential:表示是一个潜在的同步 standby sync_priority: 优先级,该优先级的值是由 primary 上的 postgresql.conf 配置文件中的 synchronous_standby_names参数配置的顺序决定的。 -->

6,主备切换(模拟fail over)

主:

pg_ctl -D /data/db01/ stop


备:

01,手动激活备机


pg_ctl promote -D /data/s01/

`2021-10-24 17:29:33.933 CST [28938] LOG:  received promote request
2021-10-24 17:29:33.933 CST [28938] LOG:  redo done at 0/4014148
2021-10-24 17:29:33.933 CST [28938] LOG:  last completed transaction was at log time 2021-10-24 17:27:26.280448+08
2021-10-24 17:29:33.954 CST [28938] LOG:  selected new timeline ID: 2
2021-10-24 17:29:34.134 CST [28938] LOG:  archive recovery complete
2021-10-24 17:29:34.166 CST [28936] LOG:  database system is ready to accept connections`

pg_controldata /data/s01/|grep 'Database cluster state' Database cluster state: in production

02, trigger_file 参数,之后在备库主机上创建触发器文件.

#promote_trigger_file = '' # file name whose presence ends recovery

#检查到该文件,自动激活备机。

03,重新初始(增量同步,只同步变化的块,同sync)

#设置wal_log_hints这个参数使得能够记录特定提示位(hint-bit)的变化。

-- # pg_rewind -D /data/master --source-server='host=127.0.0.1 -p 5432 user=postgres‘ -P

备注:数据库在 initdb 时需要开启 checksums 或者设置 "wal_log_hints = on", 接着设置主,备节点的 wal_log_hints 参数并重启数据库。*initdb --data-checksums*

#更新了 postgresql.conf, postgresql.auto.conf,需要收到修改部分参数。

-- 重新同步命令

`主库上执行

bash-4.2$ pg_rewind -D /data/master --source-server='host=127.0.0.1 port=5433 user=postgres' -P
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/60025E8 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/5000060 on timeline 1
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 67 MB (total source directory size is 87 MB)
69284/69284 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!

pg_rewind 成功。

7,复制槽(replication_slot)

因为pg在归档模式下,对于已经完成归档的wal日志会自动清理,所以提供了复制槽来避免主库在所有的备库收到 WAL 日志之前不会移除它们,并且主库也不会移除可能导致恢复冲突的行,即使备库断开也是如此。


#备机要配置后备机使用这个槽,在备库的recovery.conf中应该配置 primary_slot_name,例如: primary_conninfo = 'host=192.168.7.180 port=1921 user=bill password='xxx' primary_slot_name = 'node_a_slot'


常用参数

PostgreSQL 12对主从复制实现和配置做了重大改进,如废弃了recovery.conf,并将参数转换为普通的PostgreSQL配置参数,使得配置群集和复制更加简单。


ALTER SYSTEM SET会将配置保存在一个postgresql.conf.auto中,与postgresql.conf并存,系统会优先使用.auto配置。



hot_standby = on 备库是否可读。

ps -ef | grep wal wal 日志发送进程"wal sender process",说明是主库。 wal 日志接收进程"wal receiver process" ,说明是备库。


pg_controldata 输出数据库服务的当前状态,可以根据 "Database cluster state: " 的信息来判断,

如果值为 "in production" 说明是主库。

如果值为 "in archive recovery" 说明是备库。

`-bash-4.2$ pg_controldata -D /data/db01/ |grep "cluster state"
Database cluster state:               in production
-bash-4.2$ pg_controldata -D /data/s01/ |grep "cluster state"
Database cluster state:               in archive recovery`


--操作系统命令,判断发送进程还是接收进程
ps -ef | grep ``"wal"` `| grep -v ``"grep"
--数据查看
select` `pid,usename,application_name,client_addr,state,sync_state ``from` `pg_stat_replication;
--or 备库
select` `pid,status,last_msg_send_time,last_msg_receipt_time,conninfo ``from` `pg_stat_wal_receiver;
--系统函数查看
select pg_is_in_recovery();


standby.signal

编辑standby.signal文件(就在数据文件夹内,以此标识从节点,当从节点提升为主节点后会自动删除)

# 添加standby_mode = 'on'


建表命令:

create table t1 (id int);

insert into t1 select from generate_series(1,10);

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。