pg 流复制
PostgreSQL 13.4
listen_addresses = '*'
min_wal_size=800G
#默认80M,导致wal日志备删除,主备不同步。
wal_log_hints = on
#pg_rewind需要使用,否则失败。
max_wal_size=1000G
#max_wal_senders = 10
#wal_level = replica
1,备份
pg_basebackup -D s001/ -U postgres -Fp -X stream -R -Pv -p5432 -h 127.0.0.1
#备份过程完成后,(-R)会在data目录下创建了一个standby.signal
,并将primary_conninfo
写入postgresql.auto.conf
。
#如果postgresql.conf中的hot_standby参数设置为on(默认值),对外提供只读,并且数据目录中存在Standby.signal文件,则replication slave
将在“热备”模式下运行。
`-bash-4.2$ pg_basebackup -D s01/ -U postgres -Fp -X stream -R -Pv -p5432 -h 127.0.0.1
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/20000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_8199"
pg_basebackup -D s01/ -U postgres -Fp -X stream -R -Pv -p5432 -h 127.0.0.1
pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup -D s01/ -U postgres -Fp -X stream -R -Pv -p5432 -h 127 .0.0.1
pg_basebackup: initiating base backup, waiting for checkpoint to complete368910/563739 kB (65%), 0/1 tablespace (s01//arc/0000000100000000000000 563748/563748 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/20000138
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed`
pg_basebackup -h 10.20.20.1 -D /xxxx/data -U replicator -P -v -R -X stream -C -S pgstandby1
exit
-
-h
–指定作为主服务器的主机。 -
-D
–指定数据目录。 -
-U
–指定连接用户。 -
-P
–启用进度报告。 -
-v
–启用详细模式。 -
-R
–启用恢复配置的创建:创建一个standby.signal文件,并将连接设置附加到数据目录下的postgresql.auto.conf。 -
-X
–用于在备份中包括所需的预写日志文件(WAL文件)。流的值表示在创建备份时流式传输WAL。 -
-C
–在开始备份之前,允许创建由-S选项命名的复制插槽。 -
-S
–指定复制插槽名称。 -
-c, --checkpoint=fast|spread
pg_basebackup会发起一次checkpoint,可以选择checkpoint的模式,fast或者spread。如果没有指定fast,备份任务可能需要一段时间等待checkpoint set fast or spread checkpointin
2,端口修改
echo "port=5433">> /data/s01/postgresql.conf
`-bash-4.2$ pg_ctl -D /data/s01/ start
waiting for server to start....2021-10-19 09:51:15.679 CST [12427] LOG: starting PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2021-10-19 09:51:15.684 CST [12427] LOG: listening on IPv6 address "::1", port 5433
2021-10-19 09:51:15.684 CST [12427] LOG: listening on IPv4 address "127.0.0.1", port 5433
2021-10-19 09:51:15.696 CST [12427] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2021-10-19 09:51:15.711 CST [12428] LOG: database system was interrupted; last known up at 2021-10-19 09:41:31 CST
...2021-10-19 09:51:19.068 CST [12428] LOG: entering standby mode
2021-10-19 09:51:19.111 CST [12428] LOG: redo starts at 0/20000028
2021-10-19 09:51:19.128 CST [12428] LOG: consistent recovery state reached at 0/20000138
2021-10-19 09:51:19.128 CST [12427] LOG: database system is ready to accept read only connections
done
server started
2021-10-19 09:51:19.174 CST [12451] LOG: started streaming WAL from primary at 0/21000000 on timeline 1`
3,查询复制状态(默认为异步 async)
postgres=# \x
Expanded display is on.
postgres=#
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------ pid | 12452 usesysid | 10 usename | postgres application_name | walreceiver client_addr | 127.0.0.1 client_hostname | client_port | 37766 backend_start | 2021-10-19 09:51:19.171164+08 backend_xmin | state | streaming sent_lsn | 0/21000148 write_lsn | 0/21000148 flush_lsn | 0/21000148 replay_lsn | 0/21000148 write_lag | flush_lag | replay_lag | sync_priority | 0 sync_state | async reply_time | 2021-10-19 09:52:29.222453+08
4 , 同步复制
show synchronous_standby_names;
show synchronous_commit;
备数据库修改: postgresql.auto.conf
添加application_name=standby01
primary_conninfo = 'application_name=standby01 user=postgres passfile=''/var/lib/pgsql/.pgpass'' channel_binding=disable host=127.0.0.1 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
#pg_ctl -D /data/s001 restart
主数据库:
synchronous_commit = on
synchronous_standby_names = 'standby01'
alter system set synchronous_commit = on;
alter system set synchronous_standby_names = 'standby01';
// 重新加载配置文件即可:pg_ctl -D /data/master reload
<!--synchronous_standby_names设置的'standby01,standby02'就是在Standby数据库中配置连接参数“application_name”。-->
重新启动主:
`-bash-4.2$ pg_ctl -D /data/db01/ start
waiting for server to start....2021-10-19 10:12:40.383 CST [21812] LOG: starting PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2021-10-19 10:12:40.385 CST [21812] LOG: listening on IPv6 address "::1", port 5432
2021-10-19 10:12:40.385 CST [21812] LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-10-19 10:12:40.398 CST [21812] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-10-19 10:12:40.413 CST [21813] LOG: database system was shut down at 2021-10-19 10:12:36 CST
2021-10-19 10:12:40.425 CST [21812] LOG: database system is ready to accept connections
done
server started
-bash-4.2$ 2021-10-19 10:12:42.173 CST [21849] LOG: started streaming WAL from primary at 0/22000000 on timeline 1
2021-10-19 10:12:50.309 CST [12427] LOG: received fast shutdown request
2021-10-19 10:12:50.338 CST [12427] LOG: aborting any active transactions
2021-10-19 10:12:50.345 CST [21849] FATAL: terminating walreceiver process due to administrator command
2021-10-19 10:12:50.391 CST [12445] LOG: shutting down
2021-10-19 10:12:50.453 CST [12427] LOG: database system is shut down
2021-10-19 10:12:53.108 CST [21951] LOG: standby "standby01" is now a synchronous standby with priority 1
2021-10-19 10:12:53.108 CST [21951] STATEMENT: START_REPLICATION 0/22000000 TIMELINE 1`
重新启动备:
`-bash-4.2$ pg_ctl -D /data/s01/ start
waiting for server to start....2021-10-19 10:12:52.986 CST [21945] LOG: starting PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2021-10-19 10:12:52.986 CST [21945] LOG: listening on IPv6 address "::1", port 5433
2021-10-19 10:12:52.986 CST [21945] LOG: listening on IPv4 address "127.0.0.1", port 5433
2021-10-19 10:12:53.004 CST [21945] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2021-10-19 10:12:53.045 CST [21946] LOG: database system was shut down in recovery at 2021-10-19 10:12:50 CST
2021-10-19 10:12:53.049 CST [21946] LOG: entering standby mode
2021-10-19 10:12:53.077 CST [21946] LOG: consistent recovery state reached at 0/220000A0
2021-10-19 10:12:53.077 CST [21946] LOG: invalid record length at 0/220000A0: wanted 24, got 0
2021-10-19 10:12:53.078 CST [21945] LOG: database system is ready to accept read only connections
done
server started
-bash-4.2$ 2021-10-19 10:12:53.108 CST [21950] LOG: started streaming WAL from primary at 0/22000000 on timeline 1
2021-10-19 10:12:55.555 CST [21946] LOG: redo starts at 0/220000A0`
5,同步状态检查(sync)
select pg_is_in_recovery();--主库是false 备库是true
`postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 21951
usesysid | 10
usename | postgres
application_name | standby01
client_addr | 127.0.0.1
client_hostname |
client_port | 39472
backend_start | 2021-10-19 10:12:53.101559+08
backend_xmin |
state | streaming
sent_lsn | 0/220000D8
write_lsn | 0/220000D8
flush_lsn | 0/220000D8
replay_lsn | 0/220000D8
write_lag |
flush_lag |
replay_lag |
sync_priority | 1
sync_state | sync
reply_time | 2021-10-19 10:14:35.761522+08`
<!--sync: 表示为 同步流复制模式 potential:表示是一个潜在的同步 standby sync_priority: 优先级,该优先级的值是由 primary 上的 postgresql.conf 配置文件中的 synchronous_standby_names参数配置的顺序决定的。 -->
6,主备切换(模拟fail over)
主:
pg_ctl -D /data/db01/ stop
备:
01,手动激活备机
pg_ctl promote -D /data/s01/
`2021-10-24 17:29:33.933 CST [28938] LOG: received promote request
2021-10-24 17:29:33.933 CST [28938] LOG: redo done at 0/4014148
2021-10-24 17:29:33.933 CST [28938] LOG: last completed transaction was at log time 2021-10-24 17:27:26.280448+08
2021-10-24 17:29:33.954 CST [28938] LOG: selected new timeline ID: 2
2021-10-24 17:29:34.134 CST [28938] LOG: archive recovery complete
2021-10-24 17:29:34.166 CST [28936] LOG: database system is ready to accept connections`
pg_controldata /data/s01/|grep 'Database cluster state' Database cluster state: in production
02, trigger_file 参数,之后在备库主机上创建触发器文件.
#promote_trigger_file = '' # file name whose presence ends recovery
#检查到该文件,自动激活备机。
03,重新初始(增量同步,只同步变化的块,同sync)
#设置wal_log_hints这个参数使得能够记录特定提示位(hint-bit)的变化。
-- # pg_rewind -D /data/master --source-server='host=127.0.0.1 -p 5432 user=postgres‘ -P
备注:数据库在 initdb 时需要开启 checksums 或者设置 "wal_log_hints = on", 接着设置主,备节点的 wal_log_hints 参数并重启数据库。*initdb --data-checksums*
#更新了 postgresql.conf, postgresql.auto.conf,需要收到修改部分参数。
-- 重新同步命令
`主库上执行
bash-4.2$ pg_rewind -D /data/master --source-server='host=127.0.0.1 port=5433 user=postgres' -P
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/60025E8 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/5000060 on timeline 1
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 67 MB (total source directory size is 87 MB)
69284/69284 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!
pg_rewind 成功。
7,复制槽(replication_slot)
因为pg在归档模式下,对于已经完成归档的wal日志会自动清理,所以提供了复制槽来避免主库在所有的备库收到 WAL 日志之前不会移除它们,并且主库也不会移除可能导致恢复冲突的行,即使备库断开也是如此。
#备机要配置后备机使用这个槽,在备库的recovery.conf中应该配置 primary_slot_name,例如: primary_conninfo = 'host=192.168.7.180 port=1921 user=bill password='xxx' primary_slot_name = 'node_a_slot'
常用参数
PostgreSQL 12对主从复制实现和配置做了重大改进,如废弃了recovery.conf
,并将参数转换为普通的PostgreSQL
配置参数,使得配置群集和复制更加简单。
ALTER SYSTEM SET
会将配置保存在一个postgresql.conf.auto
中,与postgresql.conf
并存,系统会优先使用.auto
配置。
hot_standby = on 备库是否可读。
ps -ef | grep wal wal 日志发送进程"wal sender process",说明是主库。 wal 日志接收进程"wal receiver process" ,说明是备库。
pg_controldata 输出数据库服务的当前状态,可以根据 "Database cluster state: " 的信息来判断,
如果值为 "in production" 说明是主库。
如果值为 "in archive recovery" 说明是备库。
`-bash-4.2$ pg_controldata -D /data/db01/ |grep "cluster state"
Database cluster state: in production
-bash-4.2$ pg_controldata -D /data/s01/ |grep "cluster state"
Database cluster state: in archive recovery`
--操作系统命令,判断发送进程还是接收进程
ps -ef | grep ``"wal"` `| grep -v ``"grep"
--数据查看
select` `pid,usename,application_name,client_addr,state,sync_state ``from` `pg_stat_replication;
--or 备库
select` `pid,status,last_msg_send_time,last_msg_receipt_time,conninfo ``from` `pg_stat_wal_receiver;
--系统函数查看
select pg_is_in_recovery();
standby.signal
编辑standby.signal文件(就在数据文件夹内,以此标识从节点,当从节点提升为主节点后会自动删除)
#
建表命令:
create table t1 (id int);
insert into t1 select from generate_series(1,10);
- 点赞
- 收藏
- 关注作者
评论(0)