PostgreSQL pg_resetxlog整理及测试

文章来源:https://yq.aliyun.com/articles/55691

pg_resetxlog说明

pg_resetxlog,用来重置/清空一个数据库集群的预写日志以及其它控制内容,其中控制内容由命令pg_controldata可以查看,而内容的来源则是位于$PGDATA/global目录下名为pg_control的控制文件

可选参数有:

 

yunbodeMacBook-Pro:~ postgres$ pg_resetxlog –help
pg_resetxlog resets the PostgreSQL transaction log.

Usage:
pg_resetxlog [OPTION]… DATADIR

Options:
-e XIDEPOCH set next transaction ID epoch
-f force update to be done
-l XLOGFILE force minimum WAL starting location for new transaction log
-m MXID,MXID set next and oldest multitransaction ID
-n no update, just show what would be done (for testing)
-o OID set next OID
-O OFFSET set next multitransaction offset
-V, –version output version information, then exit
-x XID set next transaction ID
-?, –help show this help, then exit

具体含义:

 

-l timelineid,fileid,seg

 

 

为新的事务日志指定最小的WAL起始位置,应该比当前存在于pg_xlog中任何一个WAL日志文件名都要大。
名字以十六进制表示且分为三个部分,第一部分是时间线,一般保持该部分值不变。第三部分值不能超过255,即0xFF,如果是该值,则将第二部分加1,第三部分变为0。举例来说,如果00000001000000320000004A是最大的文件名,则-l后跟的参数为0x1,0x32,0x4B;但是,如果最大的文件名为000000010000003A000000FF,则选择-l 0x1,0x3B,0x0
The WAL starting address (-l) should be larger than any WAL segment file name currently existing in the
directory pg_xlog under the data directory. These names are also in hexadecimal and have three parts. The
first part is the “timeline ID” and should usually be kept the same. Do not choose a value larger than 255
(0xFF) for the third part; instead increment the second part and reset the third part to 0. For example, if
00000001000000320000004A is the largest entry in pg_xlog, -l 0x1,0x32,0x4B will work; but if the largest
entry is 000000010000003A000000FF, choose -l 0x1,0x3B,0x0 or more.

 

Note

pg_resetxlog itself looks at the files in pg_xlog and chooses a default -l setting beyond the last
existing file name. Therefore, manual adjustment of -l should only be needed if you are aware of WAL
segment files that are not currently present in pg_xlog, such as entries in an offline archive; or if
the contents of pg_xlog have been lost entirely.

-e XIDEPOCH

 

 

关于transaction ID epoch的信息并不会存储在数据库的任何地方,除非要在pg_resetxlog中指明。所以,可以赋予任意值,但对于复制系统,例如Slony-I,则可以从从节点中获取,以使复制能够正常工作。
The transaction ID epoch is not actually stored anywhere in the database except in the field that is set by
pg_resetxlog, so any value will work so far as the database itself is concerned. You might need to adjust
this value to ensure that replication systems such as Slony-I work correctly – if so, an appropriate value
should be obtainable from the state of the downstream replicated database.

 

-x XID

 

一个安全的设置值应该是由pg_clog下最大的文件名,然后加1,再乘以1048576得出。注意,文件名和参数都应是十六进制的格式。例如0011是最大的文件名,则-x后应该跟上0x1200000,即-x 0x1200000

 

A safe value for the next transaction ID (-x) can be determined by looking for the numerically largest file
name in the directory pg_clog under the data directory, adding one, and then multiplying by 1048576. Note
that the file names are in hexadecimal. It is usually easiest to specify the option value in hexadecimal

For example, if 0011 is the largest entry in pg_clog, -x 0x1200000 will work (five trailing zeroes
the proper multiplier).

 

-m XID

 

 

一个安全的设置值应该是由pg_multixact/offsets下最大的文件名,然后加1,再乘以65536得出。文件名和参数都应是十六进制的格式。如果pg_multixact/offsets目录下不存在任何数据,

 

则得出-m 0x10000

 

A safe value for the next multitransaction ID (-m) can be determined by looking for the numerically largest
file name in the directory pg_multixact/offsets under the data directory, adding one, and then multiplying
by 65536. As above, the file names are in hexadecimal, so the easiest way to do this is to specify the
option value in hexadecimal and add four zeroes.

 

-O OFFSET

 

一个安全的设置值应该是由pg_multixact/members下最大的文件名,然后加1,再乘以65536得出。文件名和参数都应是十六进制的格式。如果pg_multixact/members-m 0x10000

 

A safe value for the next multitransaction offset (-O) can be determined by looking for the numerically
largest file name in the directory pg_multixact/members under the data directory, adding one, and then
multiplying by 65536. As above, the file names are in hexadecimal, so the easiest way to do this is to
specify the option value in hexadecimal and add four zeroes.

 

-o OID

 

一般情况下,很难获得该值,但该值的正确性与否并不是十分的总要。

 

There is no comparably easy way to determine a next OID that’s beyond the largest one in the database, but
fortunately it is not critical to get the next-OID setting right.

-n

不做更新,只是显示抽取出来的信息,做测试用。
no update, just show extracted control values (for testing)

-f
force
强制重置WAL日志和控制信息

pg_resetxlog使用
模拟在执行大的插入数据的动作的同时,使用kill -9终止postmaster和连接进程,同时清空pg_xlog目录下的所有文件。
试验前,先查看pg_control中的内容信息:
pg_control version number: 942
Catalog version number: 201409291
Database system identifier: 6264417882840803310
Database cluster state: in production
pg_control last modified: Wed May 18 14:58:31 2016
Latest checkpoint location: 2/4EBD3538
Prior checkpoint location: 2/4BB504A8
Latest checkpoint’s REDO location: 2/4D3C6D30
Latest checkpoint’s REDO WAL file: 00000001000000020000004D
Latest checkpoint’s TimeLineID: 1
Latest checkpoint’s PrevTimeLineID: 1
Latest checkpoint’s full_page_writes: on
Latest checkpoint’s NextXID: 0/676247
Latest checkpoint’s NextOID: 68691
Latest checkpoint’s NextMultiXactId: 1
Latest checkpoint’s NextMultiOffset: 0
Latest checkpoint’s oldestXID: 991
Latest checkpoint’s oldestXID’s DB: 1
Latest checkpoint’s oldestActiveXID: 0
Latest checkpoint’s oldestMultiXid: 1
Latest checkpoint’s oldestMulti’s DB: 1
Time of latest checkpoint: Wed May 18 14:58:30 2016
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc’s timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
Current wal_level setting: minimal
Current wal_log_hints setting: off
Current max_connections setting: 100
Current max_worker_processes setting: 8
Current max_prepared_xacts setting: 0
Current max_locks_per_xact setting: 64
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0

 

窗口A:

 

test=# create table qt (id integer,vname text);
test=# insert into qt select generate_series(1,10000000), generate_series(1,10000000)||’insert_test’;

 

该命令会执行一段时间,在插入操作执行的同时,在窗口B执行:

yunbodeMacBook-Pro:pg_xlog postgres$ ps -ef|grep postgres
502 26703 26702 0 2:48PM ?? 0:00.00 postgres: logger process
502 26705 26702 0 2:48PM ?? 0:00.37 postgres: checkpointer process
502 26706 26702 0 2:48PM ?? 0:00.04 postgres: writer process
502 26707 26702 0 2:48PM ?? 0:00.01 postgres: wal writer process
502 26708 26702 0 2:48PM ?? 0:00.01 postgres: autovacuum launcher process
502 26709 26702 0 2:48PM ?? 0:00.08 postgres: stats collector process
502 26717 26702 0 2:49PM ?? 0:00.46 postgres: postgres test [local] idle

0 21731 21334 0 Mon05PM ttys002 0:00.02 sudo su – postgres
0 21732 21731 0 Mon05PM ttys002 0:00.01 su – postgres
502 26716 21733 0 2:49PM ttys002 0:00.01 psql -U postgres -d test

0 24267 24253 0 5:55PM ttys004 0:00.02 sudo su – postgres
0 24268 24267 0 5:55PM ttys004 0:00.01 su – postgres
502 26702 1 0 2:48PM ttys004 0:00.05 /Library/PostgreSQL/9.4/bin/postgres -D /Library/PostgreSQL/9.4/data
502 26746 24269 0 2:58PM ttys004 0:00.00 grep postgres
yunbodeMacBook-Pro:pg_xlog postgres$ kill -9 26702 26717

 

 

此时在窗口A会出现如下错误:

server closed the connection unexpectedly

This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

回到窗口B:
yunbodeMacBook-Pro:pg_xlog postgres$ ll
total 294912
-rw——- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004F
-rw——- 1 postgres daemon 16777216 May 18 14:58 000000010000000200000050
-rw——- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004D
-rw——- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004E
-rw——- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004B
-rw——- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004C
-rw——- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004A
-rw——- 1 postgres daemon 16777216 May 18 14:58 000000010000000200000052
-rw——- 1 postgres daemon 16777216 May 18 14:58 000000010000000200000051
drwx—— 2 postgres daemon 68 May 17 17:58 archive_status

 

 

yunbodeMacBook-Pro:pg_xlog postgres$ mv 0000000100000002000000* /tmp
yunbodeMacBook-Pro:pg_xlog postgres$ ll
total 0
drwx—— 2 postgres daemon 68 May 17 17:58 archive_status

xlog已经被清空,此时尝试重启服务:

 

yunbodeMacBook-Pro:pg_xlog postgres$ pg_ctl start -D /Library/PostgreSQL/9.4/data/
pg_ctl: another server might be running; trying to start server anyway
server starting
yunbodeMacBook-Pro:pg_xlog postgres$ % FATAL: pre-existing shared memory block (key 5432001, ID 3473409) is still in use
% HINT: If you’re sure there are no old server processes still running, remove the shared memory block or just delete the file “postmaster.pid”.

 

 

因为非正常关闭PostgreSQL,所以进程文件没有被清理,手工清理pid文件,再重新尝试启动:

 

yunbodeMacBook-Pro:pg_xlog postgres$ rm ../postmaster.pid
yunbodeMacBook-Pro:pg_xlog postgres$ pg_ctl start -D /Library/PostgreSQL/9.4/data/
server starting
yunbodeMacBook-Pro:pg_xlog postgres$ % LOG: redirecting log output to logging collector process
% HINT: Future log output will appear in directory “pg_log”.

 

 

查看日志文件,发现如下报错:

 

yunbodeMacBook-Pro:pg_xlog postgres$ cat ../pg_log/postgresql-2016-05-18_145935.csv
2016-05-18 14:59:35.491 CST,,,26761,,573c12d7.6889,1,,2016-05-18 14:59:35 CST,,0,LOG,00000,”ending log output to stderr”,,”Future log output will go to log destination “”csvlog””.”,,,,,,,””
2016-05-18 14:59:35.493 CST,,,26763,,573c12d7.688b,1,,2016-05-18 14:59:35 CST,,0,LOG,00000,”database system was interrupted; last known up at 2016-05-18 14:58:31 CST”,,,,,,,,,””
2016-05-18 14:59:35.509 CST,,,26763,,573c12d7.688b,2,,2016-05-18 14:59:35 CST,,0,LOG,00000,”invalid primary checkpoint record”,,,,,,,,,””
2016-05-18 14:59:35.509 CST,,,26763,,573c12d7.688b,3,,2016-05-18 14:59:35 CST,,0,LOG,00000,”invalid secondary checkpoint record”,,,,,,,,,””
2016-05-18 14:59:35.509 CST,,,26763,,573c12d7.688b,4,,2016-05-18 14:59:35 CST,,0,PANIC,XX000,”could not locate a valid checkpoint record”,,,,,,,,,””
2016-05-18 14:59:37.282 CST,,,26761,,573c12d7.6889,2,,2016-05-18 14:59:35 CST,,0,LOG,00000,”startup process (PID 26763) was terminated by signal 6: Abort trap”,,,,,,,,,””
2016-05-18 14:59:37.282 CST,,,26761,,573c12d7.6889,3,,2016-05-18 14:59:35 CST,,0,LOG,00000,”aborting startup due to startup process failure”,,,,,,,,,””

 

 

此时没有办法定位到有效的检查点,而且WAL日志完全丢失,且没有任何归档,此时只能通过重置xlog的方式来恢复数据库:

 

yunbodeMacBook-Pro:pg_xlog postgres$ pg_resetxlog /Library/PostgreSQL/9.4/data/
The database server was not shut down cleanly.
Resetting the transaction log might cause data to be lost.
If you want to proceed anyway, use -f to force reset.
yunbodeMacBook-Pro:pg_xlog postgres$ pg_resetxlog -f /Library/PostgreSQL/9.4/data/
Transaction log reset

 

接下来重新尝试启动数据库:

 

yunbodeMacBook-Pro:pg_xlog postgres$ pg_ctl start -D /Library/PostgreSQL/9.4/data/
server starting
yunbodeMacBook-Pro:pg_xlog postgres$ % LOG: redirecting log output to logging collector process
% HINT: Future log output will appear in directory “pg_log”.

 

 

查看日志,确认是否已经启动成功:

 

yunbodeMacBook-Pro:pg_xlog postgres$ cat ../pg_log/postgresql-2016-05-18_150116.csv
2016-05-18 15:01:16.588 CST,,,26777,,573c133c.6899,1,,2016-05-18 15:01:16 CST,,0,LOG,00000,”ending log output to stderr”,,”Future log output will go to log destination “”csvlog””.”,,,,,,,””
2016-05-18 15:01:16.589 CST,,,26779,,573c133c.689b,1,,2016-05-18 15:01:16 CST,,0,LOG,00000,”database system was shut down at 2016-05-18 15:00:26 CST”,,,,,,,,,””
2016-05-18 15:01:16.590 CST,,,26779,,573c133c.689b,2,,2016-05-18 15:01:16 CST,,0,LOG,00000,”MultiXact member wraparound protections are now enabled”,,,,,,,,,””
2016-05-18 15:01:16.592 CST,,,26777,,573c133c.6899,2,,2016-05-18 15:01:16 CST,,0,LOG,00000,”database system is ready to accept connections”,,,,,,,,,””
2016-05-18 15:01:16.592 CST,,,26783,,573c133c.689f,1,,2016-05-18 15:01:16 CST,,0,LOG,00000,”autovacuum launcher started”,,,,,,,,,””

 

 

数据库已经启动成功,可以对比前后pg_controldata输出:

 

pg_control version number: 942
Catalog version number: 201409291
Database system identifier: 6264417882840803310
Database cluster state: in production
pg_control last modified: Wed May 18 15:01:16 2016
Latest checkpoint location: 2/4E000028
Prior checkpoint location: 0/0
Latest checkpoint’s REDO location: 2/4E000028
Latest checkpoint’s REDO WAL file: 00000001000000020000004E
Latest checkpoint’s TimeLineID: 1
Latest checkpoint’s PrevTimeLineID: 1
Latest checkpoint’s full_page_writes: on
Latest checkpoint’s NextXID: 0/676247
Latest checkpoint’s NextOID: 68691
Latest checkpoint’s NextMultiXactId: 1
Latest checkpoint’s NextMultiOffset: 0
Latest checkpoint’s oldestXID: 991
Latest checkpoint’s oldestXID’s DB: 1
Latest checkpoint’s oldestActiveXID: 0
Latest checkpoint’s oldestMultiXid: 1
Latest checkpoint’s oldestMulti’s DB: 1
Time of latest checkpoint: Wed May 18 15:00:26 2016
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc’s timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
Current wal_level setting: minimal
Current wal_log_hints setting: off
Current max_connections setting: 100
Current max_worker_processes setting: 8
Current max_prepared_xacts setting: 0
Current max_locks_per_xact setting: 64
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0

 

判断数据丢失量

根据pg_controldata输出结果中Latest checkpoint location的输出,可以查看重置前后,WAL位置的变化,通过函数pg_xlog_location_diff()可以确认数据的丢失情况。
查看两个事务日志位置之间相差字节数:

 

 

select pg_xlog_location_diff(‘1/911974A8′,’1/91197440’);
pg_xlog_location_diff
———————–
104

 

对于正在运行对PG实例,查看当前正在写入的xlog文件名:

select pg_xlogfile_name(pg_current_xlog_location()); –pg_current_xlog_location()函数显示当前事务日志的写位置


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *