PostgreSQL pg_resetxlog整理及测试

文章来源：https://yq.aliyun.com/articles/55691 pg_resetxlog说明 pg_resetxlog，用来重置/清空一个数据库集群的预写日志以及其它控制内容，其中控制内容由命令pg_controldata可以查看，而内容的来源则是位于$PGDATA/global目录下名为pg_control的控制文件可选参数有： yunbodeMacBook-Pro:~ postgres$ pg_resetxlog --help pg_resetxlog resets the PostgreSQL transaction log. Usage: pg_resetxlog [OPTION]... DATADIR Options: -e XIDEPOCH set next transaction ID epoch -f force update to be done -l XLOGFILE force minimum WAL starting location for new transaction log -m MXID,MXID set next and oldest multitransaction ID -n no update, just show what would be done (for testing) -o OID set next OID -O OFFSET set next multitransaction offset -V, --version output version information, then exit -x XID set next transaction ID -?, --help show this help, then exit 具体含义： -l timelineid,fileid,seg 为新的事务日志指定最小的WAL起始位置，应该比当前存在于pg_xlog中任何一个WAL日志文件名都要大。名字以十六进制表示且分为三个部分，第一部分是时间线，一般保持该部分值不变。第三部分值不能超过255，即0xFF，如果是该值，则将第二部分加1，第三部分变为0。举例来说，如果00000001000000320000004A是最大的文件名，则-l后跟的参数为0x1,0x32,0x4B；但是，如果最大的文件名为000000010000003A000000FF，则选择-l 0x1,0x3B,0x0 The WAL starting address (-l) should be larger than any WAL segment file name currently existing in the directory pg_xlog under the data directory. These names are also in hexadecimal and have three parts. The first part is the “timeline ID” and should usually be kept the same. Do not choose a value larger than 255 (0xFF) for the third part; instead increment the second part and reset the third part to 0. For example, if 00000001000000320000004A is the largest entry in pg_xlog, -l 0x1,0x32,0x4B will work; but if the largest entry is 000000010000003A000000FF, choose -l 0x1,0x3B,0x0 or more. Note pg_resetxlog itself looks at the files in pg_xlog and chooses a default -l setting beyond the last existing file name. Therefore, manual adjustment of -l should only be needed if you are aware of WAL segment files that are not currently present in pg_xlog, such as entries in an offline archive; or if the contents of pg_xlog have been lost entirely. -e XIDEPOCH 关于transaction ID epoch的信息并不会存储在数据库的任何地方，除非要在pg_resetxlog中指明。所以，可以赋予任意值，但对于复制系统，例如Slony-I，则可以从从节点中获取，以使复制能够正常工作。 The transaction ID epoch is not actually stored anywhere in the database except in the field that is set by pg_resetxlog, so any value will work so far as the database itself is concerned. You might need to adjust this value to ensure that replication systems such as Slony-I work correctly - if so, an appropriate value should be obtainable from the state of the downstream replicated database. -x XID 一个安全的设置值应该是由pg_clog下最大的文件名，然后加1，再乘以1048576得出。注意，文件名和参数都应是十六进制的格式。例如0011是最大的文件名，则-x后应该跟上0x1200000，即-x 0x1200000 A safe value for the next transaction ID (-x) can be determined by looking for the numerically largest file name in the directory pg_clog under the data directory, adding one, and then multiplying by 1048576. Note that the file names are in hexadecimal. It is usually easiest to specify the option value in hexadecimal For example, if 0011 is the largest entry in pg_clog, -x 0x1200000 will work (five trailing zeroes the proper multiplier). -m XID 一个安全的设置值应该是由pg_multixact/offsets下最大的文件名，然后加1，再乘以65536得出。文件名和参数都应是十六进制的格式。如果pg_multixact/offsets目录下不存在任何数据，则得出-m 0x10000 A safe value for the next multitransaction ID (-m) can be determined by looking for the numerically largest file name in the directory pg_multixact/offsets under the data directory, adding one, and then multiplying by 65536. As above, the file names are in hexadecimal, so the easiest way to do this is to specify the option value in hexadecimal and add four zeroes. -O OFFSET 一个安全的设置值应该是由pg_multixact/members下最大的文件名，然后加1，再乘以65536得出。文件名和参数都应是十六进制的格式。如果pg_multixact/members-m 0x10000 A safe value for the next multitransaction offset (-O) can be determined by looking for the numerically largest file name in the directory pg_multixact/members under the data directory, adding one, and then multiplying by 65536. As above, the file names are in hexadecimal, so the easiest way to do this is to specify the option value in hexadecimal and add four zeroes. -o OID 一般情况下，很难获得该值，但该值的正确性与否并不是十分的总要。 There is no comparably easy way to determine a next OID that's beyond the largest one in the database, but fortunately it is not critical to get the next-OID setting right. -n 不做更新，只是显示抽取出来的信息，做测试用。 no update, just show extracted control values (for testing) -f force 强制重置WAL日志和控制信息 pg_resetxlog使用模拟在执行大的插入数据的动作的同时，使用kill -9终止postmaster和连接进程，同时清空pg_xlog目录下的所有文件。试验前，先查看pg_control中的内容信息： pg_control version number: 942 Catalog version number: 201409291 Database system identifier: 6264417882840803310 Database cluster state: in production pg_control last modified: Wed May 18 14:58:31 2016 Latest checkpoint location: 2/4EBD3538 Prior checkpoint location: 2/4BB504A8 Latest checkpoint's REDO location: 2/4D3C6D30 Latest checkpoint's REDO WAL file: 00000001000000020000004D Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0/676247 Latest checkpoint's NextOID: 68691 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 991 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Time of latest checkpoint: Wed May 18 14:58:30 2016 Fake LSN counter for unlogged rels: 0/1 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no Current wal_level setting: minimal Current wal_log_hints setting: off Current max_connections setting: 100 Current max_worker_processes setting: 8 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 64 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 窗口A： test=# create table qt (id integer,vname text); test=# insert into qt select generate_series(1,10000000), generate_series(1,10000000)||'insert_test'; 该命令会执行一段时间，在插入操作执行的同时，在窗口B执行： yunbodeMacBook-Pro:pg_xlog postgres$ ps -ef|grep postgres 502 26703 26702 0 2:48PM ?? 0:00.00 postgres: logger process 502 26705 26702 0 2:48PM ?? 0:00.37 postgres: checkpointer process 502 26706 26702 0 2:48PM ?? 0:00.04 postgres: writer process 502 26707 26702 0 2:48PM ?? 0:00.01 postgres: wal writer process 502 26708 26702 0 2:48PM ?? 0:00.01 postgres: autovacuum launcher process 502 26709 26702 0 2:48PM ?? 0:00.08 postgres: stats collector process 502 26717 26702 0 2:49PM ?? 0:00.46 postgres: postgres test [local] idle 0 21731 21334 0 Mon05PM ttys002 0:00.02 sudo su - postgres 0 21732 21731 0 Mon05PM ttys002 0:00.01 su - postgres 502 26716 21733 0 2:49PM ttys002 0:00.01 psql -U postgres -d test 0 24267 24253 0 5:55PM ttys004 0:00.02 sudo su - postgres 0 24268 24267 0 5:55PM ttys004 0:00.01 su - postgres 502 26702 1 0 2:48PM ttys004 0:00.05 /Library/PostgreSQL/9.4/bin/postgres -D /Library/PostgreSQL/9.4/data 502 26746 24269 0 2:58PM ttys004 0:00.00 grep postgres yunbodeMacBook-Pro:pg_xlog postgres$ kill -9 26702 26717 此时在窗口A会出现如下错误： server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. 回到窗口B： yunbodeMacBook-Pro:pg_xlog postgres$ ll total 294912 -rw------- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004F -rw------- 1 postgres daemon 16777216 May 18 14:58 000000010000000200000050 -rw------- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004D -rw------- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004E -rw------- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004B -rw------- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004C -rw------- 1 postgres daemon 16777216 May 18 14:58 00000001000000020000004A -rw------- 1 postgres daemon 16777216 May 18 14:58 000000010000000200000052 -rw------- 1 postgres daemon 16777216 May 18 14:58 000000010000000200000051 drwx------ 2 postgres daemon 68 May 17 17:58 archive_status yunbodeMacBook-Pro:pg_xlog postgres$ mv 0000000100000002000000* /tmp yunbodeMacBook-Pro:pg_xlog postgres$ ll total 0 drwx------ 2 postgres daemon 68 May 17 17:58 archive_status xlog已经被清空，此时尝试重启服务： yunbodeMacBook-Pro:pg_xlog postgres$ pg_ctl start -D /Library/PostgreSQL/9.4/data/ pg_ctl: another server might be running; trying to start server anyway server starting yunbodeMacBook-Pro:pg_xlog postgres$ % FATAL: pre-existing shared memory block (key 5432001, ID 3473409) is still in use % HINT: If you're sure there are no old server processes still running, remove the shared memory block or just delete the file "postmaster.pid". 因为非正常关闭PostgreSQL，所以进程文件没有被清理，手工清理pid文件，再重新尝试启动： yunbodeMacBook-Pro:pg_xlog postgres$ rm ../postmaster.pid yunbodeMacBook-Pro:pg_xlog postgres$ pg_ctl start -D /Library/PostgreSQL/9.4/data/ server starting yunbodeMacBook-Pro:pg_xlog postgres$ % LOG: redirecting log output to logging collector process % HINT: Future log output will appear in directory "pg_log". 查看日志文件，发现如下报错： yunbodeMacBook-Pro:pg_xlog postgres$ cat ../pg_log/postgresql-2016-05-18_145935.csv 2016-05-18 14:59:35.491 CST,,,26761,,573c12d7.6889,1,,2016-05-18 14:59:35 CST,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,"" 2016-05-18 14:59:35.493 CST,,,26763,,573c12d7.688b,1,,2016-05-18 14:59:35 CST,,0,LOG,00000,"database system was interrupted; last known up at 2016-05-18 14:58:31 CST",,,,,,,,,"" 2016-05-18 14:59:35.509 CST,,,26763,,573c12d7.688b,2,,2016-05-18 14:59:35 CST,,0,LOG,00000,"invalid primary checkpoint record",,,,,,,,,"" 2016-05-18 14:59:35.509 CST,,,26763,,573c12d7.688b,3,,2016-05-18 14:59:35 CST,,0,LOG,00000,"invalid secondary checkpoint record",,,,,,,,,"" 2016-05-18 14:59:35.509 CST,,,26763,,573c12d7.688b,4,,2016-05-18 14:59:35 CST,,0,PANIC,XX000,"could not locate a valid checkpoint record",,,,,,,,,"" 2016-05-18 14:59:37.282 CST,,,26761,,573c12d7.6889,2,,2016-05-18 14:59:35 CST,,0,LOG,00000,"startup process (PID 26763) was terminated by signal 6: Abort trap",,,,,,,,,"" 2016-05-18 14:59:37.282 CST,,,26761,,573c12d7.6889,3,,2016-05-18 14:59:35 CST,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,,"" 此时没有办法定位到有效的检查点，而且WAL日志完全丢失，且没有任何归档，此时只能通过重置xlog的方式来恢复数据库： yunbodeMacBook-Pro:pg_xlog postgres$ pg_resetxlog /Library/PostgreSQL/9.4/data/ The database server was not shut down cleanly. Resetting the transaction log might cause data to be lost. If you want to proceed anyway, use -f to force reset. yunbodeMacBook-Pro:pg_xlog postgres$ pg_resetxlog -f /Library/PostgreSQL/9.4/data/ Transaction log reset 接下来重新尝试启动数据库： yunbodeMacBook-Pro:pg_xlog postgres$ pg_ctl start -D /Library/PostgreSQL/9.4/data/ server starting yunbodeMacBook-Pro:pg_xlog postgres$ % LOG: redirecting log output to logging collector process % HINT: Future log output will appear in directory "pg_log". 查看日志，确认是否已经启动成功： yunbodeMacBook-Pro:pg_xlog postgres$ cat ../pg_log/postgresql-2016-05-18_150116.csv 2016-05-18 15:01:16.588 CST,,,26777,,573c133c.6899,1,,2016-05-18 15:01:16 CST,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,"" 2016-05-18 15:01:16.589 CST,,,26779,,573c133c.689b,1,,2016-05-18 15:01:16 CST,,0,LOG,00000,"database system was shut down at 2016-05-18 15:00:26 CST",,,,,,,,,"" 2016-05-18 15:01:16.590 CST,,,26779,,573c133c.689b,2,,2016-05-18 15:01:16 CST,,0,LOG,00000,"MultiXact member wraparound protections are now enabled",,,,,,,,,"" 2016-05-18 15:01:16.592 CST,,,26777,,573c133c.6899,2,,2016-05-18 15:01:16 CST,,0,LOG,00000,"database system is ready to accept connections",,,,,,,,,"" 2016-05-18 15:01:16.592 CST,,,26783,,573c133c.689f,1,,2016-05-18 15:01:16 CST,,0,LOG,00000,"autovacuum launcher started",,,,,,,,,"" 数据库已经启动成功，可以对比前后pg_controldata输出： pg_control version number: 942 Catalog version number: 201409291 Database system identifier: 6264417882840803310 Database cluster state: in production pg_control last modified: Wed May 18 15:01:16 2016 Latest checkpoint location: 2/4E000028 Prior checkpoint location: 0/0 Latest checkpoint's REDO location: 2/4E000028 Latest checkpoint's REDO WAL file: 00000001000000020000004E Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0/676247 Latest checkpoint's NextOID: 68691 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 991 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Time of latest checkpoint: Wed May 18 15:00:26 2016 Fake LSN counter for unlogged rels: 0/1 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no Current wal_level setting: minimal Current wal_log_hints setting: off Current max_connections setting: 100 Current max_worker_processes setting: 8 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 64 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 判断数据丢失量根据pg_controldata输出结果中Latest checkpoint location的输出，可以查看重置前后，WAL位置的变化，通过函数pg_xlog_location_diff()可以确认数据的丢失情况。查看两个事务日志位置之间相差字节数： select pg_xlog_location_diff('1/911974A8','1/91197440'); pg_xlog_location_diff ----------------------- 104 对于正在运行对PG实例，查看当前正在写入的xlog文件名： select pg_xlogfile_name(pg_current_xlog_location()); --pg_current_xlog_location()函数显示当前事务日志的写位置