WAL file naming sequence definition - Mailing list pgsql-hackers
From | Andrew Hammond |
---|---|
Subject | WAL file naming sequence definition |
Date | |
Msg-id | 5a0a9d6f0805141425h6ffff039j414dbeb3c77ef0b1@mail.gmail.com Whole thread Raw |
Responses |
Re: WAL file naming sequence definition
Re: WAL file naming sequence definition |
List | pgsql-hackers |
I'd confirmation on how WAL files are named. I'm trying to write a tool which can tell me when we are missing a WAL filefrom the sequence. I initially thought that the file names were monotonically incrementing hexadecimal numbers. Thisdoesn't appear to be the case.<br /><br />00000001000001B7000000FD<br />00000001000001B7000000FE<br />(there seem tobe a whole bunch of missing filenames in the sequence here)<br />00000001000001B800000000<br />00000001000001B800000001<br/><br />This pattern repeats. I hunted through the code and discovered the following in src/include/access/xlog_internal.h.<br/><br />#define XLogFilePath(path, tli, log, seg) \<br /> snprintf(path, MAXPGPATH,XLOGDIR "/%08X%08X%08X", tli, log, seg)<br /><br />So, the names are not a single hexadecimal number, but insteadthree of them concatenated together. This macro is used eight times in src/backend/access/xlog.c. It seems clear thatthe first number, tli, is a TimeLineID. I wasn't completely clear on the behavior of log and seg until I found the following,also in xlog_internal.h.<br /><br />#define NextLogSeg(logId, logSeg) \<br /> do { \<br /> if ((logSeg)>= XLogSegsPerFile-1) \<br /> { \<br /> (logId)++; \<br /> (logSeg) = 0; \<br /> } \<br /> else \<br /> (logSeg)++; \<br /> } while (0)<br /><br />So, clearly log simply incrementsand seg increments until it gets up to XLogSegsPerFile. Again, xlog_internal.h knows what that is.<br /><br />/*<br/> * We break each logical log file (xlogid value) into segment files of the<br /> * size indicated by XLOG_SEG_SIZE. One possible segment at the end of each<br /> * log file is wasted, to ensure that we don't have problemsrepresenting<br /> * last-byte-position-plus-1.<br /> */<br />#define XLogSegSize ((uint32) XLOG_SEG_SIZE)<br/> #define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize)<br /><br />In src/include/<a href="http://pg_config.h.in">pg_config.h.in</a>,I see<br />/* XLOG_SEG_SIZE is the size of a single WAL file. This must bea power of 2<br /> and larger than XLOG_BLCKSZ (preferably, a great deal larger than<br /> XLOG_BLCKSZ). ChangingXLOG_SEG_SIZE requires an initdb. */<br />#undef XLOG_SEG_SIZE<br /><br />Then configure tells me the following<br/><br /># Check whether --with-wal-segsize was given.<br /> if test "${with_wal_segsize+set}" = set; then<br/> withval=$with_wal_segsize;<br /> case $withval in<br /> yes)<br /> { { echo "$as_me:$LINENO: error: argumentrequired for --with-wal-segsize<br />echo "$as_me: error: argument required for --with-wal-segsize option" >&2;}<br/> { (exit 1); exit 1; }; }<br /> ;;<br /> no)<br /> { { echo "$as_me:$LINENO: error: argumentrequired for --with-wal-segsize<br />echo "$as_me: error: argument required for --with-wal-segsize option" >&2;}<br/> { (exit 1); exit 1; }; }<br /> ;;<br /> *)<br /> wal_segsize=$withval<br /> ;;<br/> esac<br /><br />else<br /> wal_segsize=16<br />fi<br /><br /><br />case ${wal_segsize} in<br /> 1) ;;<br /> 2);;<br /> 4) ;;<br /> 8) ;;<br /> 16) ;;<br /> 32) ;;<br /> 64) ;;<br /> *) { { echo "$as_me:$LINENO: error: InvalidWAL segment size. Allowed values a<br />echo "$as_me: error: Invalid WAL segment size. Allowed values are 1,2,4,8,16,32,<br/> { (exit 1); exit 1; }; }<br /> esac<br />{ echo "$as_me:$LINENO: result: ${wal_segsize}MB" >&5<br/>echo "${ECHO_T}${wal_segsize}MB" >&6; }<br /><br />cat >>confdefs.h <<_ACEOF<br />#defineXLOG_SEG_SIZE (${wal_segsize} * 1024 * 1024)<br /> _ACEOF<br /><br />Since I didn't specify a wal_segsize at compiletime, it seems that my XLogSegsPerFile should be<br />0xffffffff / (16 * 1024 * 1024) = 255<br />Which matches nicelywith what I'm observing.<br /><br />So, and this is where I want the double-check, a tool which verifies there areno missing WAL files (based on names alone) in a series of WAL files needs to know the following.<br /><br />1) Timelinehistory (although perhaps not, it could simply verify all existing timelines)<br />2) What, if any, wal_segsize wasspecified for the database which is generating the WAL files<br /><br />Am I missing anything? The format of .backup filesseem pretty simple to me. So I intend to do the following.<br /> 1) find the most recent .backup file<br />2) verifythat all the files required for that .backup exist<br />3) see if there are any newer files, and <br />4) if thereare newer files, warn if any are missing from the sequence<br /><br />Would this be reasonable and is there any communityinterest in open-sourcing the tool that I'm building?<br /><br />Andrew<br /><br />
pgsql-hackers by date: