Re: WAL format and API changes (9.5) - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: WAL format and API changes (9.5) |
Date | |
Msg-id | CAB7nPqR-TUdK+-Uc5-Aodjpdxw3=-hjeV7y-1nKj+qo+O1WQ+A@mail.gmail.com Whole thread Raw |
In response to | Re: WAL format and API changes (9.5) (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: WAL format and API changes (9.5)
Re: WAL format and API changes (9.5) |
List | pgsql-hackers |
On Thu, Nov 13, 2014 at 10:33 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > In quick testing, this new WAL format is somewhat more compact than the 9.4 > format. That also seems to have more than bought back the performance > regression I saw earlier. Here are results from my laptop, using the > wal-update-testsuite.sh script: > master: > [results] > (27 rows) > wal-format-and-api-changes-9.patch: > [results] > (27 rows) So based on your series of tests, you are saving 6% up to 10%. That's... Really cool! > Aside from the WAL record format changes, this patch adds the "decoded WAL > record" infrastructure that we talked about with Andres. XLogReader now has > a new function, DecodeXLogRecord, which parses the block headers etc. from > the WAL record, and copies the data chunks to aligned buffers. The redo > routines are passed a pointer to the XLogReaderState, instead of the plain > XLogRecord, and the redo routines can use macros and functions defined > xlogreader.h to access the already-decoded WAL record. The new WAL record > format is difficult to parse in a piece-meal fashion, so it really needs > this separate decoding pass to be efficient. > > Thoughts on this new WAL record format? I've attached the xlogrecord.h file > here separately for easy reading, if you want to take a quick look at just > that without applying the whole patch. The new format is neat, and that's a lot of code.. Grouping together the blocks of data is a good thing for the FPW compression patch as well. Note that this patch conflicts with the recent commits 81c4508 and 34402ae. installcheck-world is passing, and standbys are able to replay records, at least without crashes. Here are some more comments: 1) with assertions enabled this does not compile because of a small typo in xlogreader.h here: +#define XLogRecHasAnyBlockRefs(decoder) ((decoder)->max_block id >= 0) 2) In xlogreader.c, XLogRecGetBlockData should return char * but a boolean is returned here: +XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len) +{ + DecodedBkpBlock *bkpb; + + if (!record->blocks[block_id].in_use) + return false; As no blocks are in use in this case this should be NULL. 3) pg_xlogdump does not seem to work: $ pg_xlogdump 00000001000000000000000D pg_xlogdump: FATAL: could not find a valid record after 0/D000000 4) A couple of NOT_USED blocks could be removed, no? +#ifdef NOT_USED BlockNumber leftChildBlkno = InvalidBlockNumber; +#endif 5) Here why not using the 2nd block instead of the 3rd (@_bt_getroot)? + XLogBeginInsert(); + XLogRegisterBuffer(0, rootbuf, REGBUF_WILL_INIT); + XLogRegisterBuffer(2, metabuf, REGBUF_WILL_INIT); 6) There are FIXME blocks: + // FIXME + //if ((bkpb->fork_flags & BKPBLOCK_WILL_INIT) != 0 && mode != RBM_ZERO) + // elog(PANIC, "block with WILL_INIT flag in WAL record must be zeroed by redo routine");] And that: + /* FIXME: translation? Although this shouldn't happen.. */ + ereport(ERROR, + (errmsg("error decoding WAL record"), + errdetail("%s", errormsg))); Regards, -- Michael
pgsql-hackers by date: