Re: WAL format and API changes (9.5) - Mailing list pgsql-hackers
| From | Michael Paquier |
|---|---|
| Subject | Re: WAL format and API changes (9.5) |
| Date | |
| Msg-id | CAB7nPqR-TUdK+-Uc5-Aodjpdxw3=-hjeV7y-1nKj+qo+O1WQ+A@mail.gmail.com Whole thread Raw |
| In response to | Re: WAL format and API changes (9.5) (Heikki Linnakangas <hlinnakangas@vmware.com>) |
| Responses |
Re: WAL format and API changes (9.5)
Re: WAL format and API changes (9.5) |
| List | pgsql-hackers |
On Thu, Nov 13, 2014 at 10:33 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> In quick testing, this new WAL format is somewhat more compact than the 9.4
> format. That also seems to have more than bought back the performance
> regression I saw earlier. Here are results from my laptop, using the
> wal-update-testsuite.sh script:
> master:
> [results]
> (27 rows)
> wal-format-and-api-changes-9.patch:
> [results]
> (27 rows)
So based on your series of tests, you are saving 6% up to 10%.
That's... Really cool!
> Aside from the WAL record format changes, this patch adds the "decoded WAL
> record" infrastructure that we talked about with Andres. XLogReader now has
> a new function, DecodeXLogRecord, which parses the block headers etc. from
> the WAL record, and copies the data chunks to aligned buffers. The redo
> routines are passed a pointer to the XLogReaderState, instead of the plain
> XLogRecord, and the redo routines can use macros and functions defined
> xlogreader.h to access the already-decoded WAL record. The new WAL record
> format is difficult to parse in a piece-meal fashion, so it really needs
> this separate decoding pass to be efficient.
>
> Thoughts on this new WAL record format? I've attached the xlogrecord.h file
> here separately for easy reading, if you want to take a quick look at just
> that without applying the whole patch.
The new format is neat, and that's a lot of code.. Grouping together
the blocks of data is a good thing for the FPW compression patch as
well.
Note that this patch conflicts with the recent commits 81c4508 and
34402ae. installcheck-world is passing, and standbys are able to
replay records, at least without crashes.
Here are some more comments:
1) with assertions enabled this does not compile because of a small
typo in xlogreader.h here:
+#define XLogRecHasAnyBlockRefs(decoder) ((decoder)->max_block id >= 0)
2) In xlogreader.c, XLogRecGetBlockData should return char * but a
boolean is returned here:
+XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len)
+{
+ DecodedBkpBlock *bkpb;
+
+ if (!record->blocks[block_id].in_use)
+ return false;
As no blocks are in use in this case this should be NULL.
3) pg_xlogdump does not seem to work:
$ pg_xlogdump 00000001000000000000000D
pg_xlogdump: FATAL: could not find a valid record after 0/D000000
4) A couple of NOT_USED blocks could be removed, no?
+#ifdef NOT_USED BlockNumber leftChildBlkno = InvalidBlockNumber;
+#endif
5) Here why not using the 2nd block instead of the 3rd (@_bt_getroot)?
+ XLogBeginInsert();
+ XLogRegisterBuffer(0, rootbuf, REGBUF_WILL_INIT);
+ XLogRegisterBuffer(2, metabuf, REGBUF_WILL_INIT);
6) There are FIXME blocks:
+ // FIXME
+ //if ((bkpb->fork_flags & BKPBLOCK_WILL_INIT) != 0 &&
mode != RBM_ZERO)
+ // elog(PANIC, "block with WILL_INIT flag in WAL
record must be zeroed by redo routine");]
And that:
+ /* FIXME: translation? Although this shouldn't happen.. */
+ ereport(ERROR,
+ (errmsg("error decoding WAL record"),
+ errdetail("%s", errormsg)));
Regards,
--
Michael
pgsql-hackers by date: