Re: Reviewing freeze map code - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Reviewing freeze map code |
Date | |
Msg-id | 20160714060607.klwgq2qr7egt3zrr@alap3.anarazel.de Whole thread Raw |
In response to | Re: Reviewing freeze map code (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Reviewing freeze map code
Re: Reviewing freeze map code Re: Reviewing freeze map code |
List | pgsql-hackers |
Hi, So I'm generally happy with 0001, baring some relatively minor adjustments. I am however wondering about one thing: On 2016-07-11 23:51:05 +0900, Masahiko Sawada wrote: > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c > index 57da57a..e7cb8ca 100644 > --- a/src/backend/access/heap/heapam.c > +++ b/src/backend/access/heap/heapam.c > @@ -3923,6 +3923,16 @@ l2: > > if (need_toast || newtupsize > pagefree) > { > + /* > + * For crash safety, we need to emit that xmax of old tuple is set > + * and clear only the all-frozen bit on visibility map if needed > + * before releasing the buffer. We can reuse xl_heap_lock for this > + * purpose. It should be fine even if we crash midway from this > + * section and the actual updating one later, since the xmax will > + * appear to come from an aborted xid. > + */ > + START_CRIT_SECTION(); > + > /* Clear obsolete visibility flags ... */ > oldtup.t_data->t_infomask &= ~(HEAP_XMAX_BITS | HEAP_MOVED); > oldtup.t_data->t_infomask2 &= ~HEAP_KEYS_UPDATED; > @@ -3936,6 +3946,28 @@ l2: > /* temporarily make it look not-updated */ > oldtup.t_data->t_ctid = oldtup.t_self; > already_marked = true; > + > + MarkBufferDirty(buffer); > + > + if (RelationNeedsWAL(relation)) > + { > + xl_heap_lock xlrec; > + XLogRecPtr recptr; > + > + XLogBeginInsert(); > + XLogRegisterBuffer(0, buffer, REGBUF_STANDARD); > + > + xlrec.offnum = ItemPointerGetOffsetNumber(&oldtup.t_self); > + xlrec.locking_xid = xmax_old_tuple; > + xlrec.infobits_set = compute_infobits(oldtup.t_data->t_infomask, > + oldtup.t_data->t_infomask2); > + XLogRegisterData((char *) &xlrec, SizeOfHeapLock); > + recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_LOCK); > + PageSetLSN(page, recptr); > + } Master does /* temporarily make it look not-updated */ oldtup.t_data->t_ctid = oldtup.t_self; here, and as is the wal record won't reflect that, because: static void heap_xlog_lock(XLogReaderState *record) { ... /* * Clear relevant update flags, but only if the modified infomask says * there's no update. */ if(HEAP_XMAX_IS_LOCKED_ONLY(htup->t_infomask)) { HeapTupleHeaderClearHotUpdated(htup); /* Make sure thereis no forward chain link in t_ctid */ ItemPointerSet(&htup->t_ctid, BufferGetBlockNumber(buffer), offnum); } won't enter the branch, because HEAP_XMAX_LOCK_ONLY won't be set. Which will leave t_ctid and HEAP_HOT_UPDATED set differently on the master and standby / after crash recovery. I'm failing to see any harmful consequences right now, but differences between master and standby are a bad thing. Pre 9.3 that's not a problem, we reset ctid and HOT_UPDATED unconditionally there. I think I'm more comfortable with setting HEAP_XMAX_LOCK_ONLY until the tuple is finally updated - that also coincides more closely with the actual meaning. Any arguments against? > > + /* Clear only the all-frozen bit on visibility map if needed */ > + if (PageIsAllVisible(BufferGetPage(buffer)) && > + VM_ALL_FROZEN(relation, block, &vmbuffer)) > + { > + visibilitymap_clear_extended(relation, block, vmbuffer, > + VISIBILITYMAP_ALL_FROZEN); > + } > + FWIW, I don't think it's worth introducing visibilitymap_clear_extended. As this is a 9.6 only patch, i think it's better to change visibilitymap_clear's API. Unless somebody protests I'm planning to commit with those adjustments tomorrow. Greetings, Andres Freund
pgsql-hackers by date: