Home > mailing lists
Re: Combine Prune and Freeze records emitted by vacuum - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Combine Prune and Freeze records emitted by vacuum
Date	March 25, 2024 13:01:58
Msg-id	923484dc-07f2-403b-aece-b01e94b4a9d2@iki.fi Whole thread Raw
In response to	Re: Combine Prune and Freeze records emitted by vacuum (Melanie Plageman <melanieplageman@gmail.com>)
List	pgsql-hackers
Tree view
On 24/03/2024 21:55, Melanie Plageman wrote:
> On Sat, Mar 23, 2024 at 01:09:30AM +0200, Heikki Linnakangas wrote:
>> On 20/03/2024 21:17, Melanie Plageman wrote:
>> There is another patch in the commitfest that touches this area: "Recording
>> whether Heap2/PRUNE records are from VACUUM or from opportunistic pruning"
>> [1]. That actually goes in the opposite direction than this patch. That
>> patch wants to add more information, to show whether a record was emitted by
>> VACUUM or on-access pruning, while this patch makes the freezing and
>> VACUUM's 2nd phase records also look the same. We could easily add more
>> flags to xl_heap_prune to distinguish them. Or assign different xl_info code
>> for them, like that other patch proposed. But I don't think that needs to
>> block this patch, that can be added as a separate patch.
>>
>> [1] https://www.postgresql.org/message-id/CAH2-Wzmsevhox+HJpFmQgCxWWDgNzP0R9F+VBnpOK6TgxMPrRw@mail.gmail.com
> 
> I think it would be very helpful to distinguish amongst vacuum pass 1,
> 2, and on-access pruning. I often want to know what did most of the
> pruning -- and I could see also wanting to know if the first or second
> vacuum pass was responsible for removing the tuples. I agree it could be
> done separately, but it would be very helpful to have as soon as
> possible now that the record type will be the same for all three.

Ok, I used separate 'info' codes for records generated on on-access 
pruning and vacuum's 1st and 2nd pass. Similar to Peter's patch on that 
other thread, except that I didn't reserve the whole high bit for this, 
but used three different 'info' codes. Freezing uses the same 
XLOG_HEAP2_PRUNE_VACUUM_SCAN as the pruning in vacuum's 1st pass. You 
can distinguish them based on whether the record has nfrozen > 0, and 
with the rest of the patches, they will be merged anyway.

>>  From 042185d3de14dcb7088bbe50e9c64e365ac42c2a Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
>> Date: Fri, 22 Mar 2024 23:10:22 +0200
>> Subject: [PATCH v6] Merge prune, freeze and vacuum WAL record formats
>>   
>>   /*
>> - * Handles XLOG_HEAP2_PRUNE record type.
>> - *
>> - * Acquires a full cleanup lock.
>> + * Replay XLOG_HEAP2_PRUNE_FREEZE record.
>>    */
>>   static void
>> -heap_xlog_prune(XLogReaderState *record)
>> +heap_xlog_prune_freeze(XLogReaderState *record)
>>   {
>>       XLogRecPtr    lsn = record->EndRecPtr;
>> -    xl_heap_prune *xlrec = (xl_heap_prune *) XLogRecGetData(record);
>> +    char       *ptr;
>> +    xl_heap_prune *xlrec;
>>       Buffer        buffer;
>>       RelFileLocator rlocator;
>>       BlockNumber blkno;
>>       XLogRedoAction action;
>>   
>>       XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
>> +    ptr = XLogRecGetData(record);
> 
> I don't love having two different pointers that alias each other and we
> don't know which one is for what. Perhaps we could memcpy xlrec like in
> my attached diff (log_updates.diff). It also might perform slightly
> better than accessing flags through a xl_heap_prune
> * -- since it wouldn't be doing pointer dereferencing.

Ok.

>>       /*
>> -     * We're about to remove tuples. In Hot Standby mode, ensure that there's
>> -     * no queries running for which the removed tuples are still visible.
>> +     * We will take an ordinary exclusive lock or a cleanup lock depending on
>> +     * whether the XLHP_CLEANUP_LOCK flag is set.  With an ordinary exclusive
>> +     * lock, we better not be doing anything that requires moving existing
>> +     * tuple data.
>>        */
>> -    if (InHotStandby)
>> -        ResolveRecoveryConflictWithSnapshot(xlrec->snapshotConflictHorizon,
>> -                                            xlrec->isCatalogRel,
>> +    Assert((xlrec->flags & XLHP_CLEANUP_LOCK) != 0 ||
>> +           (xlrec->flags & (XLHP_HAS_REDIRECTIONS | XLHP_HAS_DEAD_ITEMS)) == 0);
>> +
>> +    /*
>> +     * We are about to remove and/or freeze tuples.  In Hot Standby mode,
>> +     * ensure that there are no queries running for which the removed tuples
>> +     * are still visible or which still consider the frozen xids as running.
>> +     * The conflict horizon XID comes after xl_heap_prune.
>> +     */
>> +    if (InHotStandby && (xlrec->flags & XLHP_HAS_CONFLICT_HORIZON) != 0)
>> +    {
> 
> My attached patch has a TODO here for the comment. It sticks out that
> the serialization and deserialization conditions are different for the
> snapshot conflict horizon. We don't deserialize if InHotStandby is
> false. That's still correct now because we don't write anything else to
> the main data chunk afterward. But, if someone were to add another
> member after snapshot_conflict_horizon, they would want to know to
> deserialize snapshot_conflict_horizon first even if InHotStandby is
> false.

Good point. Fixed so that 'snapshot_conflict_horizon' is deserialized 
even if !InHotStandby. A memcpy is cheap, and is probably optimized away 
by the compiler anyway.

>> +        TransactionId snapshot_conflict_horizon;
>> +
> 
> I would throw a comment in about the memcpy being required because the
> snapshot_conflict_horizon is in the buffer unaligned.

Added.

>> +/*
>> + * Given a MAXALIGNed buffer returned by XLogRecGetBlockData() and pointed to
>> + * by cursor and any xl_heap_prune flags, deserialize the arrays of
>> + * OffsetNumbers contained in an XLOG_HEAP2_PRUNE_FREEZE record.
>> + *
>> + * This is in heapdesc.c so it can be shared between heap2_redo and heap2_desc
>> + * code, the latter of which is used in frontend (pg_waldump) code.
>> + */
>> +void
>> +heap_xlog_deserialize_prune_and_freeze(char *cursor, uint8 flags,
>> +                                       int *nredirected, OffsetNumber **redirected,
>> +                                       int *ndead, OffsetNumber **nowdead,
>> +                                       int *nunused, OffsetNumber **nowunused,
>> +                                       int *nplans, xlhp_freeze_plan **plans,
>> +                                       OffsetNumber **frz_offsets)
>> +{
>> +    if (flags & XLHP_HAS_FREEZE_PLANS)
>> +    {
>> +        xlhp_freeze_plans *freeze_plans = (xlhp_freeze_plans *) cursor;
>> +
>> +        *nplans = freeze_plans->nplans;
>> +        Assert(*nplans > 0);
>> +        *plans = freeze_plans->plans;
>> +
>> +        cursor += offsetof(xlhp_freeze_plans, plans);
>> +        cursor += sizeof(xlhp_freeze_plan) * *nplans;
>> +    }
> 
> I noticed you decided to set these in the else statements. Is that to
> emphasize that it is important to correctness that they be properly
> initialized?

The point was to always initialize *nplans et al in the function. You 
required the caller to initialize them to zero, but that seems error-prone.

I made one more last minute change: I changed the order of the array 
arguments in heap_xlog_deserialize_prune_and_freeze() to match the order 
in log_heap_prune_and_freeze().

Committed with the above little changes. Thank you! Now, back to the 
rest of the patches :-).

-- 
Heikki Linnakangas
Neon (https://neon.tech)
pgsql-hackers by date:
From: Peter Eisentraut
Date: 25 March 2024, 12:55:24
Subject: Re: speed up a logical replica setup
From: "Euler Taveira"
Date: 25 March 2024, 13:02:01
Subject: Re: speed up a logical replica setup
Re: Combine Prune and Freeze records emitted by vacuum - Mailing list pgsql-hackers

Previous

Next