Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue - Mailing list pgsql-hackers

From Arseniy Mukhin
Subject Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date
Msg-id CAE7r3MK6e_WM0m4gbcDe7sHfKORtG=-Lv4a-77mG=mzZYK9zAQ@mail.gmail.com
Whole thread Raw
In response to Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue  ("Matheus Alcantara" <matheusssilv97@gmail.com>)
List pgsql-hackers
On Fri, Sep 19, 2025 at 5:36 PM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
>
> On Fri Sep 19, 2025 at 9:34 AM -03, Arseniy Mukhin wrote:
> > Hi,
> >
> Thanks for taking a look at this!
>
> > I like this approach. We got rid of dependency on clog and don't limit
> > vacuum. Several points about the fix:
> >
> > Is it correct to remember and reuse slru slots here? IIUC we can't do
> > it if we don't hold SLRU bank lock, because by the time we get in
> > AtAbort_Notify() the queue page could be already evicted. Probably we
> > need to use SimpleLruReadPage after we acquire the lock in
> > AtAbort_Notify()?
> >
> Yeah, I think it make sense. Maybe we could just store the page number
> and offset and call SimpleLruReadPage() inside AtAbort_Notify() to get
> the slotno?

Yeah, I think it would work.

>
> > I think adding a boolean 'committed' is a good approach, but what do
> > you think about setting the queue head back to the position where
> > aborted transaction notifications start? We can do such a reset in
> > AtAbort_Notify(). So instead of marking notifications as
> > 'commited=false' we completely erase them from the queue by moving the
> > queue head back. From listeners perspective if there is a notification
> > of completed transaction in the queue - it's always a committed
> > transaction, so again get rid of TransactionIdDidCommit() call. It
> > seems like a simpler approach because we don't need to remember all
> > notifications positions in the queue and don't need the additional
> > field 'committed'. All we need is to remember the head position before
> > we write anything to the queue, and reset it back if there is an
> > abort. IIUC Listeners will never send such erased notifications:
> > - while the aborted transaction is looking like 'in progress',
> > listeners can't send its notifications.
> > - by the time the aborted transaction is completed, the head is
> > already set back so erased notifications are located after the queue
> > head and listeners can't read it.
> >
> I think that with this approach we may have a scenario where when we
> enter the AtAbort_Notify() the queue head may not be the notification
> from the transaction being aborted but it can be a notification from
> another committed transaction, so resetting the queue head back to the
> position before the aborted transaction would make us lose the
> notification from the committed transaction. Is that make sense? What do
> you think?

I think it's impossible: before pushing anything to the queue we
acquire global lock in PreCommit_Notify():

LockSharedObject(DatabaseRelationId, InvalidOid, 0, AccessExclusiveLock);

While we are holding the lock, no writers can add anything to the
queue. Then we save head position and add pending notifications to the
queue. The moment we get in AtAbort_Notify(), we still hold the global
lock (maybe it is worth adding Assert about it if we start relying on
it), so we can be sure there are no notifications in the queue after
the saved head position except ours. So it seems safe but maybe I
missed something.


Best regards,
Arseniy Mukhin



pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: Re: Having postgresql.org link to cgit instead of gitweb
Next
From: Yugo Nagata
Date:
Subject: Re: Suggestion to add --continue-client-on-abort option to pgbench