Re: Changing shared_buffers without restart - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Changing shared_buffers without restart |
Date | |
Msg-id | y2sjrhyylmuc7h77cb5x2b3jhdhsws4stkxiumhde2tq7ewswh@ovfeglsvkihd Whole thread Raw |
In response to | Re: Changing shared_buffers without restart (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
Hi, On 2025-09-18 09:52:03 -0400, Andres Freund wrote: > On 2025-09-18 10:25:29 +0530, Ashutosh Bapat wrote: > > From 0a55bc15dc3a724f03e674048109dac1f248c406 Mon Sep 17 00:00:00 2001 > > From: Dmitrii Dolgov <9erthalion6@gmail.com> > > Date: Fri, 4 Apr 2025 21:46:14 +0200 > > Subject: [PATCH 04/16] Introduce pss_barrierReceivedGeneration > > > > Currently WaitForProcSignalBarrier allows to make sure the message sent > > via EmitProcSignalBarrier was processed by all ProcSignal mechanism > > participants. > > > > Add pss_barrierReceivedGeneration alongside with pss_barrierGeneration, > > which will be updated when a process has received the message, but not > > processed it yet. This makes it possible to support a new mode of > > waiting, when ProcSignal participants want to synchronize message > > processing. To do that, a participant can wait via > > WaitForProcSignalBarrierReceived when processing a message, effectively > > making sure that all processes are going to start processing > > ProcSignalBarrier simultaneously. > > I doubt "online resizing" that requires synchronously processing the same > event, can really be called "online". There can be significant delays in > processing a barrier, stalling the entire server until that is reached seems > like a complete no-go for production systems? > [...] > > From 78bc0a49f8ebe17927abd66164764745ecc6d563 Mon Sep 17 00:00:00 2001 > > From: Dmitrii Dolgov <9erthalion6@gmail.com> > > Date: Tue, 17 Jun 2025 14:16:55 +0200 > > Subject: [PATCH 11/16] Allow to resize shared memory without restart > > > > Add assing hook for shared_buffers to resize shared memory using space, > > introduced in the previous commits without requiring PostgreSQL restart. > > Essentially the implementation is based on two mechanisms: a > > ProcSignalBarrier is used to make sure all processes are starting the > > resize procedure simultaneously, and a global Barrier is used to > > coordinate after that and make sure all finished processes are waiting > > for others that are in progress. > > > > The resize process looks like this: > > > > * The GUC assign hook sets a flag to let the Postmaster know that resize > > was requested. > > > > * Postmaster verifies the flag in the event loop, and starts the resize > > by emitting a ProcSignal barrier. > > > > * All processes, that participate in ProcSignal mechanism, begin to > > process ProcSignal barrier. First a process waits until all processes > > have confirmed they received the message and can start simultaneously. > > As mentioned above, this basically makes the entire feature not really > online. Besides the latency of some processes not getting to the barrier > immediately, there's also the issue that actually reserving large amounts of > memory can take a long time - during which all processes would be unavailable. > > I really don't see that being viable. It'd be one thing if that were a > "temporary" restriction, but the whole design seems to be fairly centered > around that. Besides not really being online, isn't this a recipe for endless undetected deadlocks? What if process A waits for a lock held by process B and process B arrives at the barrier? Process A won't ever get there, because process B can't make progress, because A is not making progress. Greetings, Andres Freund
pgsql-hackers by date: