Thread: A concurrent VACUUM FULL?
Hi hackers,
I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is possible to do a concurrent version of it using a multi-transactional approach similar to concurrent reindexing and partition detach.
The idea would be to hold weaker locks in TX1 when doing the heap rewrite (essentially allow reads but prevent writes), and then do the actual heap swap in a second TX2 transaction.
Having experimented a bit with this approach, I found that reindexing is an issue because that happens after the new heap has been swapped in. The current reindex during a heap swap effectively blocks reads so if one starts a new transaction after swapping heaps, it will block reads for a long time.
This made me think about two ways to handle this:
1. Rebuild indexes on the temporary heap in TX1 and then swap in the new indexes along with the new heap in TX2.
2. Do a concurrent index rebuild after the heap swap.
Of the two approaches above, (2) seems easiest to implement, but the downside is that indexes would be invalid while indexes are rebuilt. Therefore, (1) seems to be the more desirable one because all the heavy lifting would be done in TX1 on the temporary heap.
Does anyone have a sense of whether approach (1) is feasible or whether there are any major blockers?
Is this worth pursuing at all or am I missing something?
Best regards,
Erik
-- Database Architect, Timescale
Erik Nordström <erik@timescale.com> wrote: > Hi hackers, > > I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is possible to do a concurrent version of it usinga > multi-transactional approach similar to concurrent reindexing and partition detach. > > The idea would be to hold weaker locks in TX1 when doing the heap rewrite (essentially allow reads but prevent writes),and then do the > actual heap swap in a second TX2 transaction. Patch [1] is in the queue that allows both reads and writes. (An exclusive lock is acquired here for the swaps, but that should be held for very short time.) -- Antonin Houska Web: https://www.cybertec-postgresql.com [1] https://commitfest.postgresql.org/patch/5117/
On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote:
Erik Nordström <erik@timescale.com> wrote:
> Hi hackers,
>
> I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is possible to do a concurrent version of it using a
> multi-transactional approach similar to concurrent reindexing and partition detach.
>
> The idea would be to hold weaker locks in TX1 when doing the heap rewrite (essentially allow reads but prevent writes), and then do the
> actual heap swap in a second TX2 transaction.
Patch [1] is in the queue that allows both reads and writes. (An exclusive
lock is acquired here for the swaps, but that should be held for very short
time.)
That sounds great. Do you know if there's anything I can do to help?
- Erik
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
[1] https://commitfest.postgresql.org/patch/5117/
On 2025-Jun-30, Erik Nordström wrote: > On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote: > > Patch [1] is in the queue that allows both reads and writes. (An exclusive > > lock is acquired here for the swaps, but that should be held for very short > > time.) > > That sounds great. Do you know if there's anything I can do to help? It would be very valuable if you can review the code, test it under the weirdest conditions you can imagine or just under normal conditions, proof-read the documentation, try to see if anything is missing that should be there, and so on. Everything that you would expect from a new feature released as part of the next Postgres release. Any problems/crashes/ abnormalities that you report before the patch is included in Postgres, is one less issue that we'll have to deal with after the release. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El miedo atento y previsor es la madre de la seguridad" (E. Burke)
On Mon, Jun 30, 2025 at 1:46 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
On 2025-Jun-30, Erik Nordström wrote:
> On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote:
> > Patch [1] is in the queue that allows both reads and writes. (An exclusive
> > lock is acquired here for the swaps, but that should be held for very short
> > time.)
>
> That sounds great. Do you know if there's anything I can do to help?
It would be very valuable if you can review the code, test it under the
weirdest conditions you can imagine or just under normal conditions,
proof-read the documentation, try to see if anything is missing that
should be there, and so on. Everything that you would expect from a new
feature released as part of the next Postgres release. Any
problems/crashes/ abnormalities that you report before the patch is
included in Postgres, is one less issue that we'll have to deal with
after the release.
I'll do my best to test the feature. One question I have, though, is why not start with supporting concurrent reads but not writes? That would already be a win and make the patch simpler.
Best,
- Erik
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"El miedo atento y previsor es la madre de la seguridad" (E. Burke)
Erik Nordström <erik@timescale.com> wrote: > On Mon, Jun 30, 2025 at 1:46 PM Álvaro Herrera <alvherre@kurilemu.de> wrote: > > On 2025-Jun-30, Erik Nordström wrote: > > > On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote: > > > > Patch [1] is in the queue that allows both reads and writes. (An exclusive > > > lock is acquired here for the swaps, but that should be held for very short > > > time.) > > > > That sounds great. Do you know if there's anything I can do to help? > > It would be very valuable if you can review the code, test it under the > weirdest conditions you can imagine or just under normal conditions, > proof-read the documentation, try to see if anything is missing that > should be there, and so on. Everything that you would expect from a new > feature released as part of the next Postgres release. Any > problems/crashes/ abnormalities that you report before the patch is > included in Postgres, is one less issue that we'll have to deal with > after the release. > > I'll do my best to test the feature. Thanks. I've noticed that the patch set needs rebase. I'll try to prepare a new version today. > One question I have, though, is why not start with supporting concurrent reads but not writes? That would > already be a win and make the patch simpler. It occurred to me at some point too, but I think it would be rather a different implementation. So if we were to support both read-only and read-write modes, the amount of code would be even higher. -- Antonin Houska Web: https://www.cybertec-postgresql.com
HI Erik Nordström
In online production environments, blocking writes is generally unacceptable in most cases. The only acceptable approach is to allow concurrent read/write operations, with brief locks permitted only during the final steps of the process. We can see pg-osc's implementation (https://github.com/shayonj/pg-osc) for a non-blocking approach to VACUUM FULL operations."
In online production environments, blocking writes is generally unacceptable in most cases. The only acceptable approach is to allow concurrent read/write operations, with brief locks permitted only during the final steps of the process. We can see pg-osc's implementation (https://github.com/shayonj/pg-osc) for a non-blocking approach to VACUUM FULL operations."
On Mon, Jun 30, 2025 at 8:03 PM Antonin Houska <ah@cybertec.at> wrote:
Erik Nordström <erik@timescale.com> wrote:
> On Mon, Jun 30, 2025 at 1:46 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Jun-30, Erik Nordström wrote:
>
> > On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote:
>
> > > Patch [1] is in the queue that allows both reads and writes. (An exclusive
> > > lock is acquired here for the swaps, but that should be held for very short
> > > time.)
> >
> > That sounds great. Do you know if there's anything I can do to help?
>
> It would be very valuable if you can review the code, test it under the
> weirdest conditions you can imagine or just under normal conditions,
> proof-read the documentation, try to see if anything is missing that
> should be there, and so on. Everything that you would expect from a new
> feature released as part of the next Postgres release. Any
> problems/crashes/ abnormalities that you report before the patch is
> included in Postgres, is one less issue that we'll have to deal with
> after the release.
>
> I'll do my best to test the feature.
Thanks. I've noticed that the patch set needs rebase. I'll try to prepare a
new version today.
> One question I have, though, is why not start with supporting concurrent reads but not writes? That would
> already be a win and make the patch simpler.
It occurred to me at some point too, but I think it would be rather a
different implementation. So if we were to support both read-only and
read-write modes, the amount of code would be even higher.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
Hi Eric,
Currently the first suggested approach "Rebuild indexes on the temporary heap in TX1 and then swap in the new indexes along with the new heap in TX2." sound good.
It would be great if we are able to perform concurrent reads and writes.
In OLTP environments will it lead to slowing of the queries or query performance issues !!!!
Thanks
Dinesh Nair
From: Erik Nordström <erik@timescale.com>
Sent: Monday, June 30, 2025 3:19 PM
To: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: A concurrent VACUUM FULL?
Sent: Monday, June 30, 2025 3:19 PM
To: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: A concurrent VACUUM FULL?
You don't often get email from erik@timescale.com. Learn why this is important |
Caution: This email was sent from an external source. Please verify the sender’s identity before clicking links or opening attachments.
Hi hackers,
I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is possible to do a concurrent version of it using a multi-transactional approach similar to concurrent reindexing and partition detach.
The idea would be to hold weaker locks in TX1 when doing the heap rewrite (essentially allow reads but prevent writes), and then do the actual heap swap in a second TX2 transaction.
Having experimented a bit with this approach, I found that reindexing is an issue because that happens after the new heap has been swapped in. The current reindex during a heap swap effectively blocks reads so if one starts a new transaction after swapping heaps, it will block reads for a long time.
This made me think about two ways to handle this:
1. Rebuild indexes on the temporary heap in TX1 and then swap in the new indexes along with the new heap in TX2.
2. Do a concurrent index rebuild after the heap swap.
Of the two approaches above, (2) seems easiest to implement, but the downside is that indexes would be invalid while indexes are rebuilt. Therefore, (1) seems to be the more desirable one because all the heavy lifting would be done in TX1 on the temporary heap.
Does anyone have a sense of whether approach (1) is feasible or whether there are any major blockers?
Is this worth pursuing at all or am I missing something?
Best regards,
Erik
-- Database Architect, Timescale
On 2025-Jun-30, DINESH NAIR wrote: > In OLTP environments will it lead to slowing of the queries or query > performance issues !!!! Sure, to some extent, but ideally you wouldn't use it in a recurring fashion but only as an emergency solution out of a really serious bloat problem (so it's not something you should have impacting your production in a recurring fashion); also, performance should improve for the system overall, comparing to the state before compacting the table. I suggest you try pg_squeeze (a single run of it in a table, not scheduled runs) and report back how the system performs for you in the period when it is executing. I expect that the impact of REPACK is going to be largely the same as that of pg_squeeze, because the implementation is very similar. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Y una voz del caos me habló y me dijo "Sonríe y sé feliz, podría ser peor". Y sonreí. Y fui feliz. Y fue peor.