Home > mailing lists

Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands - Mailing list pgsql-hackers

From	Michael Paquier
Subject	Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands
Date	September 25, 2017 08:41:53
Msg-id	CAB7nPqS7bBVCqHXfRAx9Ztc5ppfgK1jKHZm10vDfd8hhoSKSRw@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands ("Bossart, Nathan" <bossartn@amazon.com>)
Responses	Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands
List	pgsql-hackers

Tree view

On Sat, Sep 23, 2017 at 12:56 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> On 9/21/17, 9:55 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
>> I still think that ExecVacuum() should pass a list of VacuumRelation
>> objects to vacuum(), and get_rel_oids() should take in input a list,
>> and return a completed lists. This way the decision-making of doing
>> everything in the same transaction should happens once in vacuum().
>> And actually, if several relations are defined with VACUUM, your patch
>> would not use one transaction per table as use_own_xacts would be set
>> to false. I think that Tom meant that relations whose processed has
>> finished have to be committed immediately. Per your patch, the commit
>> happens once all relations are committed.
>
> Sorry, I must have misunderstood.  I've attached an updated patch that
> looks more like what you described.  I also cleaned up the test cases
> a bit.
>
> IIUC the only time use_own_xacts will be false is when we are only
> doing ANALYZE and at least one of the following is true:
>
>         1. We are in a transaction block.
>         2. We are processing only one relation.

Yes.

> From the code, it appears that vacuum_rel() always starts and commits a
> new transaction for each relation:
>
>          * vacuum_rel expects to be entered with no transaction active; it will
>          * start and commit its own transaction.  But we are called by an SQL

Yes.

> So, by ensuring that get_rel_oids() returns a list whenever multiple
> tables are specified, we are making sure that commands like
>
>         ANALYZE table1, table2, table3;
>
> create transactions for each processed relation (as long as they are
> not inside a transaction block).

Yes.

> I suppose the alternative would be
> to call vacuum() for each relation and to remove the restriction that
> we must be processing more than one relation for use_own_xacts to be
> true.

The main point of my comment is that like ExecVacuum(), vacuum()
should be a high-level function where is decided if multiple
transactions should be run or not. By calling vacuum() multiple times
you break this promise. vacuum_rel should be the one working with
individual transactions.

Here is the diff between v19 and v21 that matters here:   /* Now go through the common routine */
-   if (vacstmt->rels == NIL)
-       vacuum(vacstmt->options, NULL, ¶ms, NULL, isTopLevel);
-   else
-   {
-       ListCell *lc;
-       foreach(lc, vacstmt->rels)
-           vacuum(vacstmt->options, lfirst_node(VacuumRelation, lc),
-                  ¶ms, NULL, isTopLevel);
-   }
+   vacuum(vacstmt->options, vacstmt->rels, ¶ms, NULL, isTopLevel);
If you do so, for an ANALYZE with multiple relations you would finish
by using the same transaction for all relations. I think that we had
better be consistent with VACUUM when not using an outer transaction
so as tables are analyzed and committed one by one. This does not
happen here: a unique transaction is used when using a list of
non-partitioned tables.

On Sun, Sep 24, 2017 at 4:37 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> Here is a version of the patch without the switch to AutovacMemCxt in
> autovacuum_do_vac_analyze(), which should no longer be necessary after
> 335f3d04.

Thanks for the new version.

+       if (!IsAutoVacuumWorkerProcess())
+           ereport(WARNING,
+                 (errmsg("skipping \"%s\" --- relation no longer exists",
+                         relation->relname)));
I like the use of WARNING here, but we could use as well a LOG to be
consistent when a lock obtention is skipped.

+            * going to commit this transaction and begin a new one between now
+            * and then.
+            */
+           relid = RangeVarGetRelid(relinfo->relation, NoLock, false);
We will likely have to wait that the matters discussed in
https://www.postgresql.org/message-id/25023.1506107590@sss.pgh.pa.us
are settled.

+VACUUM FULL vactst, vactst, vactst, vactst;
This is actually a waste of cycles.

I think I don't have much other comments about this patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

From: Amit Kapila
Date: 25 September 2017, 08:26:07
Subject: Re: [HACKERS] Setting pd_lower in GIN metapage

From: Michael Paquier
Date: 25 September 2017, 08:42:52
Subject: Re: [HACKERS] Setting pd_lower in GIN metapage

Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands - Mailing list pgsql-hackers

Previous

Next