Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers
From | Mahendra Singh Thalor |
---|---|
Subject | Re: [HACKERS] Block level parallel vacuum |
Date | |
Msg-id | CAKYtNAp56Q_A5-v-B3PjXM33nWfUpH4muqVDVO1TOoh4JeFQBA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Block level parallel vacuum (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>) |
List | pgsql-hackers |
On Sat, 11 Jan 2020 at 19:48, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
>
> On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
> > > >
> > > > On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
> > > > >
> > > > > Hi
> > > > > Thank you for update! I looked again
> > > > >
> > > > > (vacuum_indexes_leader)
> > > > > + /* Skip the indexes that can be processed by parallel workers */
> > > > > + if (!skip_index)
> > > > > + continue;
> > > > >
> > > > > Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
> > > >
> > > > I also agree with your point.
> > >
> > > I don't think the change is a good idea.
> > >
> > > - bool skip_index = (get_indstats(lps->lvshared, i) == NULL ||
> > > - skip_parallel_vacuum_index(Irel[i], lps->lvshared));
> > > + bool can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
> > > + skip_parallel_vacuum_index(Irel[i],
> > > + lps->lvshared));
> > >
> > > The above condition is true when the index can *not* do parallel index vacuum. How about changing it to skipped_index and change the comment to something like “We are interested in only index skipped parallel vacuum”?
> > >
> >
> > Hmm, I find the current code and comment better than what you or
> > Sergei are proposing. I am not sure what is the point of confusion in
> > the current code?
>
> Yeah the current code is also good. I just thought they were concerned
> that the variable name skip_index might be confusing because we skip
> if skip_index is NOT true.
>
> >
> > > >
> > > > >
> > > > > Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:
> > > > >
> > > > > + if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
> > > > > + {
> > > > > + ereport(WARNING,
> > > > > + (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
> > > > > + RelationGetRelationName(onerel))));
> > > > > + params->nworkers = -1;
> > > > > + }
> > > > >
> > > > > And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?
> > > >
> > > > Good point.
> > > > Yes, we should improve this. I tried to fix this.
> > >
> > > +1
> > >
> >
> > Yeah, we can improve the situation here. I think we don't need to
> > change the value of params->nworkers at first place if allow
> > lazy_scan_heap to take care of this. Also, I think we shouldn't
> > display warning unless the user has explicitly asked for parallel
> > option. See the fix in the attached patch.
>
> Agreed. But with the updated patch the PARALLEL option without the
> parallel degree doesn't display warning because params->nworkers = 0
> in that case. So how about restoring params->nworkers at the end of
> vacuum_rel()?
>
> + /*
> + * Give warning only if the user explicitly
> tries to perform a
> + * parallel vacuum on the temporary table.
> + */
> + if (params->nworkers > 0)
> + ereport(WARNING,
> + (errmsg("disabling
> parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables
> in parallel",
> +
> RelationGetRelationName(onerel))));
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
>
> On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
> > > >
> > > > On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
> > > > >
> > > > > Hi
> > > > > Thank you for update! I looked again
> > > > >
> > > > > (vacuum_indexes_leader)
> > > > > + /* Skip the indexes that can be processed by parallel workers */
> > > > > + if (!skip_index)
> > > > > + continue;
> > > > >
> > > > > Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
> > > >
> > > > I also agree with your point.
> > >
> > > I don't think the change is a good idea.
> > >
> > > - bool skip_index = (get_indstats(lps->lvshared, i) == NULL ||
> > > - skip_parallel_vacuum_index(Irel[i], lps->lvshared));
> > > + bool can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
> > > + skip_parallel_vacuum_index(Irel[i],
> > > + lps->lvshared));
> > >
> > > The above condition is true when the index can *not* do parallel index vacuum. How about changing it to skipped_index and change the comment to something like “We are interested in only index skipped parallel vacuum”?
> > >
> >
> > Hmm, I find the current code and comment better than what you or
> > Sergei are proposing. I am not sure what is the point of confusion in
> > the current code?
>
> Yeah the current code is also good. I just thought they were concerned
> that the variable name skip_index might be confusing because we skip
> if skip_index is NOT true.
>
> >
> > > >
> > > > >
> > > > > Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entire database (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit:
> > > > >
> > > > > + if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
> > > > > + {
> > > > > + ereport(WARNING,
> > > > > + (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
> > > > > + RelationGetRelationName(onerel))));
> > > > > + params->nworkers = -1;
> > > > > + }
> > > > >
> > > > > And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?
> > > >
> > > > Good point.
> > > > Yes, we should improve this. I tried to fix this.
> > >
> > > +1
> > >
> >
> > Yeah, we can improve the situation here. I think we don't need to
> > change the value of params->nworkers at first place if allow
> > lazy_scan_heap to take care of this. Also, I think we shouldn't
> > display warning unless the user has explicitly asked for parallel
> > option. See the fix in the attached patch.
>
> Agreed. But with the updated patch the PARALLEL option without the
> parallel degree doesn't display warning because params->nworkers = 0
> in that case. So how about restoring params->nworkers at the end of
> vacuum_rel()?
>
> + /*
> + * Give warning only if the user explicitly
> tries to perform a
> + * parallel vacuum on the temporary table.
> + */
> + if (params->nworkers > 0)
> + ereport(WARNING,
> + (errmsg("disabling
> parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables
> in parallel",
> +
> RelationGetRelationName(onerel))));
Hi,
I have some doubts for warning of temporary tables . Below are the some examples.
Let we have 1 temporary table with name "temp_table".
Case 1:
vacuum;
I think, in this case, we should not give any warning for temp table. We should do parallel vacuum(considering zero as parallel degree) for all the tables except temporary tables.
Case 2:
vacuum (parallel);
Case 3:
vacuum(parallel 5);
Case 4:
vacuum(parallel) temp_table;
Case 5:
vacuum(parallel 4) temp_table;
I think, for case 2 and 4, as per new design, we should give error (ERROR: Parallel degree should be specified between 0 to 1024) because by default, parallel vacuum is ON, so if user give parallel option without degree, then we can give error.
If we can give error for case 2 and 4, then we can give warning for case 3, 5.
Thoughts?
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: