Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means - Mailing list pgsql-hackers
From | Haribabu Kommi |
---|---|
Subject | Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means |
Date | |
Msg-id | CAJrrPGfTQ3w=5BdprWgQghuM7ujtBy2Sthr6zRwY0QeRBQd2RQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
|
List | pgsql-hackers |
On Mon, Sep 25, 2017 at 4:39 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 09/06/2017 09:45 AM, Haribabu Kommi wrote:
>
>
> On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
> On 7/25/17 12:55 AM, Tom Lane wrote:
>
> Tomas Vondra <tomas.vondra@2ndquadrant.com> +num_tuples - nkeep);> <mailto:tomas.vondra@2ndquadrant.com>> writes:
>
> It seems to me that VACUUM and ANALYZE somewhat disagree on what
> exactly reltuples means. VACUUM seems to be thinking that
> reltuples
> = live + dead while ANALYZE apparently believes that reltuples =
> live
>
>
> The question is - which of the reltuples definitions is the
> right
> one? I've always assumed that "reltuples = live + dead" but
> perhaps
> not?
>
>
> I think the planner basically assumes that reltuples is the live
> tuple count, so maybe we'd better change VACUUM to get in step.
>
>
> Attached is a patch that (I think) does just that. The disagreement
> was caused by VACUUM treating recently dead tuples as live, while
> ANALYZE treats both of those as dead.
>
> At first I was worried that this will negatively affect plans in the
> long-running transaction, as it will get underestimates (due to
> reltuples not including rows it can see). But that's a problem we
> already have anyway, you just need to run ANALYZE in the other session.
>
>
> Thanks for the patch.
> From the mail, I understand that this patch tries to improve the
> reltuples value update in the catalog table by the vacuum command
> to consider the proper visible tuples similar like analyze command.
>
> -num_tuples);
>
> With the above correction, there is a problem in reporting the number
> of live tuples to the stats.
>
> postgres=# select reltuples, n_live_tup, n_dead_tup
> from pg_stat_user_tables join pg_class using (relname)
> where relname = 't';
> reltuples | n_live_tup | n_dead_tup
> -----------+------------+------------
> 899818 | 799636 | 100182
> (1 row)
>
>
> The live tuples data value is again decremented with dead tuples
> value before sending them to stats in function lazy_vacuum_rel(),
>
> /* report results to the stats collector, too */
> new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;
>
> The fix needs a correction here also. Or change the correction in
> lazy_vacuum_rel() function itself before updating catalog table similar
> like stats.
>
Ah, haven't noticed that for some reason - you're right, we estimate the
reltuples based on (num_tuples - nkeep), so it doesn't make much sense
to subtract nkeep again. Attached is v2 of the fix.
I've removed the subtraction from lazy_vacuum_rel(), leaving just
new_live_tuples = new_rel_tuples;
and now it behaves as expected (no second subtraction). That means we
can get rid of new_live_tuples altogether (and the protection against
negative values), and use new_rel_tuples directly.
Which pretty much means that in this case
(pg_class.reltuples == pg_stat_user_tables.n_live_tup)
but I guess that's fine, based on the initial discussion in this thread.
The changes are fine and now it reports proper live tuples in both
pg_class and stats. The other issue of continuous vacuum operation
leading to decrease of number of live tuples is not related to this
patch and that can be handled separately.
I changed the patch status as ready for committer.
Regards,
Hari Babu
Fujitsu Australia
pgsql-hackers by date: