On Wed, 10 Dec 2025 at 23:33, Sergey Naumov <sknaumov@gmail.com> wrote:
>
> > so I think if the index did contain the 74962 pages when the query was planned, then the costs should have known
aboutit.
> Yes, here we are facing a corner case when data is generated in a long transaction on just cleaned-up DB => when
autovacuumkicks in, autoanalyze wipes old stats and have no data in tables to properly assess relations cardinality.
>
> > And for me, I've not really seen enough evidence that there's any bug here.
> So the question is whether the query planner should take into account the WHERE clause that hints that it is enough
tojoin just a single row instead of the whole table.
Just for the record here, the planner has no preference with join
order in regards to existance of WHERE clause. The primary driver of
this is the estimated number of rows. From that, the exact costs for
the specific join method are based on a few other things too. It
sounds like you're claiming that it should be better to assume the
scan with the WHERE clause is better somehow. To me, this sounds like
something a rules-based optimiser might do. We have a cost-based
optimiser which uses table statistics as inputs to the cost
calculations.
I did try and recreate the issue you've reported, but I'm unable to. I
find it a bit suspicious that your planner opted to Hash Join when the
hash table was estimated to contain a single row. I'd expect the
planner would normally Nested Loop unless there's a FULL JOIN, which
there is not, in this case. See attached.
I expect you'll have more luck with bug reports if you work with the
person who's trying to help you and try and gather the information
they've requested rather than ignoring that and reiterating what you
think the problem is.
David