Re: Import Statistics in postgres_fdw before resorting to sampling. - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Import Statistics in postgres_fdw before resorting to sampling.
Date
Msg-id CAExHW5uEicRpM0cxEgX_0PL41V6OcHUa0v3BOdBzsQOQcBuh=Q@mail.gmail.com
Whole thread Raw
In response to Re: Import Statistics in postgres_fdw before resorting to sampling.  (Corey Huinker <corey.huinker@gmail.com>)
Responses Re: Import Statistics in postgres_fdw before resorting to sampling.
List pgsql-hackers
On Fri, Jan 9, 2026 at 9:52 PM Corey Huinker <corey.huinker@gmail.com> wrote:
>
> On Fri, Jan 9, 2026 at 4:35 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
>>
>> On Wed, Jan 7, 2026 at 11:34 AM Corey Huinker <corey.huinker@gmail.com> wrote:
>>
>> >
>> > Anyway, here's v8, incorporating the documentation feedback and Matheus's notes.
>>
>> I went through the patches. I have one question: The column names of
>> the foreign table on the local server are sorted using qsort, which
>> uses C sorting. The column names the result obtained from the foreign
>> server are sorted using ORDER BY clause. If the default collation on
>> the foreign server is different from the one used by qsort(), the
>> merge sort in import_fetched_statistics() may fail. Shouldn't these
>> two sorts use the same collation?
>
>
> I wondered about that, but somehow thought that because they're of type pg_catalog.name rather than text/varchar that
theyweren't subject to collation settings. We definitely should explicitly sort by C collation regardless. 
>
>>
>>
>> Some minor comments
>>
>> /* avoid including explain_state.h here */
>> typedef struct ExplainState ExplainState;
>> -
>>
>> unintended line deletion?
>
>
> Yeah, must have been.
>
>
>>
>>
>> fetch_remote_statistics() fetches the statistics twice if the first
>> attempt fails. I think we need same check after the second attempt as
>> well. The second attempt should not fail, but I think we need some
>> safety checks and Assert at least, in case the foreign server
>> misbehaves.
>
>
> I think the only test *not* done on re-fetch is checking the relkind of the relation, are you speaking of another
check?It's really no big deal to do that one again, but I made the distinction early on in the coding, and in
retrospectthere are relatively few checks we'd consider skipping on the second pass, so maybe it's not worth the
distinction.
>
> Since you're joining the thread, we have an outstanding debate about what the desired basic workflow should be, and I
thinkwe should get some consensus before we paint ourselves into a corner. 
>
> 1. The Simplest Possible Model
>
> * There is no remote_analyze functionality
> * fetch_stats defaults to false
> * Failure to fetch stats results in a failure, no failover to sampling.
>

Instead of fetch_stats, I would prefer has_stats. But if majority is
on fetch_stats, I am ok with it.

> 2. Simplest Model, but with Failover
>
> * Same as #1, but if we aren't satisfied with the stats we get from the remote, we issue a WARNING, then fall back to
sampling,trusting that the user will eventually turn off fetch_stats on tables where it isn't working. 
>
> 3. Analyze and Retry
>
> * Same as #2, but we add remote_analyze option (default false).
> * If the first attempt fails AND remote_analyze is set on, then we send the remote analyze, then retry. Only if that
failsdo we fall back to sampling. 
>
> 4. Analyze and Retry, Optimistic
>
> * Same as #3, but fetch_stats defaults to ON, because the worst case scenario is that we issue a few queries that
return0-1 rows before giving up and just sampling. 
> * This is the option that Nathan advocated for in our initial conversation about the topic, and I found it quite
persuasiveat the time, but he's been slammed with other stuff and hasn't been able to add to this thread. 
>
> 5. Fetch With Retry Or Sample, Optimisitc
>
> * If fetch_stats is on, AND the remote table is seemingly capable of holding stats, attempt to fetch them, possibly
retryingafter ANALYZE depending on remote_analyze. 
> * If fetching stats failed, just error, as a way to prime the user into changing the table's setting.
> * This is what's currently implemented, and it's not quite what anyone wants. Defaulting fetch_stats to true doesn't
seemgreat, but not defaulting it to true will reduce adoption of this feature. 
>
> 6. Fetch With Retry Or Sample, Pessimistic
>
> * Same as #5, but with fetch_stats = false.

I have no strong opinion about whether fetch_stats should be ON or off
by default. In my simple view, the remote relation that the foreign
table points is usually a regular table or a partitioned table. So
default ON makes sense to me. But there's a possibility that in an
OLAP system most of the foreign objects are views exposing only some
data and not all. I think defaulting to off is backward compatible and
requires cautious opt-in. Or else suddenly all ANALYZE on foreign
server will be costly since they are running a couple queries before
falling back to sampling.

Why do we need remote_analyze as a flag? I might have missed some
conversation which led to including this flag. We fetch relkind, so we
know whether the object on the foreign server supports statistics or
not. If it supports statistics, we run ANALYZE automatically. I mean
the user wants us to ANALYZE the foreign table and has also indicated
that the remote object is capable of storing stats. We should document
the fact that we analyze if we do not find existing statistics. Why
would users want to disable running ANALYZE? One possibility is they
don't want to disturb query plans because of suddenly changing
statistics. But then they will need to switch auto-analysis off. So
maybe there is a way to know when it's not acceptable to run ANALYZE
on the remote server. I feel this flag will put another burden on the
user who already has to set so many flags in postgres_fdw. That is
especially true, when the flag is useful only the first time when the
remote object doesn't have stats - usually there will be stats. If at
all we have to introduce remote_analyze, let it default to the value
of fetch_stats.

If both - importing statistics and fallback analysis - fail, I think
we should resort to sampling - since the user has asked for an ANALYZE
table and sampling will achieve that. As you said, giving warnings at
every failure will nudge the user to set their options correctly.

Does that make sense?

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: remove the unneeded header file math.h in binaryheap.c
Next
From: "Euler Taveira"
Date:
Subject: Re: log_min_messages per backend type