Re: Import Statistics in postgres_fdw before resorting to sampling. - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Import Statistics in postgres_fdw before resorting to sampling.
Date
Msg-id CADkLM=fsef+NHPjCR4FXF=9wu6Bsf=0E7MOQKOs4AfHJYuF31w@mail.gmail.com
Whole thread Raw
In response to Re: Import Statistics in postgres_fdw before resorting to sampling.  (Corey Huinker <corey.huinker@gmail.com>)
List pgsql-hackers
Since you're joining the thread, we have an outstanding debate about what the desired basic workflow should be, and I think we should get some consensus before we paint ourselves into a corner.

1. The Simplest Possible Model

* There is no remote_analyze functionality
* fetch_stats defaults to false
* Failure to fetch stats results in a failure, no failover to sampling.

2. Simplest Model, but with Failover

* Same as #1, but if we aren't satisfied with the stats we get from the remote, we issue a WARNING, then fall back to sampling, trusting that the user will eventually turn off fetch_stats on tables where it isn't working.

3. Analyze and Retry

* Same as #2, but we add remote_analyze option (default false).
* If the first attempt fails AND remote_analyze is set on, then we send the remote analyze, then retry. Only if that fails do we fall back to sampling.

4. Analyze and Retry, Optimistic

* Same as #3, but fetch_stats defaults to ON, because the worst case scenario is that we issue a few queries that return 0-1 rows before giving up and just sampling.
* This is the option that Nathan advocated for in our initial conversation about the topic, and I found it quite persuasive at the time, but he's been slammed with other stuff and hasn't been able to add to this thread.

5. Fetch With Retry Or Sample, Optimisitc

* If fetch_stats is on, AND the remote table is seemingly capable of holding stats, attempt to fetch them, possibly retrying after ANALYZE depending on remote_analyze.
* If fetching stats failed, just error, as a way to prime the user into changing the table's setting.
* This is what's currently implemented, and it's not quite what anyone wants. Defaulting fetch_stats to true doesn't seem great, but not defaulting it to true will reduce adoption of this feature.

6. Fetch With Retry Or Sample, Pessimistic

* Same as #5, but with fetch_stats = false.


Rebased after adding the COLLATE argument to the ORDER-BY statements.
 
Attachment

pgsql-hackers by date:

Previous
From: David Christensen
Date:
Subject: [PATCH] Fix incorrect parser comment
Next
From: Kai Wagner
Date:
Subject: Re: how to gate experimental features (SQL/PGQ)