Since you're joining the thread, we have an outstanding debate about what the desired basic workflow should be, and I think we should get some consensus before we paint ourselves into a corner.
1. The Simplest Possible Model
* There is no remote_analyze functionality
* fetch_stats defaults to false
* Failure to fetch stats results in a failure, no failover to sampling.
2. Simplest Model, but with Failover
* Same as #1, but if we aren't satisfied with the stats we get from the remote, we issue a WARNING, then fall back to sampling, trusting that the user will eventually turn off fetch_stats on tables where it isn't working.
3. Analyze and Retry
* Same as #2, but we add remote_analyze option (default false).
* If the first attempt fails AND remote_analyze is set on, then we send the remote analyze, then retry. Only if that fails do we fall back to sampling.
4. Analyze and Retry, Optimistic
* Same as #3, but fetch_stats defaults to ON, because the worst case scenario is that we issue a few queries that return 0-1 rows before giving up and just sampling.
* This is the option that Nathan advocated for in our initial conversation about the topic, and I found it quite persuasive at the time, but he's been slammed with other stuff and hasn't been able to add to this thread.
5. Fetch With Retry Or Sample, Optimisitc
* If fetch_stats is on, AND the remote table is seemingly capable of holding stats, attempt to fetch them, possibly retrying after ANALYZE depending on remote_analyze.
* If fetching stats failed, just error, as a way to prime the user into changing the table's setting.
* This is what's currently implemented, and it's not quite what anyone wants. Defaulting fetch_stats to true doesn't seem great, but not defaulting it to true will reduce adoption of this feature.
6. Fetch With Retry Or Sample, Pessimistic
* Same as #5, but with fetch_stats = false.