On Thu, Oct 10, 2024 at 5:28 AM Craig Milhiser <craig@milhiser.com> wrote:
> Then the machine ran out of disk space: ERROR: could not write to file
"base/pgsql_tmp/pgsql_tmp4942.1.fileset/o1859485of2097152.p0.0":No space left on device
For that, I have a patch in the queue to unlink temporary files incrementally:
https://www.postgresql.org/message-id/flat/CA+hUKG+RGdvhAdVu5_LH3Ksee+kW-XkTP_nMxBL+Rmgp3Tjb_w@mail.gmail.com
That's just treating a symptom, though. Things have already gone
quite wrong if we're repeatedly repartitioning our way up to 2 million
batches and only giving up there because of Andrei's patch.
I wonder if there something could be wrong with Parallel Hash Right
Join, which we see in your plan. That's new-ish, and I vaguely recall
another case where that seemed to be on the scene in a plan with a
high number of batches... hmm. Definitely keen to see a reproducer
with synthetic data if you can come up with one...