Re: parallel index creation: maintenance_work_mem not honored? - Mailing list pgsql-admin

From Peter Geoghegan
Subject Re: parallel index creation: maintenance_work_mem not honored?
Date
Msg-id CAH2-WzkFJnjem0U0-4HeEwk4-LawXvrJbSaf790MLvGaspBh4w@mail.gmail.com
Whole thread Raw
In response to Re: parallel index creation: maintenance_work_mem not honored?  (MichaelDBA <MichaelDBA@sqlexec.com>)
List pgsql-admin
On Sat, Jun 25, 2022 at 9:37 AM MichaelDBA <MichaelDBA@sqlexec.com> wrote:
> If "External sorts are often faster than internal sorts", why bother with increasing work_mem to avoid sorting on
disk?

They're faster despite spilling to disk -- not because of it. This
effect is quite visible at times, though usually only with large
sorts. It's noticeable, but not huge. And it depends on things like
the datatypes being sorted, to a degree. (My original point was that
external sorts shouldn't be assumed to be significantly slower or
significantly less efficient than equivalent large internal sorts,
where CPU costs tend to dominate either way.)

Only an external sort can do a final on-the-fly merge, where tuplesort
outputs tuples in their final sorted order before the entire input has
been fully sorted. It's not uncommon for that to more than make up for
any latency from writing out temp files. The added latency is usually
not noticeable because we also need to read from random locations in
memory to gather tuples to output to temp files. That can totally hide
any I/O latency in many important cases.

In principle you could get the same benefit with an internal sort,
without any I/O for temp files -- nothing makes it fundamentally
impossible to teach tuplesort how to do a similar final on-the-fly
strategy. But that's tricky for a variety of reasons, mostly related
to the fact that the final size of the input cannot easily be
predicted. That's why it hasn't been implemented.

-- 
Peter Geoghegan



pgsql-admin by date:

Previous
From: MichaelDBA
Date:
Subject: Re: parallel index creation: maintenance_work_mem not honored?
Next
From: Sergey Aleynikov
Date:
Subject: Re: Xmax precedes relation freeze threshold errors