Re: Correct the documentation for work_mem - Mailing list pgsql-hackers

From Gurjeet Singh
Subject Re: Correct the documentation for work_mem
Date
Msg-id CABwTF4XAHt7efd=8bhsgsh-vjEtSXBnSLrOUe6TxAcRrPFqHbQ@mail.gmail.com
Whole thread Raw
In response to Re: Correct the documentation for work_mem  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Correct the documentation for work_mem
List pgsql-hackers
On Fri, Apr 21, 2023 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> > On 21.04.23 16:28, Imseih (AWS), Sami wrote:
> >> I suggest a small doc fix:
> >> “Note that for a complex query, several sort or hash operations might be
> >> running simultaneously;”
>
> > Here is a discussion of these terms:
> > https://takuti.me/note/parallel-vs-concurrent/
>
> > I think "concurrently" is the correct word here.
>
> Probably, but it'd do little to remove the confusion Sami is on about,

+1.

When discussing this internally, Sami's proposal was in fact to use
the word 'concurrently'. But given that when it comes to computers and
programming, it's common for someone to not understand the intricate
difference between the two terms, we thought it's best to not use any
of those, and instead use a word not usually associated with
programming and algorithms.

Aside: Another pair of words I see regularly used interchangeably,
when in fact they mean different things: precise vs. accurate.

> especially since the next sentence uses "concurrently" to describe the
> other case.  I think we need a more thorough rewording, perhaps like
>
> -       Note that for a complex query, several sort or hash operations might be
> -       running in parallel; each operation will generally be allowed
> +       Note that a complex query may include several sort or hash
> +       operations; each such operation will generally be allowed

This wording doesn't seem to bring out the fact that there could be
more than one work_mem consumer running (in-progress) at the same
time. The reader to could mistake it to mean hashes and sorts in a
complex query may happen one after the other.

+ Note that a complex query may include several sort and hash operations, and
+ more than one of these operations may be in progress simultaneously at any
+ given time;  each such operation will generally be allowed

I believe the phrase "several sort _and_ hash" better describes the
possible composition of a complex query, than does "several sort _or_
hash".

> I also find this wording a bit further down to be poor:
>
>         Hash-based operations are generally more sensitive to memory
>         availability than equivalent sort-based operations.  The
>         memory available for hash tables is computed by multiplying
>         <varname>work_mem</varname> by
>         <varname>hash_mem_multiplier</varname>.  This makes it
>
> I think "available" is not le mot juste, and it's also unclear from
> this whether we're speaking of the per-hash-table limit or some
> (nonexistent) overall limit.  How about
>
> -       memory available for hash tables is computed by multiplying
> +       memory limit for a hash table is computed by multiplying

+1

Best regards,
Gurjeet https://Gurje.et
Postgres Contributors Team, http://aws.amazon.com



pgsql-hackers by date:

Previous
From: "Regina Obe"
Date:
Subject: RE: Order changes in PG16 since ICU introduction
Next
From: Tom Lane
Date:
Subject: Re: Commitfest 2023-03 starting tomorrow!