Home > mailing lists

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs - Mailing list pgsql-hackers

From	Filip Janus
Subject	Re: [PATCH] Better Performance for PostgreSQL with large INSERTs
Date	November 26 17:02:58
Msg-id	CAFjYY+JTULmdQUJcBK-hDGRSWZy0e+RFHrCy3vp9J1pCQ18+Ew@mail.gmail.com Whole thread Raw
In response to	Re: [PATCH] Better Performance for PostgreSQL with large INSERTs (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

-Filip-

út 7. 10. 2025 v 16:54 odesílatel Andres Freund <andres@anarazel.de> napsal:

Hi,

On 2025-10-07 15:03:29 +0200, Philipp Marek wrote:
> > Have you tried to verify that this doesn't cause performance regressions
> > in
> > other workloads? pq_recvbuf() has this code:
> >
> ...
> >
> > I do seem to recall that just increasing the buffer size substantially
> > lead to
> > more time being spent inside that memmove() (likely due to exceeding
> > L1/L2).
>
>
> Do you have any pointers to discussions or other data about that?
>
>
> My (quick) analysis was that clients that send one request,
> wait for an answer, then send the next request wouldn't run that code
> as there's nothing behind the individual requests that could be moved.
>
>
> But yes, Pipeline Mode[1] might/would be affected.
>
> The interesting question is how much data can userspace copy before
> that means more load than doing a userspace-kernel-userspace round trip.
> (I guess that moving 64kB or 128kB should be quicker, especially since
> the various CPU mitigations.)

I unfortunately don't remember the details of where I saw it
happening. Unfortunately I suspect it'll depend a lot on hardware and
operating system details (like the security mitigations you mention) when it
matters too.

> As long as there are complete requests in the buffer the memmove()
> could be avoided; only the initial part of the first incomplete request
> might need moving to the beginning.

Right. I'd be inclined that that ought to be addressed as part of this patch,
that way we can be sure that it's pretty sure it's not going to cause
regressions.

I tried to benchmark the usage of memmove(), but I wasn’t able to hit the memmove() part of the code. This led me to a deeper investigation, and I realized that the memmove() call is probably in a dead part of the code.
pq_recvbuf is called when PqRecvPointer >= PqRecvLength, while memmove() is called later only if PqRecvLength > PqRecvPointer.
This results in a contradiction.

> The documentation says
>
> > Pipelining is less useful, and more complex,
> > when a single pipeline contains multiple transactions
> > (see Section 32.5.1.3).
>
> are there any benchmarks/usage statistics for pipeline mode?

You can write benchmarks for it using pgbench's pipeline support, with a
custom script.

Greetings,

Andres Freund

I am also proposing the introduction of a new GUC variable for setting PQ_RECV_BUFFER_SIZE in the first patch. And the second patch removes the dead code.

Filip

Attachment

pgsql-hackers by date:

From: Nazir Bilal Yavuz
Date: 26 November, 17:02:21
Subject: Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

From: Manni Wood
Date: 26 November, 17:21:46
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD

Re: [PATCH] Better Performance for PostgreSQL with large INSERTs - Mailing list pgsql-hackers

Attachment

Previous

Next