Thread: slow pg_dump with bytea

slow pg_dump with bytea

From

Janning Vygen

Date:

20 October 2023, 10:26:06

I don't know if the PG developers are aware of this:

https://serverfault.com/questions/1081642/postgresql-13-speed-up-pg-dump-to-5-minutes-instead-of-70-minutes 


But this question is quite famous and many users like the solution.
So maybe you can fix it by changing the pg_dump process to not compress 
any bytea data.

regards
Janning

Re: slow pg_dump with bytea

From

Laurenz Albe

Date:

20 October 2023, 11:52:17

On Fri, 2023-10-20 at 12:26 +0200, Janning Vygen wrote:
> I don't know if the PG developers are aware of this:
>
> https://serverfault.com/questions/1081642/postgresql-13-speed-up-pg-dump-to-5-minutes-instead-of-70-minutes
>
> But this question is quite famous and many users like the solution.
> So maybe you can fix it by changing the pg_dump process to not compress
> any bytea data.

Doesn't sound like a bug to me.

Compression is determined when "pg_dump" starts.  How should it guess that
there is a binary column with compressed data in some table?  Even if it did,
I wouldn't feel well with a "pg_dump" with enough artificial intelligence to
do this automatically for me (and get it wrong occasionally).

In addition, I don't think that this problem is limited to compressed
binary data.  In my experience, compressed dumps are always slower than
uncompressed ones.  It is a speed vs. size thing.

By the way, PostgreSQL v16 introduced compression with "lz4" und "zstd"
to "pg_dump", which is much faster.

Yours,
Laurenz Albe

Re: slow pg_dump with bytea

From

Janning Vygen

Date:

20 October 2023, 12:08:17

Thanks for this quick response. Don't think it is a bug either. So, 
wrong list, actually. But I didn't found any "feature request" list. 
Sorry for this.

Anyhow, for me it seems odd that you can dump in a fraction of time 
compared to the default settings. With the solution in the link you have 
both: speed and compression. Which seems to be offered by pg_dump -Z 
option too, acording to the level yu set. But it isn't.

regards
Janning


Am 20.10.23 um 13:52 schrieb Laurenz Albe:
> On Fri, 2023-10-20 at 12:26 +0200, Janning Vygen wrote:
>> I don't know if the PG developers are aware of this:
>>
>> https://serverfault.com/questions/1081642/postgresql-13-speed-up-pg-dump-to-5-minutes-instead-of-70-minutes
>>
>> But this question is quite famous and many users like the solution.
>> So maybe you can fix it by changing the pg_dump process to not compress
>> any bytea data.
> 
> Doesn't sound like a bug to me.
> 
> Compression is determined when "pg_dump" starts.  How should it guess that
> there is a binary column with compressed data in some table?  Even if it did,
> I wouldn't feel well with a "pg_dump" with enough artificial intelligence to
> do this automatically for me (and get it wrong occasionally).
> 
> In addition, I don't think that this problem is limited to compressed
> binary data.  In my experience, compressed dumps are always slower than
> uncompressed ones.  It is a speed vs. size thing.
> 
> By the way, PostgreSQL v16 introduced compression with "lz4" und "zstd"
> to "pg_dump", which is much faster.
> 
> Yours,
> Laurenz Albe

Re: slow pg_dump with bytea

From

Laurenz Albe

Date:

20 October 2023, 12:51:30

On Fri, 2023-10-20 at 14:08 +0200, Janning Vygen wrote:
> Anyhow, for me it seems odd that you can dump in a fraction of time
> compared to the default settings. With the solution in the link you have
> both: speed and compression. Which seems to be offered by pg_dump -Z
> option too, acording to the level yu set. But it isn't.

Who says that the only goal of pg_dump is speed?

Like I said, specialized compression software is way better than PostgreSQL's
built-in "pglz".

Yours,
Laurenz Albe

Re: slow pg_dump with bytea

From

Alvaro Herrera

Date:

20 October 2023, 13:18:19

On 2023-Oct-20, Janning Vygen wrote:

> So maybe you can fix it by changing the pg_dump process to not compress any
> bytea data.

The problem with this idea is that the compression applies to the whole
data stream.  The backup format doesn't allow to compress some columns
and not others.  Doing that would require a lot of changes to pg_dump.

Maybe a workable idea would be to allow excluding individual tables from
being compressed.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)