Thread: slow pg_dump with bytea
I don't know if the PG developers are aware of this: https://serverfault.com/questions/1081642/postgresql-13-speed-up-pg-dump-to-5-minutes-instead-of-70-minutes But this question is quite famous and many users like the solution. So maybe you can fix it by changing the pg_dump process to not compress any bytea data. regards Janning
On Fri, 2023-10-20 at 12:26 +0200, Janning Vygen wrote: > I don't know if the PG developers are aware of this: > > https://serverfault.com/questions/1081642/postgresql-13-speed-up-pg-dump-to-5-minutes-instead-of-70-minutes > > But this question is quite famous and many users like the solution. > So maybe you can fix it by changing the pg_dump process to not compress > any bytea data. Doesn't sound like a bug to me. Compression is determined when "pg_dump" starts. How should it guess that there is a binary column with compressed data in some table? Even if it did, I wouldn't feel well with a "pg_dump" with enough artificial intelligence to do this automatically for me (and get it wrong occasionally). In addition, I don't think that this problem is limited to compressed binary data. In my experience, compressed dumps are always slower than uncompressed ones. It is a speed vs. size thing. By the way, PostgreSQL v16 introduced compression with "lz4" und "zstd" to "pg_dump", which is much faster. Yours, Laurenz Albe
Thanks for this quick response. Don't think it is a bug either. So, wrong list, actually. But I didn't found any "feature request" list. Sorry for this. Anyhow, for me it seems odd that you can dump in a fraction of time compared to the default settings. With the solution in the link you have both: speed and compression. Which seems to be offered by pg_dump -Z option too, acording to the level yu set. But it isn't. regards Janning Am 20.10.23 um 13:52 schrieb Laurenz Albe: > On Fri, 2023-10-20 at 12:26 +0200, Janning Vygen wrote: >> I don't know if the PG developers are aware of this: >> >> https://serverfault.com/questions/1081642/postgresql-13-speed-up-pg-dump-to-5-minutes-instead-of-70-minutes >> >> But this question is quite famous and many users like the solution. >> So maybe you can fix it by changing the pg_dump process to not compress >> any bytea data. > > Doesn't sound like a bug to me. > > Compression is determined when "pg_dump" starts. How should it guess that > there is a binary column with compressed data in some table? Even if it did, > I wouldn't feel well with a "pg_dump" with enough artificial intelligence to > do this automatically for me (and get it wrong occasionally). > > In addition, I don't think that this problem is limited to compressed > binary data. In my experience, compressed dumps are always slower than > uncompressed ones. It is a speed vs. size thing. > > By the way, PostgreSQL v16 introduced compression with "lz4" und "zstd" > to "pg_dump", which is much faster. > > Yours, > Laurenz Albe
On Fri, 2023-10-20 at 14:08 +0200, Janning Vygen wrote: > Anyhow, for me it seems odd that you can dump in a fraction of time > compared to the default settings. With the solution in the link you have > both: speed and compression. Which seems to be offered by pg_dump -Z > option too, acording to the level yu set. But it isn't. Who says that the only goal of pg_dump is speed? Like I said, specialized compression software is way better than PostgreSQL's built-in "pglz". Yours, Laurenz Albe
On 2023-Oct-20, Janning Vygen wrote: > So maybe you can fix it by changing the pg_dump process to not compress any > bytea data. The problem with this idea is that the compression applies to the whole data stream. The backup format doesn't allow to compress some columns and not others. Doing that would require a lot of changes to pg_dump. Maybe a workable idea would be to allow excluding individual tables from being compressed. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)