Re: Horribly slow pg_upgrade performance with many Large Objects - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Horribly slow pg_upgrade performance with many Large Objects
Date
Msg-id CAMT0RQTXiqH7zdQEVSVd2L7_Cw4wQ1eHOD8hfZ+0vecMXJWc-w@mail.gmail.com
Whole thread Raw
In response to Horribly slow pg_upgrade performance with many Large Objects  (Hannu Krosing <hannuk@google.com>)
List pgsql-hackers
Hi Nathan,

Did a quick check of the patch and it seems to work ok.

What do you think of the idea of not dumping pg_shdepend here, but
instead adding the required entries after loading
pg_largeobject_metadata based on the contents of it ?

The query for this would be

WITH db AS (
    SELECT oid FROM pg_database WHERE datname = current_database()
)
INSERT INTO pg_shdepend
SELECT db.oid AS dbid,
       2613 AS classid,
       lm.oid AS objid,
       0 AS objsubid,
       1260 AS refclassid,
       COALESCE(acl.grantee, lm.lomowner) AS refobjid,
       CASE WHEN grantee IS NULL or grantee = lomowner
            THEN 'o'
            ELSE 'a'
       END as deptype
  FROM pg_largeobject_metadata as lm
  LEFT JOIN LATERAL (
    SELECT DISTINCT (aclexplode(lm.lomacl)).grantee
  ) AS acl ON true,
  db
  WHERE (lm.oid, COALESCE(acl.grantee, lm.lomowner)) NOT IN (SELECT
objid, refobjid FROM pg_shdepend WHERE dbid = db.oid)
;

(I had hoped to use ON CONFLICT DO NOTHING but this is not supported
for system tables.)


---|
Hannu










On Wed, May 7, 2025 at 4:51 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, May 05, 2025 at 02:23:25PM -0500, Nathan Bossart wrote:
> > That leaves pg_shdepend.  For now, I've just instructed pg_upgrade to COPY
> > the relevant pg_shdepend rows as an independent step, but perhaps there's a
> > reasonably straightforward way to put that in pg_dump, too.
>
> It turns out there is.  TableDataInfo's filtercond field can be used to
> easily add a WHERE clause to the data dumping command.  On my laptop,
> upgrading with --jobs=8 with 10M large objects evenly distributed across 10
> databases (each with a non-bootstrap-superuser owner and another role with
> select rights) takes ~100 seconds without this patch and ~30 seconds with
> it.
>
> I've also added dependency tracking, version checks (this only works for
> upgrades from >=v12 for now), a hack to ensure the columns for
> pg_largeobject_metadata/pg_shdepend are collected, and comments.  I'm sure
> there's something I've missed, but this patch has worked well in my tests
> thus far.
>
> Taking a step back, I'm a little disappointed in the gains here.  A 3-9x
> speedup is nice, but I guess I was hoping to find another order of
> magnitude somewhere.  To do any better, I think we'd need to copy the files
> for pg_largeobject_metadata directly for upgrades from >= v16, but that
> would have to fit somewhere between when pg_restore creates the database
> and when it restores any large object comments/seclabels.  I'm not wild
> about the amount of hackery required to get that working.
>
> --
> nathan



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Proposal to allow DELETE/UPDATE on partitioned tables with unsupported foreign partitions
Next
From: Dmitry Dolgov
Date:
Subject: Re: Changing shared_buffers without restart