Thread: pg_dump sort priority mismatch for large objects

pg_dump sort priority mismatch for large objects

From
Nathan Bossart
Date:
(moving to a new thread)

On Thu, Jul 10, 2025 at 06:05:26PM +0530, Nitin Motiani wrote:
>> There might be an existing issue here, because dbObjectTypePriorities
>> has the following comment:
>>
>> * NOTE: object-type priorities must match the section assignments made in
>> * pg_dump.c; that is, PRE_DATA objects must sort before DO_PRE_DATA_BOUNDARY,
>> * POST_DATA objects must sort after DO_POST_DATA_BOUNDARY, and DATA objects
>> * must sort between them.
>>
>> But dumpLO() puts large objects in SECTION_DATA, and PRIO_LARGE_OBJECT
>> is before PRIO_PRE_DATA_BOUNDARY. I admittedly haven't spent too much
>> time investigating this, though.
> 
> I looked through the history of this to see how this happened and if it
> could be an existing issue. Prior to a45c78e3284b, dumpLO used to put large
> objects in SECTION_PRE_DATA. That commit changed dumpLO and also changed
> addBoundaryDependencies to move DO_LARGE_OBJECT from pre-data to data
> section. Seems like since then this has been inconsistent with
> pg_dump_sort.c. I think the change in pg_dump_sort.c should be backported
> to PG17 & 18 independent of the state of the larger patch.

Thanks for doing the archaeology!

> But even with the inconsistency, it doesn't look like there is an existing
> issue. As the dependencies were changed in addBoundaryDependencies, that
> should take precedence over the order in pg_dump_sort.c. The
> dbObjectTypePriorities are used by sortDumpableObjectsByTypeName. But right
> after that sortDumpableObjects sorts the objects based on dependencies
> therefore the change in boundary dependencies should ensure that this is
> working as intended. Still I think dbObjectTypePriorities should be made
> consistent with the rest.

+1, if for no other reason than we'll need it to be below PRIO_TABLE_DATA
for the speed-up-pg_upgrade-with-many-LOs patch [0].  Does anyone see any
problems with applying something like the following down to v17?

diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 0b0977788f1..538e7dcb493 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -76,10 +76,10 @@ enum dbObjectTypePriorities
     PRIO_TABLE_ATTACH,
     PRIO_DUMMY_TYPE,
     PRIO_ATTRDEF,
-    PRIO_LARGE_OBJECT,
     PRIO_PRE_DATA_BOUNDARY,     /* boundary! */
     PRIO_TABLE_DATA,
     PRIO_SEQUENCE_SET,
+    PRIO_LARGE_OBJECT,
     PRIO_LARGE_OBJECT_DATA,
     PRIO_STATISTICS_DATA_DATA,
     PRIO_POST_DATA_BOUNDARY,    /* boundary! */

[0] https://postgr.es/m/aBkQLSkx1zUJ-LwJ%40nathan

-- 
nathan



Re: pg_dump sort priority mismatch for large objects

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> On Thu, Jul 10, 2025 at 06:05:26PM +0530, Nitin Motiani wrote:
>> I looked through the history of this to see how this happened and if it
>> could be an existing issue. Prior to a45c78e3284b, dumpLO used to put large
>> objects in SECTION_PRE_DATA. That commit changed dumpLO and also changed
>> addBoundaryDependencies to move DO_LARGE_OBJECT from pre-data to data
>> section. Seems like since then this has been inconsistent with
>> pg_dump_sort.c. I think the change in pg_dump_sort.c should be backported
>> to PG17 & 18 independent of the state of the larger patch.

> +1, if for no other reason than we'll need it to be below PRIO_TABLE_DATA
> for the speed-up-pg_upgrade-with-many-LOs patch [0].  Does anyone see any
> problems with applying something like the following down to v17?

That's clearly an oversight in a45c78e3284b.  I agree that fixing
pg_dump_sort.c to match shouldn't create any functional difficulties.
It might make the topological sort step marginally faster by
reducing the number of ordering violations that have to be fixed.

            regards, tom lane



Re: pg_dump sort priority mismatch for large objects

From
Nathan Bossart
Date:
Committed/back-patched.

-- 
nathan