Thread: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Jacob Champion

Date:

06 August 2018, 22:02:14

Hi all,

We recently ran into an issue in pg_dump that caused the initial
sort-by-name pass to return incorrect results. It doesn't seem to
affect overall correctness, since the later toposort pass takes care
of dependencies, but it does occasionally cause a spurious diff in
dump output before and after a pg_upgrade run.

The key appears to be in this comment, in pg_dump_sort.c:

/*
 * Sort by namespace. Note that all objects of the same type should
 * either have or not have a namespace link, so we needn't be fancy about
 * cases where one link is null and the other not.
 */

This doesn't appear to be correct anymore. From scanning the code, it
looks like the DO_DEFAULT_ACL type can optionally have a NULL
namespace. Even if it were correct, we can get to this part of the
code with objects of different types, as long as they share the same
sort priority (see DO_COLLATION and DO_TRANSFORM). We only ran into
this because of a bug in Greenplum that caused two types to share a
sort priority where they previously did not.

A quick and dirty patch is attached, which simply defines an ordering
between NULL and non-NULL namespaces so that quicksort behaves
rationally.

WDYT?

--Jacob

Attachment

pg-dump-sort.patch

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Tom Lane

Date:

06 August 2018, 22:13:06

Jacob Champion <pchampion@pivotal.io> writes:
> We recently ran into an issue in pg_dump that caused the initial
> sort-by-name pass to return incorrect results. It doesn't seem to
> affect overall correctness, since the later toposort pass takes care
> of dependencies, but it does occasionally cause a spurious diff in
> dump output before and after a pg_upgrade run.

Do you mean "incorrect results", or just "unstable results"?
If the former, what's incorrect about it?

            regards, tom lane

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Andrew Dunstan

Date:

06 August 2018, 22:16:56


On 08/06/2018 03:13 PM, Tom Lane wrote:
> Jacob Champion <pchampion@pivotal.io> writes:
>> We recently ran into an issue in pg_dump that caused the initial
>> sort-by-name pass to return incorrect results. It doesn't seem to
>> affect overall correctness, since the later toposort pass takes care
>> of dependencies, but it does occasionally cause a spurious diff in
>> dump output before and after a pg_upgrade run.
> Do you mean "incorrect results", or just "unstable results"?
> If the former, what's incorrect about it?
>
>             


I'd also like to see a test case.

We should perhaps have a more representative set of data for the upgrade 
testing, both same version and cross-version, which judge success by 
comparing pre and post dumps, but I don't recall ever seeing this.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Jacob Champion

Date:

06 August 2018, 22:23:59

On Mon, Aug 6, 2018 at 12:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Do you mean "incorrect results", or just "unstable results"?
> If the former, what's incorrect about it?

Incorrect, as in "the results are not sorted by type name." Here's an
example ordering that we saw -- but note that you won't be able to
repro since it relies on the Greenplum bug I mentioned.

...
pg_catalog.xlogloc_ops
public._tmp_table
public.a
public.a_star
<null>.abstime date
<null>.abstime int4
<null>.abstime time
<null>.abstime timestamp
<null>.abstime timestamptz
gporca_faults.foo
...

You can see the inversion between public._tmp_table (which is TABLE
DATA) and gporca_faults.foo (which is also TABLE DATA). I can try to
work on a Postgres-specific test case if you'd like, but since the
root cause is that we're not defining a valid ordering, quicksort may
or may not behave consistently for test purposes. We got "lucky" here;
otherwise we'd never have noticed.

--Jacob

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Jacob Champion

Date:

06 August 2018, 22:34:47

On Mon, Aug 6, 2018 at 12:23 PM Jacob Champion <pchampion@pivotal.io> wrote:
> since the
> root cause is that we're not defining a valid ordering, quicksort may
> or may not behave consistently for test purposes.

To expand on this, consider three objects, of the same type, compared
with the current comparator function:

namespace_a.object_3 < namespace_b.object_1 (because a < b)
NULL.object_2 < namespace_a.object_3 (because 2 < 3)
namespace_b.object_1 < NULL.object_2 (because 1 < 2)

This is rock-paper-scissors instead of a partial ordering.

--Jacob

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Tom Lane

Date:

06 August 2018, 22:45:49

Jacob Champion <pchampion@pivotal.io> writes:
> ... since the
> root cause is that we're not defining a valid ordering, quicksort may
> or may not behave consistently for test purposes.

Ah, gotcha.  But whether the behavior is sane or not, it'd be reproducible
for any specific input dataset on any specific platform (unless you've got
a quicksort that actually uses randomized pivots; but ours doesn't, and
I think that pg_dump does use src/port/qsort.c).  So that partially
answers Andrew's question as to why we've not seen instability in the
buildfarm's results.

It also seems entirely possible that we simply don't have any cases in the
existing test data that provoke the indeterminate behavior --- to judge by
your concrete example, it might take specifically-chosen object names to
get into a situation where the comparator delivers inconsistent results.

            regards, tom lane

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Jacob Champion

Date:

06 August 2018, 23:38:27

On Mon, Aug 6, 2018 at 12:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Ah, gotcha.  But whether the behavior is sane or not, it'd be reproducible
> for any specific input dataset on any specific platform (unless you've got
> a quicksort that actually uses randomized pivots; but ours doesn't, and
> I think that pg_dump does use src/port/qsort.c).  So that partially
> answers Andrew's question as to why we've not seen instability in the
> buildfarm's results.

I completely missed that the qsort in use was part of libpgport; that
should make it much easier to repro. We'll give it a shot.

Thanks,
--Jacob

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Tom Lane

Date:

07 August 2018, 20:17:12

Jacob Champion <pchampion@pivotal.io> writes:
> On Mon, Aug 6, 2018 at 12:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Ah, gotcha.  But whether the behavior is sane or not, it'd be reproducible
>> for any specific input dataset on any specific platform (unless you've got
>> a quicksort that actually uses randomized pivots; but ours doesn't, and
>> I think that pg_dump does use src/port/qsort.c).  So that partially
>> answers Andrew's question as to why we've not seen instability in the
>> buildfarm's results.

> I completely missed that the qsort in use was part of libpgport; that
> should make it much easier to repro. We'll give it a shot.

I don't see any reason to insist on a test case before pushing this
fix, so I did that.  (As I expected, the fix doesn't change any existing
regression test results.)

            regards, tom lane

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Jacob Champion

Date:

07 August 2018, 21:41:30

On Tue, Aug 7, 2018 at 10:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't see any reason to insist on a test case before pushing this
> fix, so I did that.  (As I expected, the fix doesn't change any existing
> regression test results.)

Thanks! We have made some progress towards a repro but we're having
problems putting it into the pg_dump suite. Here's a simple case for
you:

    CREATE USER me;
    CREATE SCHEMA a;
    ALTER DEFAULT PRIVILEGES IN SCHEMA a GRANT UPDATE ON TABLES TO me;
    ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON SEQUENCES TO me;
    ALTER DEFAULT PRIVILEGES GRANT SELECT ON SEQUENCES TO me;

We dumped this from a newly created database (commit e0ee9305 from
master) and got the following order:

    ALTER DEFAULT PRIVILEGES ... IN SCHEMA public REVOKE ALL ON SEQUENCES...
    ALTER DEFAULT PRIVILEGES ... IN SCHEMA public GRANT SELECT ON SEQUENCES...
    ...
    ALTER DEFAULT PRIVILEGES ... GRANT SELECT ON SEQUENCES...
    ...
    ALTER DEFAULT PRIVILEGES ... IN SCHEMA a REVOKE ALL ON TABLES...
    ALTER DEFAULT PRIVILEGES ... IN SCHEMA a GRANT UPDATE ON TABLES...

In this case, schema a should dump before public. If you switch the
first two ALTER DEFAULT PRIVILEGES lines in the reproduction SQL, you
should get a different (correct) ordering.

--Jacob

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Tom Lane

Date:

07 August 2018, 21:48:52

Jacob Champion <pchampion@pivotal.io> writes:
> Thanks! We have made some progress towards a repro but we're having
> problems putting it into the pg_dump suite.

Yeah, I find the pg_dump test suite to be pretty much unreadable too.
Don't worry about it --- I don't think we need to memorialize a test
case for this.

> Here's a simple case for you:

It is nice to have that in case anyone wants to verify the fix manually,
though.  Thanks!

            regards, tom lane

Re: pg_dump: sortDumpableObjectsByTypeName() doesn't always do that

From

Stephen Frost

Date:

07 August 2018, 21:54:19

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Jacob Champion <pchampion@pivotal.io> writes:
> > Thanks! We have made some progress towards a repro but we're having
> > problems putting it into the pg_dump suite.
>
> Yeah, I find the pg_dump test suite to be pretty much unreadable too.
> Don't worry about it --- I don't think we need to memorialize a test
> case for this.

This would be a similar case to the test:

COPY fk_reference_test_table second

and really wouldn't be very difficult to add, you just have to realize
that the regexp has be set up to match the output of pg_dump and that's
it's a multi-line one, as that test case is.

For my 2c, this seems like a pretty reasonable test to add and wouldn't
be very hard to do.  If you'd like help, please reach out.

Thanks!

Stephen

Attachment

signature.asc