Re: Non-text mode for pg_dumpall - Mailing list pgsql-hackers

From Vaibhav Dalvi
Subject Re: Non-text mode for pg_dumpall
Date
Msg-id CA+vB=AGsn4eUxsbLk_oy=iKzd8D_1Ne375XH-2u6Zncu72Q01Q@mail.gmail.com
Whole thread Raw
In response to Re: Non-text mode for pg_dumpall  (Vaibhav Dalvi <vaibhav.dalvi@enterprisedb.com>)
Responses Re: Non-text mode for pg_dumpall
List pgsql-hackers
Hi Mahendra,

Here are a few more comments following my review of the patch:

### 1\. Incorrect Comment for `-g` (globals-only) Option

The comment for the `-g` case in the code states that it restores the
`global.dat` file. However, in the non-text dump output, I only see the
following files: `databases`, `map.dat`, and `toc.dat`.

```c
+ case 'g':
+ /* restore only global.dat file from directory */
+ globals_only = true;
+ break;
```

Please update this comment to accurately reflect the file being restored
(e.g., `toc.dat` or the global objects within the archive).

### 2\. Incorrect Order of `case` Statements in `pg_restore.c`

The new `case 7` statement in `pg_restore.c` appears to be
inserted before `case 6`, disrupting the numerical order.

```c
+ case 7: /* database patterns to skip */
+ simple_string_list_append(&db_exclude_patterns, optarg);
+ break;

case 6:
opts->restrict_key = pg_strdup(optarg);
```

Please re-order the `case` statements so they follow ascending
numerical order.

### 3\. Missing Example in SGML Documentation

The SGML documentation for `pg_dumpall` is missing an explicit
example demonstrating its use with non-text formats (e.g., directory format).
It would be beneficial to include a clear example for this new feature.

### 4\. Cosmetic Issues

Please address the following minor stylistic points:

Please ensure the function signatures
adhere to standard coding style, particularly for line wrapping.
The following lines seem to have inconsistent indentation:

```c
static int restore_global_objects(const char *inputFileSpec, RestoreOptions *opts,
int numWorkers, bool append_data, int num, bool globals_only);
static int restore_all_databases(const char *inputFileSpec,
SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers);
```

Please fix instances where the 80-character line limit is
crossed, such as in the example below:

```c
n_errors = restore_one_database(subdirpath, opts, numWorkers, true, 1, false);
```

I believe this concludes my formal review.

Thanks,
Vaibhav Dalvi

On Wed, Nov 5, 2025 at 12:29 PM Vaibhav Dalvi <vaibhav.dalvi@enterprisedb.com> wrote:
Hi Mahendra,

Thank you for the fix. Please find my further review comments below.

### Restrict-Key Option

The `--restrict-key` option is currently being accepted by
`pg_dumpall` even when non-plain formats are specified,
which contradicts its intended use only with the plain format.

For example:

```
$ ./db/bin/pg_dump --format=d -f testdump_dir --restrict-key=RESTRICT_KEY
pg_dump: error: option --restrict-key can only be used with --format=plain
$ ./db/bin/pg_dumpall --format=d -f testdump_dir --restrict-key=RESTRICT_KEY
pg_dumpall: error: invalid restrict key
```

I have attached a delta patch that addresses the issue with the
`--restrict-key` option. It would be beneficial to include a dedicated
test case for this check.

### Use of Dump Options Structure (dopt)

Please ensure consistency by utilizing the main dump options
structure (`dopt`) instead of declaring and using individual variables
where the structure already provides fields. For example, the
`output_clean` variable seems redundant here:

```c
case 'c':
output_clean = true;
dopt.outputClean = 1;
break;
```

In my attached delta file, I have replaced the unnecessary
`restrict_key` variable with `dopt.restrict_key`.

### Cosmetic Issues

1. Please review the spacing around the pointer:
```c
+ ((ArchiveHandle * )fout) ->connection = conn;
+ ((ArchiveHandle * ) fout) -> public.numWorkers = 1;
```
2. Please be consistent with the punctuation of single-line comments; 
    some end with a full stop (`.`) and others do not.
3. In the SGML documentation changes, some new statements start
    with one space, and others start with two. Please adhere to a single
    standard for indentation across the patch.

Regards,
Vaibhav
EnterpriseDB

On Mon, Nov 3, 2025 at 5:24 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
On Mon, 3 Nov 2025 at 12:06, Vaibhav Dalvi <vaibhav.dalvi@enterprisedb.com> wrote:
>
> Hi Mahendra,
>
> Thank you for your work on this feature.
> I have just begun reviewing the latest patch and
> encountered the following errors during the initial setup:
>
> ```
> $ ./db/bin/pg_restore testdump_dir -C -d postgres -F d -p 5556
> pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
> LINE 1: \restrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj...
> ^
> Command was: \restrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj9vg3Xxys1b3hb
>
> pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
> LINE 1: \unrestrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCj...
> ^
> Command was: \unrestrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj9vg3Xxys1b3hb
>
> pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
> LINE 1: \connect template1
> ^
> Command was: \connect template1
>
> pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
> LINE 1: \connect postgres
> ^
> Command was: \connect postgres
> ```
> To cross-check tried with plain dump(with pg_dumpall) and
>  restored(SQL file restore) without patch and didn't get above
> connection errors.
>
> It appears there might be an issue with the dump file itself.
> Please note that this is my first observation as I have just
> started the review. I will continue with my assessment.
>
> Regards,
> Vaibhav Dalvi
> EnterpriseDB

Thanks Vaibhav for the review.
This change was added by me in v04. Only in the case of a file, we should restore these commands. Attached patch is fixing the same.

If we dump and restore the same file with the same user, then we will get an error of ROLE CREATE as the same role is already created. I think, either we can ignore this error, or we can keep it as a restore can be done with different users.
mst@localhost bin]$ ./pg_restore d1  -C -d postgres
pg_restore: error: could not execute query: ERROR:  role "mst" already exists
Command was: CREATE ROLE mst;
ALTER ROLE mst WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION BYPASSRLS;


pg_restore: warning: errors ignored on restore: 1


>
> On Fri, Oct 31, 2025 at 2:51 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>
>> On Tue, 28 Oct 2025 at 11:32, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>> >
>> > On Thu, 16 Oct 2025 at 16:24, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>> > >
>> > > On Wed, 15 Oct 2025 at 23:05, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>> > > >
>> > > > On Sun, 24 Aug 2025 at 22:12, Andrew Dunstan <andrew@dunslane.net> wrote:
>> > > > >
>> > > > >
>> > > > > On 2025-08-23 Sa 9:08 PM, Noah Misch wrote:
>> > > > >
>> > > > > On Wed, Jul 30, 2025 at 02:51:59PM -0400, Andrew Dunstan wrote:
>> > > > >
>> > > > > OK, now that's reverted we should discuss how to proceed. I had two thoughts
>> > > > > - we could use invent a JSON format for the globals, or we could just use
>> > > > > the existing archive format. I think the archive format is pretty flexible,
>> > > > > and should be able to accommodate this. The downside is it's not humanly
>> > > > > readable. The upside is that we don't need to do anything special either to
>> > > > > write it or parse it.
>> > > > >
>> > > > > I would first try to use the existing archiver API, because that makes it
>> > > > > harder to miss bugs.  Any tension between that API and pg_dumpall is likely to
>> > > > > have corresponding tension on the pg_restore side.  Resolving that tension
>> > > > > will reveal much of the project's scope that remained hidden during the v18
>> > > > > attempt.  Perhaps more important than that, using the archiver API means
>> > > > > future pg_dump and pg_restore options are more likely to cooperate properly
>> > > > > with $SUBJECT.  In other words, I want it to be hard to add pg_dump/pg_restore
>> > > > > features that malfunction only for $SUBJECT archives.  The strength of the
>> > > > > archiver architecture shows in how rarely new features need format-specific
>> > > > > logic and how rarely format-specific bugs get reported.  We've had little or
>> > > > > no trouble with e.g. bugs that appear in -Fd but not in -Fc.
>> > > > >
>> > > > >
>> > > > > Yeah, that's what we're going to try.
>> > > > >
>> > > > >
>> > > > > cheers
>> > > > >
>> > > > >
>> > > > > andrew
>> > > > >
>> > > > > --
>> > > > > Andrew Dunstan
>> > > > > EDB: https://www.enterprisedb.com
>> > > >
>> > > > Thanks Andrew, Noah and all others for feedback.
>> > > >
>> > > > Based on the above suggestions and discussions, I removed sql commands
>> > > > from the global.dat file. For global commands, now we are making
>> > > > toc.dat/toc.dmp/toc.tar file based on format specified and based on
>> > > > format specified, we are making archive entries for these global
>> > > > commands. By this approach, we removed the hard-coded parsing part of
>> > > > the global.dat file and we are able to skip DROP DATABASE with the
>> > > > globals-only option.
>> > > >
>> > > > Here, I am attaching a patch for review, testing and feedback. This is
>> > > > a WIP patch. I will do some more code cleanup and will add some more
>> > > > comments also. Please review this and let me know design level
>> > > > feedback. Thanks Tushar Ahuja for some internal testing and feedback.
>> > > >
>> > >
>> > > Hi,
>> > > Here, I am attaching an updated patch. In offline discussion, Andrew
>> > > reported some test-case failures(Thanks Andrew). I fixed those.
>> > > Please let me know feedback for the patch.
>> > >
>> >
>> > Hi,
>> > Here I am attaching a re-based patch as v02 was failing on head.
>> > Thanks Tushar for the testing.
>> > Please review this and let me know feedback.
>> >
>>
>> Hi all,
>> Here I am attaching an updated patch for review and testing. Based on
>> some offline comments by Andrew, I did some code cleanup.
>> Please consider this patch for feedback.
>>
>> --
>> Thanks and Regards
>> Mahendra Singh Thalor
>> EnterpriseDB: http://www.enterprisedb.com



--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Next
From: Bryan Green
Date:
Subject: Re: [PATCH] Fix fragile walreceiver test.