Re: Non-text mode for pg_dumpall - Mailing list pgsql-hackers
| From | Mahendra Singh Thalor |
|---|---|
| Subject | Re: Non-text mode for pg_dumpall |
| Date | |
| Msg-id | CAKYtNAqQ5LHWNWyaeVOJaS=2xDJovfv9GShTBzf8_5s=jH7wsg@mail.gmail.com Whole thread Raw |
| In response to | Re: Non-text mode for pg_dumpall (Vaibhav Dalvi <vaibhav.dalvi@enterprisedb.com>) |
| List | pgsql-hackers |
Thanks Vaibhav, Tushar and Andrew for the review and testing.
On Mon, 3 Nov 2025 at 17:30, Vaibhav Dalvi
<vaibhav.dalvi@enterprisedb.com> wrote:
>
> Hi Mahendra,
>
> I have a few more review comments regarding the patch:
>
> 1. Is the following change in `src/bin/pg_dump/connectdb.c` intentional?
>
> ```
> --- a/src/bin/pg_dump/connectdb.c
> +++ b/src/bin/pg_dump/connectdb.c
Yes, we need this. If there is any error, then we were trying to
disconnect the database in 2 places so we were getting a crash. I will
try to reproduce crashe without this patch and will respond.
On Tue, 4 Nov 2025 at 18:23, tushar <tushar.ahuja@enterprisedb.com> wrote:
> Thanks Mahendra, I am getting a segmentation fault against v05 patch.
>
> [edb@1a1c15437e7c bin]$ ./pg_dumpall -Ft --file a.3 -v
> pg_dumpall: executing SELECT pg_catalog.set_config('search_path', '', false);
> Segmentation fault
>
> Issue is coming with all output file formats -F[t/c/d] except plain
>
> regards,
Thanks for the report. Fixed,
On Tue, 4 Nov 2025 at 22:25, Andrew Dunstan <andrew@dunslane.net> wrote:
> Yeah, I don't think we need to dump the timestamp in non-text modes. This fix worked for me:
>
>
> diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
> index 601b9f9738e..f66cc26d9a2 100644
> --- a/src/bin/pg_dump/pg_dumpall.c
> +++ b/src/bin/pg_dump/pg_dumpall.c
> @@ -638,7 +638,7 @@ main(int argc, char *argv[])
> if (quote_all_identifiers)
> executeCommand(conn, "SET quote_all_identifiers = true");
>
> - if (verbose)
> + if (verbose && archDumpFormat == archNull)
> dumpTimestamp("Started on");
Thanks Andrew. Yes, we should not dump timestamp in non-text modes.
On Wed, 5 Nov 2025 at 18:47, Vaibhav Dalvi
<vaibhav.dalvi@enterprisedb.com> wrote:
>
> Hi Mahendra,
>
> Here are a few more comments following my review of the patch:
>
> ### 1\. Incorrect Comment for `-g` (globals-only) Option
>
> The comment for the `-g` case in the code states that it restores the
> `global.dat` file. However, in the non-text dump output, I only see the
> following files: `databases`, `map.dat`, and `toc.dat`.
Fixed.
>
> ```c
> + case 'g':
> + /* restore only global.dat file from directory */
> + globals_only = true;
> + break;
Fixed.
> ```
>
> Please update this comment to accurately reflect the file being restored
> (e.g., `toc.dat` or the global objects within the archive).
Fixed.
>
> ### 2\. Incorrect Order of `case` Statements in `pg_restore.c`
>
> The new `case 7` statement in `pg_restore.c` appears to be
> inserted before `case 6`, disrupting the numerical order.
>
> ```c
> + case 7: /* database patterns to skip */
> + simple_string_list_append(&db_exclude_patterns, optarg);
> + break;
>
> case 6:
> opts->restrict_key = pg_strdup(optarg);
> ```
>
> Please re-order the `case` statements so they follow ascending
> numerical order.
Fixed.
>
> ### 3\. Missing Example in SGML Documentation
>
> The SGML documentation for `pg_dumpall` is missing an explicit
> example demonstrating its use with non-text formats (e.g., directory format).
> It would be beneficial to include a clear example for this new feature.
I think we don't add such cases in doc. We already added test cases in
code. If others also feel that we should add a test case in SGML, then
I will update the doc with the test case.
>
> ### 4\. Cosmetic Issues
>
> Please address the following minor stylistic points:
>
> Please ensure the function signatures
> adhere to standard coding style, particularly for line wrapping.
> The following lines seem to have inconsistent indentation:
>
> ```c
> static int restore_global_objects(const char *inputFileSpec, RestoreOptions *opts,
> int numWorkers, bool append_data, int num, bool globals_only);
> static int restore_all_databases(const char *inputFileSpec,
> SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers);
> ```
>
> Please fix instances where the 80-character line limit is
> crossed, such as in the example below:
Fixed.
>
> ```c
> n_errors = restore_one_database(subdirpath, opts, numWorkers, true, 1, false);
> ```
>
> I believe this concludes my formal review.
>
> Thanks,
> Vaibhav Dalvi
>
> On Wed, Nov 5, 2025 at 12:29 PM Vaibhav Dalvi <vaibhav.dalvi@enterprisedb.com> wrote:
>>
>> Hi Mahendra,
>>
>> Thank you for the fix. Please find my further review comments below.
>>
>> ### Restrict-Key Option
>>
>> The `--restrict-key` option is currently being accepted by
>> `pg_dumpall` even when non-plain formats are specified,
>> which contradicts its intended use only with the plain format.
>>
>> For example:
>>
>> ```
>> $ ./db/bin/pg_dump --format=d -f testdump_dir --restrict-key=RESTRICT_KEY
>> pg_dump: error: option --restrict-key can only be used with --format=plain
>> $ ./db/bin/pg_dumpall --format=d -f testdump_dir --restrict-key=RESTRICT_KEY
>> pg_dumpall: error: invalid restrict key
>> ```
>>
>> I have attached a delta patch that addresses the issue with the
>> `--restrict-key` option. It would be beneficial to include a dedicated
>> test case for this check.
We should dump restrict-key with all modes as we need to restore with
the "-f file" option in text mode.
Ex: pg_dumpall --format=d -f testdump_dir
and restore::: pg_restore testdump_dir -d dabasename -C -f testdumpfile
(In testdumpfile, we will generate commands from archive dump)
So I am not merging this delat patch.
>>
>> ### Use of Dump Options Structure (dopt)
>>
>> Please ensure consistency by utilizing the main dump options
>> structure (`dopt`) instead of declaring and using individual variables
>> where the structure already provides fields. For example, the
>> `output_clean` variable seems redundant here:
>>
>> ```c
>> case 'c':
>> output_clean = true;
>> dopt.outputClean = 1;
>> break;
output_clean is not added by this patch. I will analyse this comment
and will respond in the next update.
>> ```
>>
>> In my attached delta file, I have replaced the unnecessary
>> `restrict_key` variable with `dopt.restrict_key`.
This is also not part of this patch. If you feel to add this in DOPT,
please suggest in separate thread.
>>
>> ### Cosmetic Issues
>>
>> 1. Please review the spacing around the pointer:
>> ```c
>> + ((ArchiveHandle * )fout) ->connection = conn;
>> + ((ArchiveHandle * ) fout) -> public.numWorkers = 1;
Fixed.
>> ```
>> 2. Please be consistent with the punctuation of single-line comments;
>> some end with a full stop (`.`) and others do not.
Based on nearby code comments, I made changes. I will try to fix these
inconsistencies..
>> 3. In the SGML documentation changes, some new statements start
>> with one space, and others start with two. Please adhere to a single
>> standard for indentation across the patch.
Okay. I will fix these.
>>
>> Regards,
>> Vaibhav
>> EnterpriseDB
>>
>> On Mon, Nov 3, 2025 at 5:24 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>>
>>> On Mon, 3 Nov 2025 at 12:06, Vaibhav Dalvi <vaibhav.dalvi@enterprisedb.com> wrote:
>>> >
>>> > Hi Mahendra,
>>> >
>>> > Thank you for your work on this feature.
>>> > I have just begun reviewing the latest patch and
>>> > encountered the following errors during the initial setup:
>>> >
>>> > ```
>>> > $ ./db/bin/pg_restore testdump_dir -C -d postgres -F d -p 5556
>>> > pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
>>> > LINE 1: \restrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj...
>>> > ^
>>> > Command was: \restrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj9vg3Xxys1b3hb
>>> >
>>> > pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
>>> > LINE 1: \unrestrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCj...
>>> > ^
>>> > Command was: \unrestrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj9vg3Xxys1b3hb
>>> >
>>> > pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
>>> > LINE 1: \connect template1
>>> > ^
>>> > Command was: \connect template1
>>> >
>>> > pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
>>> > LINE 1: \connect postgres
>>> > ^
>>> > Command was: \connect postgres
>>> > ```
>>> > To cross-check tried with plain dump(with pg_dumpall) and
>>> > restored(SQL file restore) without patch and didn't get above
>>> > connection errors.
>>> >
>>> > It appears there might be an issue with the dump file itself.
>>> > Please note that this is my first observation as I have just
>>> > started the review. I will continue with my assessment.
>>> >
>>> > Regards,
>>> > Vaibhav Dalvi
>>> > EnterpriseDB
>>>
>>> Thanks Vaibhav for the review.
>>> This change was added by me in v04. Only in the case of a file, we should restore these commands. Attached patch is
fixingthe same.
>>>
>>> If we dump and restore the same file with the same user, then we will get an error of ROLE CREATE as the same role
isalready created. I think, either we can ignore this error, or we can keep it as a restore can be done with different
users.
>>>>
>>>> mst@localhost bin]$ ./pg_restore d1 -C -d postgres
>>>> pg_restore: error: could not execute query: ERROR: role "mst" already exists
>>>> Command was: CREATE ROLE mst;
>>>> ALTER ROLE mst WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION BYPASSRLS;
>>>>
>>>>
>>>> pg_restore: warning: errors ignored on restore: 1
>>>
>>>
>>>
>>> >
>>> > On Fri, Oct 31, 2025 at 2:51 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>> >>
>>> >> On Tue, 28 Oct 2025 at 11:32, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>> >> >
>>> >> > On Thu, 16 Oct 2025 at 16:24, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>> >> > >
>>> >> > > On Wed, 15 Oct 2025 at 23:05, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>> >> > > >
>>> >> > > > On Sun, 24 Aug 2025 at 22:12, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > On 2025-08-23 Sa 9:08 PM, Noah Misch wrote:
>>> >> > > > >
>>> >> > > > > On Wed, Jul 30, 2025 at 02:51:59PM -0400, Andrew Dunstan wrote:
>>> >> > > > >
>>> >> > > > > OK, now that's reverted we should discuss how to proceed. I had two thoughts
>>> >> > > > > - we could use invent a JSON format for the globals, or we could just use
>>> >> > > > > the existing archive format. I think the archive format is pretty flexible,
>>> >> > > > > and should be able to accommodate this. The downside is it's not humanly
>>> >> > > > > readable. The upside is that we don't need to do anything special either to
>>> >> > > > > write it or parse it.
>>> >> > > > >
>>> >> > > > > I would first try to use the existing archiver API, because that makes it
>>> >> > > > > harder to miss bugs. Any tension between that API and pg_dumpall is likely to
>>> >> > > > > have corresponding tension on the pg_restore side. Resolving that tension
>>> >> > > > > will reveal much of the project's scope that remained hidden during the v18
>>> >> > > > > attempt. Perhaps more important than that, using the archiver API means
>>> >> > > > > future pg_dump and pg_restore options are more likely to cooperate properly
>>> >> > > > > with $SUBJECT. In other words, I want it to be hard to add pg_dump/pg_restore
>>> >> > > > > features that malfunction only for $SUBJECT archives. The strength of the
>>> >> > > > > archiver architecture shows in how rarely new features need format-specific
>>> >> > > > > logic and how rarely format-specific bugs get reported. We've had little or
>>> >> > > > > no trouble with e.g. bugs that appear in -Fd but not in -Fc.
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > Yeah, that's what we're going to try.
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > cheers
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > andrew
>>> >> > > > >
>>> >> > > > > --
>>> >> > > > > Andrew Dunstan
>>> >> > > > > EDB: https://www.enterprisedb.com
>>> >> > > >
>>> >> > > > Thanks Andrew, Noah and all others for feedback.
>>> >> > > >
>>> >> > > > Based on the above suggestions and discussions, I removed sql commands
>>> >> > > > from the global.dat file. For global commands, now we are making
>>> >> > > > toc.dat/toc.dmp/toc.tar file based on format specified and based on
>>> >> > > > format specified, we are making archive entries for these global
>>> >> > > > commands. By this approach, we removed the hard-coded parsing part of
>>> >> > > > the global.dat file and we are able to skip DROP DATABASE with the
>>> >> > > > globals-only option.
>>> >> > > >
>>> >> > > > Here, I am attaching a patch for review, testing and feedback. This is
>>> >> > > > a WIP patch. I will do some more code cleanup and will add some more
>>> >> > > > comments also. Please review this and let me know design level
>>> >> > > > feedback. Thanks Tushar Ahuja for some internal testing and feedback.
>>> >> > > >
>>> >> > >
>>> >> > > Hi,
>>> >> > > Here, I am attaching an updated patch. In offline discussion, Andrew
>>> >> > > reported some test-case failures(Thanks Andrew). I fixed those.
>>> >> > > Please let me know feedback for the patch.
>>> >> > >
>>> >> >
>>> >> > Hi,
>>> >> > Here I am attaching a re-based patch as v02 was failing on head.
>>> >> > Thanks Tushar for the testing.
>>> >> > Please review this and let me know feedback.
>>> >> >
>>> >>
>>> >> Hi all,
>>> >> Here I am attaching an updated patch for review and testing. Based on
>>> >> some offline comments by Andrew, I did some code cleanup.
>>> >> Please consider this patch for feedback.
>>> >>
>>> >> --
>>> >> Thanks and Regards
>>> >> Mahendra Singh Thalor
>>> >> EnterpriseDB: http://www.enterprisedb.com
>>>
>>>
>>>
>>> --
>>> Thanks and Regards
>>> Mahendra Singh Thalor
>>> EnterpriseDB: http://www.enterprisedb.com
Here, I am attaching an updated patch for the review and testing.
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: