Re: Non-text mode for pg_dumpall - Mailing list pgsql-hackers
From | Mahendra Singh Thalor |
---|---|
Subject | Re: Non-text mode for pg_dumpall |
Date | |
Msg-id | CAKYtNAr-6_CCz+tnJ0T2597Ar4iAZXkE1qPimsO9EY-Kn7LvzQ@mail.gmail.com Whole thread Raw |
In response to | Re: Non-text mode for pg_dumpall (Noah Misch <noah@leadboat.com>) |
List | pgsql-hackers |
Thanks Noah for the comments. On Wed, 9 Jul 2025 at 02:58, Noah Misch <noah@leadboat.com> wrote: > > On Fri, Apr 04, 2025 at 04:11:05PM -0400, Andrew Dunstan wrote: > > Thanks. I have pushed these now with a few further small tweaks. > > This drops all databases: > > pg_dumpall --clean -Fd -f /tmp/dump > pg_restore -d template1 --globals-only /tmp/dump > > That didn't match my expectations given this help text: > > $ pg_restore --help|grep global > -g, --globals-only restore only global objects, no databases Databases are global objects so due to --clean command, we are putting drop commands in global.dat for all the databases. While restoring, we used the "--globals-only" option so we are dropping all these databases by global.dat file. Please let us know your expectations for this specific case. > > This happens in dropDBs(). I found that by searching pg_dumpall.c for "OPF", > which finds all the content we can write to globals.dat. > > commit 1495eff wrote: > > --- a/src/bin/pg_dump/pg_dumpall.c > > +++ b/src/bin/pg_dump/pg_dumpall.c > > > @@ -1612,9 +1683,27 @@ dumpDatabases(PGconn *conn) > > continue; > > } > > > > + /* > > + * If this is not a plain format dump, then append dboid and dbname to > > + * the map.dat file. > > + */ > > + if (archDumpFormat != archNull) > > + { > > + if (archDumpFormat == archCustom) > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".dmp", db_subdir, oid); > > + else if (archDumpFormat == archTar) > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".tar", db_subdir, oid); > > + else > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\"", db_subdir, oid); > > Use appendShellString() instead. Plain mode already does that for the > "pg_dumpall -f" argument, which is part of db_subdir here. We don't want > weird filename characters to work out differently for plain vs. non-plain > mode. Also, it's easier to search for appendShellString() than to search for > open-coded shell quoting. Yes, we can use appendShellString also. We are using snprintf in the pg_dump.c file also. Ex: snprintf(tagbuf, sizeof(tagbuf), "LARGE OBJECTS %u..%u", loinfo->looids[0], loinfo->looids[loinfo->numlos - 1]); If we want to use appendShellString, I can write a patch for these. Please let me know your opinion. > > > @@ -1641,19 +1727,30 @@ dumpDatabases(PGconn *conn) > > if (filename) > > fclose(OPF); > > > > - ret = runPgDump(dbname, create_opts); > > + ret = runPgDump(dbname, create_opts, dbfilepath, archDumpFormat); > > if (ret != 0) > > pg_fatal("pg_dump failed on database \"%s\", exiting", dbname); > > > > if (filename) > > { > > - OPF = fopen(filename, PG_BINARY_A); > > + char global_path[MAXPGPATH]; > > + > > + if (archDumpFormat != archNull) > > + snprintf(global_path, MAXPGPATH, "%s/global.dat", filename); > > + else > > + snprintf(global_path, MAXPGPATH, "%s", filename); > > + > > + OPF = fopen(global_path, PG_BINARY_A); > > if (!OPF) > > pg_fatal("could not re-open the output file \"%s\": %m", > > - filename); > > + global_path); > > Minor item: plain mode benefits from reopening, because pg_dump appended to > the plain output file. There's no analogous need to reopen global.dat, since > just this one process writes to global.dat. yes, only once we need to open global.dat file but to keep simple code, we kept old code. > > > @@ -1672,17 +1770,36 @@ runPgDump(const char *dbname, const char *create_opts) > > initPQExpBuffer(&connstrbuf); > > initPQExpBuffer(&cmd); > > > > - printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, > > - pgdumpopts->data, create_opts); > > - > > /* > > - * If we have a filename, use the undocumented plain-append pg_dump > > - * format. > > + * If this is not a plain format dump, then append file name and dump > > + * format to the pg_dump command to get archive dump. > > */ > > - if (filename) > > - appendPQExpBufferStr(&cmd, " -Fa "); > > + if (archDumpFormat != archNull) > > + { > > + printfPQExpBuffer(&cmd, "\"%s\" -f %s %s", pg_dump_bin, > > + dbfile, create_opts); > > + > > + if (archDumpFormat == archDirectory) > > + appendPQExpBufferStr(&cmd, " --format=directory "); > > + else if (archDumpFormat == archCustom) > > + appendPQExpBufferStr(&cmd, " --format=custom "); > > + else if (archDumpFormat == archTar) > > + appendPQExpBufferStr(&cmd, " --format=tar "); > > + } > > else > > - appendPQExpBufferStr(&cmd, " -Fp "); > > + { > > + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, > > + pgdumpopts->data, create_opts); > > This uses pgdumpopts for plain mode only, so many pg_dumpall options silently > have no effect in non-plain mode. Example: > > strace -f pg_dumpall --lock-wait-timeout=10 2>&1 >/dev/null | grep exec > strace -f pg_dumpall --lock-wait-timeout=10 -Fd -f /tmp/dump3 2>&1 >/dev/null | grep exec Agreed. We can add pgdumpopts->data to all the dump formats. > > > --- a/src/bin/pg_dump/pg_restore.c > > +++ b/src/bin/pg_dump/pg_restore.c > > > +/* > > + * read_one_statement > > + * > > + * This will start reading from passed file pointer using fgetc and read till > > + * semicolon(sql statement terminator for global.dat file) > > + * > > + * EOF is returned if end-of-file input is seen; time to shut down. > > What makes it okay to use this particular subset of SQL lexing? To support complex syntax, we used this code from another file. > > > +/* > > + * get_dbnames_list_to_restore > > + * > > + * This will mark for skipping any entries from dbname_oid_list that pattern match an > > + * entry in the db_exclude_patterns list. > > + * > > + * Returns the number of database to be restored. > > + * > > + */ > > +static int > > +get_dbnames_list_to_restore(PGconn *conn, > > + SimpleOidStringList *dbname_oid_list, > > + SimpleStringList db_exclude_patterns) > > +{ > > + int count_db = 0; > > + PQExpBuffer query; > > + PGresult *res; > > + > > + query = createPQExpBuffer(); > > + > > + if (!conn) > > + pg_log_info("considering PATTERN as NAME for --exclude-database option as no db connection while doingpg_restore."); > > When do we not have a connection here? We'd need to document this behavior > variation if it stays, but I'd prefer if we can just rely on having a > connection. Yes, we can document this behavior. > > > + /* If database is already created, then don't set createDB flag. */ > > + if (opts->cparams.dbname) > > + { > > + PGconn *test_conn; > > + > > + test_conn = ConnectDatabase(db_cell->str, NULL, opts->cparams.pghost, > > + opts->cparams.pgport, opts->cparams.username,TRI_DEFAULT, > > + false, progname, NULL, NULL, NULL, NULL); > > + if (test_conn) > > + { > > + PQfinish(test_conn); > > + > > + /* Use already created database for connection. */ > > + opts->createDB = 0; > > + opts->cparams.dbname = db_cell->str; > > + } > > + else > > + { > > + /* we'll have to create it */ > > + opts->createDB = 1; > > + opts->cparams.dbname = connected_db; > > + } > > In released versions, "pg_restore --create" fails if the database exists, and > pg_restore w/o --create fails unless the database exists. I think we should > continue that pattern in this new feature. If not, pg_restore should document > how it treats pg_dumpall-sourced dumps with the "create if not exists" > semantics appearing here. Yes, we can document this behavior also. I am working on all these review comments and I will post a patch in the coming days. Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: