Re: Bug in pg_dump/restore -o - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Bug in pg_dump/restore -o |
Date | |
Msg-id | 3838.1011326894@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Bug in pg_dump/restore -o (Philip Warner <pjw@rhyme.com.au>) |
Responses |
Re: Bug in pg_dump/restore -o
Re: Bug in pg_dump/restore -o |
List | pgsql-hackers |
Philip Warner <pjw@rhyme.com.au> writes: > I'm around now; do you still want me to look into this? Yup. >> pg_dump: [tar archiver] bad COPY statement - could not find "copy" in > string "cr >> eate temporary table pgdump_oid (dummy int4); >> copy pgdump_oid with oids from stdin; > I'm not sure, from your patch code, how it got the CREATE and COPY > statements in the one string - did you by any chance use an defective dump > file from a previous patch attempt? Sorry that I wasn't clearer. This message did not come from the given patch. What I had tried to do was issue a single ArchiveEntry call with the CREATE TABLE/COPY both in the copyStmt string of the one archive entry. I don't have that code handy anymore, but it was approximately ArchiveEntry(fout, "0", "Max OID", "<Init>", NULL, "", "", "CREATE TEMPORARY TABLE pgdump_oid(dummy int4);\n" "COPY pgdump_oid WITH OIDS FROM stdin;\n", "", setMaxOid_dumper,NULL); in setMaxOid, and the same code as given in the patch for setMaxOid_dumper. > This leads me to the question: when should the OID restoration be performed > - in the SCHEMA or DATA phase? ISTM that it is part of the data, and should > be performed after data restoration. But perhaps I misunderstand what it is > for. Well, that's an interesting question. I thought it was data too, when I was making these patches. But I just got off the horn from a long conversation with Bruce, who put in this max-oid-setting code to begin with, and what he says is that the original intention was to force up the OID counter *before* loading any schema information *or* data. The idea being that if you believe that OIDs are unique across your whole database, then you want the OIDs assigned to tables, sequences, rules, etc etc to be disjoint from those assigned to your user data rows. Since we can't directly restore the original OIDs of these constructs, the best we can do is ensure they will be assigned OIDs beyond the ones being loaded into data rows. Of course this whole concept collapses in the face of OID wraparound, not to mention the current idea that we should change to generating OIDs from per-table counters instead of globally. So one possible answer is to say "forget it, we don't guarantee that anymore" and just rip out the setMaxOid code entirely. But since we don't have per-table OID counters yet, that seems a bit premature to both me and Bruce. If we do want to try to maintain the original concept, it seems that the OID-setting code needs to be emitted in a way that would cause it to be executed *first* and in *all* restore modes: schema only, data only, or schema+data. Not sure if that's even possible given the current design of the archiver, but you'd know better than I. >>A potentially more serious problem is that if the archiving code chooses >>to issue other operations between the schema restore and data restore >>for the temp table, we might do a \connect and lose the temp table. >Why is this a problem - I presume I don't understand the OID allocation stuff. CREATE TEMP TABLE pgdump_oids (...); \connect - otheruser CREATE TABLE someothertable; ... \connect - postgres COPY pgdump_oids FROM stdin; ERROR: pgdump_oids doesn't exist Since \connect starts a fresh connection, the temp table will disappear. AFAICS, the only way to work around this is to be certain that no other archive TOC entries will be emitted between the schema and data for pgdump_oids (if these are separate TOC entries). I'd prefer a solution in which they are only one TOC entry, since that seems less likely to break in the future... regards, tom lane
pgsql-hackers by date: