Re: [PATCH] Largeobject access controls - Mailing list pgsql-hackers
From | KaiGai Kohei |
---|---|
Subject | Re: [PATCH] Largeobject access controls |
Date | |
Msg-id | 4AD675A6.6050208@ak.jp.nec.com Whole thread Raw |
In response to | Re: [PATCH] Largeobject access controls (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [PATCH] Largeobject access controls
|
List | pgsql-hackers |
Robert Haas wrote: > On Wed, Oct 14, 2009 at 12:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> KaiGai Kohei <kaigai@ak.jp.nec.com> writes: >>> Tom Lane wrote: >>>> The most serious problem is that you ripped out myLargeObjectExists(), >>>> apparently because you didn't understand what it's there for. The reason >>>> it's there is to ensure that pg_dump runs don't fail. pg_dump is expected >>>> to produce a consistent dump of all large objects that existed at the >>>> time of its transaction snapshot. With this code, pg_dump would get a >>>> "large object doesn't exist" error on any LO that is deleted between the >>>> time of the snapshot and the time pg_dump actually tries to dump it --- >>>> which could be quite a large window in a large database. >>> The origin of this matter is the basis of existence was changed. >>> Our current basis is whether pg_largeobject has one or more data chunk for >>> the given loid in the correct snapshot, or not. >>> The newer one is whether pg_largeobject_metadata has the entry for the given >>> loid in the SnapshotNow, or not. >> Which is wrong. You can certainly switch your attention as to which >> catalog to look in, but you can't change the snapshot that the test is >> referenced to. >> >>> The newer basis is same as other database objects, such as functions. >>> But why pg_dump performs correctly for these database objects? >> The reason pg_dump works reasonably well is that it takes locks on >> tables, and the other objects it dumps (such as functions) have >> indivisible one-row representations in the catalogs. They're either >> there when pg_dump looks, or they're not. What you would have here >> is that pg_dump would see LO data chunks when it looks into >> pg_largeobject using its transaction snapshot, and then its attempts to >> open those LO OIDs would fail because the metadata is not there anymore >> according to SnapshotNow. >> >> There are certainly plenty of corner-case issues in pg_dump; I've >> complained before that it's generally a bad idea to be migrating pg_dump >> functionality into internal backend routines that tend to rely on >> SnapshotNow. But if we change LO dumping like this, it's not going to >> be a corner case --- it's going to be a common race-condition failure >> with a window measured in many minutes if not hours. That's more than >> sufficient reason to reject this patch, because the added functionality >> isn't worth it. And pg_dump isn't even the only client that depends >> on the assumption that a read-only open LO is static. >> >>>> Moving on to lesser but still significant problems, you probably already >>>> guessed my next one: there's no pg_dump support. >>> Hmm. I planed to add support to the pg_dump next to the serve-side enhancement. >> We do not commit system changes that lack necessary pg_dump support. >> There are some things you can leave till later, but pg_dump is not >> negotiable. (Otherwise, testers would be left with databases they >> can't dump/reload across the next forced initdb.) > > As part of closing out this CommitFest, I am marking this patch as > Returned with Feedback. Because the amount of work that remains to be > done is so substantial, that might have been a good idea anyway, but > since the clock is expiring the point is moot. Could you wait for the next 24 hours please? It is unproductive to break off the discussion for a month, even if the patch will be actually commited at the December. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: