Re: WIP patch for parallel pg_dump - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: WIP patch for parallel pg_dump |
Date | |
Msg-id | 27542.1291573727@sss.pgh.pa.us Whole thread Raw |
In response to | Re: WIP patch for parallel pg_dump (Greg Smith <greg@2ndquadrant.com>) |
Responses |
Re: WIP patch for parallel pg_dump
Re: WIP patch for parallel pg_dump |
List | pgsql-hackers |
Greg Smith <greg@2ndquadrant.com> writes: > In addition, Joachim submitted a synchronized snapshot patch that looks > to me like it slipped through the cracks without being fully explored. > ... > The way I read that thread, there were two objections: > 1) This mechanism isn't general enough for all use-cases outside of > pg_dump, which doesn't make it wrong when the question is how to get > parallel pg_dump running > 2) Running as superuser is excessive. Running as the database owner was > suggested as likely to be good enough for pg_dump purposes. IIRC, in old discussions of this problem we first considered allowing clients to pull down an explicit representation of their snapshot (which actually is an existing feature now, txid_current_snapshot()) and then upload that again to become the active snapshot in another connection. That was rejected on the grounds that you could cause all kinds of mischief by uploading a bad snapshot; so we decided to think about providing a server-side-only means to clone another backend's current snapshot. Which is essentially what Joachim's above-mentioned patch provides. However, as was discussed in that thread, that approach is far from being ideal either. I'm wondering if we should reconsider the pass-it-through-the-client approach, because if we could make that work it would be more general and it wouldn't need any special privileges. The trick seems to be to apply sufficient sanity testing to the snapshot proposed to be installed in the subsidiary transaction. I think the requirements would basically be (1) xmin <= any listed XIDs < xmax (2) xmin not so old as to cause GlobalXmin to decrease (3) xmax not beyond current XID counter (4) XID list includes all still-running XIDs in the given range One tricky part would be ensuring GlobalXmin doesn't decrease when the snap is installed, but I think that could be made to work if we take ProcArrayLock exclusively and insist on observing some other running transaction with xmin <= proposed xmin. For the pg_dump case this would certainly hold since xmin would be the parent pg_dump's xmin. Given the checks stated above, it would be possible for someone to install a snapshot that corresponds to no actual state of the database, eg it shows some T1 as running and T2 as committed when actually T1 committed before T2. I don't see any simple way for the installation function to detect that, but I'm not sure whether it matters. The user might see inconsistent data, but do we care? Perhaps as a safety measure we should only allow snapshot installation in read-only transactions, so that even if the xact does observe inconsistent data it can't possibly corrupt the database state thereby. This'd be no skin off pg_dump's nose, obviously. Or compromise on "only superusers can do it in non-read-only transactions". Thoughts? regards, tom lane
pgsql-hackers by date: