Re: pg_background (and more parallelism infrastructure patches) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: pg_background (and more parallelism infrastructure patches) |
Date | |
Msg-id | CA+Tgmobr9u15pE9d6gTD0DLwnjLwSDyW72PeaKc9-kROB0=Duw@mail.gmail.com Whole thread Raw |
In response to | Re: pg_background (and more parallelism infrastructure patches) (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: pg_background (and more parallelism infrastructure
patches)
|
List | pgsql-hackers |
On Mon, Nov 10, 2014 at 1:29 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> That's an issue to which we may need to engineer some solution, but >> not in this patch. Overall, the patch's architecture is modeled >> closely on the way we synchronize GUCs to new backends when using >> EXEC_BACKEND, and I don't think we're going do any better than stating >> with that and working to file down the rough edges as we go. So I'd >> like to go ahead and commit this. > > Did you check out whether PGC_BACKEND GUCs work? Other than that I agree > that we can just solve further issues as we notice them. This isn't > particularly intrustive. Is the scenario you're concerned about this one: 1. Start postmaster. 2. Start a user session. 3. Change a PGC_BACKEND GUC in postgresql.conf. 4. Reload. 5. Launch a background worker that uses this code. It should end up with the same values there as the active session, not the current one from postgresql.conf. But we want to verify that's the behavior we actually get. Right? >> -- It doesn't handle types without send/receive functions. I thought >> that was tolerable since the functions without such types seem like >> mostly edge cases, but Andres doesn't like it. Apparently, BDR is >> makes provisions to either ship the tuple as one big blob - if all >> built-in types - or else use binary format where supported and text >> format otherwise. Since this issue is common to BDR, parallelism, and >> pg_background, we might want to introduce some common infrastructure >> for it, but it's not too clear to me right now what that should look >> like. > > I think we definitely need to solve this - but I'm also not at all that > sure how. > > For something like pg_background it's pretty trivial to fall back to > in/out when there's no send/recv. It's just a couple of lines, and the > portal code has the necessary provisions. So solving it for > pg_background itself is pretty trivial. That's not really the interesting part of the problem, I think. Yeah, pg_background can be made to speak text format if we really care. But the real issue is that even with the binary format, converting a tuple in on-disk format into a DataRow message so that the other side can reconstruct the tuple and shove it into an executor slot (or wherever it wants to shove it) seems like it might be expensive. You might have data on that; if it's not expensive, stop here. If it is expensive, what we really want to do is figure out some way to make it safe to copy the tuple into the shared memory queue, or send it out over the socket, and have the process on the other end use it without having to revalidate the whole thing column-by-column. > But, as you say, pg_background > itself isn't a primary goal (although a nice demonstration, with some > real world applications). There's enough differences between the > parallelism and the replication cases that I'm not entirely sure how > much common ground there is. Namely replication has the additional > concerns of version differences, trust (don't use blobs if you don't > fully trust the other side), and differences in the allocated oids > (especially relevant for arrays which, very annoyingly, embed the oid in > the send/recv format). Yeah. It would be nice to use the same code for deciding what to do in a particular case. It seems like it ought to be possible to have one rule for whether a tuple with a given set of data types is safe to ship in the on-disk format. For pg_background/parallelism, it's enough if that function returns true; for BDR, there might be additional criteria applied before deciding to do it that way. But they could use the same decision tree for part of it. What would be even better is to find some way to MAKE IT SAFE to send the undecoded tuple. I'm not sure what that would look like. Locking the types against concurrent changes? Marking dropped types as dropped without actually dropping them, and garbage collecting them later? Providing safer ways to deconstruct tuples? It might help to start by enumerating what exactly can go wrong here. As far as I can see, the main risks for pg_background and parallelism are (1) that the type might get concurrently dropped, leaving us unable to interpret the received tuple and (2) that the type might get concurrently modified in some way that leaves us confused about how to interpret the tuple, as by having the type size change. The chances of a problem in practice seem remote, since you can't do either of those things if a table column with that type exists, but I can't convince myself that there's no way for the type to be modified under us in such a way that two different backends have different ideas about whether it exists or how wide it is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: