Re: fork/exec patch: pre-CreateProcess finalization - Mailing list pgsql-patches
From | Claudio Natoli |
---|---|
Subject | Re: fork/exec patch: pre-CreateProcess finalization |
Date | |
Msg-id | A02DEC4D1073D611BAE8525405FCCE2B0280AA@harris.memetrics.local Whole thread Raw |
In response to | fork/exec patch: pre-CreateProcess finalization (Claudio Natoli <claudio.natoli@memetrics.com>) |
Responses |
Re: fork/exec patch: pre-CreateProcess finalization
|
List | pgsql-patches |
Thanks for your comments Tom. Tom Lane writes: > Claudio Natoli <claudio.natoli@memetrics.com> writes: > > This has required some reworking of the existing code base, particularly to > > BackendFork (which now actually does the fork()). As such, I've been > > anticipating that this will be the most controversial of > the fork/exec patches, so critique away :-) > > You haven't explained why that's necessary. Given the fact that this > patch seems to hugely complicate the postmaster logic --- not so much > either path individually, but the messy #ifdef interleaving of two > radically different programs --- I am inclined to reject it on style > grounds alone. I have to agree, as I wasn't particularly happy with the BackendFork section of this patch. You didn't really seem to comment on the pgstat changes, which were much cleaner, so perhaps (like me) you were happier with these changes. So, could you give me an indication as to whether or not making changes to the BackendFork logic as done for the pgstat logic (specifically, replacing fork() call with a new, argument marshalling routine to do fork/exec) would be more acceptable? The reason I avoided this in the BackendFork case was, predominantly, code duplication. Creating a new "BackendForkExec" would, I think, result in a lot of similarity between the two routines. [yes, BackendFork is poorly named, although I'd still think of moving the fork() call into BackendFork anyway, thereby justifying its name. But, much more importantly, for symmetry with the code format that will inevitably arise in the fork/exec case; see below] > We should either find a way to make the fork/exec path more like the > existing code, or bite the bullet and change them both. Doing it the > way you have here will create an unmaintainable mess. I'm not prepared > to work on a postmaster in which a step as basic as fork() happens in > two different places depending on an #ifdef. ["unmaintainable mess": yeah, I had the exact same thought when I'd finished. "No one's ever going to be able to rework this properly".] I'm afraid that, short of removing all SubPostmaster actions from BackendFork, the need to do fork in two different places (at least, physically, if not logically different) will remain. After fork/exec'ing, the child process can't recover the context needed to create the argument list which it'll need to build up the PostgresMain arg list. So, in summary, I see two solutions: a) do something similar to BackendFork as done for pgstat, specifically: - move the fork call into BackendFork - create a fork/exec equivalent to BackendFork - #ifdef the call to the appropriate function in BackendStartup PRO: child is forked in the same logical place CON: possible duplication in code between BackendFork + BackendForkExec b) delay the fork() call to the end of BackendFork in both fork/exec and normal cases, turning it into solely an argument formatter for PostgresMain - move BackendInit, port->cmdline_options parsing + MemoryContext calls into PostgresMain PRO: forking in same (physical) location in code; no duplication CON: additional complexity at the start of PostgresMain to avoid these calls in stand-alone backend case Which is the lesser of two evils? FWIW, I prefer the latter, as it makes the normal + fork/exec cases effectively identical, for a little added complexity to PostgresMain. There's also the alternative you outlined, but I think it is less workable than the above options: > Now it seems to me that a lot of the mess in your latest patch comes > from trying to set up the backend command string before you've forked, > and therefore before you've done any of the sub-postmaster actions. > That's not gonna work. I agree that this is where the mess comes from, but I think this (setting up backend command string before fork) is *exactly* what needs to occur in the fork/exec case (leading into the Win32 CreateProcess calls). If we want this code to work in the Win32 case, we basically can't do any processing between the fork + exec calls, and I think recovering all the context to do the command string marshalling post fork/exec is untenable. > What I'd suggest is adding a new entry point from main.c, say > SubPostmasterMain(), with its own command line syntax. Perhaps > postmaster -fork ... additional args ... > (where -fork is what cues main.c where to dispatch to, and the > additional args are just those that need to be passed from the > postmaster to the sub-postmaster, without any of the additional > information that will be learned by the sub-postmaster during > client authentication.) > > In the Windows case the postmaster would fork/exec to this routine, > which would re-load as much of the postmaster context as was needed > and then call BackendFork (which we really oughta rename to something > else). BackendFork would execute the client auth process and then > simply call PostgresMain with a correctly constructed argument list. Here's the critical difference in our thinking: ISTM that we'll have to go to a lot of work to get the fork/exec case to re-load the context required to format up the argument list to PostgresMain. Certainly, it is a lot more work than allowing the postmaster itself (not the forked child) to create the argument list, albeit moving auth to PostgresMain. Ok. So, in conclusion we are in agreement on at least one thing: the current patch sucks. But, I suspect we may have an impasse on a sole crucial point: setting up the backend command string before forking. I'm all for it, and you seem pretty much against it. So, I guess I could try to bring you around with another, cleaner patch using one of the alternatives listed above. But, before doing that, I guess I should give you a chance to: a) try to convince me otherwise b) indicate which of the two alternatives you (even grudgingly :-) prefer (or another alternative in light of the above) Again, thanks for your comments, and looking forward to your reply (so that, hopefully, I can get cracking on a new patch). Cheers, Claudio --- Certain disclaimers and policies apply to all email sent from Memetrics. For the full text of these disclaimers and policies see <a href="http://www.memetrics.com/emailpolicy.html">http://www.memetrics.com/em ailpolicy.html</a>
pgsql-patches by date: