Re: GUC assign hooks (was Re: wal_buffers = -1 and SIGHUP) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: GUC assign hooks (was Re: wal_buffers = -1 and SIGHUP) |
Date | |
Msg-id | BANLkTinT-0E_QuZYGcTuEq5c4bVVxZtS_w@mail.gmail.com Whole thread Raw |
In response to | Re: GUC assign hooks (was Re: wal_buffers = -1 and SIGHUP) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: GUC assign hooks (was Re: wal_buffers = -1 and SIGHUP)
|
List | pgsql-hackers |
On Mon, Apr 4, 2011 at 2:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> IMO the real problem is essentially that GUC assign hooks have two >> functions, checking and canonicalization of the value-to-be-stored >> versus executing secondary actions when an assignment is made; and >> there's no way to get at just the first one. So we cannot canonicalize >> the value first and then see if it's equal to the current setting. >> I think the only real fix is to change the API for assign hooks. This >> is a bit annoying but it's not like we haven't ever done that before. >> I'm thinking about splitting assign hooks into two functions, along the >> lines of > >> bool check_hook (datatype *new_value, GucSource source) >> bool assign_hook (datatype new_value, GucSource source) > > After perusing the existing assign_hook functions, I have some further > thoughts about this proposal. > > * Many of the existing assign hooks do a nontrivial amount of > computation (such as catalog lookups) to derive internal values from the > presented string; examples include assign_temp_tablespaces and > assign_timezone. A naive implementation of the above design would > require the assign_hook to repeat this computation after the check_hook > already did it, which of course is undesirable. > > * Assign hooks that do catalog lookups need special pushups to avoid > having to do such lookups while restoring a previous GUC setting during > transaction abort (since lookups can't safely be performed in an aborted > transaction). Up to now we've used ad-hoc techniques for each such > variable, as seen for instance in assign_session_authorization. The > usual idea is to modify the original string to store additional data, > which then requires a custom show_hook to ensure only the original part > of the string is shown to users. The messiest aspect of this is that > it must be possible to reliably tell a modified string from original > user input. > > I think that we can avoid the first problem and clean up the second > problem if we do this: > > 1. Code guc.c so that the check_hook is only applied to original user > input. When restoring a previous setting during abort (which > necessarily will have been passed through the check_hook at an earlier > time), only call the assign_hook. > > 2. Guarantee that the string pointer passed to the assign_hook is > exactly what was returned by the check_hook, ie, guc.c is not allowed > to duplicate or copy that string. > > Given these rules, a check_hook and assign_hook could cooperate to store > additional data in what guc.c thinks is just a pointer to a string > value, ie, there can be more data after the terminating \0. The > assign_hook could work off just this additional data without ever doing > a catalog lookup. No special show_hook is needed. The only thing this proposal has to recommend it is that the current coding is even worse. > Of course this only works for string GUC variables, but I'm finding it > hard to visualize a case where a bool, int, float, or enum GUC could > need a catalog lookup to interpret. We could possibly legislate that > all of these are separately malloc'd to allow the same type of trick to > be applied across the board, but I think that's overkill. We can just > tell people they must use a string GUC if they need hidden data. > > This technique would need documentation of course, but at least it > *would* be documented rather than ad-hoc for each variable. > > Another variant would be to allow the check_hook to pass back a separate > "void *" value that could be passed on to the assign_hook, containing > any necessary derived data. This is logically a bit cleaner, and would > work for all types of GUC variables; but it would make things messier in > guc.c since there would be an additional value to pass around. I'm not > convinced it's worth that, but could be talked into it if anyone feels > strongly about it. > > Another thing I was reminded of while perusing the code is the comment > for GUC_complaint_elevel: > > * At some point it'd be nice to replace this with a mechanism that allows > * the custom message to become the DETAIL line of guc.c's generic message. > > The reason we invented GUC_complaint_elevel in the first place was to > avoid a change in the signatures of assign hooks. If we are making such > a change, now would be the time to fix it, because we're never gonna fix > it otherwise. I see a few ways we could do it: > > 1. Add a "char **errdetail" parameter to assign_hooks, which guc.c would > initialize to NULL before call. If the hook stores a non-null pointer > there, guc.c would include that string as errdetail. This is the least > effort from guc.c's viewpoint, but will be relatively painful to use > from the hook's standpoint, because it'd generally have to palloc some > space, or maybe even set up a StringInfo buffer to contain the generated > message. > > 2. Add a "char *errdetail" parameter to assign_hooks, which points at > a local-variable buffer in the calling routine, of a well-known size > (think GUC_ERRDETAIL_BUFSIZE macro in guc.h). Then hooks do something > like > snprintf(errdetail, GUC_ERRDETAIL_BUFSIZE, _("format"), ...); > to return an error detail string. > > 3. Create a function in guc.c for hooks to call, along the lines of > void GUC_assign_errdetail(const char *format, ...) > The simplest implementation of this would rely on the assumption that > assign-hook-calling isn't re-entrant, so it could just store the > formatted string in a static variable. That seems a bit ugly, but at > least the ugliness would be hidden in a well-defined place, where it > could be fixed locally if the assumption ever breaks down. > > At this point I'm leaning to #3 but wondered if anyone would see it > as either overkill or too ugly. With any of these variants, we could > forget about my previous suggestion of adding an explicit elevel > parameter to assign hook calls, since the hooks would no longer need > to know the elog level to use anyway. Definitely +1 for #3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: