Re: [PATCH] lock_timeout and common SIGALRM framework - Mailing list pgsql-hackers
From | Boszormenyi Zoltan |
---|---|
Subject | Re: [PATCH] lock_timeout and common SIGALRM framework |
Date | |
Msg-id | 500085DA.5050908@cybertec.at Whole thread Raw |
In response to | Re: [PATCH] lock_timeout and common SIGALRM framework (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [PATCH] lock_timeout and common SIGALRM framework
Re: [PATCH] lock_timeout and common SIGALRM framework |
List | pgsql-hackers |
2012-07-12 19:05 keltezéssel, Tom Lane írta: > Here is a revised version of the timeout-infrastructure patch. > I whacked it around quite a bit, notably: > > * I decided that the most convenient way to handle the initialization > issue was to combine establishment of the signal handler with resetting > of the per-process variables. So handle_sig_alarm is no longer global, > and InitializeTimeouts is called at the places where we used to do > "pqsignal(SIGALRM, handle_sig_alarm);". I believe this is correct > because any subprocess that was intending to use SIGALRM must have > called that before establishing any timeouts. OK. > * BTW, doing that exposed the fact that walsender processes were failing > to establish a SIGALRM signal handler, which is a pre-existing bug, > because they run a normal authentication transaction during InitPostgres > and hence need to be able to cope with deadlock and statement timeouts. > I will do something about back-patching a fix for that. Wow, my work uncovered a pre-existing bug in PostgreSQL. :-) > * I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT > and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get > done in walsender and autovacuum processes. I'm not totally satisfied > with that, but for sure it didn't work to only establish them in > regular backends. > > * I didn't like the "TimeoutName" nomenclature, because to me "name" > suggests that the value is a string, not just an enum. So I renamed > that to TimeoutId. OK. > * I whacked around the logic in timeout.c a fair amount, because it > had race conditions if SIGALRM happened while enabling or disabling > a timeout. I believe the attached coding is safe, but I'm not totally > happy with it from a performance standpoint, because it will do two > setitimer calls (a disable and then a re-enable) in many cases where > the old code did only one. Disabling deadlock timeout, a.k.a. disable_sig_alarm(false) in the original code called setitimer() twice if statement timeout was still in effect, it was done in CheckStatementTimeout(). Considering this, I think there is no performance problem with the new code you came up with. > I think what we ought to do is go ahead and apply this, so that we > can have the API nailed down, and then we can revisit the internals > of timeout.c to see if we can't get the performance back up. > It's clearly a much cleaner design than the old spaghetti logic in > proc.c, so I think we ought to go ahead with this independently of > whether the second patch gets accepted. There is one tiny bit you might have broken. You wrote previously: > I am also underwhelmed by the "timeout_start" callback function concept. > In the first place, that's broken enable_timeout, which incorrectly > assumes that the value it gets must be "now" (see its schedule_alarm > call). In the second place, it seems fairly likely that callers of > get_timeout_start would likewise want the clock time at which the > timeout was enabled, not the timeout_start reference time. (If they > did want the latter, why couldn't they get it from wherever the callback > function had gotten it?) I'm inclined to propose that we drop the > timeout_start concept and instead provide two functions for scheduling > interrupts: > > enable_timeout_after(TimeoutName tn, int delay_ms); > enable_timeout_at(TimeoutName tn, TimestampTz fin_time); > > where you use the former if you want the standard GetCurrentTimestamp + > n msec calculation, but if you want the stop time calculated in some > other way, you calculate it yourself and use the second function. It's okay, but you haven't really followed it with STATEMENT_TIMEOUT: -----8<----------8<----------8<----------8<----------8<----- *** 2396,2404 **** /* Set statement timeout running, if any */ /* NB: this mustn't be enableduntil we are within an xact */ if (StatementTimeout > 0) ! enable_sig_alarm(StatementTimeout, true); else ! cancel_from_timeout = false; xact_started = true; } --- 2397,2405 ---- /* Set statement timeout running, if any */ /* NB: this mustn't be enableduntil we are within an xact */ if (StatementTimeout > 0) ! enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout); else ! disable_timeout(STATEMENT_TIMEOUT, false); xact_started = true; } -----8<----------8<----------8<----------8<----------8<----- It means that StatementTimeout losts its precision. It would trigger in the future counting from "now" instead of counting from GetCurrentStatementStartTimestamp(). It should be: enable_timeout_at(STATEMENT_TIMEOUT, TimestampTzPlusMilliseconds(GetCurrentStatementStartTimestamp(), StatementTimeout)); > I haven't really looked at the second patch yet, but at minimum that > will need some rebasing to match the API tweaks here. Yes, I will do that. Thanks for your review and work. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
pgsql-hackers by date: