Re: [PATCH] lock_timeout and common SIGALRM framework - Mailing list pgsql-hackers
| From | Boszormenyi Zoltan |
|---|---|
| Subject | Re: [PATCH] lock_timeout and common SIGALRM framework |
| Date | |
| Msg-id | 500085DA.5050908@cybertec.at Whole thread Raw |
| In response to | Re: [PATCH] lock_timeout and common SIGALRM framework (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
Re: [PATCH] lock_timeout and common SIGALRM framework
Re: [PATCH] lock_timeout and common SIGALRM framework |
| List | pgsql-hackers |
2012-07-12 19:05 keltezéssel, Tom Lane írta:
> Here is a revised version of the timeout-infrastructure patch.
> I whacked it around quite a bit, notably:
>
> * I decided that the most convenient way to handle the initialization
> issue was to combine establishment of the signal handler with resetting
> of the per-process variables. So handle_sig_alarm is no longer global,
> and InitializeTimeouts is called at the places where we used to do
> "pqsignal(SIGALRM, handle_sig_alarm);". I believe this is correct
> because any subprocess that was intending to use SIGALRM must have
> called that before establishing any timeouts.
OK.
> * BTW, doing that exposed the fact that walsender processes were failing
> to establish a SIGALRM signal handler, which is a pre-existing bug,
> because they run a normal authentication transaction during InitPostgres
> and hence need to be able to cope with deadlock and statement timeouts.
> I will do something about back-patching a fix for that.
Wow, my work uncovered a pre-existing bug in PostgreSQL. :-)
> * I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
> and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
> done in walsender and autovacuum processes. I'm not totally satisfied
> with that, but for sure it didn't work to only establish them in
> regular backends.
>
> * I didn't like the "TimeoutName" nomenclature, because to me "name"
> suggests that the value is a string, not just an enum. So I renamed
> that to TimeoutId.
OK.
> * I whacked around the logic in timeout.c a fair amount, because it
> had race conditions if SIGALRM happened while enabling or disabling
> a timeout. I believe the attached coding is safe, but I'm not totally
> happy with it from a performance standpoint, because it will do two
> setitimer calls (a disable and then a re-enable) in many cases where
> the old code did only one.
Disabling deadlock timeout, a.k.a. disable_sig_alarm(false) in
the original code called setitimer() twice if statement timeout
was still in effect, it was done in CheckStatementTimeout().
Considering this, I think there is no performance problem with
the new code you came up with.
> I think what we ought to do is go ahead and apply this, so that we
> can have the API nailed down, and then we can revisit the internals
> of timeout.c to see if we can't get the performance back up.
> It's clearly a much cleaner design than the old spaghetti logic in
> proc.c, so I think we ought to go ahead with this independently of
> whether the second patch gets accepted.
There is one tiny bit you might have broken. You wrote previously:
> I am also underwhelmed by the "timeout_start" callback function concept.
> In the first place, that's broken enable_timeout, which incorrectly
> assumes that the value it gets must be "now" (see its schedule_alarm
> call). In the second place, it seems fairly likely that callers of
> get_timeout_start would likewise want the clock time at which the
> timeout was enabled, not the timeout_start reference time. (If they
> did want the latter, why couldn't they get it from wherever the callback
> function had gotten it?) I'm inclined to propose that we drop the
> timeout_start concept and instead provide two functions for scheduling
> interrupts:
>
> enable_timeout_after(TimeoutName tn, int delay_ms);
> enable_timeout_at(TimeoutName tn, TimestampTz fin_time);
>
> where you use the former if you want the standard GetCurrentTimestamp +
> n msec calculation, but if you want the stop time calculated in some
> other way, you calculate it yourself and use the second function.
It's okay, but you haven't really followed it with STATEMENT_TIMEOUT:
-----8<----------8<----------8<----------8<----------8<-----
*** 2396,2404 **** /* Set statement timeout running, if any */ /* NB: this mustn't be
enableduntil we are within an xact */ if (StatementTimeout > 0)
! enable_sig_alarm(StatementTimeout, true); else
! cancel_from_timeout = false;
xact_started = true; }
--- 2397,2405 ---- /* Set statement timeout running, if any */ /* NB: this mustn't be
enableduntil we are within an xact */ if (StatementTimeout > 0)
! enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout); else
! disable_timeout(STATEMENT_TIMEOUT, false);
xact_started = true; }
-----8<----------8<----------8<----------8<----------8<-----
It means that StatementTimeout losts its precision. It would trigger
in the future counting from "now" instead of counting from
GetCurrentStatementStartTimestamp(). It should be:
enable_timeout_at(STATEMENT_TIMEOUT,
TimestampTzPlusMilliseconds(GetCurrentStatementStartTimestamp(), StatementTimeout));
> I haven't really looked at the second patch yet, but at minimum that
> will need some rebasing to match the API tweaks here.
Yes, I will do that.
Thanks for your review and work.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de http://www.postgresql.at/
pgsql-hackers by date: