Re: BUG #8532: postgres fails to start with timezone-data >=2013e - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #8532: postgres fails to start with timezone-data >=2013e |
Date | |
Msg-id | 10389.1384135035@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #8532: postgres fails to start with timezone-data >=2013e (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: BUG #8532: postgres fails to start with timezone-data >=2013e
|
List | pgsql-bugs |
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > On 16.10.2013 13:09, timo.gurr@gmail.com wrote: >> # ls -la /usr/share/zoneinfo/ >> lrwxrwxrwx 1 root root 1 Oct 16 12:06 posix -> . > That patch conflicts with the upstream Makefile change to create the > "other" directory as a symlink. With the vanilla zoneinfo sources, the > symlink is fine, but by putting 'posix' inside 'zoneinfo' directory, the > Gentoo-specific patch creates that infinite recursion situation. I agree, this is an egregious packaging bug. Programs should be able to enumerate the timezone database without running into infinite recursion. > That said, the error message you get from PostgreSQL isn't very > user-friendly. There is a check on recursion depth in the timezone > traversing code, but apparently it trips on another limit first, on the > number of directory handles that can be open at a time. > Also, I don't understand how this is preventing PostgreSQL from starting > up. AFAICS the traversal of the timezones is only done when you query > the pg_timezone_names system view. Not at startup. Keep in mind that in 9.2 and later, we traverse the timezone tree in initdb to set the timezone GUC. Before that, we would do it in postmaster startup --- but only if we didn't find a TZ variable in the environment nor a setting in postgresql.conf. I tried to reproduce this in HEAD by inserting a bogus symlink into the installation timezone tree and running initdb. Curiously, it did not fail, though initdb took rather longer than expected. After debugging I realized that scan_available_timezones() was in fact recursing deeper and deeper into the posix/posix/posix/posix/... nest --- but eventually, the constructed filename exceeds MAXPGPATH, and we truncate it, and fail to open the truncated filename, so the recursion stops. (And you don't get any error message, unless you compiled with DEBUG_IDENTIFY_TIMEZONE.) Also, the implementation in initdb isn't vulnerable to running out of descriptors because it sucks in an entire directory at a time with pgfnames(), so it doesn't have a descriptor open when it recurses. (Instead, it consumes a lot of memory --- but it looks like still only about 10MB worth.) In 9.1, the reason you see the maxAllocatedDescs complaint is that the postmaster tries to set the timezone before it's increased max_safe_fds, so it won't increase maxAllocatedDescs past 16. Enumerating the zones in a regular backend would almost certainly report the timezone recursion error instead. (I am kinda wondering why maxAllocatedDescs == 16 isn't enough to get to a recursion error at depth 10, but maybe there are a few other files open when this happens.) Basically, I don't think we should do anything about this. Packaging the TZ database like that is completely brain-dead, and Gentoo needs to fix it, not tell us we're doing something wrong. The consequences of their bug aren't too serious in modern PG releases anyway. (Given what I know of their packaging policies, I have to wonder why they're still shipping 9.1 rather than the bleeding edge...) regards, tom lane
pgsql-bugs by date: