Thread: funkiness with '1999-12-31 19:00:00-05'::timestamp with time zone
Hi all, I've come across a puzzling situation with a table having a timestamp with time zone column. This column is full of values displaying exactly as '1999-12-31 19:00:00-05', but for some reason Postgres is treating some of these identical-seeming timestamps as being different. If I update all these timestamps by adding an interval of '1 DAYS' to all rows, Postgres recognizes all the values as being the same. If I repeat this experiment using a timestamp without time zone type, Postgres recognizes all the timestamps as being the same. When I pg_dump the timestamps_test table, I see a normal-looking dump: COPY timestamps_test (ts) FROM stdin; 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 … and when I reload this pg_dump file back into the same database, Postgres again recognizes that all the timestamps are the same (i.e. SELECT COUNT(DISTINCT(ts)) returns 1). I've attached a plain-text pg_dump of this table. Here's a log of how I created this timestamps_test table, from a source table full of these '1999-12-31 19:00:00-05' timestamps. Any ideas what might be causing this? test=# CREATE TABLE timestamps_test (ts timestamp with time zone NOT NULL); CREATE TABLE test=# INSERT INTO timestamps_test (ts) SELECT DISTINCT(updated) FROM myschema.strange_table; INSERT 0 119 test=# SELECT COUNT(DISTINCT(ts)) FROM timestamps_test; count ------- 119 (1 row) test=# SELECT DISTINCT(ts) FROM timestamps_test LIMIT 10; ts ------------------------ 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 (10 rows) test=# SELECT MAX(ts) = MIN(ts), MAX(ts) - MIN(ts) FROM timestamps_test; ?column? | ?column? ----------+---------- f | 00:00:00 (1 row) test=# UPDATE timestamps_test SET ts = ts + INTERVAL '1 DAYS'; UPDATE 119 test=# SELECT COUNT(DISTINCT(ts)) FROM timestamps_test; count ------- 1 (1 row) test=# SELECT DISTINCT(ts) FROM timestamps_test LIMIT 10; ts ------------------------ 2000-01-01 19:00:00-05 (1 row) test=# SELECT version(); version -------------------------------------------------------------------------------- ----------------------------------- PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 2 0080704 (Red Hat 4.1.2-46), 64-bit (1 row) test=# SELECT name, setting FROM pg_settings WHERE name IN ('TimeZone', 'lc_collate', 'lc_ctype', 'lc_time', 'DateStyle'); name | setting ------------+------------ DateStyle | ISO, MDY lc_collate | C lc_ctype | C lc_time | C TimeZone | US/Eastern (5 rows) Thanks for any ideas, Josh
Attachment
Josh Kupershmidt <schmiddy@gmail.com> writes: > I've come across a puzzling situation with a table having a timestamp > with time zone column. This column is full of values displaying > exactly as '1999-12-31 19:00:00-05', but for some reason Postgres is > treating some of these identical-seeming timestamps as being > different. Is this installation using float or integer timestamps? If the former, it might be interesting to look at the subtraction result ts - '1999-12-31 19:00:00-05'::timestamptz I'm thinking some of them might be different by submicrosecond amounts. regards, tom lane
On Fri, Sep 3, 2010 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Is this installation using float or integer timestamps? If the former, > it might be interesting to look at the subtraction result > ts - '1999-12-31 19:00:00-05'::timestamptz > I'm thinking some of them might be different by submicrosecond amounts. Ah yes, this is likely why. pg_config says CONFIGURE = ... '--disable-integer-datetimes' ... But I'm having trouble seeing for sure whether there are submicrosecond parts of these timestamps. I just see a bunch of '00:00:00' values with your query: test=# SELECT ts - '1999-12-31 19:00:00-05'::timestamptz FROM timestamps_test LIMIT 5; ?column? ---------- 00:00:00 00:00:00 00:00:00 00:00:00 00:00:00 (5 rows) And SELECT EXTRACT(microseconds FROM ts) FROM timestamps_test also just gives me zeroes. Is there a way for me to see for sure? Josh
Josh Kupershmidt <schmiddy@gmail.com> writes: > On Fri, Sep 3, 2010 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm thinking some of them might be different by submicrosecond amounts. > Ah yes, this is likely why. pg_config says CONFIGURE = ... > '--disable-integer-datetimes' ... > But I'm having trouble seeing for sure whether there are > submicrosecond parts of these timestamps. Experimenting, I can do this: regression=# create table t1 (ts timestamptz); CREATE TABLE regression=# insert into t1 select '1999-12-31 19:00:00.0000001-05'::timestamptz; INSERT 0 1 regression=# insert into t1 select '1999-12-31 19:00:00.000000-05'::timestamptz; INSERT 0 1 regression=# select * from t1; ts ------------------------ 1999-12-31 19:00:00-05 1999-12-31 19:00:00-05 (2 rows) regression=# select extract(epoch from ts - '1999-12-31 19:00:00-05'::timestamptz) from t1; date_part ---------------------- 1.00000761449337e-07 0 (2 rows) This timestamp (2000-01-01 00:00 GMT) is actually the zero value internally for Postgres timestamps, so in principle a float timestamp has precision far smaller than microseconds for values near this. We don't make any great effort to expose that though. It looks like the closest value that timestamptzin makes different from zero is regression=# select extract(epoch from '1999-12-31 19:00:00.00000000001-05' - '1999-12-31 19:00:00-05'::timestamptz) ; date_part ---------------------- 1.45519152283669e-11 (1 row) regards, tom lane
On Fri, Sep 3, 2010 at 3:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > regression=# select extract(epoch from ts - '1999-12-31 19:00:00-05'::timestamptz) from t1; > date_part > ---------------------- > 1.00000761449337e-07 > 0 > (2 rows) > > This timestamp (2000-01-01 00:00 GMT) is actually the zero value > internally for Postgres timestamps, so in principle a float timestamp > has precision far smaller than microseconds for values near this. > We don't make any great effort to expose that though. It looks like > the closest value that timestamptzin makes different from zero is > > regression=# select extract(epoch from '1999-12-31 19:00:00.00000000001-05' - '1999-12-31 19:00:00-05'::timestamptz) ; > date_part > ---------------------- > 1.45519152283669e-11 > (1 row) EXTRACT(epoch ...) was what I was looking for: SELECT EXTRACT(epoch FROM ts - '1999-12-31 19:00:00-05'::timestamptz) FROM timestamps_test LIMIT 5; date_part ----------------------- 1.4120666068199e-309 1.4154982781624e-309 1.41550281692099e-309 1.41591466059161e-309 1.41591524669472e-309 (5 rows) Thanks for the help, Tom. Josh
[ trivia warning ] I wrote: > We don't make any great effort to expose that though. It looks like > the closest value that timestamptzin makes different from zero is > regression=# select extract(epoch from '1999-12-31 19:00:00.00000000001-05' - '1999-12-31 19:00:00-05'::timestamptz) ; > date_part > ---------------------- > 1.45519152283669e-11 > (1 row) Actually, it looks like the precision is being limited by the rotation from EST zone. In GMT zone I can do this: regression=# select extract(epoch from '2000-01-01 00:00:00.0000000000000000000000000000000000000000000000000000000000000000001'::timestamptz- '2000-01-01 00:00:00'); date_part ----------- 1e-67 (1 row) and it could go a lot smaller except there's an arbitrary limit on the length of input string that timestamptzin will take. If float timestamps weren't deprecated it might be worth trying to make this behave less surprisingly. regards, tom lane
Josh Kupershmidt <schmiddy@gmail.com> writes: > EXTRACT(epoch ...) was what I was looking for: > SELECT EXTRACT(epoch FROM ts - '1999-12-31 19:00:00-05'::timestamptz) > FROM timestamps_test LIMIT 5; > date_part > ----------------------- > 1.4120666068199e-309 > 1.4154982781624e-309 > 1.41550281692099e-309 > 1.41591466059161e-309 > 1.41591524669472e-309 > (5 rows) Wow. You must have gotten those with the help of some arithmetic, because timestamptzin would never have produced them. I found out I can do regression=# select extract(epoch from ('2000-01-01 00:00:00'::timestamptz + '0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001'::interval) -'2000-01-01 00:00:00'); date_part ----------- 1e-209 (1 row) but I wonder what it was you actually did. regards, tom lane
On Fri, Sep 3, 2010 at 3:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Wow. You must have gotten those with the help of some arithmetic, > because timestamptzin would never have produced them. I found out I can > do > > regression=# select extract(epoch from ('2000-01-01 00:00:00'::timestamptz + '0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001'::interval) -'2000-01-01 00:00:00'); > date_part > ----------- > 1e-209 > (1 row) > > but I wonder what it was you actually did. I wonder myself :-) I encountered these timestamps while going through some C code I inherited which uses libpq to load several tables (such as myschema.strange_table in the original example) using COPY FROM STDIN. I don't think any timestamp arithmetic was involved. The code was supposed to copy in legitimate timestamps, but instead loaded all these '1999-12-31 19:00:00-05' values, and I'm still trying to figure out how/why. Josh
Josh Kupershmidt <schmiddy@gmail.com> writes: > On Fri, Sep 3, 2010 at 3:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> but I wonder what it was you actually did. > I wonder myself :-) I encountered these timestamps while going through > some C code I inherited which uses libpq to load several tables (such > as myschema.strange_table in the original example) using COPY FROM > STDIN. I don't think any timestamp arithmetic was involved. The code > was supposed to copy in legitimate timestamps, but instead loaded all > these '1999-12-31 19:00:00-05' values, and I'm still trying to figure > out how/why. Interesting. I can't imagine how you could have produced these with plain COPY, since that would go through timestamptzin. Was it by any chance a binary COPY? If so I could believe that funny timestamps could get in. Maybe some confusion over endianness of the binary data, for instance. regards, tom lane
On Fri, Sep 3, 2010 at 4:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Interesting. I can't imagine how you could have produced these with > plain COPY, since that would go through timestamptzin. Was it by any > chance a binary COPY? If so I could believe that funny timestamps could > get in. Maybe some confusion over endianness of the binary data, for > instance. Exactly, the code is using COPY ... TO STDOUT WITH BINARY along with COPY ... FROM STDIN WITH BINARY. Josh
Josh Kupershmidt <schmiddy@gmail.com> writes: > On Fri, Sep 3, 2010 at 4:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Interesting. �I can't imagine how you could have produced these with >> plain COPY, since that would go through timestamptzin. �Was it by any >> chance a binary COPY? �If so I could believe that funny timestamps could >> get in. �Maybe some confusion over endianness of the binary data, for >> instance. > Exactly, the code is using COPY ... TO STDOUT WITH BINARY along with > COPY ... FROM STDIN WITH BINARY. OK; what you need to look at is how the client code is preparing the timestamp values. What they should be is floats representing seconds since 2000-01-01 00:00 GMT, sent in bigendian byte order. regards, tom lane