Re: TimestampTz->Text->TimestampTz casting fails with DateStyle 'Postgres' - Mailing list pgsql-bugs

From Tom Lane
Subject Re: TimestampTz->Text->TimestampTz casting fails with DateStyle 'Postgres'
Date
Msg-id 394372.1733884998@sss.pgh.pa.us
Whole thread Raw
In response to Re: TimestampTz->Text->TimestampTz casting fails with DateStyle 'Postgres'  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: TimestampTz->Text->TimestampTz casting fails with DateStyle 'Postgres'
List pgsql-bugs
I wrote:
> I agree.  We should be mapping it to whatever GMT offset "LMT" means
> in the selected zone (assuming there is an entry, which it seems like
> there is in most of them).  I've not gotten around to looking at what
> that would take, but it's probably not entirely trivial, since it
> would mean different GMT offsets in different zones.

Here's a draft patch for that.  It turns out to be simpler than I
thought, because we can mostly piggy-back on the existing DYNTZ
support.  The difference between LMT and one of our existing dynamic
abbreviations is that LMT means whatever it means in the prevailing
session timezone, while a dynamic abbreviation specifies which TZDB
timezone it refers to.

> One thing that might be interesting to look at is whether the same
> mechanism could be used for other TZ abbreviations defined by the tzdb
> data, instead of relying on our hard-wired lists.

As I set it up here, we first check the timezone abbreviation list,
then look into the session timezone to see if it has an entry.
I don't expect that this lookup will succeed for anything except LMT,
because every other abbreviation that TZDB knows about is already
listed in our standard abbreviation list.  But in the future we
could imagine removing entries from the abbreviation list so that
this code path takes more of the burden.  That'd be particularly
attractive for abbreviations that have conflicts, because then
our interpretation of the abbreviation would automatically adapt
based on the timezone setting.  That's something to pursue another
day though.  We might need to work a bit harder on optimizing
this code path before we let it take anything more common than
LMT, too.

This is not committable because I didn't think about documentation
yet.  We probably want to explain this in Appendix B, and also
there are assorted comments about what a "dynamic abbreviation"
is that will need some adjustment.

            regards, tom lane

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 0b19cddf54..6a10488d9a 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -1812,9 +1812,35 @@ DetermineTimeZoneAbbrevOffsetTS(TimestampTz ts, const char *abbr,
 }


+/* TimeZoneAbbrevIsKnown()
+ *
+ * Detect whether the given string is a time zone abbreviation that's known
+ * in the specified TZDB timezone.  The match is not case-sensitive.
+ */
+static bool
+TimeZoneAbbrevIsKnown(const char *abbr, pg_tz *tzp)
+{
+    int            offset,
+                isdst;
+
+    /*
+     * DetermineTimeZoneAbbrevOffsetInternal does more than is really needed
+     * here, but it's not clear that it's worth optimizing further.
+     *
+     * We can use the epoch as the probe time, since we don't care here about
+     * exactly what the abbreviation resolves as.
+     */
+    return DetermineTimeZoneAbbrevOffsetInternal((pg_time_t) 0,
+                                                 abbr,
+                                                 tzp,
+                                                 &offset,
+                                                 &isdst);
+}
+
+
 /* DetermineTimeZoneAbbrevOffsetInternal()
  *
- * Workhorse for above two functions: work from a pg_time_t probe instant.
+ * Workhorse for above three functions: work from a pg_time_t probe instant.
  * On success, return GMT offset and DST status into *offset and *isdst.
  */
 static bool
@@ -3106,9 +3132,25 @@ DecodeTimezoneAbbrev(int field, const char *lowtoken,
     }
     if (tp == NULL)
     {
-        *ftype = UNKNOWN_FIELD;
-        *offset = 0;
-        *tz = NULL;
+        /*
+         * See if the current session_timezone recognizes it.  This is mainly
+         * meant to handle "LMT", whose meaning varies across timezones, so we
+         * don't try to cache the result.
+         */
+        if (session_timezone &&
+            TimeZoneAbbrevIsKnown(lowtoken, session_timezone))
+        {
+            /* we can treat it like a dynamic-offset abbreviation */
+            *ftype = DYNTZ;
+            *offset = 0;
+            *tz = session_timezone;
+        }
+        else
+        {
+            *ftype = UNKNOWN_FIELD;
+            *offset = 0;
+            *tz = NULL;
+        }
     }
     else
     {
@@ -3278,9 +3320,6 @@ DecodeTimezoneAbbrevPrefix(const char *str, int *offset, pg_tz **tz)
     *offset = 0;                /* avoid uninitialized vars on failure */
     *tz = NULL;

-    if (!zoneabbrevtbl)
-        return -1;                /* no abbrevs known, so fail immediately */
-
     /* Downcase as much of the string as we could need */
     for (len = 0; len < TOKMAXLEN; len++)
     {
@@ -3299,9 +3338,12 @@ DecodeTimezoneAbbrevPrefix(const char *str, int *offset, pg_tz **tz)
      */
     while (len > 0)
     {
-        const datetkn *tp = datebsearch(lowtoken, zoneabbrevtbl->abbrevs,
-                                        zoneabbrevtbl->numabbrevs);
+        const datetkn *tp = NULL;

+        /* Known in zoneabbrevtbl? */
+        if (zoneabbrevtbl)
+            tp = datebsearch(lowtoken, zoneabbrevtbl->abbrevs,
+                             zoneabbrevtbl->numabbrevs);
         if (tp != NULL)
         {
             if (tp->type == DYNTZ)
@@ -3324,6 +3366,17 @@ DecodeTimezoneAbbrevPrefix(const char *str, int *offset, pg_tz **tz)
                 return len;
             }
         }
+
+        /* See if the current session_timezone recognizes it. */
+        if (session_timezone &&
+            TimeZoneAbbrevIsKnown(lowtoken, session_timezone))
+        {
+            /* we can treat it like a dynamic-offset abbreviation */
+            *tz = session_timezone;
+            return len;
+        }
+
+        /* Nope, try the next shorter string. */
         lowtoken[--len] = '\0';
     }

diff --git a/src/test/regress/expected/horology.out b/src/test/regress/expected/horology.out
index cb28dfbaee..b90bfcd794 100644
--- a/src/test/regress/expected/horology.out
+++ b/src/test/regress/expected/horology.out
@@ -3332,6 +3332,12 @@ SELECT to_timestamp('2011-12-18 11:38 MSK', 'YYYY-MM-DD HH12:MI TZ');  -- dyntz
  Sat Dec 17 23:38:00 2011 PST
 (1 row)

+SELECT to_timestamp('2011-12-18 00:00 LMT', 'YYYY-MM-DD HH24:MI TZ');  -- dyntz
+         to_timestamp
+------------------------------
+ Sat Dec 17 23:52:58 2011 PST
+(1 row)
+
 SELECT to_timestamp('2011-12-18 11:38ESTFOO24', 'YYYY-MM-DD HH12:MITZFOOSS');
          to_timestamp
 ------------------------------
diff --git a/src/test/regress/expected/timestamptz.out b/src/test/regress/expected/timestamptz.out
index b437613ac8..b5f0e0b8db 100644
--- a/src/test/regress/expected/timestamptz.out
+++ b/src/test/regress/expected/timestamptz.out
@@ -176,6 +176,45 @@ SELECT '205000-01-10 17:32:01 Europe/Helsinki'::timestamptz; -- non-DST
  Fri Jan 10 07:32:01 205000 PST
 (1 row)

+-- Recognize "LMT" as whatever it means in the current zone
+SELECT 'Jan 01 00:00:00 1000'::timestamptz;
+         timestamptz
+------------------------------
+ Wed Jan 01 00:00:00 1000 LMT
+(1 row)
+
+SELECT 'Jan 01 00:00:00 1000 LMT'::timestamptz;
+         timestamptz
+------------------------------
+ Wed Jan 01 00:00:00 1000 LMT
+(1 row)
+
+SELECT 'Jan 01 00:00:00 2024 LMT'::timestamptz;
+         timestamptz
+------------------------------
+ Sun Dec 31 23:52:58 2023 PST
+(1 row)
+
+SET timezone = 'Europe/London';
+SELECT 'Jan 01 00:00:00 1000 LMT'::timestamptz;
+         timestamptz
+------------------------------
+ Wed Jan 01 00:00:00 1000 LMT
+(1 row)
+
+SELECT 'Jan 01 00:00:00 2024 LMT'::timestamptz;
+         timestamptz
+------------------------------
+ Mon Jan 01 00:01:15 2024 GMT
+(1 row)
+
+-- which might be nothing
+SET timezone = 'UTC';
+SELECT 'Jan 01 00:00:00 2024 LMT'::timestamptz;  -- fail
+ERROR:  invalid input syntax for type timestamp with time zone: "Jan 01 00:00:00 2024 LMT"
+LINE 1: SELECT 'Jan 01 00:00:00 2024 LMT'::timestamptz;
+               ^
+RESET timezone;
 -- Test non-error-throwing API
 SELECT pg_input_is_valid('now', 'timestamptz');
  pg_input_is_valid
diff --git a/src/test/regress/sql/horology.sql b/src/test/regress/sql/horology.sql
index 4aa88b4ba9..1310b43277 100644
--- a/src/test/regress/sql/horology.sql
+++ b/src/test/regress/sql/horology.sql
@@ -538,6 +538,7 @@ SELECT to_timestamp('2011-12-18 11:38 EST', 'YYYY-MM-DD HH12:MI TZ');
 SELECT to_timestamp('2011-12-18 11:38 -05', 'YYYY-MM-DD HH12:MI TZ');
 SELECT to_timestamp('2011-12-18 11:38 +01:30', 'YYYY-MM-DD HH12:MI TZ');
 SELECT to_timestamp('2011-12-18 11:38 MSK', 'YYYY-MM-DD HH12:MI TZ');  -- dyntz
+SELECT to_timestamp('2011-12-18 00:00 LMT', 'YYYY-MM-DD HH24:MI TZ');  -- dyntz
 SELECT to_timestamp('2011-12-18 11:38ESTFOO24', 'YYYY-MM-DD HH12:MITZFOOSS');
 SELECT to_timestamp('2011-12-18 11:38-05FOO24', 'YYYY-MM-DD HH12:MITZFOOSS');
 SELECT to_timestamp('2011-12-18 11:38 JUNK', 'YYYY-MM-DD HH12:MI TZ');  -- error
diff --git a/src/test/regress/sql/timestamptz.sql b/src/test/regress/sql/timestamptz.sql
index 6b91e7eddc..ea0ffca310 100644
--- a/src/test/regress/sql/timestamptz.sql
+++ b/src/test/regress/sql/timestamptz.sql
@@ -109,6 +109,18 @@ SELECT '20500110 173201 Europe/Helsinki'::timestamptz; -- non-DST
 SELECT '205000-07-10 17:32:01 Europe/Helsinki'::timestamptz; -- DST
 SELECT '205000-01-10 17:32:01 Europe/Helsinki'::timestamptz; -- non-DST

+-- Recognize "LMT" as whatever it means in the current zone
+SELECT 'Jan 01 00:00:00 1000'::timestamptz;
+SELECT 'Jan 01 00:00:00 1000 LMT'::timestamptz;
+SELECT 'Jan 01 00:00:00 2024 LMT'::timestamptz;
+SET timezone = 'Europe/London';
+SELECT 'Jan 01 00:00:00 1000 LMT'::timestamptz;
+SELECT 'Jan 01 00:00:00 2024 LMT'::timestamptz;
+-- which might be nothing
+SET timezone = 'UTC';
+SELECT 'Jan 01 00:00:00 2024 LMT'::timestamptz;  -- fail
+RESET timezone;
+
 -- Test non-error-throwing API
 SELECT pg_input_is_valid('now', 'timestamptz');
 SELECT pg_input_is_valid('garbage', 'timestamptz');

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_dump crash on identity sequence with not loaded attributes
Next
From: Artur Zakirov
Date:
Subject: Re: pg_dump crash on identity sequence with not loaded attributes