Thread: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18594 Logged by: Francisco Javier Ossandon Email address: fco.j.ossandon@gmail.com PostgreSQL version: 15.5 Operating system: Linux Description: Dear developers: I have been using Postgres for some years now, and I just found what looks like a bug, or at least I did not see anything in the documentation that could explain it. * version(): PostgreSQL 15.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit The main issue is that "CASE WHEN THEN ELSE" is giving an expected result in what seems like simple logic. I have a series of WHENs with conditions over the value of a column and an ELSE where I return the same column value if the CASEs above don't apply, and the query always returns the value of the ELSE condition like if the WHENs above would not exist. The following query exemplifies this, where the first CASE column gives an unexpected result when using the same column in the ELSE, while the other CASE columns provide the expected result when removing the ELSE or using a different column in the ELSE or a constant string: ``` SELECT oid, typname, typcategory, CASE WHEN typcategory = 'N' THEN 'Numeric types' WHEN typcategory = 'S' THEN 'String types' WHEN typcategory = 'E' THEN 'Enum types' ELSE typcategory END AS test_case_fails, CASE WHEN typcategory = 'N' THEN 'Numeric types' WHEN typcategory = 'S' THEN 'String types' WHEN typcategory = 'E' THEN 'Enum types' END AS test_case_works_no_else, CASE WHEN typcategory = 'N' THEN 'Numeric types' WHEN typcategory = 'S' THEN 'String types' WHEN typcategory = 'E' THEN 'Enum types' ELSE typname END AS test_case_works_other_col, CASE WHEN typcategory = 'N' THEN 'Numeric types' WHEN typcategory = 'S' THEN 'String types' WHEN typcategory = 'E' THEN 'Enum types' ELSE 'ELSE' END AS test_case_works_constant FROM pg_type ORDER BY oid LIMIT 10 ; ``` This returns the following: ``` oid|typname |typcategory|test_case_fails|test_case_works_no_else|test_case_works_other_col|test_case_works_constant| ---+----------+-----------+---------------+-----------------------+-------------------------+------------------------+ 16|bool |B |B | |bool |ELSE | 17|bytea |U |U | |bytea |ELSE | 18|char |Z |Z | |char |ELSE | 19|name |S |S |String types |String types |String types | 20|int8 |N |N |Numeric types |Numeric types |Numeric types | 21|int2 |N |N |Numeric types |Numeric types |Numeric types | 22|int2vector|A |A | |int2vector |ELSE | 23|int4 |N |N |Numeric types |Numeric types |Numeric types | 24|regproc |N |N |Numeric types |Numeric types |Numeric types | 25|text |S |S |String types |String types |String types | ``` It looks weird, so I'm reporting this to you for review. Best regards,
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Andrew Dunstan
Date:
On 2024-08-28 We 12:17 PM, PG Bug reporting form wrote: > The following bug has been logged on the website: > > Bug reference: 18594 > Logged by: Francisco Javier Ossandon > Email address: fco.j.ossandon@gmail.com > PostgreSQL version: 15.5 > Operating system: Linux > Description: > > Dear developers: > I have been using Postgres for some years now, and I just found what looks > like a bug, or at least I did not see anything in the documentation that > could explain it. > > * version(): > PostgreSQL 15.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 > (Red Hat 7.3.1-12), 64-bit > > The main issue is that "CASE WHEN THEN ELSE" is giving an expected result in > what seems like simple logic. I have a series of WHENs with conditions over > the value of a column and an ELSE where I return the same column value if > the CASEs above don't apply, and the query always returns the value of the > ELSE condition like if the WHENs above would not exist. The following query > exemplifies this, where the first CASE column gives an unexpected result > when using the same column in the ELSE, while the other CASE columns provide > the expected result when removing the ELSE or using a different column in > the ELSE or a constant string: > ``` > SELECT oid, > typname, > typcategory, > CASE > WHEN typcategory = 'N' THEN 'Numeric types' > WHEN typcategory = 'S' THEN 'String types' > WHEN typcategory = 'E' THEN 'Enum types' > ELSE typcategory > END AS test_case_fails, > CASE > WHEN typcategory = 'N' THEN 'Numeric types' > WHEN typcategory = 'S' THEN 'String types' > WHEN typcategory = 'E' THEN 'Enum types' > END AS test_case_works_no_else, > CASE > WHEN typcategory = 'N' THEN 'Numeric types' > WHEN typcategory = 'S' THEN 'String types' > WHEN typcategory = 'E' THEN 'Enum types' > ELSE typname > END AS test_case_works_other_col, > CASE > WHEN typcategory = 'N' THEN 'Numeric types' > WHEN typcategory = 'S' THEN 'String types' > WHEN typcategory = 'E' THEN 'Enum types' > ELSE 'ELSE' > END AS test_case_works_constant > FROM pg_type > ORDER BY oid > LIMIT 10 > ; > ``` > This returns the following: > ``` > oid|typname > |typcategory|test_case_fails|test_case_works_no_else|test_case_works_other_col|test_case_works_constant| > ---+----------+-----------+---------------+-----------------------+-------------------------+------------------------+ > 16|bool |B |B | |bool > |ELSE | > 17|bytea |U |U | |bytea > |ELSE | > 18|char |Z |Z | |char > |ELSE | > 19|name |S |S |String types |String > types |String types | > 20|int8 |N |N |Numeric types |Numeric > types |Numeric types | > 21|int2 |N |N |Numeric types |Numeric > types |Numeric types | > 22|int2vector|A |A | > |int2vector |ELSE | > 23|int4 |N |N |Numeric types |Numeric > types |Numeric types | > 24|regproc |N |N |Numeric types |Numeric > types |Numeric types | > 25|text |S |S |String types |String > types |String types | > ``` > It looks weird, so I'm reporting this to you for review. > The bug is in the query, not in Postgres. Try casting typcategory to text in your failing case to see the difference. As it is the literals are being cast to "char" because that's what typcategory is. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Francisco J. Ossandón
Date:
Hello Andrew,
I thought about that before writing and tried casting to TEXT in the WHEN comparison bits but it still failed.
CASE
WHEN typcategory::TEXT = 'N' THEN 'Numeric types'
WHEN typcategory::TEXT = 'S' THEN 'String types'
WHEN typcategory::TEXT = 'E' THEN 'Enum types'
ELSE typcategory
END AS test_case_fails,
After your answer, I tried again but this time adding the casting to the ELSE too, and then it worked:
CASE
WHEN typcategory::TEXT = 'N' THEN 'Numeric types'
WHEN typcategory::TEXT = 'S' THEN 'String types'
WHEN typcategory::TEXT = 'E' THEN 'Enum types'
ELSE typcategory::TEXT
So is the ELSE column hijacking the data type of the whole expression?
I thought about that before writing and tried casting to TEXT in the WHEN comparison bits but it still failed.
CASE
WHEN typcategory::TEXT = 'N' THEN 'Numeric types'
WHEN typcategory::TEXT = 'S' THEN 'String types'
WHEN typcategory::TEXT = 'E' THEN 'Enum types'
ELSE typcategory
END AS test_case_fails,
After your answer, I tried again but this time adding the casting to the ELSE too, and then it worked:
CASE
WHEN typcategory::TEXT = 'N' THEN 'Numeric types'
WHEN typcategory::TEXT = 'S' THEN 'String types'
WHEN typcategory::TEXT = 'E' THEN 'Enum types'
ELSE typcategory::TEXT
So is the ELSE column hijacking the data type of the whole expression?
Cheers,
Francisco
El mié, 28 ago 2024 a las 13:57, Andrew Dunstan (<andrew@dunslane.net>) escribió:
On 2024-08-28 We 12:17 PM, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference: 18594
> Logged by: Francisco Javier Ossandon
> Email address: fco.j.ossandon@gmail.com
> PostgreSQL version: 15.5
> Operating system: Linux
> Description:
>
> Dear developers:
> I have been using Postgres for some years now, and I just found what looks
> like a bug, or at least I did not see anything in the documentation that
> could explain it.
>
> * version():
> PostgreSQL 15.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712
> (Red Hat 7.3.1-12), 64-bit
>
> The main issue is that "CASE WHEN THEN ELSE" is giving an expected result in
> what seems like simple logic. I have a series of WHENs with conditions over
> the value of a column and an ELSE where I return the same column value if
> the CASEs above don't apply, and the query always returns the value of the
> ELSE condition like if the WHENs above would not exist. The following query
> exemplifies this, where the first CASE column gives an unexpected result
> when using the same column in the ELSE, while the other CASE columns provide
> the expected result when removing the ELSE or using a different column in
> the ELSE or a constant string:
> ```
> SELECT oid,
> typname,
> typcategory,
> CASE
> WHEN typcategory = 'N' THEN 'Numeric types'
> WHEN typcategory = 'S' THEN 'String types'
> WHEN typcategory = 'E' THEN 'Enum types'
> ELSE typcategory
> END AS test_case_fails,
> CASE
> WHEN typcategory = 'N' THEN 'Numeric types'
> WHEN typcategory = 'S' THEN 'String types'
> WHEN typcategory = 'E' THEN 'Enum types'
> END AS test_case_works_no_else,
> CASE
> WHEN typcategory = 'N' THEN 'Numeric types'
> WHEN typcategory = 'S' THEN 'String types'
> WHEN typcategory = 'E' THEN 'Enum types'
> ELSE typname
> END AS test_case_works_other_col,
> CASE
> WHEN typcategory = 'N' THEN 'Numeric types'
> WHEN typcategory = 'S' THEN 'String types'
> WHEN typcategory = 'E' THEN 'Enum types'
> ELSE 'ELSE'
> END AS test_case_works_constant
> FROM pg_type
> ORDER BY oid
> LIMIT 10
> ;
> ```
> This returns the following:
> ```
> oid|typname
> |typcategory|test_case_fails|test_case_works_no_else|test_case_works_other_col|test_case_works_constant|
> ---+----------+-----------+---------------+-----------------------+-------------------------+------------------------+
> 16|bool |B |B | |bool
> |ELSE |
> 17|bytea |U |U | |bytea
> |ELSE |
> 18|char |Z |Z | |char
> |ELSE |
> 19|name |S |S |String types |String
> types |String types |
> 20|int8 |N |N |Numeric types |Numeric
> types |Numeric types |
> 21|int2 |N |N |Numeric types |Numeric
> types |Numeric types |
> 22|int2vector|A |A |
> |int2vector |ELSE |
> 23|int4 |N |N |Numeric types |Numeric
> types |Numeric types |
> 24|regproc |N |N |Numeric types |Numeric
> types |Numeric types |
> 25|text |S |S |String types |String
> types |String types |
> ```
> It looks weird, so I'm reporting this to you for review.
>
The bug is in the query, not in Postgres. Try casting typcategory to
text in your failing case to see the difference. As it is the literals
are being cast to "char" because that's what typcategory is.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Francisco J. Ossandon
Bioinformatician
Ph.D. in Biotechnology
Bioinformatician
Ph.D. in Biotechnology
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Tom Lane
Date:
=?UTF-8?Q?Francisco_J=2E_Ossand=C3=B3n?= <fco.j.ossandon@gmail.com> writes: > So is the ELSE column hijacking the data type of the whole expression? It's the only CASE result that is supplying a definite type at all. But see https://www.postgresql.org/docs/current/typeconv-union-case.html particularly the footnote to the bit about "Select the first non-unknown input type as the candidate type, then consider each other non-unknown input type, left to right." The WHEN clauses have exactly nothing to do with the result type of the CASE: it's the THEN and ELSE clauses that supply the result. regards, tom lane
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Francisco J. Ossandón
Date:
Hello Tom,
Thanks for the explanation and the link to the documentation. I understand now what happened.
So it was a mistake on my side.
Thanks again for the patience and replies.
Best regards,
Francisco
El mié, 28 ago 2024 a las 20:11, Tom Lane (<tgl@sss.pgh.pa.us>) escribió:
Francisco J. Ossandón <fco.j.ossandon@gmail.com> writes:
> So is the ELSE column hijacking the data type of the whole expression?
It's the only CASE result that is supplying a definite type at all.
But see
https://www.postgresql.org/docs/current/typeconv-union-case.html
particularly the footnote to the bit about "Select the first
non-unknown input type as the candidate type, then consider
each other non-unknown input type, left to right."
The WHEN clauses have exactly nothing to do with the result type
of the CASE: it's the THEN and ELSE clauses that supply the result.
regards, tom lane
Francisco J. Ossandon
Bioinformatician
Ph.D. in Biotechnology
Bioinformatician
Ph.D. in Biotechnology
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
"David G. Johnston"
Date:
On Wed, Jan 22, 2025 at 3:00 AM Chris BSomething <xpusostomos@gmail.com> wrote:
In the 6 rules for type resolution, rule 5 says "select the first non-unknown input type as candidate, and consider if other non-unknown types can be implicitly converted." ... should there not be a new rule, something like considering if the unknown types can be converted, then notice that a long string can't be converted to a char, because it won't fit? And then fail with an interesting error?
Do you have an example demonstrating this new rule? I'm doubting this resolution logic goes into that much depth regarding the semantics of each type. It is implied that even if an implicit cast exists from type A to type B that actual values of type A may fail during the casting process to make them type B. That would be the interesting error you'd end up seeing - but from the type casting/input system, not the type resolution one.
David J.
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Chris BSomething
Date:
On Tue, Feb 18, 2025, 6:31 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wed, Jan 22, 2025 at 3:00 AM Chris BSomething <xpusostomos@gmail.com> wrote:In the 6 rules for type resolution, rule 5 says "select the first non-unknown input type as candidate, and consider if other non-unknown types can be implicitly converted." ... should there not be a new rule, something like considering if the unknown types can be converted, then notice that a long string can't be converted to a char, because it won't fit? And then fail with an interesting error?Do you have an example demonstrating this new rule? I'm doubting this resolution logic goes into that much depth regarding the semantics of each type. It is implied that even if an implicit cast exists from type A to type B that actual values of type A may fail during the casting process to make them type B. That would be the interesting error you'd end up seeing - but from the type casting/input system, not the type resolution one.David J.
That sounds like a lot of words to justify an error that could be caught at parse time, that is actually never caught at all. It seems to me that whatever type is inferred from a long constant string, it won't fit into a char... At least not without casting it. And therefore an error should come out.
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
"David G. Johnston"
Date:
On Mon, Feb 17, 2025 at 3:49 PM Chris BSomething <xpusostomos@gmail.com> wrote:
That sounds like a lot of words to justify an error that could be caught at parse time, that is actually never caught at all. It seems to me that whatever type is inferred from a long constant string, it won't fit into a char... At least not without casting it. And therefore an error should come out.
I thought you were trying to convince people to modify the documentation? This reads more like wanting to hack the parser to teach it forbidden knowledge.
In any case I think this email report is closed and burying any discussion about this material here just makes it harder for people to find.
David J.
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Chris BSomething
Date:
On Tue, Feb 18, 2025, 7:13 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Feb 17, 2025 at 3:49 PM Chris BSomething <xpusostomos@gmail.com> wrote:That sounds like a lot of words to justify an error that could be caught at parse time, that is actually never caught at all. It seems to me that whatever type is inferred from a long constant string, it won't fit into a char... At least not without casting it. And therefore an error should come out.I thought you were trying to convince people to modify the documentation? This reads more like wanting to hack the parser to teach it forbidden knowledge.
Hack? Forbidden knowledge? What an odd comment.
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Tom Lane
Date:
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Wed, Jan 22, 2025 at 3:00 AM Chris BSomething <xpusostomos@gmail.com> > wrote: >> ... should there not be a new rule, >> something like considering if the unknown types can be converted, then >> notice that a long string can't be converted to a char, because it won't >> fit? And then fail with an interesting error? > Do you have an example demonstrating this new rule? I'm doubting this > resolution logic goes into that much depth regarding the semantics of each > type. It is implied that even if an implicit cast exists from type A to > type B that actual values of type A may fail during the casting process to > make them type B. That would be the interesting error you'd end up seeing > - but from the type casting/input system, not the type resolution one. Yeah. We document somewhere that the contents of a literal string are *not* considered while deciding what type it should be taken as; thus, the possibility of a casting failure is not part of the rules. I think such a rule would be pretty dangerous, even if we could implement it easily. As an example, if you entered something you meant as a timestamp, but you fat-fingered the punctuation or something, it'd likely fall back to being considered just "text", potentially silently changing the semantics of the expression. We had a lot of trouble with that sort of behavior back when the system had a lot of implicit casts to text. We got rid of (most of) those in 8.3 or thereabouts, and it made the semantics generally a lot safer. regards, tom lane
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Chris BSomething
Date:
On Tue, 18 Feb 2025 at 10:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah. We document somewhere that the contents of a literal string are
*not* considered while deciding what type it should be taken as; thus,
the possibility of a casting failure is not part of the rules.
The length of a literal is not strictly speaking its "contents". One could for example, have its type as varchar(x) where x is its length. And then not have implicit typecasts from varchar(x) to varchar( < x).
I think such a rule would be pretty dangerous, even if we could
implement it easily. As an example, if you entered something you
meant as a timestamp, but you fat-fingered the punctuation or
something, it'd likely fall back to being considered just "text",
I don't get that, I thought timestamps have to be preceded by "TIMESTAMP".
potentially silently changing the semantics of the expression. We had
a lot of trouble with that sort of behavior back when the system had a
lot of implicit casts to text. We got rid of (most of) those in 8.3
or thereabouts, and it made the semantics generally a lot safer.
So... you solved that, it sounds like.
And yet this super dangerous behavior which could sit dormant and undiscovered in a view in a large computer system, doesn't bother anyone? OK, whatever.
Chris
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
"David G. Johnston"
Date:
On Wednesday, February 19, 2025, Chris BSomething <xpusostomos@gmail.com> wrote:
And yet this super dangerous behavior which could sit dormant and undiscovered in a view in a large computer system, doesn't bother anyone? OK, whatever.
There a quite a few of those in SQL…testing queries is a requirement. And being explicit with types is strongly suggested.
To be clear, the complaint is that we allow input functions such as varchar(n) and char to truncate their inputs as opposed to erroring out if the input length is too long. Right? If not I’m still confused as to what exact behavior you believe is dangerous.
David J.
Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
From
Vik Fearing
Date:
On 19/02/2025 13:50, Chris BSomething wrote: > > On Tue, 18 Feb 2025 at 10:12, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I think such a rule would be pretty dangerous, even if we could > implement it easily. As an example, if you entered something you > meant as a timestamp, but you fat-fingered the punctuation or > something, it'd likely fall back to being considered just "text", > > > I don't get that, I thought timestamps have to be preceded by "TIMESTAMP". The SQL standard requires you to do that, but PostgreSQL is a bit more lax about it (and in my humble opinion, thankfully so). If you did prefix your literals with their type, you wouldn't be having these problems in the first place. The problem stems from postgres treating your unmarked literal as "unknown" and then trying to figure out what the heck it is. -- Vik Fearing