BUG #17564: Planner bug in combination of generate_series(), unnest() and ORDER BY - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17564: Planner bug in combination of generate_series(), unnest() and ORDER BY |
Date | |
Msg-id | 17564-c7472c2f90ef2da3@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17564: Planner bug in combination of generate_series(), unnest() and ORDER BY
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17564 Logged by: Martijn van Oosterhout Email address: kleptog@gmail.com PostgreSQL version: 14.4 Operating system: Debian Linux (Bullseye) Description: We ran into a strange planner issue on our production system on Friday. Basically, a query would fail in the planning stage, depending one of the parameters. This shouldn't happen (AIUI). The query is as follows (simplified considerably from the original): db=# explain SELECT generate_subscripts(ARRAY[]::integer[], 1) AS id, unnest(ARRAY[]::jsonb[]) ->> 'timestamp' AS timestamp FROM results JOIN groups ON groups.id = results.group_id WHERE results.search_id = 3336 order by timestamp; (Yes, I know the combination of unnest() and generate_subscripts() in this way is evil, but it does work.) The error is: ERROR: set-valued function called in context that cannot accept a set LINE 2: unnest(ARRAY[]::jsonb[]) ->> 'timestamp' AS timestam... However, if you disable nested loops, it works fine: db=# set enable_nestloop =false; SET db=# explain SELECT generate_subscripts(ARRAY[]::integer[], 1) AS id, unnest(ARRAY[]::jsonb[]) ->> 'timestamp' AS timestamp FROM results JOIN groups ON groups.id = results.group_id WHERE results.search_id = 3336 order by timestamp; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Sort (cost=54523.19..55478.19 rows=382000 width=36) Sort Key: (((unnest('{}'::jsonb[])) ->> 'timestamp'::text)) -> Result (cost=19.75..8658.15 rows=382000 width=36) -> ProjectSet (cost=19.75..1973.15 rows=382000 width=36) -> Hash Join (cost=19.75..59.33 rows=382 width=0) Hash Cond: (groups.id = results.group_id) -> Seq Scan on groups (cost=0.00..36.54 rows=1154 width=4) -> Hash (cost=14.97..14.97 rows=382 width=4) -> Index Only Scan using results_pkey on results (cost=0.29..14.97 rows=382 width=4) Index Cond: (search_id = 3336) (10 rows) If you remove the ORDER BY, it works. If you remove the generate_series(), it works. If you remove the JOIN, it works. If you remove the "->> 'timestamp'", it works. If you wrap the query in a subquery without the ORDER BY, and then the put the ORDER BY on that, it works. Like so: explain SELECT * FROM (SELECT generate_subscripts(ARRAY[]::integer[], 1) AS id, unnest(ARRAY[]::jsonb[]) ->> 'timestamp' AS timestamp FROM results JOIN groups ON groups.id = results.group_id WHERE results.search_id = 3336) x order by timestamp; This gives the same query plan as above after disabling the nested loops. What appears to be happening is that the planner attempts a transformation and places the unnest() in the ORDER BY statement replacing the reference to the "timestamp" field with the actual expression. There unnest() is clearly not allowed. Perhaps the fact that the unnest() is hidden beneath the operator(->>) prevents the planner from noticing the transformation is not permitted. This is a pain to reproduce. Just dumping are restoring the tables elsewhere did not work. We noticed that the 'results' table in production was quite bloated and when we replicated that in the test environment, it finally triggered. At least at 300% bloat it triggered, that probably triggers the planner to try some other plans. This is not a critical bug, since it is easily worked around and the combination of conditions seems quite unusual. Noticed in 13.4, reproduced in 13.7 and 14.4. Have a nice day, Martijn
pgsql-bugs by date: