On Tue, 2025-12-16 at 21:25 -0300, Jhonathan Cruz wrote:
> I would like to report two query execution behaviors that may indicate
> performance issues or regressions. I am not certain whether these are
> known or expected behaviors, so I would appreciate guidance.
>
> ----------------------------------------------------------------------
> 1) DISTINCT ON combined with ORDER BY
> ----------------------------------------------------------------------
>
> I observed cases where queries using DISTINCT ON together with ORDER BY
> produce execution plans with explicit sorting steps, even when the ordering
> requirements are clearly defined.
>
> Query:
>
> EXPLAIN (ANALYZE, BUFFERS)
> SELECT DISTINCT ON (grp)
> grp, id, created_at
> FROM t
> ORDER BY grp, created_at DESC;
>
> Example EXPLAIN ANALYZE output:
>
> Unique (cost=45231.12..47731.12 rows=100 width=24)
> (actual time=182.413..198.721 rows=100 loops=1)
> Buffers: shared hit=12845
> -> Sort (cost=45231.12..46481.12 rows=500000 width=24)
> (actual time=182.410..191.256 rows=500000 loops=1)
> Sort Key: grp, created_at DESC
> Sort Method: quicksort Memory: 51200kB
> Buffers: shared hit=12845
> -> Seq Scan on t (cost=0.00..11231.00 rows=500000 width=24)
> (actual time=0.012..38.117 rows=500000 loops=1)
> Buffers: shared hit=12845
> Planning Time: 0.213 ms
> Execution Time: 202.981 ms
>
> Question:
> - Is this planner behavior expected for DISTINCT ON queries, or could
> this case potentially avoid a full sort?
That is normal and as expected. If you need all result rows, you have
to sort all the rows to find the unique results.
If you put a LIMIT clause on the query, PostgreSQL could perform a
top-N sort.
DISTINCT is easy to write, but sometimes the most expensive part of a
query. I don't think that can be avoided.
> ----------------------------------------------------------------------
> 2) Possible JIT performance regression with aggregation and light filtering
> ----------------------------------------------------------------------
>
> On PostgreSQL 16.x, I observed that queries performing aggregation with
> light filtering may run slower with JIT enabled compared to JIT disabled.
>
> Query:
>
> SET jit = on;
>
> EXPLAIN (ANALYZE, BUFFERS)
> SELECT grp, count(*)
> FROM t
> WHERE id > 100
> GROUP BY grp;
>
> Example EXPLAIN ANALYZE output (JIT enabled):
>
> HashAggregate (cost=12431.22..12441.22 rows=100 width=12)
> (actual time=156.234..156.941 rows=100 loops=1)
> Group Key: grp
> Buffers: shared hit=12483
> -> Seq Scan on t (cost=0.00..11231.00 rows=499900 width=4)
> (actual time=0.014..52.731 rows=499900 loops=1)
> Filter: (id > 100)
> Rows Removed by Filter: 100
> Buffers: shared hit=12483
> JIT:
> Functions: 5
> Options: Inlining true, Optimization true, Expressions true, Deforming true
> Timing: Generation 2.312 ms, Inlining 18.942 ms,
> Optimization 94.117 ms, Emission 22.884 ms, Total 138.255 ms
> Planning Time: 0.184 ms
> Execution Time: 159.214 ms
Now this is more like a bug. JIT still has its problems. I think not the
least of the problems is that PostgreSQL cannot predict how long the
compilation will take.
At the risk of drawing fire, I have now come to recommend turning JIT off
unless you have an analytical workload.
Yours,
Laurenz Albe