On Wed, 10 Sept 2025 at 09:32, Vivek Gadge <vvkgadge56@gmail.com> wrote:
> EXPLAIN ANALYZE VERBOSE
> SELECT m.txn_date, d.bank_ref
> FROM app.main m
> JOIN app.detail d ON m.txn_id = d.main_txn_id
> WHERE m.txn_id = 9999999999999999999;
>
>
> Both app.main and app.detail are range partitioned by month (e.g., main_202502, detail_202502, etc.) on a date
column,
>
> Attached below is the EXPLAIN ANALYZE VERBOSE output for the query. It shows that PostgreSQL is scanning all
partitionsin ascending order, even though the matching record is present in the main_202509 partition.
I think you might be under the false impression that the executor is
able to stop searching for the next row to join once it finds the
first matching row. If that were the case, then yes, it might be
faster to somehow scan the partition where the matching row exists
first as then you could forego scanning the remainder. PostgreSQL
does have some ability to do some of this with the "Unique Join"
functionality, but it does not apply to this case since there's no
proof that only a single row can match (this would require some sort
of Unique Index, which cannot exist in your case since we don't have
global indexes that exist over all partitions).
If you're able to patch PostgreSQL and recompile, try the attached
patch away from your production environment to check if there are any
gains to scanning the partitions in another order. I didn't bother to
change all the append paths that were generated, but I think I've
reversed the order of at least the one the planner is using in your
case.
It's probably true that there are some cases (when the row order does
not matter) where scanning partitions that are more present in shared
buffers first would be better as that means using buffers perhaps
before you evict them to make way for the buffers of some other
(normally unused) partition, but I don't believe buffer eviction comes
into effect for your case since you're basically just scanning an
index which doesn't contain any value for the search key. That
requires very few buffers.
A secondary thought here is that perhaps your partitioning strategy
needs revision. What's the reason you opted to partition by date? Does
that benefit some other queries better? Or is it just a case that it
works well for some data retention policy? IMO, partitioning by range
of the transaction ID would fit much better for this particular query.
David