David Rowley <dgrowleyml@gmail.com> wrote:
> On Fri, 19 Sept 2025 at 23:58, Antonin Houska <ah@cybertec.at> wrote:
> > Admittedly I haven't thought about clause like ORDER BY yet, but I wonder if
> > it'd really be useful. My understanding is that the purpose of clustering is
> > to make index scan more efficient: with a clustered table, the heap tuples
> > pertaining to given index tuple should be located on the same page, so the
> > heap access is not that random.
>
> I imagine that's true most of the time, but it could also be so that
> fewer pages are dirtied when an UPDATE updates a set or rows with the
> same or similar clustered column values.
Good point.
> > If IOT-AM table does not have anything like index, I imagine it has some kind
> > of ordering information in the system catalog. Without that the query planner
> > can hardly utilize the ordering. In such case REPACK should use the catalog
> > information on ordering rather than accept arbitrary ORDER BY clause.
>
> Well, it would be impossible to insert records without some metadata
> to indicate the IOT keys...
>
> You might assume that someone might change their mind one day about
> the chosen order and wish to change it. My point was about leaving the
> door open to support that by having some native syntax that could be
> used to trigger that change.
I doubted whether the current AM API is designed to do catalog changes, but
then recalled that CLUSTER does set pg_index.indisclustered, and that it does
so outside table_relation_copy_for_cluster(). So I can now imagine that REPACK
... ORDER BY can do something like that.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com