Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: PoC/WIP: Extended statistics on expressions |
Date | |
Msg-id | 096d26e1-f053-a039-047d-a050dbb80ac6@enterprisedb.com Whole thread Raw |
In response to | Re: PoC/WIP: Extended statistics on expressions (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Responses |
Re: PoC/WIP: Extended statistics on expressions
|
List | pgsql-hackers |
On 3/26/21 12:37 PM, Dean Rasheed wrote: > On Thu, 25 Mar 2021 at 19:59, Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> Attached is an updated patch series, with all the changes discussed >> here. I've cleaned up the ndistinct stuff a bit more (essentially >> reverting back from GroupExprInfo to GroupVarInfo name), and got rid of >> the UpdateStatisticsForTypeChange. >> > > I've looked over all that and I think it's in pretty good shape. I > particularly like how much simpler the ndistinct code has now become. > > Some (hopefully final) review comments: > > 1). I don't think index.c is the right place for > StatisticsGetRelation(). I appreciate that it is very similar to the > adjacent IndexGetRelation() function, but index.c is really only for > index-related code, so I think StatisticsGetRelation() should go in > statscmds.c > Ah, right, I forgot about this. I wonder if we should add catalog/statistics.c, similar to catalog/index.c (instead of adding it locally to statscmds.c). > 2). Perhaps the error message at statscmds.c:293 should read > > "expression cannot be used in multivariate statistics because its > type %s has no default btree operator class" > > (i.e., add the word "multivariate", since the same expression *can* be > used in univariate statistics even though it has no less-than > operator). > > 3). The comment for ATExecAddStatistics() should probably mention that > "ALTER TABLE ADD STATISTICS" isn't a command in the grammar, in a > similar way to other similar functions, e.g.: > > /* > * ALTER TABLE ADD STATISTICS > * > * This is no such command in the grammar, but we use this internally to add > * AT_ReAddStatistics subcommands to rebuild extended statistics after a table > * column type change. > */ > > 4). The comment at the start of ATPostAlterTypeParse() needs updating > to mention CREATE STATISTICS statements. > > 5). I think ATPostAlterTypeParse() should also attempt to preserve any > COMMENTs attached to statistics objects, i.e., something like: > > --- src/backend/commands/tablecmds.c.orig 2021-03-26 10:39:38.328631864 +0000 > +++ src/backend/commands/tablecmds.c 2021-03-26 10:47:21.042279580 +0000 > @@ -12619,6 +12619,9 @@ > CreateStatsStmt *stmt = (CreateStatsStmt *) stm; > AlterTableCmd *newcmd; > > + /* keep the statistics object's comment */ > + stmt->stxcomment = GetComment(oldId, StatisticExtRelationId, 0); > + > newcmd = makeNode(AlterTableCmd); > newcmd->subtype = AT_ReAddStatistics; > newcmd->def = (Node *) stmt; > > 6). Comment typo at extended_stats.c:2532 - s/statitics/statistics/ > > 7). I don't think that the big XXX comment near the start of > estimate_multivariate_ndistinct() is really relevant anymore, now that > the code has been simplified and we no longer extract Vars from > expressions, so perhaps it can just be deleted. > Thanks! I'll fix these, and then will consider getting it committed sometime later today, once the buildfarm does some testing on the other stuff I already committed. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: