Re: Multivariate MCV stats can leak data to unprivileged users - Mailing list pgsql-hackers
From | Dean Rasheed |
---|---|
Subject | Re: Multivariate MCV stats can leak data to unprivileged users |
Date | |
Msg-id | CAEZATCUcoAKrn-1NFkDN7md-wt6CQev0JXa+Tv31T9Jjkxgx5Q@mail.gmail.com Whole thread Raw |
In response to | Re: Multivariate MCV stats can leak data to unprivileged users (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Multivariate MCV stats can leak data to unprivileged users
|
List | pgsql-hackers |
On Sun, 19 May 2019 at 00:48, Stephen Frost <sfrost@snowman.net> wrote: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > > > > > I think we have four options - rework it before beta1, rework it after > > > beta1, rework it in PG13 and leave it as it is now. > > > > Yup, that's about what the options are. I'm just voting against > > "change it in v13". If we're going to change it, then the fewer > > major versions that have the bogus definition the better --- and > > since we're changing that catalog in v12 anyway, users will see > > fewer distinct behaviors if we do this change too. > > > > It's very possibly too late to get it done before beta1, > > unfortunately. But as Andres noted, post-beta1 catversion > > bumps are hardly unusual, so I do not think "rework after > > beta1" is unacceptable. > > Agreed. > Yes, that makes sense. I think we shouldn't risk trying to get this into beta1, but let's try to get it done as soon as possible after that. Actually, it doesn't appear to be as big a change as I had feared. As a starter for ten, here's a patch doing the basic split, moving the extended stats data into a new catalog pg_statistic_ext_data (I'm not particularly wedded to that name, it's just the first name that came to mind). With this patch the catalogs look like this: \d pg_statistic_ext Table "pg_catalog.pg_statistic_ext" Column | Type | Collation | Nullable | Default --------------+------------+-----------+----------+--------- oid | oid | | not null | stxrelid | oid | | not null | stxname | name | | not null | stxnamespace | oid | | not null | stxowner | oid | | not null | stxkeys | int2vector | | not null | stxkind | "char"[] | | not null | Indexes: "pg_statistic_ext_name_index" UNIQUE, btree (stxname, stxnamespace) "pg_statistic_ext_oid_index" UNIQUE, btree (oid) "pg_statistic_ext_relid_index" btree (stxrelid) \d pg_statistic_ext_data Table "pg_catalog.pg_statistic_ext_data" Column | Type | Collation | Nullable | Default -----------------+-----------------+-----------+----------+--------- stxoid | oid | | not null | stxndistinct | pg_ndistinct | C | | stxdependencies | pg_dependencies | C | | stxmcv | pg_mcv_list | C | | Indexes: "pg_statistic_ext_data_stxoid_index" UNIQUE, btree (stxoid) I opted to create/remove pg_statistic_ext_data tuples at the same time as the pg_statistic_ext tuples, in CreateStatistics() / RemoveStatisticsById(), so then it's easier to see that they're in a one-to-one relationship, and other code doesn't need to worry about the data tuple not existing. The other option would be to defer inserting the data tuple to ANALYZE. I couldn't resist moving the code block that declares pg_statistic_ext's indexes in indexing.h to the right place, assuming that file is (mostly) sorted alphabetically by catalog name. This puts the extended stats entries just after the normal stats entries which seems preferable. This is only a very rough first draft (e.g., no doc updates), but it passes all the regression tests. Regards, Dean
Attachment
pgsql-hackers by date: