Home > mailing lists

Re: Multivariate MCV stats can leak data to unprivileged users - Mailing list pgsql-hackers

From	Dean Rasheed
Subject	Re: Multivariate MCV stats can leak data to unprivileged users
Date	May 19, 2019 09:49:03
Msg-id	CAEZATCUcoAKrn-1NFkDN7md-wt6CQev0JXa+Tv31T9Jjkxgx5Q@mail.gmail.com Whole thread Raw
In response to	Re: Multivariate MCV stats can leak data to unprivileged users (Stephen Frost <sfrost@snowman.net>)
Responses	Re: Multivariate MCV stats can leak data to unprivileged users
List	pgsql-hackers

Tree view

On Sun, 19 May 2019 at 00:48, Stephen Frost <sfrost@snowman.net> wrote:
>
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> >
> > > I think we have four options - rework it before beta1, rework it after
> > > beta1, rework it in PG13 and leave it as it is now.
> >
> > Yup, that's about what the options are.  I'm just voting against
> > "change it in v13".  If we're going to change it, then the fewer
> > major versions that have the bogus definition the better --- and
> > since we're changing that catalog in v12 anyway, users will see
> > fewer distinct behaviors if we do this change too.
> >
> > It's very possibly too late to get it done before beta1,
> > unfortunately.  But as Andres noted, post-beta1 catversion
> > bumps are hardly unusual, so I do not think "rework after
> > beta1" is unacceptable.
>
> Agreed.
>

Yes, that makes sense.

I think we shouldn't risk trying to get this into beta1, but let's try
to get it done as soon as possible after that.

Actually, it doesn't appear to be as big a change as I had feared. As
a starter for ten, here's a patch doing the basic split, moving the
extended stats data into a new catalog pg_statistic_ext_data (I'm not
particularly wedded to that name, it's just the first name that came
to mind).

With this patch the catalogs look like this:

\d pg_statistic_ext
            Table "pg_catalog.pg_statistic_ext"
    Column    |    Type    | Collation | Nullable | Default
--------------+------------+-----------+----------+---------
 oid          | oid        |           | not null |
 stxrelid     | oid        |           | not null |
 stxname      | name       |           | not null |
 stxnamespace | oid        |           | not null |
 stxowner     | oid        |           | not null |
 stxkeys      | int2vector |           | not null |
 stxkind      | "char"[]   |           | not null |
Indexes:
    "pg_statistic_ext_name_index" UNIQUE, btree (stxname, stxnamespace)
    "pg_statistic_ext_oid_index" UNIQUE, btree (oid)
    "pg_statistic_ext_relid_index" btree (stxrelid)

\d pg_statistic_ext_data
              Table "pg_catalog.pg_statistic_ext_data"
     Column      |      Type       | Collation | Nullable | Default
-----------------+-----------------+-----------+----------+---------
 stxoid          | oid             |           | not null |
 stxndistinct    | pg_ndistinct    | C         |          |
 stxdependencies | pg_dependencies | C         |          |
 stxmcv          | pg_mcv_list     | C         |          |
Indexes:
    "pg_statistic_ext_data_stxoid_index" UNIQUE, btree (stxoid)

I opted to create/remove pg_statistic_ext_data tuples at the same time
as the pg_statistic_ext tuples, in CreateStatistics() /
RemoveStatisticsById(), so then it's easier to see that they're in a
one-to-one relationship, and other code doesn't need to worry about
the data tuple not existing. The other option would be to defer
inserting the data tuple to ANALYZE.

I couldn't resist moving the code block that declares
pg_statistic_ext's indexes in indexing.h to the right place, assuming
that file is (mostly) sorted alphabetically by catalog name. This puts
the extended stats entries just after the normal stats entries which
seems preferable.

This is only a very rough first draft (e.g., no doc updates), but it
passes all the regression tests.

Regards,
Dean

Attachment

split-up-pg-statistics-ext.patch

pgsql-hackers by date:

From: David Rowley
Date: 19 May 2019, 08:18:38
Subject: Re: Statistical aggregate functions are not working with PARTIAL aggregation

From: Akim Demaille
Date: 19 May 2019, 12:47:32
Subject: Remove useless associativity/precedence from parsers

Re: Multivariate MCV stats can leak data to unprivileged users - Mailing list pgsql-hackers

Attachment

Previous

Next