Home > mailing lists

Re: Postgres 15 SELECT query doesn't use index under RLS - Mailing list pgsql-performance

From	Alexander Okulovich
Subject	Re: Postgres 15 SELECT query doesn't use index under RLS
Date	October 18, 2023 14:07:38
Msg-id	dd874d42-ad02-48a6-82db-5666f1ee0ec1@stiltsoft.com Whole thread Raw
In response to	Re: Postgres 15 SELECT query doesn't use index under RLS (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Postgres 15 SELECT query doesn't use index under RLS
List	pgsql-performance

Tree view

Hi Tom,

> If in fact you were getting decent performance from an indexscan plan
> before, the only explanation I can think of is that the repo_ids you
> are querying for are correlated with the tenant_id, so that the RLS
> filter doesn't eliminate very many rows from the index result.  The
> planner wouldn't realize that by default, but if you create extended
> statistics on repo_id and tenant_id then it might do better.  Still,
> you probably want the extra index.

Do you have any idea how to measure that correlation?

> You haven't shown any evidence suggesting that.
My suggestion is based on following backward reasoning.

We used the product with the default settings. The requests are simple. 
We didn't change the hardware (actually, we use even more performant 
hardware because of that issue) and DDL. I've checked the request on old 
and new databases. Requests that rely on this index execute more than 10 
times longer. Planner indeed used Index Scan before, but now it doesn't.

So, from my perspective, the only reason we experience that is database 
logic change. I think we could probably try to reproduce the issue on 
different Postgres versions and find the specific version that causes this.

> Adding tenant_id is going to bloat your indexes quite a bit,
> so I wouldn't do that except in cases where you've demonstrated
> it's important.

Any recommendations from the Postgres team on how to use the indexes 
under RLS would help a lot here, but I didn't find them.

Kind regards,

Alexander

On 13.10.2023 22:26, Tom Lane wrote:
> Alexander Okulovich <aokulovich@stiltsoft.com> writes:
>> Recently, we upgraded the AWS RDS instance from Postgres 12.14 to 15.4
>> and noticed extremely high disk consumption on the following query
>> execution:
>> select (exists (select 1 as "one" from "public"."indexed_commit" where
>> "public"."indexed_commit"."repo_id" in (964992,964994,964999, ...);
>> For some reason, the query planner starts using Seq Scan instead of the
>> index on the "repo_id" column when requesting under user limited with
>> RLS. On prod, it happens when there are more than 316 IDs in the IN part
>> of the query, on stage - 3. If we execute the request from Superuser,
>> the planner always uses the "repo_id" index.
> The superuser bypasses the RLS policy.  When that's enforced, the
> query can no longer use an index-only scan (because it needs to fetch
> tenant_id too).  Moreover, it may be that only a small fraction of the
> rows fetched via the index will satisfy the RLS condition.  So the
> estimated cost of an indexscan query could be high enough to persuade
> the planner that a seqscan is a better idea.
>
>> Luckily, we can easily reproduce this on our stage database (which is
>> smaller). If we add a multicolumn "repo_id, tenant_id" index, the
>> planner uses it (Index Only Scan) with any IN params count under RLS.
> Yeah, that would be the obvious way to ameliorate both problems.
>
> If in fact you were getting decent performance from an indexscan plan
> before, the only explanation I can think of is that the repo_ids you
> are querying for are correlated with the tenant_id, so that the RLS
> filter doesn't eliminate very many rows from the index result.  The
> planner wouldn't realize that by default, but if you create extended
> statistics on repo_id and tenant_id then it might do better.  Still,
> you probably want the extra index.
>
>> Could you please clarify if this is a Postgres bug or not?
> You haven't shown any evidence suggesting that.
>
>> Should we
>> include the "tenant_id" column in all our indexes to make them work
>> under RLS?
> Adding tenant_id is going to bloat your indexes quite a bit,
> so I wouldn't do that except in cases where you've demonstrated
> it's important.
>
>             regards, tom lane

pgsql-performance by date:

From: Alexander Okulovich
Date: 18 October 2023, 10:29:26
Subject: Re: Postgres 15 SELECT query doesn't use index under RLS

From: Tom Lane
Date: 18 October 2023, 20:35:50
Subject: Re: Postgres 15 SELECT query doesn't use index under RLS

Re: Postgres 15 SELECT query doesn't use index under RLS - Mailing list pgsql-performance

Previous

Next