Thread: RLS not using index scan but seq scan when condition gets a bit complicated
RLS not using index scan but seq scan when condition gets a bit complicated
From
Charles Huang
Date:
Hi Postgres community,
We are experiencing some performance issues when RLS is enabled for large tables. With simplified example:
We have a table:
CREATE TABLE emp.employees (employee_id INTEGER PRIMARY KEY,-- companies table are defined in a different schema, not accessible to emp servicecompany_id INTEGER NOT NULL,employee_name TEXT NOT NULL);
Index for employees table:
CREATE INDEX employees_company_id_idx ON emp.employees (company_id);
And for the table we have RLS select policy:
CREATE POLICY employee_select_policy ON emp.employees FOR SELECTUSING (company_id = ANY(coalesce(string_to_array(current_setting('emp.authorized_read_company_ids', TRUE), ',')::INTEGER[], ARRAY []::INTEGER[])));
When a very simple query is executed, for instance:
SET emp.authorized_read_company_ids = '1, 2, 3, ..., 200';SELECT count(*) FROM emp.employees WHERE TRUE; -- 68091 rows
The query plan for this query reads:
Aggregate (cost=1096.02..1096.03 rows=1 width=8) (actual time=8.740..8.740 rows=1 loops=1)Output: count(*)Buffers: shared hit=778-> Index Only Scan using employees_company_id_idx on emp.employees (cost=0.35..970.78 rows=50099 width=0) (actual time=0.124..4.976 rows=49953 loops=1)Output: company_idIndex Cond: (employees.company_id = ANY (COALESCE((string_to_array(current_setting('emp.authorized_read_company_ids'::text, true), ','::text))::integer[], '{}'::integer[])))Heap Fetches: 297Buffers: shared hit=778Planning:Buffers: shared hit=12Planning Time: 0.824 msExecution Time: 8.768 ms
The problem rises when we make the RLS select policy condition a bit more complicated by adding admin checks inside RLS select policy:
CREATE POLICY employee_select_policy ON emp.employees FOR SELECTUSING (coalesce(nullif(current_setting('emp.is_admin', TRUE), ''), 'false')::BOOLEANOR company_id = ANY(coalesce(string_to_array(current_setting('emp.authorized_read_company_ids', TRUE), ',')::INTEGER[], ARRAY []::INTEGER[])));
When the same simple query is executed:
SET emp.is_admin = TRUE;SET emp.authorized_read_company_ids = '1, 2, 3, ..., 200';SELECT count(*) FROM emp.employees WHERE TRUE; -- 68091 rows
The query plan now reads:
Aggregate (cost=6238.51..6238.52 rows=1 width=8) (actual time=2156.271..2156.272 rows=1 loops=1)Output: count(*)Buffers: shared hit=367-> Index Only Scan using employees_company_id_idx on emp.employees (cost=0.29..6099.16 rows=55740 width=0) (actual time=0.065..2151.939 rows=49953 loops=1)Output: company_idFilter: ((COALESCE(NULLIF(current_setting('emp.is_admin'::text, true), ''::text), 'false'::text))::boolean OR (employees.company_id = ANY (COALESCE((string_to_array(current_setting('emp.authorized_read_company_ids'::text, true), ','::text))::integer[], '{}'::integer[]))))Rows Removed by Filter: 11430Heap Fetches: 392Buffers: shared hit=367Planning Time: 0.744 msExecution Time: 2156.302 ms
We can see the performance has deteriorated horribly because the RLS is not using index any more for the company ids, the RLS scan happens for every single row in the result set against every single company id in the db context.
With the size of table and the number of company ids inside the db context growing, the execution time becomes longer and longer.
To summarise: We would like to have admin users run without any RLS restrictions, and normal users to have RLS enforced using an index based on company_ids. Unfortunately, we cannot have queries executed by admin users connect to the database as a different database user.
Is there anything you could suggest?
Thanks,
Charles