Thread: query multiple schemas
Hello,
I have a database with multiple tenants with a separate schema for each tenant.
The structure of each schema is identical, the tenant ID is the name of the schema.
What I would like to achieve is to able to query tables in all schemas at once with the tenant IDs added to the result set.
I experimented with typed solutions like described in https://dev.to/staab/how-to-query-multiple-schemas-with-postgresql-28kj without much success.
So I turned to a more dynamic JSON-based solution.
Please note that I'm new to plpgsql, so any (even a less related) advice is welcome :)
My current experimental function is:
CREATE OR REPLACE FUNCTION demo.tenant_union_query(tbl anyelement)
RETURNS setof json AS $func$
declare
_select text;
begin
_select := (select
string_agg(
format('select t.*, %L tenantId from %I.%I t', schema_name, schema_name, pg_typeof(tbl)),
E'\n' || ' union all ' || E'\n')
from (
SELECT schema_name
FROM information_schema.schemata
where schema_name not in ('information_schema') and schema_name not like 'pg_%'
) tenants
);
return query execute 'select row_to_json(r) from (' || _select || ') as r';
END;
$func$ LANGUAGE plpgsql;
And this is how I use it to query a "usual" result-set-like result with the tenant ID in the 1st column, followed by the fields from the given table:
select r->>'tenantid' tenantId, json_populate_record(null::mytable, r) from tenant_union_query(null::mytable) r order by tenantId; -- (1)
The above solution seems to work, my questions are:
- Is there a better way to achieve the same functionality? Maybe without using JSON as an intermediate representation?
- Is there a way to further simplify the usage, like instead of the query (1) above something more simple, like: select * from tenant_union_query_2(null::mytable) order by tenantId;
Thanks for your help in advance.
Best regards,
Norbi
Hello,
I have a database with multiple tenants with a separate schema for each tenant.
The structure of each schema is identical, the tenant ID is the name of the schema.What I would like to achieve is to able to query tables in all schemas at once with the tenant IDs added to the result set.
I experimented with typed solutions like described in https://dev.to/staab/how-to-query-multiple-schemas-with-postgresql-28kj without much success.
So I turned to a more dynamic JSON-based solution.Please note that I'm new to plpgsql, so any (even a less related) advice is welcome :)
My current experimental function is:
CREATE OR REPLACE FUNCTION demo.tenant_union_query(tbl anyelement)
RETURNS setof json AS $func$
declare
_select text;
begin
_select := (select
string_agg(
format('select t.*, %L tenantId from %I.%I t', schema_name, schema_name, pg_typeof(tbl)),
E'\n' || ' union all ' || E'\n')
from (
SELECT schema_name
FROM information_schema.schemata
where schema_name not in ('information_schema') and schema_name not like 'pg_%'
) tenants
);
return query execute 'select row_to_json(r) from (' || _select || ') as r';
END;
$func$ LANGUAGE plpgsql;
And this is how I use it to query a "usual" result-set-like result with the tenant ID in the 1st column, followed by the fields from the given table:
select r->>'tenantid' tenantId, json_populate_record(null::mytable, r) from tenant_union_query(null::mytable) r order by tenantId; -- (1)The above solution seems to work, my questions are:
- Is there a better way to achieve the same functionality? Maybe without using JSON as an intermediate representation?
- Is there a way to further simplify the usage, like instead of the query (1) above something more simple, like: select * from tenant_union_query_2(null::mytable) order by tenantId;
Thanks for your help in advance.
Best regards,
Norbi
The structure of each schema is identical, the tenant ID is the name of the schema.
The above solution seems to work, my questions are:
- Is there a better way to achieve the same functionality? Maybe without using JSON as an intermediate representation?
Steve Baldwin <steve.baldwin@gmail.com> writes: > If the number of tenant schemas is reasonably static, you could write a > plpgsql function to create a set of UNION ALL views with one view for each > table in all tenant schemas. You could re-run the function each time a > tenant schema is added. Having the set of views would allow you to query > them as you would any of the underlying tables, and the query planner could > likely optimise the query better. With your current function, if you needed > to add a WHERE clause and the underlying tables were large, it would likely > not perform as well as the UNION ALL view. Another idea is to build a partitioned table comprising all the individual tables. This might require the tables to have all the same owner (not sure about that) and it'd require you to store the partition key, ie tenant, explicitly in every table which might seem a bit wasteful. But it'll likely outperform any other solution for the union queries. Notably, adding or deleting a partition would be much less painful than redefining a UNION ALL view. regards, tom lane
Steve Baldwin <steve.baldwin@gmail.com> writes:
> If the number of tenant schemas is reasonably static, you could write a
> plpgsql function to create a set of UNION ALL views
Another idea is to build a partitioned table
> And if/when I get back to this issue myself, I'll do the same.
My current solution is not much different than the one I posted in my original question.CREATE OR REPLACE FUNCTION tenant_query_json(tbl anyelement)
RETURNS setof json AS $func$ [...]
CREATE OR REPLACE FUNCTION tenant_query(tbl anyelement)
RETURNS table(tenantId text, record anyelement) AS $func$ [...]
table_name
OF type_name
Hello,
I have a database with multiple tenants with a separate schema for each tenant.
The structure of each schema is identical, the tenant ID is the name of the schema.What I would like to achieve is to able to query tables in all schemas at once with the tenant IDs added to the result set.
declare
Result record;
schemas text;
begin
for schemas in select unnest(tenants) loop
execute Format('set local search_path to %s, public;', schemas);
for Result in execute Format('select $$%1$s$$ tenant, * from (%2$s) x', schemas, sql) loop
return next Result;
end loop;
end loop;
end;$function$;
select * from sql_per_tenant('select Order_ID, sum(Value) from Orders inner join Items using(Order_ID)