Home > mailing lists

Re: ToDo: show size of partitioned table - Mailing list pgsql-hackers

From	Pavel Stehule
Subject	Re: ToDo: show size of partitioned table
Date	November 22, 2018 11:42:14
Msg-id	CAFj8pRBJetR9tk5cNGt5mLewxwB4gXK0dm7L41fxUqBJ58h2wg@mail.gmail.com Whole thread Raw
In response to	Re: ToDo: show size of partitioned table (Michael Paquier <michael@paquier.xyz>)
Responses	Re: ToDo: show size of partitioned table
List	pgsql-hackers

Tree view

čt 22. 11. 2018 v 1:51 odesílatel Michael Paquier <michael@paquier.xyz> napsal:

On Wed, Nov 21, 2018 at 05:37:33PM +0100, Pavel Stehule wrote:
> st 21. 11. 2018 v 17:21 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com>
> napsal:
>> Hmm, these tests are not going to work, because they have "pavel" in the
>> expected output.
>
> I was blind, thank you for check

+create table testtable_apple(logdate date);
+create table testtable_orange(logdate date);
+create index testtable_apple_index on testtable_apple(logdate);
+create index testtable_orange_index on testtable_orange(logdate);
There are already a bunch of partition relations with multiple levels
created as part of the regression tests, so instead of creating more of
those, I would suggest to test \dP and \dPt in create_table.sql, and
\dPi in indexing.sql (please make sure to add tests for \dP with
partitioned indexes as well).

I think that you should really add the direct parent of a partition in
at least the verbose output, now for multiple partition levels things
are confusing in my opinion. For example with such a schema:
CREATE TABLE parent_tab (id int) PARTITION BY RANGE (id);
CREATE INDEX parent_index ON parent_tab (id);
CREATE TABLE child_0_10 PARTITION OF parent_tab
FOR VALUES FROM (0) TO (10);
CREATE TABLE child_10_20 PARTITION OF parent_tab
FOR VALUES FROM (10) TO (20);
CREATE TABLE child_20_30 PARTITION OF parent_tab
FOR VALUES FROM (20) TO (30);
INSERT INTO parent_tab VALUES (generate_series(0,29));
CREATE TABLE child_30_40 PARTITION OF parent_tab
FOR VALUES FROM (30) TO (40)
PARTITION BY RANGE(id);
CREATE TABLE child_30_35 PARTITION OF child_30_40
FOR VALUES FROM (30) TO (35);
CREATE TABLE child_35_40 PARTITION OF child_30_40
FOR VALUES FROM (35) TO (40);
INSERT INTO parent_tab VALUES (generate_series(30,39));

Then with \dP+ I got that:
=# \dP+
List of partitioned relations
Schema | Name | Owner | Size | Description
--------+-------------+--------+--------+-------------
public | child_30_40 | ioltas | 48 kB |
public | parent_tab | ioltas | 120 kB |
(2 rows)
Showing the parent partition looks like a pretty important to me as I
would expect multi-level partitions to be a frequent case (perhaps it
should show up as well in the non-verbose output?). The field should be
NULL if the relation is the top of the tree.

it looks like bug for me much more.

your example - on my comp

there is about 5x 8KB data .. 40KB

But in views I got

there is 16KB more, what is really messy.

I think so most correct is removing child_30_40 from the report.

test=# SELECT n.nspname as "Schema",
c.relname as "Name",
pg_catalog.pg_get_userbyid(c.relowner) as "Owner",
(SELECT pg_catalog.pg_size_pretty(sum(pg_catalog.pg_table_size(relid)))
     FROM pg_catalog.pg_partition_tree(c.oid)) AS "Size",
pg_catalog.obj_description(c.oid, 'pg_class') as "Description"
FROM pg_catalog.pg_class c
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('p') and not c.relispartition
      AND n.nspname <> 'pg_catalog'
      AND n.nspname <> 'information_schema'
      AND n.nspname !~ '^pg_toast'
AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;
+--------+------------+-------+-------+-------------+
| Schema |    Name    | Owner | Size | Description |
+--------+------------+-------+-------+-------------+
| public | parent_tab | pavel | 40 kB |             |
+--------+------------+-------+-------+-------------+
(1 row)

I afraid of unreadable result if we allow overlap in report. I think so can be strange if some disk space will be reported 2x or more times in one report. Unfortunately It means so some information will be hidden. In this moment I prefer readability and simple meaning.

I am not strong in this topics. Another possibility is show parent (this should be displayed every time, without it it is messy).

This query is much more complex, but the result is more informative

SELECT n.nspname as "Schema",
c.relname as "Name",
n2.nspname as "Parent schema",
c2.relname as "Parent name",
pg_catalog.pg_get_userbyid(c.relowner) as "Owner",
s.max as "Hiearchy deep",
s.size as "Size",
pg_catalog.obj_description(c.oid, 'pg_class') as "Description"
FROM pg_catalog.pg_class c
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
     LEFT JOIN pg_catalog.pg_inherits i ON c.oid = i.inhrelid
     LEFT JOIN pg_catalog.pg_class c2 ON c2.oid = i.inhparent
     LEFT JOIN pg_catalog.pg_namespace n2 ON n2.oid = c2.relnamespace,
     LATERAL (SELECT max(level), pg_catalog.pg_size_pretty(sum(pg_catalog.pg_table_size(relid))) as size
     FROM pg_catalog.pg_partition_tree(c.oid)) s
WHERE c.relkind IN ('p')
      AND n.nspname <> 'pg_catalog'
      AND n.nspname <> 'information_schema'
      AND n.nspname !~ '^pg_toast'
AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;

Still I prefer to not show nested partitioned tables for simplicity, readability reasons. Displaying nested objects in one table doesn't look like good idea for me. But I am ready to accept different common opinion.

Still do you think so variant with parent should be preferred?

Again, with the previous schema:
=# \dPi *idx
List of partitioned indexes
Schema | Name | Owner | Table
--------+--------------------+--------+-------------
public | child_30_40_id_idx | ioltas | child_30_40
(1 row)
=# \dP *idx
Did not find any partitioned relations named "*idx"
I would have expected in the second case to have the partitioned
*relations* showing up in the output, and a relation can be an index as
well if the pattern matches.

I think so it is correct - I don't would to see the index here, because index size is calculated by total_relation_size already.

Here my position is strong. \dP for me doesn't mean "tables or indexes" - it means "partition tables with total relation size". I don't see any sense to show tables and indexes in one report.

Regards

Pavel

Could you please address those problems first? The basic shape of the
patch with the three new sub-commands is fine I think, so we can go
ahead with that, but the two problems reported are blockers in my
opinion.

--
Michael

pgsql-hackers by date:

From: Haozhou Wang
Date: 22 November 2018, 11:25:37
Subject: Re: Control your disk usage in PG: Introduction to Disk Quota Extension

From: Magnus Hagander
Date: 22 November 2018, 11:44:31
Subject: Re: New function pg_stat_statements_reset_query() to reset statisticsof a specific query

Re: ToDo: show size of partitioned table - Mailing list pgsql-hackers

Previous

Next