Thread: Column type modification in big tables

Column type modification in big tables

From

Lok P

Date:

07 August 2024, 11:09:05

Hello all,

We have a postgres table which is a range partitions on a timestamp column having total size ~3TB holding a total ~5billion rows spanning across ~150 daily partitions and there are ~140+columns in the table. Also this table is a child to another partition table. And we have partition creation handled through pg_partman extension on this table.

We have a requirement of modifying the existing column lengths as below . So doing it directly through a single alter command will probably scan and rewrite the whole table which may take hours of run time.

So trying to understand from experts what is the best possible way to tackle such changes in postgres database? And if any side effects we may see considering this table being child to another one and also dependency with pg_partman extension.

two of the columns from varchar(20) to numeric(3)
one of the columns from varchar(20) to varchar(2)
one of the columns from Number(10,2) to Numeric(8,2)

Regards

Lok

Re: Column type modification in big tables

From

sud

Date:

07 August 2024, 11:21:26

On Wed, Aug 7, 2024 at 4:39 PM Lok P <loknath.73@gmail.com> wrote:

Hello all,
We have a postgres table which is a range partitions on a timestamp column having total size ~3TB holding a total ~5billion rows spanning across ~150 daily partitions and there are ~140+columns in the table. Also this table is a child to another partition table. And we have partition creation handled through pg_partman extension on this table.

We have a requirement of modifying the existing column lengths as below . So doing it directly through a single alter command will probably scan and rewrite the whole table which may take hours of run time.

So trying to understand from experts what is the best possible way to tackle such changes in postgres database? And if any side effects we may see considering this table being child to another one and also dependency with pg_partman extension.

two of the columns from varchar(20) to numeric(3)
one of the columns from varchar(20) to varchar(2)
one of the columns from Number(10,2) to Numeric(8,2)

Others may correct but i think, If you don't have the FK defined on these columns you can do below.

--Alter table add column which will be very fast within seconds as it will just add it to the data dictionary.

ALTER TABLE tab1 ADD COLUMN new_column1 NUMERIC(3), new_column2 varchar2(3);

-- Back populate the data partition wise and commit, if it's really needed

UPDATE tab1_part1 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
UPDATE tab1_part2 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
UPDATE tab1_part3 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
.....

--Alter table drop old columns which will be very fast within seconds as it will just drop it from the data dictionary.

ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2;

Re: Column type modification in big tables

From

Lok P

Date:

07 August 2024, 11:30:02

On Wed, Aug 7, 2024 at 4:51 PM sud <suds1434@gmail.com> wrote:

Others may correct but i think, If you don't have the FK defined on these columns you can do below.

--Alter table add column which will be very fast within seconds as it will just add it to the data dictionary.

ALTER TABLE tab1 ADD COLUMN new_column1 NUMERIC(3), new_column2 varchar2(3);

-- Back populate the data partition wise and commit, if it's really needed

UPDATE tab1_part1 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
UPDATE tab1_part2 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
UPDATE tab1_part3 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
.....

--Alter table drop old columns which will be very fast within seconds as it will just drop it from the data dictionary.

ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2;

Thank you so much.

I understand this will be the fastest possible way to achieve the column modification.

But talking about the dropped column which will be sitting in the table and consuming storage space, Is it fine to leave as is or auto vacuum will remove the column values behind the scene and also anyway , once those partitions will be purged they will be by default purged. Is this understanding correct?

And also will this have any impact on the partition maintenance which is currently done by pg_partman as because the template table is now different internally(not from outside though). Will it cause conflict because of those dropped columns from the main table?

Re: Column type modification in big tables

From

sud

Date:

07 August 2024, 19:35:56

On Wed, Aug 7, 2024 at 5:00 PM Lok P <loknath.73@gmail.com> wrote:

On Wed, Aug 7, 2024 at 4:51 PM sud <suds1434@gmail.com> wrote:

Others may correct but i think, If you don't have the FK defined on these columns you can do below.

--Alter table add column which will be very fast within seconds as it will just add it to the data dictionary.

ALTER TABLE tab1 ADD COLUMN new_column1 NUMERIC(3), new_column2 varchar2(3);

-- Back populate the data partition wise and commit, if it's really needed

UPDATE tab1_part1 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
UPDATE tab1_part2 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
UPDATE tab1_part3 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
.....

--Alter table drop old columns which will be very fast within seconds as it will just drop it from the data dictionary.

ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2;

Thank you so much.

I understand this will be the fastest possible way to achieve the column modification.

But talking about the dropped column which will be sitting in the table and consuming storage space, Is it fine to leave as is or auto vacuum will remove the column values behind the scene and also anyway , once those partitions will be purged they will be by default purged. Is this understanding correct?

And also will this have any impact on the partition maintenance which is currently done by pg_partman as because the template table is now different internally(not from outside though). Will it cause conflict because of those dropped columns from the main table?

I think leaving the table as is after the dropping column will be fine for you because your regular partition maintenance/drop will slowly purge the historical partitions and eventually they will be removed. But if you update those new columns with the old column values, then autovacuum should also take care of removing the rows with older column values (which are dead actually) .

Not sure if pg_partman will cause any issue ,as because the table now has the column data type/length changed. Others may confirm.

Re: Column type modification in big tables

From

Lok P

Date:

08 August 2024, 18:38:39

On Thu, Aug 8, 2024 at 1:06 AM sud <suds1434@gmail.com> wrote:

On Wed, Aug 7, 2024 at 5:00 PM Lok P <loknath.73@gmail.com> wrote:

On Wed, Aug 7, 2024 at 4:51 PM sud <suds1434@gmail.com> wrote:

Others may correct but i think, If you don't have the FK defined on these columns you can do below.

--Alter table add column which will be very fast within seconds as it will just add it to the data dictionary.

ALTER TABLE tab1 ADD COLUMN new_column1 NUMERIC(3), new_column2 varchar2(3);

-- Back populate the data partition wise and commit, if it's really needed

UPDATE tab1_part1 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
UPDATE tab1_part2 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
UPDATE tab1_part3 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), new_column2 = CAST(old_column2 AS varchar2(3)) ;
commit;
.....

--Alter table drop old columns which will be very fast within seconds as it will just drop it from the data dictionary.

ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2;

Thank you so much.

I understand this will be the fastest possible way to achieve the column modification.

But talking about the dropped column which will be sitting in the table and consuming storage space, Is it fine to leave as is or auto vacuum will remove the column values behind the scene and also anyway , once those partitions will be purged they will be by default purged. Is this understanding correct?

And also will this have any impact on the partition maintenance which is currently done by pg_partman as because the template table is now different internally(not from outside though). Will it cause conflict because of those dropped columns from the main table?

I think leaving the table as is after the dropping column will be fine for you because your regular partition maintenance/drop will slowly purge the historical partitions and eventually they will be removed. But if you update those new columns with the old column values, then autovacuum should also take care of removing the rows with older column values (which are dead actually) .

Not sure if pg_partman will cause any issue ,as because the table now has the column data type/length changed. Others may confirm.

Thank you so much.

Can anybody suggest any other possible way here. As, we also need to have the existing values be updated to the new column value here using update command (even if it will update one partition at a time). And as I see we have almost all the values in the column not null, which means it will update almost ~5billion rows across all the partitions. So my question is , is there any parameter(like work_mem,maintenance_work_mem etc) which we can set to make this update faster?

or any other way to get this column altered apart from this method?

Re: Column type modification in big tables

From

Alban Hertroys

Date:

08 August 2024, 18:55:22

> On 8 Aug 2024, at 20:38, Lok P <loknath.73@gmail.com> wrote:
>
> Thank you so much.
>
> Can anybody suggest any other possible way here. As, we also need to have the existing values be updated to the new
columnvalue here using update command (even if it will update one partition at a time). And as I see we have almost all
thevalues in the column not null, which means it will update almost ~5billion rows across all the partitions. So my
questionis , is there any parameter(like work_mem,maintenance_work_mem etc) which we can set to make this update
faster?
>  or any other way to get this column altered apart from this method?

Just a wild thought here that I’m currently not able to check… Can you add views as partitions? They would be read-only
ofcourse, but that would allow you to cast the columns in your original partitions to the new format, while you can add
anynew partitions in the new format. 

I suspect it’s not allowed, but perhaps worth a try.

Alban Hertroys
--
There is always an exception to always.

Re: Column type modification in big tables

From

Greg Sabino Mullane

Date:

08 August 2024, 20:35:32

On Thu, Aug 8, 2024 at 2:39 PM Lok P <loknath.73@gmail.com> wrote:

Can anybody suggest any other possible way here.

Sure - how about not changing the column type at all?

> one of the columns from varchar(20) to varchar(2)

ALTER TABLE foobar ADD CONSTRAINT twocharplease CHECK (length(mycol) <= 2) NOT VALID;

> one of the columns from Number(10,2) to Numeric(8,2)

ALTER TABLE foobar ADD CONSTRAINT eightprecision CHECK (mycol <= 10^8) NOT VALID;

> two of the columns from varchar(20) to numeric(3)

This one is trickier, as we don't know the contents, nor why it is going to numeric(3) - not a terribly useful data type, but let's roll with it and assume the stuff in the varchar is a number of some sort, and that we don't allow nulls:

ALTER TABLE foobar ADD CONSTRAINT onekorless CHECK (mycol::numeric(3) is not null) NOT VALID;

You probably want to check on the validity of the existing rows: see the docs on VALIDATE CONSTRAINT here:

https://www.postgresql.org/docs/current/sql-altertable.html

Cheers,

Greg

Re: Column type modification in big tables

From

Lok P

Date:

09 August 2024, 10:39:03

On Fri, Aug 9, 2024 at 2:06 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:

On Thu, Aug 8, 2024 at 2:39 PM Lok P <loknath.73@gmail.com> wrote:
Can anybody suggest any other possible way here.

Sure - how about not changing the column type at all?

> one of the columns from varchar(20) to varchar(2)

ALTER TABLE foobar ADD CONSTRAINT twocharplease CHECK (length(mycol) <= 2) NOT VALID;

> one of the columns from Number(10,2) to Numeric(8,2)

ALTER TABLE foobar ADD CONSTRAINT eightprecision CHECK (mycol <= 10^8) NOT VALID;

> two of the columns from varchar(20) to numeric(3)

This one is trickier, as we don't know the contents, nor why it is going to numeric(3) - not a terribly useful data type, but let's roll with it and assume the stuff in the varchar is a number of some sort, and that we don't allow nulls:

ALTER TABLE foobar ADD CONSTRAINT onekorless CHECK (mycol::numeric(3) is not null) NOT VALID;

You probably want to check on the validity of the existing rows: see the docs on VALIDATE CONSTRAINT here:

https://www.postgresql.org/docs/current/sql-altertable.html

Thank you so much. Will definitely try to evaluate this approach. The Only concern I have is , as this data is moving downstream with exactly the same data type and length , so will it cause the downstream code to break while using this column in the join or filter criteria. Also I believe the optimizer won't be able to utilize this information while preparing the execution plan.

Another thing , correct me if wrong, My understanding is , if we want to run the "validate constraint" command after running this "check constraint with not valid" command, this will do a full table scan across all the partitions , but it's still beneficial as compared to updating the columns values for each rows. Correct me if I'm wrong.

Re: Column type modification in big tables

From

Greg Sabino Mullane

Date:

09 August 2024, 15:48:35

On Fri, Aug 9, 2024 at 6:39 AM Lok P <loknath.73@gmail.com> wrote:

Thank you so much. Will definitely try to evaluate this approach. The Only concern I have is , as this data is moving downstream with exactly the same data type and length , so will it cause the downstream code to break while using this column in the join or filter criteria. Also I believe the optimizer won't be able to utilize this information while preparing the execution plan.

Yes, this is not as ideal as rewriting the table, but you asked for another approaches :) As to the impact of your downstream stuff, I think you have to try and see. Not clear what you mean by the optimizer, it's not going to really care about numeric(10) versus numeric(8) or varchar(20) vs varchar(2). It's possible the varchar -> numeric could cause issues, but without real-world queries and data we cannot say.

Another thing , correct me if wrong, My understanding is , if we want to run the "validate constraint" command after running this "check constraint with not valid" command, this will do a full table scan across all the partitions , but it's still beneficial as compared to updating the columns values for each rows. Correct me if I'm wrong.

Yes, it needs to scan the entire table, but it's a lightweight lock, won't block concurrent access, will not need to detoast, and makes no table or index updates. Versus an entire table rewrite which will do heavy locking, take up tons of I/O, update all the indexes, and generate quite a lot of WAL.

Cheers,

Greg

Re: Column type modification in big tables

From

Lok P

Date:

09 August 2024, 19:22:05

On Fri, Aug 9, 2024 at 9:19 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:

On Fri, Aug 9, 2024 at 6:39 AM Lok P <loknath.73@gmail.com> wrote:
Thank you so much. Will definitely try to evaluate this approach. The Only concern I have is , as this data is moving downstream with exactly the same data type and length , so will it cause the downstream code to break while using this column in the join or filter criteria. Also I believe the optimizer won't be able to utilize this information while preparing the execution plan.

Yes, this is not as ideal as rewriting the table, but you asked for another approaches :) As to the impact of your downstream stuff, I think you have to try and see. Not clear what you mean by the optimizer, it's not going to really care about numeric(10) versus numeric(8) or varchar(20) vs varchar(2). It's possible the varchar -> numeric could cause issues, but without real-world queries and data we cannot say.

Another thing , correct me if wrong, My understanding is , if we want to run the "validate constraint" command after running this "check constraint with not valid" command, this will do a full table scan across all the partitions , but it's still beneficial as compared to updating the columns values for each rows. Correct me if I'm wrong.

Yes, it needs to scan the entire table, but it's a lightweight lock, won't block concurrent access, will not need to detoast, and makes no table or index updates. Versus an entire table rewrite which will do heavy locking, take up tons of I/O, update all the indexes, and generate quite a lot of WAL.

Thank you so much Greg.

Considering the option, if we are able to get large down time to get this activity done.

Some teammates suggested altering the column with "USING" Clause. I am not really able to understand the difference, also when i tested on a simple table, it seems the "USING" clause takes more time as compared to normal ALTER. But again I don't see any way to see the progress and estimated completion time. Can you share your thoughts on this?

ALTER TABLE foobar ALTER COLUMN mycol TYPE NUMERIC(3) USING mycol::NUMERIC(3);
VS
ALTER TABLE foobar ALTER COLUMN mycol TYPE NUMERIC(3) ;

*****

Another thing also comes to my mind whether we should just create a new partition table(say new_part_table) from scratch from the DDL of the existing table(say old_part_table) and then load the data into it using command (insert into new_part_table.. select..from old_part_table). Then create indexes and constraints etc, something as below.

Will this approach be faster/better as compared to the simple "alter table alter column approach" as above, considering we will have 4-6 hours of downtime for altering three different columns on this ~5TB table?

-- Steps

Create table exactly same as existing partition table but with the modified column types/lengths.

drop indexes ; (Except PK and FK indexes may be..)
drop constraints;

insert into new_part_table (...) select (...) from old_part_table;

create indexes concurrently ;
create constraints; (But this table is also a child table to another partition table, so creating the foreign key may be resource consuming here too).

drop the old_part_table;
rename the new_part_table to old_part_table;
rename all the partitions;

VACUUM old_part_table ;

ANALYZE old_part_table ;

Re: Column type modification in big tables

From

sud

Date:

10 August 2024, 12:17:18

On Sat, Aug 10, 2024 at 12:52 AM Lok P <loknath.73@gmail.com> wrote:

On Fri, Aug 9, 2024 at 9:19 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
On Fri, Aug 9, 2024 at 6:39 AM Lok P <loknath.73@gmail.com> wrote:
Thank you so much. Will definitely try to evaluate this approach. The Only concern I have is , as this data is moving downstream with exactly the same data type and length , so will it cause the downstream code to break while using this column in the join or filter criteria. Also I believe the optimizer won't be able to utilize this information while preparing the execution plan.

Yes, this is not as ideal as rewriting the table, but you asked for another approaches :) As to the impact of your downstream stuff, I think you have to try and see. Not clear what you mean by the optimizer, it's not going to really care about numeric(10) versus numeric(8) or varchar(20) vs varchar(2). It's possible the varchar -> numeric could cause issues, but without real-world queries and data we cannot say.

Another thing , correct me if wrong, My understanding is , if we want to run the "validate constraint" command after running this "check constraint with not valid" command, this will do a full table scan across all the partitions , but it's still beneficial as compared to updating the columns values for each rows. Correct me if I'm wrong.

Yes, it needs to scan the entire table, but it's a lightweight lock, won't block concurrent access, will not need to detoast, and makes no table or index updates. Versus an entire table rewrite which will do heavy locking, take up tons of I/O, update all the indexes, and generate quite a lot of WAL.

Thank you so much Greg.

Considering the option, if we are able to get large down time to get this activity done.

Some teammates suggested altering the column with "USING" Clause. I am not really able to understand the difference, also when i tested on a simple table, it seems the "USING" clause takes more time as compared to normal ALTER. But again I don't see any way to see the progress and estimated completion time. Can you share your thoughts on this?

ALTER TABLE foobar ALTER COLUMN mycol TYPE NUMERIC(3) USING mycol::NUMERIC(3);
VS
ALTER TABLE foobar ALTER COLUMN mycol TYPE NUMERIC(3) ;

*****
Another thing also comes to my mind whether we should just create a new partition table(say new_part_table) from scratch from the DDL of the existing table(say old_part_table) and then load the data into it using command (insert into new_part_table.. select..from old_part_table). Then create indexes and constraints etc, something as below.

Will this approach be faster/better as compared to the simple "alter table alter column approach" as above, considering we will have 4-6 hours of downtime for altering three different columns on this ~5TB table?

-- Steps

Create table exactly same as existing partition table but with the modified column types/lengths.

drop indexes ; (Except PK and FK indexes may be..)
drop constraints;

insert into new_part_table (...) select (...) from old_part_table;

create indexes concurrently ;
create constraints; (But this table is also a child table to another partition table, so creating the foreign key may be resource consuming here too).

drop the old_part_table;
rename the new_part_table to old_part_table;
rename all the partitions;

VACUUM old_part_table ;
ANALYZE old_part_table ;

My 2cents.

If you have enough time then from a simplicity point of view, your single line alter command may look good, but how are you going to see the amount of progress it has made so far and how much time it's going to take to finish. And you got ~6hrs of down time but if it fails at 5th hour then you will be in a bad position.

Re: Column type modification in big tables

From

Lok P

Date:

10 August 2024, 21:06:42

On Sat, Aug 10, 2024 at 5:47 PM sud <suds1434@gmail.com> wrote:

My 2cents.
If you have enough time then from a simplicity point of view, your single line alter command may look good, but how are you going to see the amount of progress it has made so far and how much time it's going to take to finish. And you got ~6hrs of down time but if it fails at 5th hour then you will be in a bad position.

Does it mean that , if we get enough downtime then , we should rather go with the option of recreating the table from scratch and populating the data from the existing table and then rename it back? It does look more complicated considering many steps like creating indexes, constraints back and renaming it and then running vacuum and analyze etc.

Can someone through some light , in case we get 5-6hrs downtime for this change , then what method should we choose for this Alter operation?

Re: Column type modification in big tables

From

Greg Sabino Mullane

Date:

13 August 2024, 14:08:52

On Sat, Aug 10, 2024 at 5:06 PM Lok P <loknath.73@gmail.com> wrote:

Can someone through some light , in case we get 5-6hrs downtime for this change , then what method should we choose for this Alter operation?

We can't really answer that. Only you know what resources you have, what risk/reward you are willing to handle, and how long things may take. For that latter item, your best bet is to try this out on the same/similar hardware and see how long it takes. Do a smaller table and extrapolate if you need to. Or promote one of your replicas offline and modify that. I've given you a low-risk / medium-reward option with check constraints, but for the ALTER TABLE options you really need to try it and see (on non-prod).

it seems the "USING" clause takes more time as compared to normal ALTER. But again I don't see any way to see the progress and estimated completion time. Can you share your thoughts on this?

There should be no difference if they are doing the same conversion.

Will this approach be faster/better as compared to the simple "alter table alter column approach" as above

Seems a lot more complicated to me than a simple ALTER. But measurement is key. Create a new test cluster using pgBackRest or whatever you have. Then run your ALTER TABLE and see how long it takes (remember that multiple columns can be changed in a single ALTER TABLE statement).

Cheers,

Greg

Re: Column type modification in big tables

From

Lok P

Date:

13 August 2024, 19:56:36

On Tue, Aug 13, 2024 at 7:39 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:

On Sat, Aug 10, 2024 at 5:06 PM Lok P <loknath.73@gmail.com> wrote:

Can someone through some light , in case we get 5-6hrs downtime for this change , then what method should we choose for this Alter operation?

We can't really answer that. Only you know what resources you have, what risk/reward you are willing to handle, and how long things may take. For that latter item, your best bet is to try this out on the same/similar hardware and see how long it takes. Do a smaller table and extrapolate if you need to. Or promote one of your replicas offline and modify that. I've given you a low-risk / medium-reward option with check constraints, but for the ALTER TABLE options you really need to try it and see (on non-prod).

"Do a smaller table and extrapolate if you need to. Or promote one of your replicas offline and modify that. I've given you a low-risk / medium-reward option with check constraints, but for the ALTER TABLE options you really need to try it and see (on non-prod)."

Is there any possible method(maybe by looking into the data dictionary tables/views etc) to see the progress of the Alter statement by which we can estimate the expected completion time of the "Alter" command? I understand pg_stat_activity doesn't show any completion percentage of a statement, but wondering if by any other possible way we can estimate the amount of time it will take in prod for the completion of the ALTER command.

Re: Column type modification in big tables

From

veem v

Date:

13 August 2024, 19:58:24

On Tue, 13 Aug 2024 at 19:39, Greg Sabino Mullane <htamfids@gmail.com> wrote:

On Sat, Aug 10, 2024 at 5:06 PM Lok P <loknath.73@gmail.com> wrote:

Can someone through some light , in case we get 5-6hrs downtime for this change , then what method should we choose for this Alter operation?

We can't really answer that. Only you know what resources you have, what risk/reward you are willing to handle, and how long things may take. For that latter item, your best bet is to try this out on the same/similar hardware and see how long it takes. Do a smaller table and extrapolate if you need to. Or promote one of your replicas offline and modify that. I've given you a low-risk / medium-reward option with check constraints, but for the ALTER TABLE options you really need to try it and see (on non-prod).

What about if the OP opt a strategy something as below,

1) Detaching the partitions 2)Altering individual partitions with required column type and length 3)Altering the table 4)Attaching the partitions back to the main table

This should be faster and also a controlled fashion for each partition individually.

Re: Column type modification in big tables

From

"Peter J. Holzer"

Date:

13 August 2024, 20:53:51

On 2024-08-14 01:26:36 +0530, Lok P wrote:
> Is there any possible method(maybe by looking into the data dictionary tables/
> views etc) to see the progress of the Alter statement by which we can estimate
> the expected completion time of the "Alter" command? I understand
> pg_stat_activity doesn't show any completion percentage of a statement, but
> wondering if by any other possible way we can estimate the amount of time it
> will take in prod for the completion of the ALTER command.

You could look at the data files. Tables in PostgreSQL are stored as a
series of 1GB files, so you watching them being created and/or read
gives you a pretty good idea about progress.

For example, here is an alter table (changing one column from int to
bigint)  on a 1.8 GB table on my laptop:

The original table: Two data files with 1 and 0.8 GB respectively:

22:26:51 1073741824 Aug 13 22:24 266648
22:26:51  853794816 Aug 13 22:26 266648.1

The operation begins: A data file for the new table appears:

22:26:55 1073741824 Aug 13 22:26 266648
22:26:55  853794816 Aug 13 22:26 266648.1
22:26:55   79298560 Aug 13 22:26 266659

... and grows:

22:26:57 1073741824 Aug 13 22:26 266648
22:26:57  853794816 Aug 13 22:26 266648.1
22:26:57  208977920 Aug 13 22:26 266659

... and grows:

22:26:59 1073741824 Aug 13 22:26 266648
22:26:59  853794816 Aug 13 22:26 266648.1
22:26:59  284024832 Aug 13 22:26 266659

and now the table has exceeded 1 GB, so there's a second file:

22:27:17 1073741824 Aug 13 22:26 266648
22:27:17 1073741824 Aug 13 22:27 266659
22:27:17  853794816 Aug 13 22:27 266648.1
22:27:17    3022848 Aug 13 22:27 266659.1

... and a third:

22:27:44 1073741824 Aug 13 22:26 266648
22:27:44 1073741824 Aug 13 22:27 266659
22:27:44 1073741824 Aug 13 22:27 266659.1
22:27:44  853794816 Aug 13 22:27 266648.1
22:27:44   36798464 Aug 13 22:27 266659.2

almost finished:

22:28:08 1073741824 Aug 13 22:26 266648
22:28:08 1073741824 Aug 13 22:27 266659
22:28:08 1073741824 Aug 13 22:27 266659.1
22:28:08  853794816 Aug 13 22:27 266648.1
22:28:08   36798464 Aug 13 22:28 266659.2

Done: The old table has been reduced to an empty file (not sure why
PostgreSQL keeps that around):

22:28:10 1073741824 Aug 13 22:27 266659
22:28:10 1073741824 Aug 13 22:27 266659.1
22:28:10   36798464 Aug 13 22:28 266659.2
22:28:10          0 Aug 13 22:28 266648

Of course you need to be postgres or root to do this. Be careful!

Watching the access times may be useful, too, but on Linux by default
the access time is only updated under some special circumstances, so
this may be misleading.

        hp


--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

Re: Column type modification in big tables

From

Dominique Devienne

Date:

14 August 2024, 08:04:17

On Tue, Aug 13, 2024 at 10:54 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
> You could look at the data files. Tables in PostgreSQL are stored as a
> series of 1GB files, so you watching them being created and/or read
> gives you a pretty good idea about progress.

Thanks Peter, very insightful. Appreciated. --DD

Re: Column type modification in big tables

From

Lok P

Date:

15 August 2024, 15:15:25

On Tue, Aug 13, 2024 at 7:39 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:

On Sat, Aug 10, 2024 at 5:06 PM Lok P <loknath.73@gmail.com> wrote:

Can someone through some light , in case we get 5-6hrs downtime for this change , then what method should we choose for this Alter operation?

We can't really answer that. Only you know what resources you have, what risk/reward you are willing to handle, and how long things may take. For that latter item, your best bet is to try this out on the same/similar hardware and see how long it takes. Do a smaller table and extrapolate if you need to.

Hello Greg,

In terms of testing on sample data and extrapolating, as i picked the avg partition sizeof the table (which is ~20GB) and i created a non partitioned table with exactly same columns and populated with similar data and also created same set of indexes on it and the underlying hardware is exactly same as its on production. I am seeing it's taking ~5minutes to alter all the four columns on this table. So we have ~90 partitions in production with data in them and the other few are future partitions and are blank. (Note- I executed the alter with "work_mem=4GB, maintenance_work_mem=30gb, max_parallel_worker_per_gather=8, max_parallel_maintenance_worker =16" )

So considering the above figures , can i safely assume it will take ~90*5minutes= ~7.5hours in production and thus that many hours of downtime needed for this alter OR do we need to consider any other factors or activity here?

Re: Column type modification in big tables

From

Alban Hertroys

Date:

15 August 2024, 18:48:10

> On 15 Aug 2024, at 14:15, Lok P <loknath.73@gmail.com> wrote:

(…)

> Hello Greg,
>
> In terms of testing on sample data and extrapolating, as i picked the avg partition sizeof the table (which is ~20GB)
andi created a non partitioned table with exactly same columns and populated with similar data and also created same
setof indexes on it and the underlying hardware is exactly same as its on production. I am seeing it's taking ~5minutes
toalter all the four columns on this table. So we have ~90 partitions in production with data in them and the other few
arefuture partitions and are blank. (Note- I executed the alter with "work_mem=4GB, maintenance_work_mem=30gb,
max_parallel_worker_per_gather=8,max_parallel_maintenance_worker =16" ) 
>
> So considering the above figures , can i safely assume it will take ~90*5minutes= ~7.5hours in production and thus
thatmany hours of downtime needed for this alter OR do we need to consider any other factors or activity here?  

Are all those partitions critical, or only a relative few?

If that’s the case, you could:
    1) detach the non-critical partitions
    2) take the system down for maintenance
    3) update the critical partitions
    4) take the system up again
    5) update the non-critical partitions
    6) re-attach the non-critical partitions

That could shave a significant amount of time off your down-time. I would script the detach and re-attach processes
first,to save some extra. 

Admittedly, I haven’t actually tried that procedure, but I see no reason why it wouldn’t work.

Apart perhaps, from inserts happening that should have gone to some of those detached partitions. Maybe those could be
sentto a ‘default’ partition that gets detached at step 7, after which you can insert+select those from the default
intothe appropriate partitions? 

But you were going to test that first anyway, obviously.

Alban Hertroys
--
There is always an exception to always.

Re: Column type modification in big tables

From

Lok P

Date:

15 August 2024, 23:34:47

On Thu, Aug 15, 2024 at 9:18 PM Alban Hertroys <haramrae@gmail.com> wrote:

> On 15 Aug 2024, at 14:15, Lok P <loknath.73@gmail.com> wrote:

(…)
Are all those partitions critical, or only a relative few?

If that’s the case, you could:
1) detach the non-critical partitions
2) take the system down for maintenance
3) update the critical partitions
4) take the system up again
5) update the non-critical partitions
6) re-attach the non-critical partitions

That could shave a significant amount of time off your down-time. I would script the detach and re-attach processes first, to save some extra.

Thank you so much.

The partition table which we are planning to apply the ALTER script is a child table to another big partition table. And we have foreign key defined on table level but not partition to partition. So will detaching the partitions and then altering column of each detached partition and then re-attaching will revalidate the foreign key again? If that is the case then the re-attaching partition step might consume a lot of time. Is my understanding correct here?

Re: Column type modification in big tables

From

Lok P

Date:

15 August 2024, 23:41:03

On Fri, Aug 16, 2024 at 2:04 AM Lok P <loknath.73@gmail.com> wrote:

On Thu, Aug 15, 2024 at 9:18 PM Alban Hertroys <haramrae@gmail.com> wrote:

> On 15 Aug 2024, at 14:15, Lok P <loknath.73@gmail.com> wrote:

(…)
Are all those partitions critical, or only a relative few?

If that’s the case, you could:
1) detach the non-critical partitions
2) take the system down for maintenance
3) update the critical partitions
4) take the system up again
5) update the non-critical partitions
6) re-attach the non-critical partitions

That could shave a significant amount of time off your down-time. I would script the detach and re-attach processes first, to save some extra.

Thank you so much.

The partition table which we are planning to apply the ALTER script is a child table to another big partition table. And we have foreign key defined on table level but not partition to partition. So will detaching the partitions and then altering column of each detached partition and then re-attaching will revalidate the foreign key again? If that is the case then the re-attaching partition step might consume a lot of time. Is my understanding correct here?

Additionally , if we are okay with the 7.5hrs of down time , is my calculation/extrapolation of total time consumption based on a sample table, for direct alter, accurate? Because, in that case , I was thinking it's less complex and also less error prone to just do it in a single alter command rather than going for multiple steps of detach, alter, attach partition.

Re: Column type modification in big tables

From

Greg Sabino Mullane

Date:

17 August 2024, 18:55:56

On Thu, Aug 15, 2024 at 4:41 PM Lok P <loknath.73@gmail.com> wrote:

Additionally , if we are okay with the 7.5hrs of down time , is my calculation/extrapolation of total time consumption based on a sample table, for direct alter, accurate? Because, in that case , I was thinking it's less complex and also less error prone to just do it in a single alter command rather than going for multiple steps of detach, alter, attach partition.

Well, it's meant to get you a ballpark figure, but yes, it seems as though you will probably okay, But for something this critical involving production downtime, I would try out the exact command and see how long it takes on a test system. Restore a backup (you have backups I hope) or use pg_basebackup to make a copy of your prod system somewhere.

Cheers,

Greg

Re: Column type modification in big tables

From

Lok P

Date:

22 August 2024, 10:25:48

On Thu, 15 Aug, 2024, 9:18 pm Alban Hertroys, <haramrae@gmail.com> wrote:

> On 15 Aug 2024, at 14:15, Lok P <loknath.73@gmail.com> wrote:

(…)

> Hello Greg,
>
> In terms of testing on sample data and extrapolating, as i picked the avg partition sizeof the table (which is ~20GB) and i created a non partitioned table with exactly same columns and populated with similar data and also created same set of indexes on it and the underlying hardware is exactly same as its on production. I am seeing it's taking ~5minutes to alter all the four columns on this table. So we have ~90 partitions in production with data in them and the other few are future partitions and are blank. (Note- I executed the alter with "work_mem=4GB, maintenance_work_mem=30gb, max_parallel_worker_per_gather=8, max_parallel_maintenance_worker =16" )
>
> So considering the above figures , can i safely assume it will take ~90*5minutes= ~7.5hours in production and thus that many hours of downtime needed for this alter OR do we need to consider any other factors or activity here?

Are all those partitions critical, or only a relative few?

If that’s the case, you could:
1) detach the non-critical partitions
2) take the system down for maintenance
3) update the critical partitions
4) take the system up again
5) update the non-critical partitions
6) re-attach the non-critical partitions

That could shave a significant amount of time off your down-time. I would script the detach and re-attach processes first, to save some extra.

Admittedly, I haven’t actually tried that procedure, but I see no reason why it wouldn’t work.

Apart perhaps, from inserts happening that should have gone to some of those detached partitions. Maybe those could be sent to a ‘default’ partition that gets detached at step 7, after which you can insert+select those from the default into the appropriate partitions?

But you were going to test that first anyway, obviously.

We were checking this strategy , but what we found is while attaching any of the historical partition back to the child table , if there runs any existing inserts on the other live partitions of the same child table that attach keeps on hang state. Also during this period the parent table (which is also partitioned) takes an exclusive lock on itself!!

Even detaching any partition "concurrently" also waits for any inserts to finish, even those are on other partitions. Is this behavior expected?