Thread: alter column type

alter column type

From

Casey Deccio

Date:

05 June 2015, 16:23:24

I have a database in which one table references the primary key of another. The type of the primary key was initially int, but I changed it to bigint. However, I forgot to update the type of a column that references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE bigint", where foo/bar is the table/column referencing the primary key that is now of type bigint.

However, with 2^31 rows, it is taking a "long" time to write the rows (it's been 12 hours). Is there a more efficient way to do this? Even if/when this one finishes, there are other column types that I have to update. This update effectively locked me out of all access to the data anyway, so I don't foresee any concern of writes that might affect integrity.

Cheers,

Casey

Re: alter column type

From

Melvin Davidson

Date:

05 June 2015, 18:23:45

CAUTION: This is very dangerous and may cause corruption.
         *** DO THIS IN A TEST DATABASE FIRST ***

--1. Get the oid for int8 (bigint)
SELECT t.oid
FROM pg_type t
WHERE typname = 'int8';

--2. Get the oid for your table
SELECT c.oid,
       c.relname as table,
       a.attname ,
       a.atttypid,
       a.*
FROM pg_class c
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_attribute a ON ( a.attrelid = c.oid )
WHERE c.relname = 'dogs'
   AND n.nspname = 'public'
   AND a.attname = 'good_watchdog'
   AND NOT attisdropped;

BEGIN;

UPDATE pg_attribute a
   SET atttypid = <t.oid from 1>
WHERE a.attrelid = <c.oid from 2>
   AND attname = <your column to change>;

COMMIT;

On Fri, Jun 5, 2015 at 12:23 PM, Casey Deccio <casey@deccio.net> wrote:

I have a database in which one table references the primary key of another. The type of the primary key was initially int, but I changed it to bigint. However, I forgot to update the type of a column that references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE bigint", where foo/bar is the table/column referencing the primary key that is now of type bigint.

However, with 2^31 rows, it is taking a "long" time to write the rows (it's been 12 hours). Is there a more efficient way to do this? Even if/when this one finishes, there are other column types that I have to update. This update effectively locked me out of all access to the data anyway, so I don't foresee any concern of writes that might affect integrity.

Cheers,
Casey

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Re: alter column type

From

Michael Nolan

Date:

05 June 2015, 18:36:23

On Fri, Jun 5, 2015 at 12:23 PM, Casey Deccio <casey@deccio.net> wrote:

I have a database in which one table references the primary key of another. The type of the primary key was initially int, but I changed it to bigint. However, I forgot to update the type of a column that references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE bigint", where foo/bar is the table/column referencing the primary key that is now of type bigint.

However, with 2^31 rows, it is taking a "long" time to write the rows (it's been 12 hours). Is there a more efficient way to do this? Even if/when this one finishes, there are other column types that I have to update. This update effectively locked me out of all access to the data anyway, so I don't foresee any concern of writes that might affect integrity.

Cheers,
Casey

Probably too late for this time, but in the past when I've needed to redefine the type for a column, I've made a dump, edited the dump file to change the type and then renamed the table and reloaded it. That's usually several orders of magnitude faster.
--

Mike Nolan

nolan@tssi.com

Re: alter column type

From

Ravi Krishna

Date:

05 June 2015, 18:38:20

Why is PG even re-writing all rows when the data type is being changed
from smaller (int) to larger (bigint) type, which automatically means
existing data is safe. Like, changing from varchar(30) to varchar(50)
should involve no rewrite of existing rows.

Re: alter column type

From

Casey Deccio

Date:

05 June 2015, 18:56:11

On Fri, Jun 5, 2015 at 2:36 PM, Michael Nolan <htfoot@gmail.com> wrote:

Probably too late for this time, but in the past when I've needed to redefine the type for a column, I've made a dump, edited the dump file to change the type and then renamed the table and reloaded it. That's usually several orders of magnitude faster.

Actually, not too late. My first ALTER is still running, and I still have four more to go. Sigh. I had thought of this but wasn't sure how it might compare. Thanks for the data point :)

Casey

Re: alter column type

From

John R Pierce

Date:

05 June 2015, 19:01:05

On 6/5/2015 11:37 AM, Ravi Krishna wrote:
> Why is PG even re-writing all rows when the data type is being changed
> from smaller (int) to larger (bigint) type, which automatically means
> existing data is safe. Like, changing from varchar(30) to varchar(50)
> should involve no rewrite of existing rows.

int to bigint requires storage change, as all bigints are 64 bit while
all ints are 32 bit.     it would be a MESS to try and keep track of a
table that has some int and some bigint storage of a given field.

now, varchar 30 to 50, that I can't answer, are you sure that does a
rewrite?   the storage is exactly the same for those.

--
john r pierce, recycling bits in santa cruz

Re: alter column type

From

Casey Deccio

Date:

05 June 2015, 19:13:42

On Fri, Jun 5, 2015 at 2:23 PM, Melvin Davidson <melvin6925@gmail.com> wrote:

CAUTION: This is very dangerous and may cause corruption.
         *** DO THIS IN A TEST DATABASE FIRST ***

--1. Get the oid for int8 (bigint)
SELECT t.oid
FROM pg_type t
WHERE typname = 'int8';

--2. Get the oid for your table
SELECT c.oid,
       c.relname as table,
       a.attname ,
       a.atttypid,
       a.*
FROM pg_class c
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_attribute a ON ( a.attrelid = c.oid )
WHERE c.relname = 'dogs'
   AND n.nspname = 'public'
   AND a.attname = 'good_watchdog'
   AND NOT attisdropped;

BEGIN;

UPDATE pg_attribute a
   SET atttypid = <t.oid from 1>
WHERE a.attrelid = <c.oid from 2>
   AND attname = <your column to change>;

COMMIT;

Thanks for the idea. Since I'm planning to dump the database first anyway (using Michael's suggestion) I'm thinking I'll try this on the live database, after I get it safely dumped. It seemed to work on a test database.

Being unfamiliar with the internals, what's the risk here? If postgres thinks something is a bigint, but previously stored it as an int, does that mean it will try to extract data beyond the boundary of some of the (old) 32-bit values and potentially throw off offsets for other values?

Casey

Re: alter column type

From

Ravi Krishna

Date:

05 June 2015, 19:18:32

> On 6/5/2015 11:37 AM, Ravi Krishna wrote:
>>
>> Why is PG even re-writing all rows when the data type is being changed
>> from smaller (int) to larger (bigint) type, which automatically means
>> existing data is safe. Like, changing from varchar(30) to varchar(50)
>> should involve no rewrite of existing rows.
>
>
>
> int to bigint requires storage change, as all bigints are 64 bit while all
> ints are 32 bit.     it would be a MESS to try and keep track of a table
> that has some int and some bigint storage of a given field.
>
> now, varchar 30 to 50, that I can't answer, are you sure that does a
> rewrite?   the storage is exactly the same for those.

Perhaps I was not clear. I don't expect any re-write for a change of
varchar(30) to 50 for the same reason you mentioned above.

Yes it is normal to expect the storage size for bigint to be different
than 32 bit, but then PG uses MVCC. If and when current row gets
updated, MVCC will ensure a new row to be written, which can fix the
data type.

I believe PG adds or drops a col without rewrite because of MVCC. For
eg, I add a new col-T in a table and drop col-S via a single ALTER
TABLE command. I am assuming this is what happens internally:

In the above case PG will simply do a dictionary update of meta
tables. So all new rows will reflect col-T and as and when the old
rows get modified, it too will get updated to the new structure.

If my above understand is correct, why it is not applied in case of
int -> bigint change.

Re: alter column type

From

Ravi Krishna

Date:

05 June 2015, 19:20:45

> In the above case PG will simply do a dictionary update of meta
> tables. So all new rows will reflect col-T and as and when the old

I will clarify it bit further:

All new rows will have space allocated for col-T and no space
allocated for col-S, while existing dormant rows are left unmodified .

Re: alter column type

From

Tom Lane

Date:

05 June 2015, 19:42:58

Casey Deccio <casey@deccio.net> writes:
> Being unfamiliar with the internals, what's the risk here?  If postgres
> thinks something is a bigint, but previously stored it as an int, does that
> mean it will try to extract data beyond the boundary of some of the (old)
> 32-bit values and potentially throw off offsets for other values?

Yes.  This *will* break your table, spectacularly.  The other person who
was opining that it would work has no understanding of the actual storage
layout.

            regards, tom lane

Re: alter column type

From

John R Pierce

Date:

05 June 2015, 20:00:38

On 6/5/2015 11:46 AM, Casey Deccio wrote:

On Fri, Jun 5, 2015 at 2:36 PM, Michael Nolan <htfoot@gmail.com> wrote:

Probably too late for this time, but in the past when I've needed to redefine the type for a column, I've made a dump, edited the dump file to change the type and then renamed the table and reloaded it. That's usually several orders of magnitude faster.

Actually, not too late. My first ALTER is still running, and I still have four more to go. Sigh. I had thought of this but wasn't sure how it might compare. Thanks for the data point :)

if all 5 alters' were to the same table, you should have combined them into one alter statement.

-- 
john r pierce, recycling bits in santa cruz

Re: alter column type

From

Casey Deccio

Date:

05 June 2015, 20:08:40

On Fri, Jun 5, 2015 at 4:00 PM, John R Pierce <pierce@hogranch.com> wrote:

Actually, not too late. My first ALTER is still running, and I still have four more to go. Sigh. I had thought of this but wasn't sure how it might compare. Thanks for the data point :)

if all 5 alters' were to the same table, you should have combined them into one alter statement

Actually, there are six in four different tables.

But... I didn't know I could combine multiple ALTERs into a single statement. Thanks for the heads up.

Casey