Thread: alter column type
I have a database in which one table references the primary key of another. The type of the primary key was initially int, but I changed it to bigint. However, I forgot to update the type of a column that references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE bigint", where foo/bar is the table/column referencing the primary key that is now of type bigint.
However, with 2^31 rows, it is taking a "long" time to write the rows (it's been 12 hours). Is there a more efficient way to do this? Even if/when this one finishes, there are other column types that I have to update. This update effectively locked me out of all access to the data anyway, so I don't foresee any concern of writes that might affect integrity.
Cheers,
CaseyCAUTION: This is very dangerous and may cause corruption.
*** DO THIS IN A TEST DATABASE FIRST ***
--1. Get the oid for int8 (bigint)
SELECT t.oid
FROM pg_type t
WHERE typname = 'int8';
--2. Get the oid for your table
SELECT c.oid,
c.relname as table,
a.attname ,
a.atttypid,
a.*
FROM pg_class c
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_attribute a ON ( a.attrelid = c.oid )
WHERE c.relname = 'dogs'
AND n.nspname = 'public'
AND a.attname = 'good_watchdog'
AND NOT attisdropped;
BEGIN;
UPDATE pg_attribute a
SET atttypid = <t.oid from 1>
WHERE a.attrelid = <c.oid from 2>
AND attname = <your column to change>;
COMMIT;*** DO THIS IN A TEST DATABASE FIRST ***
--1. Get the oid for int8 (bigint)
SELECT t.oid
FROM pg_type t
WHERE typname = 'int8';
--2. Get the oid for your table
SELECT c.oid,
c.relname as table,
a.attname ,
a.atttypid,
a.*
FROM pg_class c
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_attribute a ON ( a.attrelid = c.oid )
WHERE c.relname = 'dogs'
AND n.nspname = 'public'
AND a.attname = 'good_watchdog'
AND NOT attisdropped;
BEGIN;
UPDATE pg_attribute a
SET atttypid = <t.oid from 1>
WHERE a.attrelid = <c.oid from 2>
AND attname = <your column to change>;
On Fri, Jun 5, 2015 at 12:23 PM, Casey Deccio <casey@deccio.net> wrote:
I have a database in which one table references the primary key of another. The type of the primary key was initially int, but I changed it to bigint. However, I forgot to update the type of a column that references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE bigint", where foo/bar is the table/column referencing the primary key that is now of type bigint.However, with 2^31 rows, it is taking a "long" time to write the rows (it's been 12 hours). Is there a more efficient way to do this? Even if/when this one finishes, there are other column types that I have to update. This update effectively locked me out of all access to the data anyway, so I don't foresee any concern of writes that might affect integrity.Cheers,Casey
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

On Fri, Jun 5, 2015 at 12:23 PM, Casey Deccio <casey@deccio.net> wrote:
I have a database in which one table references the primary key of another. The type of the primary key was initially int, but I changed it to bigint. However, I forgot to update the type of a column that references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE bigint", where foo/bar is the table/column referencing the primary key that is now of type bigint.However, with 2^31 rows, it is taking a "long" time to write the rows (it's been 12 hours). Is there a more efficient way to do this? Even if/when this one finishes, there are other column types that I have to update. This update effectively locked me out of all access to the data anyway, so I don't foresee any concern of writes that might affect integrity.Cheers,Casey
Probably too late for this time, but in the past when I've needed to redefine the type for a column, I've made a dump, edited the dump file to change the type and then renamed the table and reloaded it. That's usually several orders of magnitude faster.
--
--
Mike Nolan
Why is PG even re-writing all rows when the data type is being changed from smaller (int) to larger (bigint) type, which automatically means existing data is safe. Like, changing from varchar(30) to varchar(50) should involve no rewrite of existing rows.
On Fri, Jun 5, 2015 at 2:36 PM, Michael Nolan <htfoot@gmail.com> wrote:
Probably too late for this time, but in the past when I've needed to redefine the type for a column, I've made a dump, edited the dump file to change the type and then renamed the table and reloaded it. That's usually several orders of magnitude faster.
Actually, not too late. My first ALTER is still running, and I still have four more to go. Sigh. I had thought of this but wasn't sure how it might compare. Thanks for the data point :)
Casey
On 6/5/2015 11:37 AM, Ravi Krishna wrote: > Why is PG even re-writing all rows when the data type is being changed > from smaller (int) to larger (bigint) type, which automatically means > existing data is safe. Like, changing from varchar(30) to varchar(50) > should involve no rewrite of existing rows. int to bigint requires storage change, as all bigints are 64 bit while all ints are 32 bit. it would be a MESS to try and keep track of a table that has some int and some bigint storage of a given field. now, varchar 30 to 50, that I can't answer, are you sure that does a rewrite? the storage is exactly the same for those. -- john r pierce, recycling bits in santa cruz
On Fri, Jun 5, 2015 at 2:23 PM, Melvin Davidson <melvin6925@gmail.com> wrote:
CAUTION: This is very dangerous and may cause corruption.COMMIT;
*** DO THIS IN A TEST DATABASE FIRST ***
--1. Get the oid for int8 (bigint)
SELECT t.oid
FROM pg_type t
WHERE typname = 'int8';
--2. Get the oid for your table
SELECT c.oid,
c.relname as table,
a.attname ,
a.atttypid,
a.*
FROM pg_class c
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_attribute a ON ( a.attrelid = c.oid )
WHERE c.relname = 'dogs'
AND n.nspname = 'public'
AND a.attname = 'good_watchdog'
AND NOT attisdropped;
BEGIN;
UPDATE pg_attribute a
SET atttypid = <t.oid from 1>
WHERE a.attrelid = <c.oid from 2>
AND attname = <your column to change>;
Thanks for the idea. Since I'm planning to dump the database first anyway (using Michael's suggestion) I'm thinking I'll try this on the live database, after I get it safely dumped. It seemed to work on a test database.
Being unfamiliar with the internals, what's the risk here? If postgres thinks something is a bigint, but previously stored it as an int, does that mean it will try to extract data beyond the boundary of some of the (old) 32-bit values and potentially throw off offsets for other values?
Casey
> On 6/5/2015 11:37 AM, Ravi Krishna wrote: >> >> Why is PG even re-writing all rows when the data type is being changed >> from smaller (int) to larger (bigint) type, which automatically means >> existing data is safe. Like, changing from varchar(30) to varchar(50) >> should involve no rewrite of existing rows. > > > > int to bigint requires storage change, as all bigints are 64 bit while all > ints are 32 bit. it would be a MESS to try and keep track of a table > that has some int and some bigint storage of a given field. > > now, varchar 30 to 50, that I can't answer, are you sure that does a > rewrite? the storage is exactly the same for those. Perhaps I was not clear. I don't expect any re-write for a change of varchar(30) to 50 for the same reason you mentioned above. Yes it is normal to expect the storage size for bigint to be different than 32 bit, but then PG uses MVCC. If and when current row gets updated, MVCC will ensure a new row to be written, which can fix the data type. I believe PG adds or drops a col without rewrite because of MVCC. For eg, I add a new col-T in a table and drop col-S via a single ALTER TABLE command. I am assuming this is what happens internally: In the above case PG will simply do a dictionary update of meta tables. So all new rows will reflect col-T and as and when the old rows get modified, it too will get updated to the new structure. If my above understand is correct, why it is not applied in case of int -> bigint change.
> In the above case PG will simply do a dictionary update of meta > tables. So all new rows will reflect col-T and as and when the old I will clarify it bit further: All new rows will have space allocated for col-T and no space allocated for col-S, while existing dormant rows are left unmodified .
Casey Deccio <casey@deccio.net> writes: > Being unfamiliar with the internals, what's the risk here? If postgres > thinks something is a bigint, but previously stored it as an int, does that > mean it will try to extract data beyond the boundary of some of the (old) > 32-bit values and potentially throw off offsets for other values? Yes. This *will* break your table, spectacularly. The other person who was opining that it would work has no understanding of the actual storage layout. regards, tom lane
On 6/5/2015 11:46 AM, Casey Deccio wrote:
On Fri, Jun 5, 2015 at 2:36 PM, Michael Nolan <htfoot@gmail.com> wrote:Probably too late for this time, but in the past when I've needed to redefine the type for a column, I've made a dump, edited the dump file to change the type and then renamed the table and reloaded it. That's usually several orders of magnitude faster.Actually, not too late. My first ALTER is still running, and I still have four more to go. Sigh. I had thought of this but wasn't sure how it might compare. Thanks for the data point :)
if all 5 alters' were to the same table, you should have combined them into one alter statement.
-- john r pierce, recycling bits in santa cruz
On Fri, Jun 5, 2015 at 4:00 PM, John R Pierce <pierce@hogranch.com> wrote:
Actually, there are six in four different tables.
if all 5 alters' were to the same table, you should have combined them into one alter statementActually, not too late. My first ALTER is still running, and I still have four more to go. Sigh. I had thought of this but wasn't sure how it might compare. Thanks for the data point :)
But... I didn't know I could combine multiple ALTERs into a single statement. Thanks for the heads up.
Casey