Thread: Oracle migration : size on disk of data file greater in PG
Hi, At a customer site, we've made a migration from Oracle 8.1.5 to PGSQL 8.1.1. The migration happened without any problem and now the performances are better with PG than with Ora, but the customer noticed that the size of PG on disk where much greater than the size on disk of Oracle. And I'm not able to find an easy explanation. Is it normal, due to inner data storage mecanisms differents between Oracle and PG ? Of course, we've run VACUUM on both DB before measuring the size on disk. The database is used to store statistical data by month and therefore contain dozen of tables of the same layout containing most of the time hundred of thousands records. Thank you, Benoit Gerrienne
Hi Benoit, Are you talking specifically the database? You can run up 100's of gigs of log files in pg_log if you're not careful (he says after doing the very same). Cheers Chris -----Original Message----- From: Benoit.Gerrienne@BULL.BE [mailto:Benoit.Gerrienne@BULL.BE] Sent: 19 September 2006 10:01 To: pgsql-general@postgresql.org Subject: [GENERAL] Oracle migration : size on disk of data file greater in PG Hi, At a customer site, we've made a migration from Oracle 8.1.5 to PGSQL 8.1.1. The migration happened without any problem and now the performances are better with PG than with Ora, but the customer noticed that the size of PG on disk where much greater than the size on disk of Oracle. And I'm not able to find an easy explanation. Is it normal, due to inner data storage mecanisms differents between Oracle and PG ? Of course, we've run VACUUM on both DB before measuring the size on disk. The database is used to store statistical data by month and therefore contain dozen of tables of the same layout containing most of the time hundred of thousands records. Thank you, Benoit Gerrienne ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ----------------------------------------- The information contained in this email is confidential and is intended for the recipient only. If you have received it in error, please notify us immediately by reply email and then delete it from your system. Please do not copy it or use it for any purposes, or disclose its contents to any other person or store or copy this information in any medium. The views contained in this email are those of the author and not necessarily those of Lorien plc. Thank you for your co-operation.
Benoit.Gerrienne@BULL.BE writes: > At a customer site, we've made a migration from Oracle 8.1.5 to PGSQL > 8.1.1. The migration happened without any problem and now the performances > are better with PG than with Ora, but the customer noticed that the size > of PG on disk where much greater than the size on disk of Oracle. And I'm > not able to find an easy explanation. > > Is it normal, due to inner data storage mecanisms differents between > Oracle and PG ? To a certain degree yes. It's one of the topics under active discussion for improvement in the future. In particular you'll see a big difference if you have a lot of very small columns. You may also see some difference if you have very narrow rows because of the transaction status overhead. > Of course, we've run VACUUM on both DB before measuring the size on disk. There's a couple problems here though. Firstly VACUUM doesn't usually shrink the actual size of data on disk, it just notes where the free space is so it can be reused. To shrink the actual data on disk you would need VACUUM FULL, CLUSTER, or ALTER TABLE ... ALTER COLUMN ... TYPE USING. However under normal operation Postgres expects to have some amount of free space anyways. Running VACUUM FULL is usually pointless and actually hurts performance unless you have an unusual situation such as having done large batch updates recently or not having run VACUUM regularly enough in the past. Because Postgres keeps free space around in the tables for new versions of tuples you should include Oracle's rollback segments in the comparison since that effectively corresponds to the free space Postgres keeps. Or you could do a VACUUM FULL before comparing but I do not recommend VACUUM FULL for regular operation. > The database is used to store statistical data by month and therefore > contain dozen of tables of the same layout containing most of the time > hundred of thousands records. You may find it makes more sense to store this all in one table or in tables that are children of the same table such as described in: http://www.postgresql.org/docs/8.1/static/ddl-partitioning.html -- greg