Thread: too slow
I recently migrated from MySql, The database size in mysql was 1.4GB (It is a static database). It generated a dump file(.sql) of size 8GB), It took 2days to import the whole thing into postgres. After all the response from postgres is adisaster. It took 40sec's to run a select count(logrecno) from sf10001; which generated a value 197569. And It took forever time to display the table. How to optimize the database so that I can expect faster access to data. each table has 70 colsX197569 rows (static data), like that I have 40 tables, Everything static. system configuration: p4 2.8ghz 512mb ram os: xp postgres version: 8.0 thanks a million in advance, shashi.
Shashi Gireddy napisał(a): > I recently migrated from MySql, The database size in mysql was 1.4GB (It is a static database). It generated a dump file(.sql) of size 8GB), It took 2days to import the whole thing into postgres. After all the response from postgres is adisaster. It took 40sec's to run a select count(logrecno) from sf10001; which generated a value 197569. And It took forever time to display the table. How to optimize the database so that I can expect faster access to data. > > each table has 70 colsX197569 rows (static data), like that I have 40 tables, Everything static. > > system configuration: p4 2.8ghz 512mb ram os: xp postgres version: 8.0 First of all you should make VACUUM FULL ANALYZE for the all tables (http://www.postgresql.org/docs/8.0/interactive/sql-vacuum.html) - this should solve the problem. However you should also think about changing table structure, because PostgreSQL needs different indexes than MySQL. A few months ago I had the same problem - but after vacuuming, making proper indexes everything is working like a charm. Believe me that you can achieve the same speed - it is only a matter of good db structure and environment settings (http://www.postgresql.org/docs/8.0/interactive/runtime.html). Regards, ML
On Wed, 2005-02-09 at 11:01, Marek Lewczuk wrote: > Shashi Gireddy napisał(a): > > I recently migrated from MySql, The database size in mysql was 1.4GB (It is a static database). It generated a dump file(.sql) of size 8GB), It took 2days to import the whole thing into postgres. After all the response from postgres is adisaster. It took 40sec's to run a select count(logrecno) from sf10001; which generated a value 197569. And It took forever time to display the table. How to optimize the database so that I can expect faster access to data. > > > > each table has 70 colsX197569 rows (static data), like that I have 40 tables, Everything static. > > > > system configuration: p4 2.8ghz 512mb ram os: xp postgres version: 8.0 > First of all you should make VACUUM FULL ANALYZE for the all tables > (http://www.postgresql.org/docs/8.0/interactive/sql-vacuum.html) - this > should solve the problem. However you should also think about changing > table structure, because PostgreSQL needs different indexes than MySQL. > A few months ago I had the same problem - but after vacuuming, making > proper indexes everything is working like a charm. Believe me that you > can achieve the same speed - it is only a matter of good db structure > and environment settings > (http://www.postgresql.org/docs/8.0/interactive/runtime.html). This user may also benefit from better normalization, as i remember him having lots of fields in each table. Note that since PostgreSQL has to hit the tables AFTER hitting the indexes due to MVCC, having wide, denormalized tables with indexes is a sub-optimal solution, while narrower tables are often much faster. It sounded to me like what the OP was doing was more fittingly described as "batch processing" and might well be better handled without a database at all.
All: Puzzling PG 7.4 behaviors. Began yesterday, running under RedHat ES 2.1. First symptom were many query failures, along with the message that a server process had crashed, possibly corrupting shared memory. These messages kept coming. I shut down the postmaster, and did ipcclean. Started postmaster again, but then found index problems with certain tables. Records that we knew were there would not be found via index scan queries. We then tried to reindex, and that failed as follows: DETAIL: Table contains duplicated values. This on a single column with a unique constraint, populated via a sequence! We searched for and found the duplicates and pulled them into a separate table, deleted from the original table, then reindexed. Looking at the duplicates, they look for all the world like multiple copies of the same row. Clearly the same record, but in each case the one with the later timestamp had "more" data. So we picked the latest records by timestamp and brought them back in. We've found similar instances on other indexes, but let me start there. Any ideas? Can MVCC "leak" stale rows back into the "live" space? -- sgl -------- Steve Lane Vice President Soliant Consulting, Inc. (312) 850-3830 (V) (312) 850-3930 (F) slane@soliantconsulting.com
Steve Lane wrote: >All: > >Puzzling PG 7.4 behaviors. Began yesterday, running under RedHat ES 2.1.' > > Which version of 7.4? >First symptom were many query failures, along with the message that a server >process had crashed, possibly corrupting shared memory. These messages kept >coming. I shut down the postmaster, and did ipcclean. > >Started postmaster again, but then found index problems with certain tables. >Records that we knew were there would not be found via index scan queries. >We then tried to reindex, and that failed as follows: > >DETAIL: Table contains duplicated values. > >This on a single column with a unique constraint, populated via a sequence! > >We searched for and found the duplicates and pulled them into a separate >table, deleted from the original table, then reindexed. > >Looking at the duplicates, they look for all the world like multiple copies >of the same row. Clearly the same record, but in each case the one with the >later timestamp had "more" data. So we picked the latest records by >timestamp and brought them back in. > >We've found similar instances on other indexes, but let me start there. > >Any ideas? Can MVCC "leak" stale rows back into the "live" space? > >-- sgl > > >-------- >Steve Lane >Vice President >Soliant Consulting, Inc. >(312) 850-3830 (V) >(312) 850-3930 (F) >slane@soliantconsulting.com > > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
Attachment
Sorry -- 7.4.5 -- sgl > From: "Joshua D. Drake" <jd@commandprompt.com> > Date: Wed, 09 Feb 2005 17:26:12 -0800 > To: Steve Lane <slane@soliantconsulting.com> > Cc: <pgsql-admin@postgresql.org>, Jesse LaVere <jlavere@soliantconsulting.com> > Subject: Re: [ADMIN] PG 7.4: duplicate rows in violation of unique constraint > > Steve Lane wrote: > >> All: >> >> Puzzling PG 7.4 behaviors. Began yesterday, running under RedHat ES 2.1.' >> >> > Which version of 7.4? > > >> First symptom were many query failures, along with the message that a server >> process had crashed, possibly corrupting shared memory. These messages kept >> coming. I shut down the postmaster, and did ipcclean. >> >> Started postmaster again, but then found index problems with certain tables. >> Records that we knew were there would not be found via index scan queries. >> We then tried to reindex, and that failed as follows: >> >> DETAIL: Table contains duplicated values. >> >> This on a single column with a unique constraint, populated via a sequence! >> >> We searched for and found the duplicates and pulled them into a separate >> table, deleted from the original table, then reindexed. >> >> Looking at the duplicates, they look for all the world like multiple copies >> of the same row. Clearly the same record, but in each case the one with the >> later timestamp had "more" data. So we picked the latest records by >> timestamp and brought them back in. >> >> We've found similar instances on other indexes, but let me start there. >> >> Any ideas? Can MVCC "leak" stale rows back into the "live" space? >> >> -- sgl >> >> >> -------- >> Steve Lane >> Vice President >> Soliant Consulting, Inc. >> (312) 850-3830 (V) >> (312) 850-3930 (F) >> slane@soliantconsulting.com >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 2: you can get off all lists at once with the unregister command >> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >> >> > > > -- > Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC > Postgresql support, programming shared hosting and dedicated hosting. > +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com > PostgreSQL Replicator -- production quality replication for PostgreSQL >
Steve Lane <slane@soliantconsulting.com> writes: > Sorry -- 7.4.5 I'm wondering if this could be related to this post-7.4.5 fix: 2004-10-13 18:22 tgl * contrib/pgstattuple/pgstattuple.c, src/backend/access/heap/heapam.c, src/backend/utils/adt/ri_triggers.c (REL7_4_STABLE): Repair possible failure to update hint bits back to disk, per http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php. What you are describing sure sounds like a foulup in commit status marking for multiple versions of a row. I'm not convinced it would be fixed by the abovementioned patch, but it sounds close. regards, tom lane
Tom: Thanks. What's odd here is that we seemed to have none of them, and then found a bunch, immediately after processes starting crashing. This is not the first time this has happened -- we had an outbreak once before. It's quite possible that we had these lurking in there for a while, and only found them during the checks we did after the process crashes. Does 7.4.6 incorporate the patch for this? -- sgl > From: Tom Lane <tgl@sss.pgh.pa.us> > Date: Wed, 09 Feb 2005 23:50:47 -0500 > To: Steve Lane <slane@soliantconsulting.com> > Cc: "Joshua D. Drake" <jd@commandprompt.com>, <pgsql-admin@postgresql.org>, > Jesse LaVere <jlavere@soliantconsulting.com> > Subject: Re: [ADMIN] PG 7.4: duplicate rows in violation of unique > > Steve Lane <slane@soliantconsulting.com> writes: >> Sorry -- 7.4.5 > > I'm wondering if this could be related to this post-7.4.5 fix: > > 2004-10-13 18:22 tgl > > * contrib/pgstattuple/pgstattuple.c, > src/backend/access/heap/heapam.c, > src/backend/utils/adt/ri_triggers.c (REL7_4_STABLE): Repair > possible failure to update hint bits back to disk, per > http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php. > > What you are describing sure sounds like a foulup in commit status > marking for multiple versions of a row. I'm not convinced it would be > fixed by the abovementioned patch, but it sounds close. > > regards, tom lane
> From: Tom Lane <tgl@sss.pgh.pa.us> > Date: Wed, 09 Feb 2005 23:50:47 -0500 > To: Steve Lane <slane@soliantconsulting.com> > Cc: "Joshua D. Drake" <jd@commandprompt.com>, <pgsql-admin@postgresql.org>, > Jesse LaVere <jlavere@soliantconsulting.com> > Subject: Re: [ADMIN] PG 7.4: duplicate rows in violation of unique > > Steve Lane <slane@soliantconsulting.com> writes: >> Sorry -- 7.4.5 > > I'm wondering if this could be related to this post-7.4.5 fix: > > 2004-10-13 18:22 tgl > > * contrib/pgstattuple/pgstattuple.c, > src/backend/access/heap/heapam.c, > src/backend/utils/adt/ri_triggers.c (REL7_4_STABLE): Repair > possible failure to update hint bits back to disk, per > http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php. > > What you are describing sure sounds like a foulup in commit status > marking for multiple versions of a row. I'm not convinced it would be > fixed by the abovementioned patch, but it sounds close. By the way -- this problem looks very much like one I reported a few months ago. I'll replay that email here: > I'm hoping I'm wrong, but I think I'm looking at a corrupted table or > database. We have a column that's sequence-driven, with a unique index, but > we're seeing duplicate values in that column. We're also seeing duplicate OIDs > in the affected table. And we're seeing this ominous message: > > iep_db=# select distinct * into lavere_form_004 from iep_form_004; > ERROR: missing chunk number 0 for toast value 42551774 We're getting the same thing tonight -- same "missing chunk number XX for toast value YYYY." If this is some form of MVCC "leakage", can that also explain this error? Would the "leaked" rows somehow be incomplete in the sense of not linking correctly to their corresponding toast data? -- sgl