Thread: vacuum, dead rows, usual solutions didn't help
hi, i have a postgresql-8.2.4 db, and vacuuming it does not remove the dead rows basically, the problem is this part of the vacuum-output: " HINT: Close open transactions soon to avoid wraparound problems. INFO: vacuuming "public.sessions" INFO: scanned index "sessions_pkey" to remove 2 row versions DETAIL: CPU 0.60s/0.25u sec elapsed 61.57 sec. INFO: "sessions": removed 2 row versions in 2 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "sessions_pkey" now contains 6157654 row versions in 52923 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "sessions": found 2 removable, 6157654 nonremovable row versions in 478069 pages DETAIL: 6155746 dead row versions cannot be removed yet. There were 8735 unused item pointers. 107 pages contain useful free space. 0 pages are entirely empty. CPU 6.02s/1.58u sec elapsed 598.05 sec. WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. INFO: vacuuming "pg_toast.pg_toast_5525738" INFO: index "pg_toast_5525738_index" now contains 13957669 row versions in 38328 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.40s/0.04u sec elapsed 22.26 sec. INFO: "pg_toast_5525738": found 0 removable, 13957669 nonremovable row versions in 3461686 pages DETAIL: 13938280 dead row versions cannot be removed yet. There were 154 unused item pointers. 69 pages contain useful free space. 0 pages are entirely empty. CPU 39.95s/6.19u sec elapsed 1139.50 sec. INFO: analyzing "public.sessions" INFO: "sessions": scanned 3000 of 478438 pages, containing 12 live rows and 38419 dead rows; 12 rows in sample, 1914 estimated total rows INFO: free space map contains 26849 pages in 444 relations DETAIL: A total of 30736 page slots are in use (including overhead). 30736 page slots are required to track all free space. Current limits are: 153600 page slots, 1000 relations, using 1005 kB. " (the full vacuum-log is at http://www.nekomancer.net/tmp/vacuum.txt) the "sessions" table hold session-data for a web-application (the code uses the perl Apache::Session module btw.), so it changes very often, and is vacuumed every hour (using a cronjob). previously we were running this application with postgresql-7.4, and there the vacuuming worked fine. now we migrated this to postgresql-8.2.4, and it does not want to vacuum it properly. the migration to 8.2.4 happened approx. one month ago, and this dead-row count has been growing since then. what i tried: ps aux | grep postgres on the db-server, and found some connections that were quite old. i restarted the applications that "caused" those connections, so right now there are no too old connections. pg_stat_activity: the query_start of every entry is on today, for the entries with null query_start the postgres processes are not older than 2 days. on the db-server, 4 postgres processes are "idle in transaction", but none is older than 2 days. in pg_locks, all the locks that are for the "sessions" table are from "young" (today-created) connections, and their locks are RowShareLock or AccessShareLock. so currently i am out of ideas what to check... well, actually there is one more idea: maybe the autovacuuming process somehow "conflicts" with the manual-vacuuming cronjob? is that possible? any other ideas? thanks, gabor
=?iso-8859-1?Q?G=E1bor?= Farkas <gabor@nekomancer.net> writes: > basically, the problem is this part of the vacuum-output: > INFO: "sessions": found 2 removable, 6157654 nonremovable row versions > in 478069 pages > DETAIL: 6155746 dead row versions cannot be removed yet. The problem is that you've got some old open transactions that could potentially see those recently-dead rows, so VACUUM can't remove the rows without breaking MVCC rules for those transactions. Find the clients that are holding open transactions, and zap 'em. > previously we were running this application with postgresql-7.4, and > there the vacuuming worked fine. now we migrated this to > postgresql-8.2.4, and it does not want to vacuum it properly. I wonder whether you updated the client-side support libraries? regards, tom lane
Gábor Farkas wrote: > hi, > > i have a postgresql-8.2.4 db, > > and vacuuming it does not remove the dead rows > > > basically, the problem is this part of the vacuum-output: > on the db-server, 4 postgres processes are "idle in transaction", but > none is older than 2 days. If you have something idle in transaction, your vacuums are useless. You need to fix your app. Sincerely, Joshua D. Drake
Tom Lane wrote: > =?iso-8859-1?Q?G=E1bor?= Farkas <gabor@nekomancer.net> writes: >> basically, the problem is this part of the vacuum-output: > >> INFO: "sessions": found 2 removable, 6157654 nonremovable row versions >> in 478069 pages >> DETAIL: 6155746 dead row versions cannot be removed yet. > > The problem is that you've got some old open transactions that could > potentially see those recently-dead rows, so VACUUM can't remove the > rows without breaking MVCC rules for those transactions. > > Find the clients that are holding open transactions, and zap 'em. well, that's the problem :-) if i do a "ps aux | grep postgres" on the db-server, then (ignoring the bin/postgres, writer-process and stats-collector-process) then the oldest process is 3 days old. but this dead-row-issue we have since weeks. is it possible that a recently opened connection is blocking the vacuuming of older-dead-rows? > >> previously we were running this application with postgresql-7.4, and >> there the vacuuming worked fine. now we migrated this to >> postgresql-8.2.4, and it does not want to vacuum it properly. > > I wonder whether you updated the client-side support libraries? yes, they were updated too. a related question: on the db-server, let's say i have 3 databases: A, B, C. if i have problems vacuuming B, then i only have to look for processes/transactions touching B, correct? in other words, if i do a "ps aux | grep postgres", then i see also the username and the database-name in the process-list. and if a postgres-process is not touching B, then i do not have to check it. is this correct? gabor
Joshua D. Drake wrote: > Gábor Farkas wrote: >> hi, >> >> i have a postgresql-8.2.4 db, >> >> and vacuuming it does not remove the dead rows >> >> >> basically, the problem is this part of the vacuum-output: > >> on the db-server, 4 postgres processes are "idle in transaction", but >> none is older than 2 days. > > If you have something idle in transaction, your vacuums are useless. You > need to fix your app. > maybe i described it the wrong way: there is only 1 process, that is constantly "idle in transaction". the remaining 3 were only idle-in-transaction at that point. so if i would keep checking for idle-in-transaction processes, the list of them would keep changing. are you saying, that a process should NEVER be idle-in-transaction? not even for a short time? (like some seconds?) also, even if it is wrong, can an 'idle-in-transaction' connection that was opened today block the vacuuming of rows that were deleted yesterday? thanks, gabor
On Thu, 2008-01-10 at 07:52 +0100, Gábor Farkas wrote: > the remaining 3 were only idle-in-transaction at that point. so if i > would keep checking for idle-in-transaction processes, the list of them > would keep changing. > > are you saying, that a process should NEVER be idle-in-transaction? not > even for a short time? (like some seconds?) It's OK to be idle-in-transaction, but not OK for that state to last for days. > also, even if it is wrong, can an 'idle-in-transaction' connection that > was opened today block the vacuuming of rows that were deleted yesterday? Yes, if the rows were deleted after the connection started. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs wrote: > >> also, even if it is wrong, can an 'idle-in-transaction' connection that >> was opened today block the vacuuming of rows that were deleted yesterday? > > Yes, if the rows were deleted after the connection started. > to avoid any potential misunderstandings, i will summarize the situation: 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008. 2. i know that no postgres-process is older than 7.jan.2008. (from "ps aux | grep postgres", and except the postgres-system-processes) how can this happen? gabor
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10 Jan 2008, at 11:18, Gábor Farkas wrote: > Simon Riggs wrote: >>> also, even if it is wrong, can an 'idle-in-transaction' >>> connection that was opened today block the vacuuming of rows that >>> were deleted yesterday? >> Yes, if the rows were deleted after the connection started. > > to avoid any potential misunderstandings, i will summarize the > situation: > > 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008. > > 2. i know that no postgres-process is older than 7.jan.2008. (from > "ps aux | grep postgres", and except the postgres-system-processes) > > how can this happen? > > To my understanding the question how old the processes are is only partially of interest, if a process touches those rows in a transaction just before the vacuum runs, it can't remove those rows. So all you need to get in this situation is a transaction that touches the rows in a transaction and keeps the transaction alive before vacuum runs. - -- Viele Grüße, Lars Heidieker lars@heidieker.de http://paradoxon.info - ------------------------------------ Mystische Erklärungen: Die mystischen Erklärungen gelten für tief; die Wahrheit ist, dass sie noch nicht einmal oberflächlich sind. -- Friedrich Nietzsche [ Die Fröhliche Wissenschaft Buch 3, 126 ] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iD8DBQFHhfwXcxuYqjT7GRYRAoKcAKCZgW/RI9rWN0/Gkd+c7F3T4WmV0gCg4Y6p VBxOBw50HJYsHBPFUjuaPa4= =8d+w -----END PGP SIGNATURE-----
On Thu, 2008-01-10 at 11:18 +0100, Gábor Farkas wrote: > Simon Riggs wrote: > > > >> also, even if it is wrong, can an 'idle-in-transaction' connection that > >> was opened today block the vacuuming of rows that were deleted yesterday? > > > > Yes, if the rows were deleted after the connection started. > > > > to avoid any potential misunderstandings, i will summarize the situation: > > 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008. > > 2. i know that no postgres-process is older than 7.jan.2008. (from "ps > aux | grep postgres", and except the postgres-system-processes) > > how can this happen? They might be different set of dead rows, just roughly the same numbers each day. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Jan 10, 2008, at 6:01 AM, Simon Riggs wrote: > On Thu, 2008-01-10 at 11:18 +0100, Gábor Farkas wrote: >> Simon Riggs wrote: >>> >>>> also, even if it is wrong, can an 'idle-in-transaction' >>>> connection that >>>> was opened today block the vacuuming of rows that were deleted >>>> yesterday? >>> >>> Yes, if the rows were deleted after the connection started. >>> >> >> to avoid any potential misunderstandings, i will summarize the >> situation: >> >> 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008. >> >> 2. i know that no postgres-process is older than 7.jan.2008. (from >> "ps >> aux | grep postgres", and except the postgres-system-processes) >> >> how can this happen? > > They might be different set of dead rows, just roughly the same > numbers > each day. Or, put another way, this is probably the same problem recurring, not one constant instance of the issue. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
Erik Jones wrote: > > On Jan 10, 2008, at 6:01 AM, Simon Riggs wrote: > >> On Thu, 2008-01-10 at 11:18 +0100, Gábor Farkas wrote: >>> Simon Riggs wrote: >>>> >>>>> also, even if it is wrong, can an 'idle-in-transaction' connection >>>>> that >>>>> was opened today block the vacuuming of rows that were deleted >>>>> yesterday? >>>> >>>> Yes, if the rows were deleted after the connection started. >>>> >>> >>> to avoid any potential misunderstandings, i will summarize the >>> situation: >>> >>> 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008. >>> >>> 2. i know that no postgres-process is older than 7.jan.2008. (from "ps >>> aux | grep postgres", and except the postgres-system-processes) >>> >>> how can this happen? >> >> They might be different set of dead rows, just roughly the same numbers >> each day. > > Or, put another way, this is probably the same problem recurring, not > one constant instance of the issue. unfortunately, i do not think that's the case, here is why: this vacuum-process is running every hour, and i have the logs from roughly 450 vacuum runs. so, for one specific table, that had these unremovable rows: the number of "removable dead rows" was between 0 and 11, and the number of "unremovable dead rows" grew by a number between 0 and 41106 every hour (it was three times zero, and the rest was between 86 and 41106). so i do not think it happened with different rows, just roughly the same number. on the good side, we changed the code for that one process, that kept being in "idle in transaction", and now the vacuuming works nicely. and this is still a mystery for me, because i understand that idle-in-transaction is wrong, but even so, a process that i start today, in my opinion simply cannot block the recovery of dead rows, that were deleted yesterday. but i'm probably misunderstanding something, so if i will have some more time for this in the future, i will read more about mvcc, and maybe start a thread here :-) thanks for all your help, gabor
=?ISO-8859-1?Q?G=E1bor_Farkas?= <gabor@nekomancer.net> writes: > and this is still a mystery for me, because i understand that > idle-in-transaction is wrong, but even so, a process that i start today, > in my opinion simply cannot block the recovery of dead rows, that were > deleted yesterday. Well, it's not one but two longest-transaction-lifetimes. That is, take the oldest transaction that's running now, and consider the oldest transaction that was running when it started. VACUUM can reclaim rows that were deleted by transactions that started before that one. You didn't explain your application's behavior exactly, but does that help? regards, tom lane