Thread: temporary tables, and lots of 0 byte files
I have an application which, during the normal course of its operation, creates a single temporary table. BEGIN; -- bunch of stuff here CREATE TEMPORARY TABLE results ON COMMIT DROP AS SELECT <big huge select here> -- a bit more stuff here ROLLBACK; In this case, *every* transaction ends in a rollback. Here is the problem: When I started benchmarking this application, I noticed it (postgresql) started chewing up inodes on the logical volume which houses /var/lib/pgsql. It chewed up a few thousand inodes, and then a few thousand more, and then there were no more. I investigated /var/lib/pgsql/data/ and found within the database's directory many thousands of 0-byte files. I stopped the application expecting at the end of a /session/ that the inodes would be unlinked, but there was no change. lsof told me nobody had them open. However, a start/stop of postgresql removed (at least most of) the 0-byte files. After some trial and error, I straced the right process (the bgwriter process) and found that, upon signal to leave, it would write some checkpoint-y stuff and then proceed about unlinking most or all of the 0-byte files. I traced this to, I think, src/backend/access/transam/slru.c in the SimpleLruFlush routine. My guess (probably wrong) is that there is no way to signal to the LRU cache that a file is /really/ no longer needed without flushing all of them. Since creating a temporary table results in (probably several) files that are created, these files are not removed until either a) room for them is necessary in the LRU cache or, b) the server is shut down. I probably have this wrong. What can be done to remove all of the files associated with dropped (temporary) tables _when_ the they are dropped? I'm using 8.4.5. -- Jon
Jon Nelson <jnelson+pgsql@jamponi.net> writes: > Here is the problem: When I started benchmarking this application, I > noticed it (postgresql) started chewing up inodes on the logical > volume which houses /var/lib/pgsql. It chewed up a few thousand > inodes, and then a few thousand more, and then there were no more. > I investigated /var/lib/pgsql/data/ and found within the database's > directory many thousands of 0-byte files. I stopped the application > expecting at the end of a /session/ that the inodes would be unlinked, > but there was no change. lsof told me nobody had them open. However, a > start/stop of postgresql removed (at least most of) the 0-byte files. > After some trial and error, I straced the right process (the bgwriter > process) and found that, upon signal to leave, it would write some > checkpoint-y stuff and then proceed about unlinking most or all of the > 0-byte files. This is not a bug; it's normal behavior when dropping a table. The table file is truncated to zero bytes at commit, but it's not unlinked until the next checkpoint. I surmise that you may have a very long checkpoint cycle time selected. > What can be done to remove all of the files associated with dropped > (temporary) tables _when_ the they are dropped? Nothing other than forcing a checkpoint. There are race-condition-related reasons for doing it like this, which I don't have at the top of my brain, but you can find them in the archives if you care enough. regards, tom lane
On 11/22/10, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jon Nelson <jnelson+pgsql@jamponi.net> writes: >> Here is the problem: When I started benchmarking this application, I >> noticed it (postgresql) started chewing up inodes on the logical >> volume which houses /var/lib/pgsql. It chewed up a few thousand >> inodes, and then a few thousand more, and then there were no more. > >> I investigated /var/lib/pgsql/data/ and found within the database's >> directory many thousands of 0-byte files. I stopped the application >> expecting at the end of a /session/ that the inodes would be unlinked, >> but there was no change. lsof told me nobody had them open. However, a >> start/stop of postgresql removed (at least most of) the 0-byte files. >> After some trial and error, I straced the right process (the bgwriter >> process) and found that, upon signal to leave, it would write some >> checkpoint-y stuff and then proceed about unlinking most or all of the >> 0-byte files. > > This is not a bug; it's normal behavior when dropping a table. The > table file is truncated to zero bytes at commit, but it's not unlinked > until the next checkpoint. I surmise that you may have a very long > checkpoint cycle time selected. Indeed I do. An explicit CHECKPOINT also clears the inodes up. I was very surprised to chew threw some 16 thousand inodes in a minute or two. Thanks! -- Jon