Thread: Re: [PATCHES] smartvacuum() instead of autovacuum
Hitoshi Harada wrote: > I am trying to implement smartvacuum(), which do vacuum only tables > having many dead rows, instead of autovacuum. How is this different from what autovacuum does? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Hi, Peter, > How is this different from what autovacuum does? My application needs to do vacuum by itself, while autovacuum does it as daemon. The database is updated so frequently that normal vacuum costs too much and tables to be updated are not so many as the whole database is vacuumed. I want to use autovacuum except the feature of daemon, but want to control when to vacuum and which table to vacuum. So, nothing is different between autovacuum and smartvacuum(), but former is daemon and later is user function. c.f. I ran autovacuum before, and my batch script did vacuum while autovacuum did one as well on the other session at the same time. I found the vacuum analyze conflicts each other sometime... so I want to control vacuum my self. http://archives.postgresql.org/pgsql-bugs/2002-12/msg00198.php http://archives.postgresql.org/pgsql-general/2004-05/msg00015.php Regards, Hitoshi Harada > -----Original Message----- > From: Peter Eisentraut [mailto:peter_e@gmx.net] > Sent: Sunday, October 22, 2006 10:08 PM > To: Hitoshi Harada > Cc: pgsql-hackers@postgresql.org > Subject: Re: [PATCHES] smartvacuum() instead of autovacuum > > Hitoshi Harada wrote: > > I am trying to implement smartvacuum(), which do vacuum only tables > > having many dead rows, instead of autovacuum. > > How is this different from what autovacuum does? > > -- > Peter Eisentraut > http://developer.postgresql.org/~petere/
"Hitoshi Harada" <hitoshi_harada@forcia.com> writes: >> How is this different from what autovacuum does? > My application needs to do vacuum by itself, while > autovacuum does it as daemon. > The database is updated so frequently that > normal vacuum costs too much and tables to be updated are > not so many as the whole database is vacuumed. > I want to use autovacuum except the feature of daemon, > but want to control when to vacuum and which table to vacuum. > So, nothing is different between autovacuum and smartvacuum(), > but former is daemon and later is user function. This seems completely unconvincing. What are you going to do that couldn't be done by autovacuum? regards, tom lane
Ok, But my point is, autovacuum may corrupt with vacuum analyze command on another session. My intention of smartvacuum() is based on this. Any solution for this?? Regards, Hitoshi Harada > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane > Sent: Monday, October 23, 2006 11:10 AM > To: Hitoshi Harada > Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum > > "Hitoshi Harada" <hitoshi_harada@forcia.com> writes: > >> How is this different from what autovacuum does? > > > My application needs to do vacuum by itself, while > > autovacuum does it as daemon. > > The database is updated so frequently that > > normal vacuum costs too much and tables to be updated are > > not so many as the whole database is vacuumed. > > I want to use autovacuum except the feature of daemon, > > but want to control when to vacuum and which table to vacuum. > > So, nothing is different between autovacuum and smartvacuum(), > > but former is daemon and later is user function. > > This seems completely unconvincing. What are you going to do that > couldn't be done by autovacuum? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings
If the decision to vacuum based on autovacuum criteria is good enough for you then I think you should just focus on getting autovac to do what you want it to do. Perhaps you just need to decrease the sleep time to a few seconds, so that autovac will quickly detect when something needs to be vacuumed. The only case I can think of where autovac might not work as well as smartvacuum would be if you had a lot of databases in the cluster, since autovacuum will only vacuum one database at a time. On Mon, Oct 23, 2006 at 11:18:39AM +0900, Hitoshi Harada wrote: > Ok, > > But my point is, autovacuum may corrupt with vacuum analyze command > on another session. My intention of smartvacuum() is based on this. > Any solution for this?? > > Regards, > > > Hitoshi Harada > > > -----Original Message----- > > From: pgsql-hackers-owner@postgresql.org > > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane > > Sent: Monday, October 23, 2006 11:10 AM > > To: Hitoshi Harada > > Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum > > > > "Hitoshi Harada" <hitoshi_harada@forcia.com> writes: > > >> How is this different from what autovacuum does? > > > > > My application needs to do vacuum by itself, while > > > autovacuum does it as daemon. > > > The database is updated so frequently that > > > normal vacuum costs too much and tables to be updated are > > > not so many as the whole database is vacuumed. > > > I want to use autovacuum except the feature of daemon, > > > but want to control when to vacuum and which table to vacuum. > > > So, nothing is different between autovacuum and smartvacuum(), > > > but former is daemon and later is user function. > > > > This seems completely unconvincing. What are you going to do that > > couldn't be done by autovacuum? > > > > regards, tom lane > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: don't forget to increase your free space map settings > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
"Jim C. Nasby" <jim@nasby.net> writes: > The only case I can think of where autovac might not work as well as > smartvacuum would be if you had a lot of databases in the cluster, since > autovacuum will only vacuum one database at a time. It's conceivable that it'd make sense to allow multiple autovac processes running in parallel. (The infrastructure part of this is easy enough, the hard part is keeping them from all deciding to vacuum the same table.) One reason we have not done that already is the thought that multiple vacuum processes would suck too much I/O to be reasonable. Now you could dial back their resource demands with the cost-delay settings, but it's not clear that ten autovacs running at one-tenth speed are better than one autovac using all the cycles you can spare. Usually I think it's best if a vacuum transaction finishes as fast as it can. In any case, these exact same concerns would apply to manual vacuums or a combination of manual and auto vacuum. regards, tom lane
Tom Lane wrote: > "Jim C. Nasby" <jim@nasby.net> writes: >> The only case I can think of where autovac might not work as well as >> smartvacuum would be if you had a lot of databases in the cluster, since >> autovacuum will only vacuum one database at a time. > > It's conceivable that it'd make sense to allow multiple autovac > processes running in parallel. (The infrastructure part of this is easy > enough, the hard part is keeping them from all deciding to vacuum the > same table.) > > One reason we have not done that already is the thought that multiple > vacuum processes would suck too much I/O to be reasonable. Now you > could dial back their resource demands with the cost-delay settings, > but it's not clear that ten autovacs running at one-tenth speed are > better than one autovac using all the cycles you can spare. Usually > I think it's best if a vacuum transaction finishes as fast as it can. I think this is one of the reasons table specific delay settings were designed in from the beginning. I think the main use cases for multiple vacuums at once are: 1) Vacuum per table space assuming each table space is on a different drive with it's own I/O. 2) the frequently updated table that can't wait to be vacuumed while a large table is being vacuumed. In this case if you set a system default delay setting and set a more aggressive table specific delay setting for your hot spot tables then multiple vacuums become a clear win. This is an important case that I hope we handle soon. At this point it's one of the main failings of the current autovacuum system.
On Mon, Oct 23, 2006 at 03:08:03PM -0400, Tom Lane wrote: > "Jim C. Nasby" <jim@nasby.net> writes: > > The only case I can think of where autovac might not work as well as > > smartvacuum would be if you had a lot of databases in the cluster, since > > autovacuum will only vacuum one database at a time. > > It's conceivable that it'd make sense to allow multiple autovac > processes running in parallel. (The infrastructure part of this is easy > enough, the hard part is keeping them from all deciding to vacuum the > same table.) It might be worth creating a generic framework that prevents multiple vacuums from hitting a table at once, autovac or not. > One reason we have not done that already is the thought that multiple > vacuum processes would suck too much I/O to be reasonable. Now you > could dial back their resource demands with the cost-delay settings, > but it's not clear that ten autovacs running at one-tenth speed are > better than one autovac using all the cycles you can spare. Usually > I think it's best if a vacuum transaction finishes as fast as it can. There's other things that would benefit from having some idea on what IO resources are available. For example, having a separate bgwriter (or reader) for each set of physical volumes. So a means of grouping tablespaces wouldn't hurt. > In any case, these exact same concerns would apply to manual vacuums > or a combination of manual and auto vacuum. Well, the advantage to manual vacuums is that you can tune things to utilize multiple arrays... -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Jim C. Nasby wrote: > On Mon, Oct 23, 2006 at 03:08:03PM -0400, Tom Lane wrote: > > "Jim C. Nasby" <jim@nasby.net> writes: > > > The only case I can think of where autovac might not work as well as > > > smartvacuum would be if you had a lot of databases in the cluster, since > > > autovacuum will only vacuum one database at a time. > > > > It's conceivable that it'd make sense to allow multiple autovac > > processes running in parallel. (The infrastructure part of this is easy > > enough, the hard part is keeping them from all deciding to vacuum the > > same table.) > > It might be worth creating a generic framework that prevents multiple > vacuums from hitting a table at once, autovac or not. That one is easy, because vacuum gets a lock on the affected table that conflicts with itself. The problem is that the second vacuum would actually wait for the first to finish. A naive idea is to use ConditionalLockAcquire, and if it fails just skip the table. > > One reason we have not done that already is the thought that multiple > > vacuum processes would suck too much I/O to be reasonable. Now you > > could dial back their resource demands with the cost-delay settings, > > but it's not clear that ten autovacs running at one-tenth speed are > > better than one autovac using all the cycles you can spare. Usually > > I think it's best if a vacuum transaction finishes as fast as it can. In the scenario where one table is huge and another is very small, it can certainly be useful to vacuum the small table several times while the huge one has only been vacuumed once. For that you definitively need the ability to run parallel vacuums. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
> If the decision to vacuum based on autovacuum criteria is good enough > for you then I think you should just focus on getting autovac to do what > you want it to do. Perhaps you just need to decrease the sleep time to a > few seconds, so that autovac will quickly detect when something needs to > be vacuumed. Thanks, I'll do it. My database is updated frequently all the day and runs big building process a day. Almost all the day autovac is ok but in the big building process autovac annoys it, so I wished there might be the way to order autovac to do its process. Hitoshi Harada > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jim C. Nasby > Sent: Tuesday, October 24, 2006 3:36 AM > To: Hitoshi Harada > Cc: 'Tom Lane'; 'Peter Eisentraut'; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum > > If the decision to vacuum based on autovacuum criteria is good enough > for you then I think you should just focus on getting autovac to do what > you want it to do. Perhaps you just need to decrease the sleep time to a > few seconds, so that autovac will quickly detect when something needs to > be vacuumed. > > The only case I can think of where autovac might not work as well as > smartvacuum would be if you had a lot of databases in the cluster, since > autovacuum will only vacuum one database at a time. > > On Mon, Oct 23, 2006 at 11:18:39AM +0900, Hitoshi Harada wrote: > > Ok, > > > > But my point is, autovacuum may corrupt with vacuum analyze command > > on another session. My intention of smartvacuum() is based on this. > > Any solution for this?? > > > > Regards, > > > > > > Hitoshi Harada > > > > > -----Original Message----- > > > From: pgsql-hackers-owner@postgresql.org > > > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane > > > Sent: Monday, October 23, 2006 11:10 AM > > > To: Hitoshi Harada > > > Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org > > > Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum > > > > > > "Hitoshi Harada" <hitoshi_harada@forcia.com> writes: > > > >> How is this different from what autovacuum does? > > > > > > > My application needs to do vacuum by itself, while > > > > autovacuum does it as daemon. > > > > The database is updated so frequently that > > > > normal vacuum costs too much and tables to be updated are > > > > not so many as the whole database is vacuumed. > > > > I want to use autovacuum except the feature of daemon, > > > > but want to control when to vacuum and which table to vacuum. > > > > So, nothing is different between autovacuum and smartvacuum(), > > > > but former is daemon and later is user function. > > > > > > This seems completely unconvincing. What are you going to do that > > > couldn't be done by autovacuum? > > > > > > regards, tom lane > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 5: don't forget to increase your free space map settings > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: don't forget to increase your free space map settings > > > > -- > Jim Nasby jim@nasby.net > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq