Re: Experimental patch for inter-page delay in VACUUM - Mailing list pgsql-hackers
From | Jan Wieck |
---|---|
Subject | Re: Experimental patch for inter-page delay in VACUUM |
Date | |
Msg-id | 3FA436E5.8040506@Yahoo.com Whole thread Raw |
In response to | Re: Experimental patch for inter-page delay in VACUUM ("Stephen" <jleelim@xxxxxx.com>) |
Responses |
Re: Experimental patch for inter-page delay in VACUUM
|
List | pgsql-hackers |
Stephen wrote: > I tried the Tom Lane's patch on PostgreSQL 7.4-BETA-5 and it works > fantastically! Running a few short tests show a significant improvement in > > responsiveness on my RedHat 9 Linux 2.4-20-8 (IDE 120GB 7200RPM UDMA5). I am currently looking at implementing ARC as a replacement strategy. I don't have anything that works yet, so I can't really tell what the result would be and it might turn out that we want both features. All I can say is that the theory looks like an extremely smart and generalized version of the crude hack I had done. And that one is able to lower the impact of VACUUM on the foreground clients while increasing the VACUUM speed. The 7.3.4 version of my crude hack is attached. Jan > > I didn't feel any noticeable delay when vacuum_page_delay is set to 5ms, 10 > ms. Vacuum takes 15 to 24 times longer to complete (as expected) > > but I don't mind at all. Vmstat BI/BO load is reduced by 5 times when > vacuum_page_delay = 1ms. Load average reduced significantly > > also as there are less processes waiting to complete. I find a value of 1ms > to 5ms is quite good and will keep system responsive. Going from 10ms to 1ms > didn't seem to reduce the total vacuum time by much and I'm not sure why. > > Any chance we can get this patched into 7.4 permanently? > > I cannot say how well it would work on a heavy load, but on a light load > this patch is highly recommended for 24/7 large DB systems. The > > database is mostly read-only. There are 133,000 rows and each row is about > 2.5kB in size (mostly due to the bytea column holding a binary > > image). The long row causes system to TOAST the table. I repeatedly ran the > following tests while system is idling: > > > Normal operation with no VACUUM > =============================== > > tsdb=# explain analyze select * from table1 where id = > '0078997ac809877c1a0d1f76af753608'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=19.030..19.036 rows=1 loops=1) > Index Cond: ((id)::text = '0078997ac809877c1a0d1f76af753608'::text) > Total runtime: 19.206 ms > (3 rows) > > > > VACUUM at vacuum_page_delay = 0 > =============================== > > -bash-2.05b$ vmstat 1 > procs memory swap io system > cpu > r b w swpd free buff cache si so bi bo in cs us sy > id > 0 1 0 176844 3960 17748 146704 0 0 1408 0 296 556 0 1 > 99 > 0 1 0 176844 3960 17748 146264 0 0 1536 0 285 546 0 2 > 98 > > > tsdb=# explain analyze select * from table1 where id = > '00e5ae5f4fddab371f7847f7da65eebb'; > QUERY PLAN > ---------------------------------------------------------------------------- > ---------------------------------------------------- > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=298.028..298.047 rows=1 loops=1) > Index Cond: ((id)::text = '0036edc4a92b6afd41304c6c8b76bc3c'::text) > Total runtime: 298.275 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '0046751ac3ec290b9f66ea1d66431923'; > QUERY PLAN > ---------------------------------------------------------------------------- > -------------------------------------------------------- > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=454.727..454.746 rows=1 loops=1) > Index Cond: ((id)::text = '0046751ac3ec290b9f66ea1d66431923'::text) > Total runtime: 454.970 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '00a74e6885579a2d50487f5a1dceba22'; > QUERY PLAN > ---------------------------------------------------------------------------- > -------------------------------------------------------- > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=344.483..344.501 rows=1 loops=1) > Index Cond: ((id)::text = '00a74e6885579a2d50487f5a1dceba22'::text) > Total runtime: 344.700 ms > (3 rows) > > > VACUUM at vacuum_page_delay = 1 > =============================== > > procs memory swap io system > cpu > r b w swpd free buff cache si so bi bo in cs us sy > id > 0 0 0 176840 4292 23700 137416 0 0 384 0 127 302 0 0 > 100 > 0 0 0 176840 4220 23700 137116 0 0 512 0 118 286 0 0 > 100 > 1 0 0 176840 4220 23700 136656 0 0 384 0 132 303 0 1 > 99 > > > tsdb=# explain analyze select * from table1 where id = > '003d5966f8b9a06e4b0fff9fa8e93be0'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=74.575..74.584 rows=1 loops=1) > Index Cond: ((id)::text = '003d5966f8b9a06e4b0fff9fa8e93be0'::text) > Total runtime: 74.761 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '00677fe46cd0af3d98564068f34db1cf'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=31.779..31.785 rows=1 loops=1) > Index Cond: ((id)::text = '00677fe46cd0af3d98564068f34db1cf'::text) > Total runtime: 31.954 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '00b7c3e2fffdf39ff4ac50add04336b7'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=78.974..78.989 rows=1 loops=1) > Index Cond: ((id)::text = '00b7c3e2fffdf39ff4ac50add04336b7'::text) > Total runtime: 79.172 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '008d49c007f711d5f5ec48b67a8e58f0'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=30.143..30.148 rows=1 loops=1) > Index Cond: ((id)::text = '008d49c007f711d5f5ec48b67a8e58f0'::text) > Total runtime: 30.315 ms > (3 rows) > > > VACUUM at vacuum_page_delay = 5 > =============================== > > -bash-2.05b$ vmstat 1 > procs memory swap io system > cpu > r b w swpd free buff cache si so bi bo in cs us sy > id > 0 0 0 176840 4228 22668 138212 0 0 512 0 117 276 0 0 > 100 > 0 0 0 176840 4220 22668 138212 0 0 384 0 132 296 0 1 > 99 > 0 0 0 176840 4220 22668 137764 0 0 384 0 114 276 0 0 > 100 > > > tsdb=# explain analyze select * from table1 where id = > '000aa16ffe019fa327b68b7e610e5ac0'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=14.089..14.094 rows=1 loops=1) > Index Cond: ((id)::text = '000aa16ffe019fa327b68b7e610e5ac0'::text) > Total runtime: 14.252 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '00aacc4684577737498df0536be1fac8'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=16.747..16.752 rows=1 loops=1) > Index Cond: ((id)::text = '00aacc4684577737498df0536be1fac8'::text) > Total runtime: 16.910 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '00e295f5644d4cb77a5ebc4efbbaa770'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=16.684..16.690 rows=1 loops=1) > Index Cond: ((id)::text = '00e295f5644d4cb77a5ebc4efbbaa770'::text) > Total runtime: 16.886 ms > (3 rows) > > VACUUM at vacuum_page_delay = 10 > ================================ > > -bash-2.05b$ vmstat 1 > procs memory swap io system > cpu > r b w swpd free buff cache si so bi bo in cs us sy > id > 0 0 0 176840 4336 20968 139780 0 0 384 108 121 294 0 0 > 100 > 0 0 0 176840 4336 20968 140164 0 0 384 0 130 281 0 1 > 99 > > > tsdb=# explain analyze select * from table1 where id = > '007841017b9f7c80394f2bb4314ba8c1'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=19.576..19.587 rows=1 loops=1) > Index Cond: ((id)::text = '007841017b9f7c80394f2bb4314ba8c1'::text) > Total runtime: 19.854 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '0070724846c4d0d0dbb8f3e939fd1da4'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=10.616..10.624 rows=1 loops=1) > Index Cond: ((id)::text = '0070724846c4d0d0dbb8f3e939fd1da4'::text) > Total runtime: 10.795 ms > (3 rows) > > tsdb=# explain analyze select * from table1 where id = > '00fc92bf0f5048d7680bd8fa2d4c6f3a'; > QUERY PLAN > ---------------------------------------------------------------------------- > ------------------------------------------------------ > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=2 width=344) > (actual time=28.007..28.014 rows=1 loops=1) > Index Cond: ((id)::text = '00fc92bf0f5048d7680bd8fa2d4c6f3a'::text) > Total runtime: 28.183 ms > (3 rows) > > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # diff -crN ../src.orig/backend/commands/vacuum.c ./backend/commands/vacuum.c *** ../src.orig/backend/commands/vacuum.c Thu Oct 30 10:17:36 2003 --- ./backend/commands/vacuum.c Thu Oct 30 10:27:57 2003 *************** *** 98,103 **** --- 98,107 ---- } VRelStats; + bool vacuum_buffer_hack = true; + bool vacuum_buffer_hack_active = false; + + static MemoryContext vac_context = NULL; static int elevel = -1; *************** *** 279,284 **** --- 283,295 ---- } /* + * Change the buffer cache policy to avoid vacuum causing a + * full cache eviction. + */ + if (vacuum_buffer_hack) + vacuum_buffer_hack_active = true; + + /* * Loop to process each selected relation. */ foreach(cur, vrl) *************** *** 320,325 **** --- 331,341 ---- } } } + + /* + * Change back to normal buffer cache policy. + */ + vacuum_buffer_hack_active = false; /* * Finish up processing. diff -crN ../src.orig/backend/storage/buffer/bufmgr.c ./backend/storage/buffer/bufmgr.c *** ../src.orig/backend/storage/buffer/bufmgr.c Thu Oct 30 10:17:38 2003 --- ./backend/storage/buffer/bufmgr.c Thu Oct 30 10:44:10 2003 *************** *** 64,69 **** --- 64,72 ---- bool zero_damaged_pages = false; + extern bool vacuum_buffer_hack_active; + + static void WaitIO(BufferDesc *buf); static void StartBufferIO(BufferDesc *buf, bool forInput); static void TerminateBufferIO(BufferDesc *buf); *************** *** 205,210 **** --- 208,222 ---- StartBufferIO(bufHdr, false); LWLockRelease(BufMgrLock); } + } + else + { + /* The block was not found in the buffer cache. If we are in + * vacuum buffer cache strategy mode, mark that this buffers + * content was caused by vacuum. + */ + if (vacuum_buffer_hack_active) + bufHdr->flags |= BM_READ_BY_VACUUM; } /* diff -crN ../src.orig/backend/storage/buffer/freelist.c ./backend/storage/buffer/freelist.c *** ../src.orig/backend/storage/buffer/freelist.c Thu Oct 30 10:17:38 2003 --- ./backend/storage/buffer/freelist.c Thu Oct 30 12:38:42 2003 *************** *** 71,79 **** #endif /* BMTRACE */ IsNotInQueue(bf); ! /* change bf so it points to inFrontOfNew and its successor */ ! bf->freePrev = SharedFreeList->freePrev; ! bf->freeNext = Free_List_Descriptor; /* insert new into chain */ BufferDescriptors[bf->freeNext].freePrev = bf->buf_id; --- 71,93 ---- #endif /* BMTRACE */ IsNotInQueue(bf); ! if ((bf->flags & BM_READ_BY_VACUUM) != 0) ! { ! /* in the case the buffer was read or created by vacuum, ! * add it to the end of the queue so that GetFreeBuffer() ! * will return it next again. ! */ ! bf->freePrev = Free_List_Descriptor; ! bf->freeNext = SharedFreeList->freeNext; ! ! bf->flags &= ~BM_READ_BY_VACUUM; ! } ! else ! { ! /* change bf so it points to inFrontOfNew and its successor */ ! bf->freePrev = SharedFreeList->freePrev; ! bf->freeNext = Free_List_Descriptor; ! } /* insert new into chain */ BufferDescriptors[bf->freeNext].freePrev = bf->buf_id; diff -crN ../src.orig/backend/tcop/postgres.c ./backend/tcop/postgres.c *** ../src.orig/backend/tcop/postgres.c Thu Oct 30 10:17:39 2003 --- ./backend/tcop/postgres.c Thu Oct 30 10:25:07 2003 *************** *** 69,74 **** --- 69,77 ---- extern int optind; extern char *optarg; + extern bool vacuum_buffer_hack; + extern bool vacuum_buffer_hack_active; + char *debug_query_string; /* for pgmonitor and * log_min_error_statement */ *************** *** 1874,1879 **** --- 1877,1888 ---- for (;;) { + /* + * Reset the vacuum buffer hack active flag, could be left + * true if a vacuum got aborted. + */ + vacuum_buffer_hack_active = false; + /* * Release storage left over from prior query cycle, and create a * new query input buffer in the cleared QueryContext. diff -crN ../src.orig/include/storage/buf_internals.h ./include/storage/buf_internals.h *** ../src.orig/include/storage/buf_internals.h Thu Oct 30 10:17:43 2003 --- ./include/storage/buf_internals.h Thu Oct 30 10:29:56 2003 *************** *** 41,46 **** --- 41,47 ---- #define BM_IO_ERROR (1 << 6) #define BM_JUST_DIRTIED (1 << 7) #define BM_PIN_COUNT_WAITER (1 << 8) + #define BM_READ_BY_VACUUM (1 << 9) typedef bits16 BufFlags;
pgsql-hackers by date: