Re: GIN data corruption bug(s) in 9.6devel - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: GIN data corruption bug(s) in 9.6devel |
Date | |
Msg-id | 568B8F15.9000407@2ndquadrant.com Whole thread Raw |
In response to | Re: GIN data corruption bug(s) in 9.6devel (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: GIN data corruption bug(s) in 9.6devel
|
List | pgsql-hackers |
Hi, On 12/23/2015 09:33 PM, Jeff Janes wrote: > On Mon, Dec 21, 2015 at 11:51 AM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> >> >> On 12/21/2015 07:41 PM, Jeff Janes wrote: >>> >>> On Sat, Dec 19, 2015 at 3:19 PM, Tomas Vondra >>> <tomas.vondra@2ndquadrant.com> wrote: >> >> >> ... >> >>>> So both patches seem to do the trick, but (2) is faster. Not sure >>>> if this is expected. (BTW all the results are without asserts >>>> enabled). >>> >>> >>> Do you know what the size of the pending list was at the end of each >>> test? >>> >>> I think last one may be faster because it left a large mess behind >>> that someone needs to clean up later. >> >> >> No. How do I measure it? > > pageinspect's gin_metapage_info, or pgstattuple's pgstatginindex Hmmm, so this turns out not very useful, because at the end the data I get from gin_metapage_info is almost exactly the same for both patches (more details below). >> >>> >>> Also, do you have the final size of the indexes in each case? >> >> >> No, I haven't realized the patches do affect that, so I haven't measured it. > > There shouldn't be a difference between the two approaches (although I > guess there could be if one left a larger pending list than the other, > as pending lists is very space inefficient), but since you included > 9.5 in your test I thought it would be interesting to see how either > patched version under 9.6 compared to 9.5. Well, turns out there's a quite significant difference, actually. The index sizes I get (quite stable after multiple runs): 9.5 : 2428 MB 9.6 + alone cleanup : 730 MB 9.6 + pending lock : 488 MB So that's quite a significant difference, I guess. The load duration for each version look like this: 9.5 : 1415 seconds 9.6 + alone cleanup : 1310 seconds 9.6 + pending lock : 1380 seconds I'd say I'm happy with sacrificing ~5% of time in exchange for ~35% reduction of index size. The size of the index on 9.5 after VACUUM FULL (so pretty much the smallest index possible) is 440MB, which suggests the "pending lock" patch does a quite good job. The gin_metapage_info at the end of one of the runs (pretty much all the runs look exactly the same) looks like this: pending lock alone cleanup 9.5 -------------------------------------------------------- pending_head 2 2 310460 pending_tail 338 345 310806 tail_free_size 812 812 812 n_pending_pages 330 339 347 n_pending_tuples 1003 1037 1059 n_total_pages 2 2 2 n_entry_pages 1 1 1 n_data_pages 0 0 0 n_entries 0 0 0 version 2 2 2 So almost no difference, except for the pending_* attributes, and even in that case the values are only different for 9.5 branch. Not sure what conclusion to draw from this - maybe it's necessary to collect the function input while the load is running (but that'd be tricky to process, I guess). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: