Re: Index no longer being used, destroying and recreating it restoresuse. - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Index no longer being used, destroying and recreating it restoresuse. |
Date | |
Msg-id | 92ea007a-abe0-1dd2-7e5c-080da70c2e2c@aklaver.com Whole thread Raw |
In response to | Re: Index no longer being used, destroying and recreating it restores use. (Koen De Groote <kdg.dev@gmail.com>) |
Responses |
Re: Index no longer being used, destroying and recreating it restores use.
|
List | pgsql-general |
On 6/5/20 7:05 AM, Koen De Groote wrote: > I've collected all relevant info(I think so at least) and put it here: > > The table in question is used to keep filepath data, of files on a > harddrive. > The query in question is used to retrieve items which should be backed > up, but have not yet been. > > The relevant columns of the table: > > Table "public.item" > Column | Type | > Collation | Nullable | Default > ------------------------------------+-----------------------------+-----------+----------+---------------------------------------------- > id | bigint | > | not null | nextval('item_id_seq'::regclass) > shouldbebackedup | boolean | > | not null | true > backupperformed | boolean | > | not null | false > itemCreated | timestamp without time zone | > | | now() > filepath | text | > | | > > > The existing index, which no longer gets used: > "index_in_question" btree (shouldbebackedup, backupperformed, > itemCreated, filepath) WHERE shouldbebackedup = true AND backupperformed > = false > > The new index, made out of the exact same columns and conditions, get > used immediately after creation: > CREATE INDEX CONCURRENTLY index_test ON item USING btree > (shouldbebackedup, backupperformed, itemCreated, filepath) WHERE > shouldbebackedup = true AND backupperformed = false; > > > The query in question will look something like this: > select * from item where shouldbebackedup=true and > itemCreated<='2020-06-05 00:00:00.000' and backupperformed=false order > by filepath asc, id asc limit 100 offset 10400; The result of EXPLAIN ANALYZE for above. > > Having done a count, there are around 13000 items here, without the > offset and limit. > That being said, the amount is entirely dependant on what was added on a > previous day. > > > I tried creating an extended statistic, like this, but it had no effect: > CREATE STATISTICS s1 (dependencies) ON shouldbebackedup, backupperformed > FROM item; > > Settings from the conf file I think are related: > > shared_buffers = 1024MB > effective_cache_size = 2048MB > random_page_cost = 1.1 > effective_io_concurrency = 200 > work_mem = 32MB > > Finally, I state again that this database gets a nightly "vacuum analyze". > > My thanks for looking at this and any suggestions one might have. > > Regards, > Koen > > On Thu, Jun 4, 2020 at 7:08 PM Tom Lane <tgl@sss.pgh.pa.us > <mailto:tgl@sss.pgh.pa.us>> wrote: > > Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> writes: > > On 6/4/20 9:43 AM, Tom Lane wrote: > >> It's possible that the index had bloated to the point where the > planner > >> thought it was cheaper to use a seqscan. Did you make a note of the > >> cost estimates for the different plans? > > > I missed the part where the OP pointed to a SO question. In that > > question where links to explain.depesz.com > <http://explain.depesz.com> output. > > Ah, I didn't bother to chase that link either. > > So the cost estimates are only a fraction of a percent apart, making > it unsurprising for not-so-large changes in the index size to cause > a flip in the apparently-cheapest plan. The real question then is > why the cost estimates aren't actually modeling the real execution > times very well; and I'd venture that that question boils down to > why is this rowcount estimate so far off: > > > -> Parallel Seq Scan on oscar mike_three > > (cost=0.000..1934568.500 rows=2385585 width=3141) (actual > > time=159.800..158018.961 rows=23586 loops=3) > > Filter: (four AND (NOT bravo) AND (zulu <= > > 'echo'::timestamp without time zone)) > > Rows Removed by Filter: 8610174 > > We're not going to be able to answer that if the OP doesn't wish > to decloak his data a bit more ... but a reasonable guess is that > those filter conditions are correlated. With late-model Postgres > you might be able to improve matters by creating extended statistics > for this table. > > regards, tom lane > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: