Re: Weird indices - Mailing list pgsql-general
From | Joseph Shraibman |
---|---|
Subject | Re: Weird indices |
Date | |
Msg-id | 3A93103B.8F549B94@selectacast.net Whole thread Raw |
In response to | Re[2]: Weird indices (Jean-Christophe Boggio <cat@thefreecat.org>) |
Responses |
Re: Weird indices
|
List | pgsql-general |
Jean-Christophe Boggio wrote: > > Joseph, > > I think you're going a bit too far... Tom and Stephan have been very > patient explaining you the basics of indices. > They are being patient. I thank them. I'm not trying to point fingers at anybody and say 'you idiot! why aren't you thinking like me?'. It's just that this whole visibility meets index meets planner thing is still eluding me. > >> The name of the game here is to make a plan *without* actually going > >> out and expending large amounts of time to find out the true state of > >> affairs; by the time you know for sure, you've already done the query. > > Believe this. All the best DB engines including PostgreSQL work that > way. This is based on measures, on real life. > I believe it. I just seemed to me that that looking at the index should have been before the cutoff point. And actually seemed to be in my little experiment. > JS> Well I'd hope that extracting the count from the index should be very > JS> low cost. That is what indecies are for. > > No, indices are made for finding a record in one go or for isolating a > small range of values. > When I learned data structures in cs 102 I learned how to use an index to quickly get a count. I assume the postgres developers know this as well. > JS> But certain things could be done. Like planning for the case of there > JS> being a single not null value, and updating the indecies not to point at > JS> expired rows. > > And then you'll ask when there are 2 not null values...? I think I mispoke myself there. I meant the case of there being the single repeated value. I can think offhand of a few ways to do it but I don't know how postgres uses indeces. > > JS> Isn't the point of a vacuum to get rid of old rows? Then > JS> why doesn't it update the index as well? > > It does. Look at vacuum verbose. > In my test case it didn't. My database is vacuumed every night, and I hadn't touched it that day before I did my test. Therefore it should have known there were only 16 matches and not 52. > JS> I mean the explain shows that getting the count(*) from the field that > JS> is indexed has to do a seq scan, presumably to determine if the rows are > JS> in fact valid. > > count(*) means you want all the rows that have all the fields "not > null". Read carefully : ALL THE FIELDS. Uh, no. Tom said so in a different reply. I understand that keeping different views for different open transactions can be difficult, but after a transaction that updates a row is over why isn't the row marked as 'universally visible' for all new transactions until another update occurs? Maybe I'm not making myself understood. Another way of asking the same thing: Say there is a transaction that is looking at a non-current version of a row. 'non-current' could be the value it was at the start of the transaction (and was updated by another transaction) or was updated by this transaction but not committed yet. When this transaction is over is it really that hard to get rid of the refrence to the old version of the row? There should be a 1 bit field 'is old value and isn't being used by any transaction'. Is that really hard? Maybe this is part of the whole 'vacuum later' vs. 'update now' philosophy. If the point of vacuum later is to put off the performance hit until later if it is causing these performance hits on queries because index scans aren't being used then doesn't that mean 'update now' is more likely to pay off in the short run? -- Joseph Shraibman jks@selectacast.net Increase signal to noise ratio. http://www.targabot.com
pgsql-general by date: