Re: shared-memory based stats collector - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: shared-memory based stats collector |
Date | |
Msg-id | 20180705.120423.49626073.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: shared-memory based stats collector (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: shared-memory based stats collector
|
List | pgsql-hackers |
Hello. At Wed, 04 Jul 2018 17:23:51 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <67470.1530739431@sss.pgh.pa.us> > Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes: > > At Mon, 2 Jul 2018 14:25:58 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoYQhr30eAcgJCi1v0FhA+3RP1FZVnXqSTLe=6fHy9e5oA@mail.gmail.com> > >> Copying the whole hash table kinds of sucks, partly because of the > >> time it will take to copy it, but also because it means that memory > >> usage is still O(nbackends * ntables). Without looking at the patch, > >> I'm guessing that you're doing that because we need a way to show each > >> transaction a consistent snapshot of the data, and I admit that I > >> don't see another obvious way to tackle that problem. Still, it would > >> be nice if we had a better idea. > > > The consistency here means "repeatable read" of an object's stats > > entry, not a snapshot covering all objects. We don't need to copy > > all the entries at once following this definition. The attached > > version makes a cache entry only for requested objects. > > Uh, what? That's basically destroying the long-standing semantics of > statistics snapshots. I do not think we can consider that acceptable. > As an example, it would mean that scan counts for indexes would not > match up with scan counts for their tables. The current stats collector mechanism sends at most 8 table stats in a single message. Split messages from multiple transactions can reach to collector in shuffled order. The resulting snapshot can be "inconsistent" if INQUIRY message comes between such split messages. Of course a single meesage would be enough for common transactions but not for all. Even though the inconsistency happens more frequently with this patch, I don't think users expect such strict consistency of table stats, especially on a busy system. And I believe it's a good thing if users saw more "useful" information for the relaxed consistency. (The actual meaning of "useful" is out of the current focus:p) Meanwhile, if we should keep the practical-consistency, a giant lock is out of the question. So we need a transactional stats of any shape. It can be a whole-image snapshot or a regular MMVC table, or maybe the current dshash with UNDO logs. Since there are actually many states, it is inevitable to require storage to reproduce each state. I think the consensus is that the whole-image snapshot takes too-much memory. MMVC is apparently too-much for the purpose. UNDO logs seems a bit promising. If we looking stats in a long transaction, the required memory for UNDO information easily reaches to the same amount with the whole-image snapshot. But I expect that it is not so common. I'll consider that apart from the current patch. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: