Re: Sequence Access Method WIP - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Sequence Access Method WIP |
Date | |
Msg-id | 20131118094856.GA5526@awork2.anarazel.de Whole thread Raw |
In response to | Re: Sequence Access Method WIP (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: Sequence Access Method WIP
|
List | pgsql-hackers |
On 2013-11-18 10:54:42 +0200, Heikki Linnakangas wrote: > On 15.11.2013 20:21, Andres Freund wrote: > >Well, it exposes log_sequence_tuple() - together with the added "am > >private" column of pg_squence that allows to do quite a bit of different > >things. I think unless we really implement pluggable RMGRs - which I > >don't really see coming - we cannot be radically different. > > The proposed abstraction leaks like a sieve. The plugin should not need to > touch anything but the private amdata field. log_sequence_tuple() is way too > low-level. Why? It's *less* low level than a good part of other crash recovery code we have. I have quite some doubts about the layering, but it's imo pretty sensible to centralize the wal logging code that plugins couldn't write. If you look at what e.g the _alloc callback currently gets passed. a) Relation: Important for metadata like TupleDesc, name etc. b) SeqTable entry: Important to setup state for cur/lastval c) Buffer of the tuple: LSN etc. d) HeapTuple itself: a place to store the tuples changed state And if you then look at what gets modified in that callback: Form_pg_sequence->amdata ->is_called ->last_value ->log_cnt SeqTable ->last SeqTable ->cached SeqTable ->last_valid You need is_called, last_valid because of our current representation of sequences state (which we could change, to remove that need). The rest contains values that need to be set depending on how you want your sequence to behave: * Amdata is obvious. * You need log_cnt to influence/remember how big the chunks are you WAL log at once are. Which e.g. you need to controlif want a sequence that doesn't leak values in crashes * cached is needed to control the per-backend caching. > Just as a thought-experiment, imagine that we decided to re-implement > sequences so that all the sequence values are stored in one big table, or > flat-file in the data directory, instead of the current implementation where > we have a one-block relation file for each sequence. If the sequence access > method API is any good, it should stay unchanged. That's clearly not the > case with the proposed API. I don't think we can entirely abstract away the reality that now they are based on relations and might not be at some later point. Otherwise we'd have to invent an API that copied all the data that's stored in struct RelationData. Like name, owner, ... Which non SQL accessible API we provide has a chance of providing that level of consistency in the face of fundamental refactoring? I'd say none. > Stepping back a bit and looking at this problem from a higher level, why do > you need to hack this stuff into the sequences? Couldn't you just define a > new set of functions, say bdr_currval() and bdr_nextval(), to operate on > these new kinds of sequences? Basically two things: a) User interface. For one everyone would need to reimplement the entire sequence DDL from start. For another it means it's hard to write (client) code that doesn't depend on a specific implementation. b) It's not actually easy to get similar semantics in "userspace". How would you emulate the crash-safe but non-transactional semantics of sequences without copying most of sequence.c? Without writing XLogInsert()s which we cannot do from a out-of-tree code? > You're not making much use of the existing sequence infrastructure, anyway, so it might be best to just keep the > implementation completely separate. If you need it for compatibility with > applications, you could create facade currval/nextval() functions that call > the built-in version or the bdr version depending on the argument. That doesn't get you very far: a) the default functions created by sequences are pg_catalog prefixed. So you'd need to hack up the catalog to get your own functions to be used if you want the application to work transparently. In which you need to remember the former function because you now cannot call it normally anymore. Yuck. b) One would need nearly all of sequence.c again. You need the state across calls, the locking, the WAL logging, DDL support. Pretty much the only thing *not* used would be nextval_internal() and do_setval(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: