autonomous transactions - Mailing list pgsql-hackers
From | Peter Eisentraut |
---|---|
Subject | autonomous transactions |
Date | |
Msg-id | 659a2fce-b6ee-06de-05c0-c8ed6a01979e@2ndquadrant.com Whole thread Raw |
Responses |
Re: autonomous transactions
Re: autonomous transactions Re: autonomous transactions Re: autonomous transactions Re: autonomous transactions background sessions |
List | pgsql-hackers |
I would like to propose the attached patch implementing autonomous transactions for discussion and review. This work was mostly inspired by the discussion about pg_background a while back [0]. It seemed that most people liked the idea of having something like that, but couldn't perhaps agree on the final interface. Most if not all of the preliminary patches in that thread were committed, but the user interface portions were then abandoned in favor of other work. (I'm aware that rebased versions of pg_background existing. I have one, too.) The main use case, in a nutshell, is to be able to commit certain things independently without having it affected by what happens later to the current transaction, for example for audit logging. My patch consists of three major pieces. (I didn't make them three separate patches because it will be clear where the boundaries are.) - A API interface to open a "connection" to a background worker, run queries, get results: AutonomousSessionStart(), AutonomousSessionEnd(), AutonomousSessionExecute(), etc. The communication happens using the client/server protocol. - Patches to PL/pgSQL to implement Oracle-style autonomous transaction blocks: AS $$ DECLARE PRAGMA AUTONOMOUS_TRANSACTION; BEGIN FOR i IN 0..9 LOOP START TRANSACTION; INSERT INTO test1 VALUES (i); IF i % 2 = 0 THEN COMMIT; ELSE ROLLBACK; END IF; END LOOP; RETURN 42; END; $$; This is very incomplete and has some open technical issues that I will discuss below. But those are all issues of PL/pgSQL, not really issues of how autonomous sessions work. Basically, a block that is declared with that pragma uses the autonomous C API instead of SPI to do its things. - Patches to PL/Python to implement a context manager for autonomous sessions (similar to how subtransactions work there): with plpy.autonomous() as a: for i in range(0, 10): a.execute("BEGIN") a.execute("INSERT INTO test1 (a) VALUES (%d)" % i) if i % 2 == 0: a.execute("COMMIT") else: a.execute("ROLLBACK") This works quite well, except perhaps some tuning with memory management and some caching and some refactoring. While the PL/pgSQL work is more of a top-level goal, I added the PL/Python implementation because it is easier to map the C API straight out to something more accessible, so testing it out is much easier. The main technical problem I had with PL/pgSQL is how to parse named parameters. If you're in PL/Python, say, you do plan = a.prepare("INSERT INTO test1 (a, b) VALUES ($1, $2)", ["int4", "text"]) and that works fine, because it maps straight to the client/server protocol. But in PL/pgSQL, you will want something like DECLARE x, y ... BEGIN INSERT INTO test1 (a, b) VALUES (x, y) When running in-process (SPI), we install parser hooks that allow the parser to check back into PL/pgSQL about whether x, y are variables and what they mean. When we run in an autonomous session, we don't have that available. So my idea was to extend the protocol Parse message to allow sending a symbol table instead of parameter types. So instead of saying, there are two parameters and here are their types, I would send a list of symbols and types, and the server would respond to the Parse message with some kind of information about which symbols it found. I think that would work, but I got lost in the weeds and didn't get very far. But you can see some of that in the code. If anyone has other ideas, I'd be very interested. Other than that, I think there are also other bits and pieces that are worth looking at, and perhaps have some overlap with other efforts, such as: - Refining the internal APIs for running queries, with more flexibility than SPI. There have recently been discussions about that. I just used whatever was in tcop/postgres.c directly, like pg_background does, and that seems mostly fine, but if there are other ideas, they would be useful for this, too. - An exception to the "mostly fine" is that the desirable handling of log_statement, log_duration, log_min_duration_statement for non-top-level execution is unclear. - The autonomous session API could also be useful for other things, such as perhaps implementing a variant of pg_background on top of them, or doing other asynchronous or background execution schemes. So input on that is welcome. - There is some overlap with the protocol handling for parallel query, including things like error propagation, notify handling, encoding handling. I suspect that other background workers will need similar facilities, so we could simplify some of that. - Client encoding in particular was recently discussed for parallel query. The problem with the existing solution is that it makes assign_client_encoding() require hardcoded knowledge of all relevant background worker types. So I tried a more general solution, with a hook. - I added new test files in the plpgsql directory. The main test for plpgsql runs as part of the main test suite. Maybe we want to move that to the plpgsql directory as well. - More guidance for using some of the background worker and shared memory queue facilities. For example, I don't know what a good queue size would be. - Both PL/pgSQL and PL/Python expose some details of SPI in ways that make it difficult to run some things not through SPI. For example, return codes are exposed directly by PL/Python. PL/pgSQL is heavily tied to the API flow of SPI. It's fixable, but it will be some work. I had originally wanted to hide the autonomous session API inside SPI or make it fully compatible with SPI, but that was quickly thrown out. PL/Python now contains some ugly code to make certain things match up so that existing code can be used. It's not always pretty. - The patch "Set log_line_prefix and application name in test drivers" (https://commitfest.postgresql.org/10/717/) is helpful in testing and debugging this. [0]: https://www.postgresql.org/message-id/flat/CA+Tgmoam66dTzCP8N2cRcS6S6dBMFX+JMba+mDf68H=KAkNjPQ@mail.gmail.com -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: