Re: Speed dblink using alternate libpq tuple storage - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Speed dblink using alternate libpq tuple storage |
Date | |
Msg-id | 15437.1332454036@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Speed dblink using alternate libpq tuple storage (Marko Kreen <markokr@gmail.com>) |
Responses |
Re: Speed dblink using alternate libpq tuple storage
Re: Speed dblink using alternate libpq tuple storage |
List | pgsql-hackers |
Marko Kreen <markokr@gmail.com> writes: > [ patches against libpq and dblink ] I think this patch needs a lot of work. AFAICT it breaks async processing entirely, because many changes have been made that fail to distinguish "insufficient data available as yet" from "hard error". As an example, this code at the beginning of getAnotherTuple: /* Get the field count and make sure it's what we expect */ if (pqGetInt(&tupnfields, 2, conn)) ! return EOF; is considering three cases: it got a 2-byte integer (and can continue on), or there aren't yet 2 more bytes available in the buffer, in which case it should return EOF without doing anything, or pqGetInt detected a hard error and updated the connection error state accordingly, in which case again there is nothing to do except return EOF. In the patched code we have: /* Get the field count and make sure it's what we expect */ if (pqGetInt(&tupnfields, 2, conn)) ! { ! /* Whole the message must be loaded on the buffer here */ ! errmsg = libpq_gettext("protocol error\n"); ! goto error_and_forward; ! } which handles neither the second nor third case correctly: it thinks that "data not here yet" is a hard error, and then makes sure it is an error by destroying the parsing state :-(. And if in fact pqGetInt did log an error, that possibly-useful error message is overwritten with an entirely useless "protocol error" text. I don't think the error return cases for the row processor have been thought out too well either. The row processor is not in charge of what happens to the PGresult, and it certainly has no business telling libpq to just "exit immediately from the topmost libpq function". If we do that we'll probably lose sync with the data stream and be unable to recover use of the connection at all. Also, do we need to consider any error cases for the row processor other than out-of-memory? If so it might be a good idea for it to have some ability to store a custom error message into the PGconn, which it cannot do given the current function API. In the same vein, I am fairly uncomfortable with the blithe assertion that a row processor can safely longjmp out of libpq. This was never foreseen in the original library coding and there are any number of places that that might break, now or in the future. Do we really need to allow it? If we do, it would be a good idea to decorate the libpq functions that are now expected to possibly longjmp with comments saying so. Tracing all the potential call paths that might be aborted by a longjmp is an essential activity anyway. Another design deficiency is PQskipResult(). This is badly designed for async operation because once you call it, it will absolutely not give back control until it's read the whole query result. (It'd be better to have it set some kind of flag that makes future library calls discard data until the query result end is reached.) Something else that's not very well-designed about it is that it might throw away more than just incoming tuples. As an example, suppose that the incoming data at the time you call it consists of half a dozen rows followed by an ErrorResponse. The ErrorResponse will be eaten and discarded, leaving the application no clue why its transaction has been aborted, or perhaps even the entire session cancelled. What we probably want here is just to transiently install a row processor that discards all incoming data, but the application is still expected to eventually fetch a PGresult that will tell it whether the server side of the query completed or not. In the dblink patch, given that you have to worry about elogs coming out of BuildTupleFromCStrings and tuplestore_puttuple anyway, it's not clear what is the benefit of using malloc rather than palloc to manage the intermediate buffers in storeHandler --- what that seems to mostly accomplish is increase the risk of permanent memory leaks. I don't see much value in per-column buffers either; it'd likely be cheaper to just palloc one workspace that's big enough for all the column strings together. And speaking of leaks, doesn't storeHandler leak the constructed tuple on each call, not to mention whatever might be leaked by the called datatype input functions? It also seems to me that the dblink patch breaks the case formerly handled in materializeResult() of a PGRES_COMMAND_OK rather than PGRES_TUPLES_OK result. The COMMAND case is supposed to be converted to a single-column text result, and I sure don't see where that's happening now. BTW, am I right in thinking that some of the hunks in this patch are meant to fix a pre-existing bug of failing to report the correct connection name? If so, it'd likely be better to split those out and submit as a separate patch, instead of conflating them with a feature addition. Lastly, as to the pqgetrow patch, the lack of any demonstrated test case for these functions makes me uncomfortable as to whether they are well designed. Again, I'm unconvinced that the error handling is good or that they work sanely in async mode. I'm inclined to just drop these for this go-round, and to stick to providing the features that we can test via the dblink patch. regards, tom lane
pgsql-hackers by date: