Thread: Pipelining INSERTs using libpq
I would like to pipeline INSERT statements. The idea is to avoid waiting for server round trips if the INSERT has no RETURNING clause and runs in a transaction. In my case, the failure of an individual INSERT is not particularly interesting (it's a "can't happen" scenario, more or less). I believe this is how the X toolkit avoided network latency issues. I wonder what's the best way to pipeline requests to the server using the libpq API. Historically, I have used COPY FROM STDIN instead, but that requires (double) encoding and some client-side buffering plus heuristics if multiple tables are filled. It does not seem possible to use the asynchronous APIs for this purpose, or am I missing something? -- Florian Weimer / Red Hat Product Security Team
On Fri, Dec 21, 2012 at 4:31 AM, Florian Weimer <fweimer@redhat.com> wrote: > I would like to pipeline INSERT statements. The idea is to avoid waiting > for server round trips if the INSERT has no RETURNING clause and runs in a > transaction. In my case, the failure of an individual INSERT is not > particularly interesting (it's a "can't happen" scenario, more or less). I > believe this is how the X toolkit avoided network latency issues. > > I wonder what's the best way to pipeline requests to the server using the > libpq API. Historically, I have used COPY FROM STDIN instead, but that > requires (double) encoding and some client-side buffering plus heuristics if > multiple tables are filled. > > It does not seem possible to use the asynchronous APIs for this purpose, or > am I missing something? How you attack this problem depends a lot on if all your data you want to insert is available at once or you have to wait for it from some actor on the client side. The purpose of asynchronous API is to allow client side work to continue while the server is busy with the query. So they would only help in your case if there was some kind of other processing you needed to do to gather the data and/or prepare the queries. Maybe then you'd PQsend multiple insert statements with a single call. merlin
On 12/21/2012 03:29 PM, Merlin Moncure wrote: > How you attack this problem depends a lot on if all your data you want > to insert is available at once or you have to wait for it from some > actor on the client side. The purpose of asynchronous API is to allow > client side work to continue while the server is busy with the query. The client has only very little work to do until the next INSERT. > So they would only help in your case if there was some kind of other > processing you needed to do to gather the data and/or prepare the > queries. Maybe then you'd PQsend multiple insert statements with a > single call. I want to use parameterized queries, so I'll have to create an INSERT statement which inserts multiple rows. Given that it's still stop-and-wait (even with PQsendParams), I can get through at most one batch per RTT, so the number of rows would have to be rather large for a cross-continental bulk load. It's probably doable for local bulk loading. Does the wire protocol support pipelining? The server doesn't have to do much to implement it. It just has to avoid discarding unexpected bytes after the current frame and queue it for subsequent processing instead. (Sorry if this message arrives twice.) -- Florian Weimer / Red Hat Product Security Team
Florian Weimer <fweimer@redhat.com> writes: > Does the wire protocol support pipelining? Yes, but libpq doesn't really expose the capability, because it's too simple-minded to deal with more than one query in flight. regards, tom lane
On Sunday, December 23, 2012, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Florian Weimer <fweimer@redhat.com> writes:
>> Does the wire protocol support pipelining?
>
> Yes, but libpq doesn't really expose the capability, because it's too
> simple-minded to deal with more than one query in flight.
>
> regards, tom lane
You can do it with libpqtypes via array of records...
merlin