Home > mailing lists

Re: RFC: Async query processing - Mailing list pgsql-hackers

From	Claudio Freire
Subject	Re: RFC: Async query processing
Date	January 3, 2014 17:06:18
Msg-id	CAGTBQpahBeCjUnrA=vj_8w=fGSKejGcUMitArZa+uuFKJ3LHOA@mail.gmail.com Whole thread
In response to	Re: RFC: Async query processing (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: RFC: Async query processing Re: RFC: Async query processing Re: RFC: Async query processing Re: RFC: Async query processing
List	pgsql-hackers

Tree view

On Fri, Jan 3, 2014 at 12:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Claudio Freire <klaussfreire@gmail.com> writes:
>> On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer <fweimer@redhat.com> wrote:
>>> Loading data into the database isn't such an uncommon task.  Not everything
>>> is OLTP.
>
>> Truly, but a sustained insert stream of 10 Mbps is certainly way
>> beyond common non-OLTP loads. This is far more specific than non-OLTP.
>
> I think Florian has a good point there, and the reason is this: what
> you are talking about will be of exactly zero use to applications that
> want to see the results of one query before launching the next.  Which
> eliminates a whole lot of apps.  I suspect that almost the *only*
> common use case in which a stream of queries can be launched without
> feedback is going to be bulk data loading.  It's not clear at all
> that pipelining the PQexec code path is the way to better performance
> for that --- why not use COPY, instead?

You're forgetting ORM workloads.

ORMs can usually plan the inserts to be in a sequence that both don't
require feedback (except the knowledge that they were successful), and
that do not violate constraints.

Flushing a whole object hierarchy for instance, can be done without
feedback. Not even serial columns need feedback, since many ORMs
(SQLAlchemy, Hibernate) support allocation of ID sequences in batches
(by issuing a proper select nextval).

I agree, that with the proposed API, it's too error prone to be
useful. But I also think, if the API is simple and fool-proof enough,
it could be "build them and they will come". I know I'll be happy to
implement support for SQLAlchemy (since it will benefit me), if the
API resembles the proposition below (at least in simplicity).

Per-query expectations could be such a thing. And it can even work with PQexec:

PQexec(con, "SELECT nextval('a_id_seq') FROM generate_series(1,10);");
--read--
PQexec(con, "SELECT nextval('b_id_seq') FROM generate_series(1,10);");
--read--
PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
... 9 times...
PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC);
do {  // do something useful
} while (PQflush());

Here, the PQASYNC flag would temporarily switch to non-blocking I/O,
and buffer what cannot be sent. PQASNC_CORK, would only buffer (only
send if the buffer is full). After any ASYNC call, PQflush would be
necessary (to flush the send queue and to consume the expected
responses), but I can imagine any synchronous call (PQexec,
PQsendQuery or whatever) could detect a non-empty buffer and just
blockingly flush right there.

This can benefit many useful patterns. ORM flush, is one, if there can
be preallocation of IDs (which I know at least SQLAlchemy and
Hibernate both support).

Execute-many of prepared statements is another one, quite common.

I'm not sure what would happen if one of the queries returned an
error. If in a transaction, all the following queries would error out
I'd imagine. If not, they would simply be executed blindly.. am I
correct?

pgsql-hackers by date:

From: Merlin Moncure
Date: 03 January 2014, 16:31:45
Subject: Re: RFC: Async query processing

From: Merlin Moncure
Date: 03 January 2014, 17:22:55
Subject: Re: RFC: Async query processing

Re: RFC: Async query processing - Mailing list pgsql-hackers

Previous

Next