Thread: Using libpq, floats and binary data
Hi Guys, I have jsut started to use the libpq interface to postgre, and was wondering if someone with some experience could explain the 'best practices' :) I am preparing a query, then submitting it with binary data. 6 values are ints, and one is a float. For the ints, I understand that since I need network byte ordering, I have to pump them through htonl() The same is also true for the floats I assume, so I have dones some dirty,dirty casting, but this feels like a source for problems later on. It does the correct thing, but is there a better way of doing it?? Mike char qString[] = "INSERT INTO raw_results (ca,kf,ks,lk,na,iinj, AP1height) VALUES( $1::int4, $2::int4, $3::int4, $4::int4, $5::int4, $6::int4, $7::float4)"; PGresult* pPreparedQuery = PQprepare(pDB, "InsertQ",qString, nParams, NULL); float AP1Height = 3.141; /* TROUBLE AHEAD! */ if( sizeof( float) != sizeof(int32_t) ) throw MyException(); int32_t AP_int = * ( (int32_t*) &AP1Height ); int32_t AP_intSwapped = htonl(AP_int); float AP1HeightSwapped = *( (float*) &AP_intSwapped); /* TROUBLE OVER */ int32_t ca= htonl(100); int32_t kf= htonl(200); int32_t ks= htonl(300); int32_t lk= htonl(400); int32_t na= htonl(500); int32_t iinj= htonl(600); const char* values[nParams] = { (char*)&ca, (char*)&kf, (char*)&ks, (char*)&lk, (char*)&lk, (char*)&iinj, (char*)&AP1HeightSwapped, }; int lengths[nParams] = { sizeof(ca), sizeof(kf), sizeof(ks), sizeof(lk), sizeof(na), sizeof(iinj), sizeof(AP1Height), }; int binary[nParams]={1,1,1,1,1, 1,1}; PGresult* res = PQexecPrepared( pDB, "InsertQ", nParams, values, lengths,binary,0);
zz On Wed, Sep 15, 2010 at 12:02 PM, Michael Hull <mikehulluk@googlemail.com> wrote: > Hi Guys, > I have jsut started to use the libpq interface to postgre, and was > wondering if someone with some experience could explain the 'best > practices' :) > > I am preparing a query, then submitting it with binary data. 6 values > are ints, and one is a float. > > For the ints, I understand that since I need network byte ordering, I > have to pump them through htonl() > The same is also true for the floats I assume, so I have dones some > dirty,dirty casting, but this feels like a source for problems later > on. > It does the correct thing, but is there a better way of doing it?? > > > Mike > > > char qString[] = "INSERT INTO raw_results (ca,kf,ks,lk,na,iinj, > AP1height) VALUES( $1::int4, $2::int4, $3::int4, $4::int4, $5::int4, > $6::int4, $7::float4)"; > > PGresult* pPreparedQuery = PQprepare(pDB, "InsertQ",qString, nParams, NULL); > > > > > float AP1Height = 3.141; > /* TROUBLE AHEAD! */ > if( sizeof( float) != sizeof(int32_t) ) throw MyException(); > int32_t AP_int = * ( (int32_t*) &AP1Height ); > int32_t AP_intSwapped = htonl(AP_int); > float AP1HeightSwapped = *( (float*) &AP_intSwapped); > /* TROUBLE OVER */ > > > int32_t ca= htonl(100); > int32_t kf= htonl(200); > int32_t ks= htonl(300); > int32_t lk= htonl(400); > int32_t na= htonl(500); > int32_t iinj= htonl(600); > > const char* values[nParams] = { > (char*)&ca, > (char*)&kf, > (char*)&ks, > (char*)&lk, > (char*)&lk, > (char*)&iinj, > (char*)&AP1HeightSwapped, > }; > int lengths[nParams] = { > sizeof(ca), > sizeof(kf), > sizeof(ks), > sizeof(lk), > sizeof(na), > sizeof(iinj), > sizeof(AP1Height), > }; > > int binary[nParams]={1,1,1,1,1, 1,1}; > > PGresult* res = PQexecPrepared( pDB, "InsertQ", nParams, values, > lengths,binary,0); there is a much better way -- libpqtypes: http://libpqtypes.esilo.com/ char qString[] = "INSERT INTO raw_results (ca,kf,ks,lk,na,iinj,AP1height) VALUES( %int4, %int4, %int4, %int4, %int4, %int4, %float4)"; res = PQexecf(conn, qString, ca, kf, ks, lk, na, iinj, AP1Height); getting data out is similarly easy. merlin
Hi,
Inserting many of rows is almost always IO bound. Converting ints and floats to text is CPU bound and really fast anyway. To speed things up first look at things like indexes, how often you need to COMMIT or using COPY. Only then look at prepared statements and binary transfer modes. Else it's simply not worth the headache.
Groeten, Arjen
On Wed, Sep 15, 2010 at 6:02 PM, Michael Hull <mikehulluk@googlemail.com> wrote:
Hi Guys,
I have jsut started to use the libpq interface to postgre, and was
wondering if someone with some experience could explain the 'best
practices' :)
I am preparing a query, then submitting it with binary data. 6 values
are ints, and one is a float.
For the ints, I understand that since I need network byte ordering, I
have to pump them through htonl()
The same is also true for the floats I assume, so I have dones some
dirty,dirty casting, but this feels like a source for problems later
on.
It does the correct thing, but is there a better way of doing it??
Mike
char qString[] = "INSERT INTO raw_results (ca,kf,ks,lk,na,iinj,
AP1height) VALUES( $1::int4, $2::int4, $3::int4, $4::int4, $5::int4,
$6::int4, $7::float4)";
PGresult* pPreparedQuery = PQprepare(pDB, "InsertQ",qString, nParams, NULL);
float AP1Height = 3.141;
/* TROUBLE AHEAD! */
if( sizeof( float) != sizeof(int32_t) ) throw MyException();
int32_t AP_int = * ( (int32_t*) &AP1Height );
int32_t AP_intSwapped = htonl(AP_int);
float AP1HeightSwapped = *( (float*) &AP_intSwapped);
/* TROUBLE OVER */
int32_t ca= htonl(100);
int32_t kf= htonl(200);
int32_t ks= htonl(300);
int32_t lk= htonl(400);
int32_t na= htonl(500);
int32_t iinj= htonl(600);
const char* values[nParams] = {
(char*)&ca,
(char*)&kf,
(char*)&ks,
(char*)&lk,
(char*)&lk,
(char*)&iinj,
(char*)&AP1HeightSwapped,
};
int lengths[nParams] = {
sizeof(ca),
sizeof(kf),
sizeof(ks),
sizeof(lk),
sizeof(na),
sizeof(iinj),
sizeof(AP1Height),
};
int binary[nParams]={1,1,1,1,1, 1,1};
PGresult* res = PQexecPrepared( pDB, "InsertQ", nParams, values,
lengths,binary,0);
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Wed, Sep 15, 2010 at 6:18 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com> wrote: > Hi, > Inserting many of rows is almost always IO bound. Converting ints and floats > to text is CPU bound and really fast anyway. To speed things up first look > at things like indexes, how often you need to COMMIT or using COPY. Only > then look at prepared statements and binary transfer modes. Else it's simply > not worth the headache. That's an awfully broad statement, and untrue...many applications are cpu bound. It's easier to scale storage than cpu after a point. Also, solid state storage is going to become increasingly common moving forwards. Not all type receiving parsing is trivial as you claim; timestamps and bytea for example are significantly cheaper to send in binary wire format. Anyways, libpqtypes gives you all the advantages without all the fuss. If you are really looking to shave cycles we allow you to prepare the format string as well as prepare the statement before sending it. We wrote this interface for a reason: I'd say on average it cuts down query time around 20% on average in addition to the other advantages it provides. merlin
On Thu, Sep 16, 2010 at 12:37 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Wed, Sep 15, 2010 at 6:18 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com> wrote:That's an awfully broad statement, and untrue...many applications are
> Hi,
> Inserting many of rows is almost always IO bound. Converting ints and floats
> to text is CPU bound and really fast anyway. To speed things up first look
> at things like indexes, how often you need to COMMIT or using COPY. Only
> then look at prepared statements and binary transfer modes. Else it's simply
> not worth the headache.
cpu bound. It's easier to scale storage than cpu after a point. Also,
solid state storage is going to become increasingly common moving
forwards.
Not all type receiving parsing is trivial as you claim; timestamps and
bytea for example are significantly cheaper to send in binary wire
format. Anyways, libpqtypes gives you all the advantages without all
the fuss. If you are really looking to shave cycles we allow you to
prepare the format string as well as prepare the statement before
sending it. We wrote this interface for a reason: I'd say on average
it cuts down query time around 20% on average in addition to the other
advantages it provides.
merlin
I think we can agree on one thing: trying to do it without libpqtypes is a bad idea.
Groeten, Arjen
On Wed, Sep 15, 2010 at 6:52 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com> wrote: > On Thu, Sep 16, 2010 at 12:37 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Wed, Sep 15, 2010 at 6:18 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com> >> wrote: >> > Hi, >> > Inserting many of rows is almost always IO bound. Converting ints and >> > floats >> > to text is CPU bound and really fast anyway. To speed things up first >> > look >> > at things like indexes, how often you need to COMMIT or using COPY. Only >> > then look at prepared statements and binary transfer modes. Else it's >> > simply >> > not worth the headache. >> >> That's an awfully broad statement, and untrue...many applications are >> cpu bound. It's easier to scale storage than cpu after a point. Also, >> solid state storage is going to become increasingly common moving >> forwards. >> >> Not all type receiving parsing is trivial as you claim; timestamps and >> bytea for example are significantly cheaper to send in binary wire >> format. Anyways, libpqtypes gives you all the advantages without all >> the fuss. If you are really looking to shave cycles we allow you to >> prepare the format string as well as prepare the statement before >> sending it. We wrote this interface for a reason: I'd say on average >> it cuts down query time around 20% on average in addition to the other >> advantages it provides. >> >> merlin > > I think we can agree on one thing: trying to do it without libpqtypes is a > bad idea. > Groeten, Arjen Yeah, except in the special case of bytea, so you can avoid escaping/encoding and the fairly large efficiency hit. The performance benefit of sending ints in binary vs text is barely measurable. Timestamps are the biggest win of the common types, especially if you are starting from something that is itself a binary time representation. Another reason a lot of people try and eek every cycle out of the query they can (preparing the statement and such) is to try and cut down query execution time as much as possible even in cases where it doesn't make much difference in terms of overall system load. This can happen for example when you have generated sql that crawls the server record by record. We all know this is often not the best way, but when working under those constraints the method of query execution becomes quite important. exec prepared in binary vs exec params in text can result in a greater than 50% reduction in query time in such cases (some of that benefit is on the client). If you're firing off big queries (in terms of running time), obviously all this matters very little. merlin