Thread: [Q] optmizing postgres for 'single client' / many small queries
Hi, our application is using Postgres in a rather unusuall way. It is used by a GUI application to store several hundred thousand 'parameters'. Basically it is used like a big INI file. There are about 50 tables with various parameters. The application typicall goes like this select id, child_tb_key_id, <fields with parms> from tb1 then for each selected row above select from the child table do a select (like the above) and so on -- many levels deep I know that it is not a proper way to use SQL Instead we should be selecting many rows at once, joining them/etc But it is what it is now... Queries are very fast though, Postgres reports that the all the queries for a typical 'load' operation take 0.8 seconds -- however overall time that the GUI user perceives is 8 seconds. Out of that 8 seconds a big chunk is in the sending of the SQL statements/receiving results back -- just network traffic, parsing/etc There are total about 2400 queries that happen in that period of time (just selects) I am trying to figure out how can I optimize PG configuration to suite such a contrived deployment of Postgres. For example, we do not mind PG running on the same machine as the Client app (it is connected via Qt Sql Pg plugin (so it uses Pg native access library underneath). Are there any optmization can be done for that? Also this is a 'single' client/single connection system what optimizations can be done for that? and finally since most of the queries are very quick index-based selects what can be done to optimize the traffic between pg and the client? thank you in advance for any recommendations/pointers. -- Vlad P author of C++ ORM http://github.com/vladp/CppOrm/tree/master -- http://www.fastmail.fm - Send your email first class
Hello Are you sure, so you have to use PostgreSQL - maybe SQLite or memcached is better for your task. regards Pavel Stehule 2009/9/2 V S P <toreason@fastmail.fm>: > Hi, > our application is using Postgres in a rather unusuall way. > It is used by a GUI application to store several hundred > thousand 'parameters'. Basically it is used like a big INI > file. > > There are about 50 tables with various parameters. > > The application typicall goes like this > > select id, child_tb_key_id, <fields with parms> from tb1 > > then for each selected row above > select from the child table do a select (like the above) > > and so on -- many levels deep > > > > I know that it is not a proper way to use SQL > Instead we should be selecting many rows at once, joining them/etc > > But it is what it is now... > > Queries are very fast though, Postgres reports that the > all the queries for a typical 'load' operation take 0.8 seconds > -- however overall time that the GUI user perceives is 8 seconds. > Out of that 8 seconds a big chunk is in the sending of the SQL > statements/receiving results back -- just network traffic, parsing/etc > > There are total about 2400 queries that happen in that period of time > (just selects) > > > > > I am trying to figure out how can I optimize PG configuration > to suite such a contrived deployment of Postgres. > > For example, we do not mind PG running on the same machine > as the Client app (it is connected via Qt Sql Pg plugin (so it uses > Pg native access library underneath). > > Are there any optmization can be done for that? > > > Also this is a 'single' client/single connection system > what optimizations can be done for that? > > and finally since most of the queries are very quick index-based > selects what can be done to optimize the traffic between pg and > the client? > > > > thank you in advance for > any recommendations/pointers. > > > > > -- > Vlad P > author of C++ ORM http://github.com/vladp/CppOrm/tree/master > > > -- > http://www.fastmail.fm - Send your email first class > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Hi, yes, I am sure I have to continue supporting Postgres at this time, it would take enormous effor to change to something else But yes, sqlite or tokiocabinet in my view would be good options (the decison was made some time ago, unfortunately). On Wed, 02 Sep 2009 19:49 +0200, "Pavel Stehule" <pavel.stehule@gmail.com> wrote: > Hello > > Are you sure, so you have to use PostgreSQL - maybe SQLite or > memcached is better for your task. > > regards > Pavel Stehule > > 2009/9/2 V S P <toreason@fastmail.fm>: > > Hi, > > our application is using Postgres in a rather unusuall way. > > It is used by a GUI application to store several hundred > > thousand 'parameters'. Basically it is used like a big INI > > file. > > > > There are about 50 tables with various parameters. > > > > The application typicall goes like this > > > > select id, child_tb_key_id, <fields with parms> from tb1 > > > > then for each selected row above > > select from the child table do a select (like the above) > > > > and so on -- many levels deep > > > > > > > > I know that it is not a proper way to use SQL > > Instead we should be selecting many rows at once, joining them/etc > > > > But it is what it is now... > > > > Queries are very fast though, Postgres reports that the > > all the queries for a typical 'load' operation take 0.8 seconds > > -- however overall time that the GUI user perceives is 8 seconds. > > Out of that 8 seconds a big chunk is in the sending of the SQL > > statements/receiving results back -- just network traffic, parsing/etc > > > > There are total about 2400 queries that happen in that period of time > > (just selects) > > > > > > > > > > I am trying to figure out how can I optimize PG configuration > > to suite such a contrived deployment of Postgres. > > > > For example, we do not mind PG running on the same machine > > as the Client app (it is connected via Qt Sql Pg plugin (so it uses > > Pg native access library underneath). > > > > Are there any optmization can be done for that? > > > > > > Also this is a 'single' client/single connection system > > what optimizations can be done for that? > > > > and finally since most of the queries are very quick index-based > > selects what can be done to optimize the traffic between pg and > > the client? > > > > > > > > thank you in advance for > > any recommendations/pointers. > > > > > > > > > > -- > > Vlad P > > author of C++ ORM http://github.com/vladp/CppOrm/tree/master > > > > > > -- > > http://www.fastmail.fm - Send your email first class > > > > > > -- > > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-general > > -- Vlad P author of C++ ORM http://github.com/vladp/CppOrm/tree/master -- http://www.fastmail.fm - Accessible with your email software or over the web
"V S P" <toreason@fastmail.fm> writes: > The application typicall goes like this > select id, child_tb_key_id, <fields with parms> from tb1 > then for each selected row above > select from the child table do a select (like the above) > and so on -- many levels deep Seems like you need to fix your data representation so that this operation can be collapsed into one query. The main problem looks to be a bogus decision to have separate child tables rather than one big table with an extra key column. regards, tom lane
Well, actually somebody has written a C++ ORM that allows to do things (just a sketch) class L: CDbCapable { public: int prop1; int prop2; } class A: CDbCapable { QArray<L> list_of_props_xyz; } int main () { A inst1; inst1.create_or_update_DbSchemaIfNeeded(); inst1.readFromDb(); //modifying something inst1.writeToDb(); } As it is well known C++ lack of Reflection prevents it from having standardise Data serialization libraries to files or to Databases. So in-house a mechanism was developed to do the above. It took some time and it is not possible to just yank it out. Of course, internally in the ORM's implementation a somewhat questionable decision was made that to process arrays of 'children' for a given instance would require separate SQL statements. That's where the problem comes from, I understand what needs to be done to redesign the approach/etc. And that will take more time than currently is available. Therefore, I just wanted to ask if there there are some things in Pg that can I can experiment with (local client/server communcations via IPC, reducing the speed of SQL parses, any other possible tricks) Thank you in advance On Wed, 02 Sep 2009 14:26 -0400, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > "V S P" <toreason@fastmail.fm> writes: > > The application typicall goes like this > > > select id, child_tb_key_id, <fields with parms> from tb1 > > > then for each selected row above > > select from the child table do a select (like the above) > > > and so on -- many levels deep > > Seems like you need to fix your data representation so that this > operation can be collapsed into one query. The main problem looks > to be a bogus decision to have separate child tables rather than > one big table with an extra key column. > > regards, tom lane -- Vlad P author of C++ ORM http://github.com/vladp/CppOrm/tree/master -- http://www.fastmail.fm - mmm... Fastmail...
"V S P" <toreason@fastmail.fm> writes: > Well, actually > somebody has written a C++ ORM > [ that is causing all your problems and you say you can't discard ] Just out of curiosity, does anyone know of any ORM anywhere that doesn't suck? They seem to be uniformly awful, at least in terms of their interfaces to SQL databases. If there were some we could recommend, maybe people would be less stuck with these bogus legacy architectures. regards, tom lane
I do not know of any for C++. That's why I started my own (which is not the one used for the problem I am having :- ) ) http://github.com/vladp/CppOrm it works with Pg 8.3+ and VC++ compiler sofar (but support for more platforms and Dbs will be added in the future). My Orm is not really an ORM because I did not implement anything that would 'traverse' object instance relationships (which is what the ORM in question here is doing). Instead I just automagically generate SQL code for insert/update/deletes for classes that map to tables (one-to-one). The basic problem is that C++ standards comittee in my view just sucks... i do not have any better words for it. It is because of lack of reflection (ability to identify at runtime variable names/functions names) that an ORM, or HTTP session storage/retrival mechanism, JSON/XML parsers that parse text right into class instances -- cannot be implemented Basically the things that are needed to deal with 'Typeless' data at runtime (such that XML/JSON/Database queries) and map that data to the C++ object instances. Which is in the 'high-level view' why C++ is not used for web development. Yes there are 'attempts' in that area -- but all are different, require quite a bit of sophistication and are not complete (The reflection mechanism I implemented for my cpporm is not complete either). If C++ would have supported Reflection -- the there would be C++_Hibernate, C++_LINQ, C++_json, C++_xml, C++_HTTP, C++_HTTPSession and so on... (and no they would have been memory hogs -- thanks to now standard reference counting in C++ via shared_ptr and good use of allocators) sorry for the rant, still looking for any bright ideas on optimizing for many small queries/local db host situations. Thanks On Wed, 02 Sep 2009 14:45 -0400, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > "V S P" <toreason@fastmail.fm> writes: > > Well, actually > > somebody has written a C++ ORM > > [ that is causing all your problems and you say you can't discard ] > > Just out of curiosity, does anyone know of any ORM anywhere that doesn't > suck? They seem to be uniformly awful, at least in terms of their > interfaces to SQL databases. If there were some we could recommend, > maybe people would be less stuck with these bogus legacy architectures. > > regards, tom lane -- Vlad P author of C++ ORM http://github.com/vladp/CppOrm/tree/master -- http://www.fastmail.fm - Faster than the air-speed velocity of an unladen european swallow
On Wed, Sep 02, 2009 at 02:45:39PM -0400, Tom Lane wrote: > "V S P" <toreason@fastmail.fm> writes: > > Well, actually somebody has written a C++ ORM > > [ that is causing all your problems and you say you can't discard ] > > Just out of curiosity, does anyone know of any ORM anywhere that > doesn't suck? They seem to be uniformly awful, at least in terms of > their interfaces to SQL databases. If there were some we could > recommend, maybe people would be less stuck with these bogus legacy > architectures. Hibernate has the very nice feature of being able to get out of your way. Properly used, it can keep completely out of the business of making (wrong) guesses based on DDL, which is what ORMs often do. DBIx::Class <http://search.cpan.org/dist/DBIx-Class/> has gone a long way in the right direction. Ones which (attempt to) dictate decisions about DDL are just off the map. :P Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Sep 2, 2009 at 4:35 PM, David Fetter<david@fetter.org> wrote: > Hibernate has the very nice feature of being able to get out of your > way. Properly used, it can keep completely out of the business of > making (wrong) guesses based on DDL, which is what ORMs often do. > DBIx::Class <http://search.cpan.org/dist/DBIx-Class/> has gone a long > way in the right direction. > > Ones which (attempt to) dictate decisions about DDL are just off the > map. :P David, do you know how well these kinds of ORMs work when it come to mapping non-trivial schema designs? For example, how would these work when creating a mapping for the multiple inheritance design that you've blogged about earlier? -- Regards, Richard Broersma Jr. Visit the Los Angeles PostgreSQL Users Group (LAPUG) http://pugs.postgresql.org/lapug
On Thu, Sep 03, 2009 at 07:24:50AM -0700, Richard Broersma wrote: > On Wed, Sep 2, 2009 at 4:35 PM, David Fetter<david@fetter.org> wrote: > > > Hibernate has the very nice feature of being able to get out of > > your way. Properly used, it can keep completely out of the > > business of making (wrong) guesses based on DDL, which is what > > ORMs often do. DBIx::Class > > <http://search.cpan.org/dist/DBIx-Class/> has gone a long way in > > the right direction. > > > > Ones which (attempt to) dictate decisions about DDL are just off > > the map. :P > > David, do you know how well these kinds of ORMs work when it come to > mapping non-trivial schema designs? For example, how would these > work when creating a mapping for the multiple inheritance design > that you've blogged about earlier? If your mapper only does the job of mapping, you can choose classes/objects and then map them to the appropriate, possibly parameterized, SQL queries, which the DB people can then freely rearrange. One nice feature of such a system is that the DBA and/or DB developer has a way to know what the client code expects. In OO terms, there are public interfaces--everything mentioned in the ORM layer--and private interfaces--DDL, DML, and DCL--to the database. Programmers who like to use object-oriented languages and methods should be happy about this object-oriented approach to database management, but for some reason, a lot of them don't understand that the idea of public and private interfaces applies to what they (too simplistically, much of the time) think of as "the persistence layer." Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Hi, Tom Lane <tgl@sss.pgh.pa.us> writes: > Just out of curiosity, does anyone know of any ORM anywhere that doesn't > suck? They seem to be uniformly awful, at least in terms of their > interfaces to SQL databases. If there were some we could recommend, > maybe people would be less stuck with these bogus legacy architectures. It seems like people interrested into ORMs are the one who do not want to tackle SQL... and for people having some time to spend on the possibility of finding a good ORM: http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx A more practical discussion seems to be here (I still have to read it): http://omniti.com/seeds/orms-done-right Regards, -- dim PS: In short my advice is always to choose an ORM where it's easy to bypass query generation, and stick to not letting it generate SQL. Sometime basic CRUD is ok though (INSERT/UPDATE/DELETE one object/row at a time).
Tom Lane <tgl@sss.pgh.pa.us> writes: > Just out of curiosity, does anyone know of any ORM anywhere that doesn't > suck? They seem to be uniformly awful, at least in terms of their > interfaces to SQL databases. If there were some we could recommend, > maybe people would be less stuck with these bogus legacy architectures. I personally like sqlalchemy - http://www.sqlalchemy.org/. Some of the good things: 1. Table reflection, i.e. dynamic generation of mapped python classes - great at development time - no mismatch between db and client side code. 2. Easily extensible - about 10 lines of code gives you an interface to Geos geometries retrieved from PostGIS 3. It seems to be possible to produce arbitrarily complex sql statements. OK, they're usually less readable than sql, but for dynamic query generation it definitely beats manually chopping up text. Cheers, Will T
On 2009-09-02, V S P <toreason@fastmail.fm> wrote: > Hi, > our application is using Postgres in a rather unusuall way. > It is used by a GUI application to store several hundred > thousand 'parameters'. Basically it is used like a big INI > file. > > There are about 50 tables with various parameters. > > The application typicall goes like this > > select id, child_tb_key_id, <fields with parms> from tb1 > > then for each selected row above > select from the child table do a select (like the above) > > and so on -- many levels deep > > I know that it is not a proper way to use SQL > Instead we should be selecting many rows at once, joining them/etc > > But it is what it is now... rewriting it as a single query will speed up operation. > Queries are very fast though, Postgres reports that the > all the queries for a typical 'load' operation take 0.8 seconds > -- however overall time that the GUI user perceives is 8 seconds. > Out of that 8 seconds a big chunk is in the sending of the SQL > statements/receiving results back -- just network traffic, parsing/etc 0.8s processing 7.2s latency > There are total about 2400 queries that happen in that period of time > (just selects) > I am trying to figure out how can I optimize PG configuration > to suite such a contrived deployment of Postgres. > For example, we do not mind PG running on the same machine > as the Client app (it is connected via Qt Sql Pg plugin (so it uses > Pg native access library underneath). > > Are there any optmization can be done for that? try using libpq directly. > Also this is a 'single' client/single connection system > what optimizations can be done for that? asynchronous queries if you can know what to ask second before the answer to the first query > and finally since most of the queries are very quick index-based > selects what can be done to optimize the traffic between pg and > the client? unix domain sockets > thank you in advance for > any recommendations/pointers. but serously bite the bullet and redo it a a single query that returns 2400 rows instead of 2400 queries that return one row. or slurp all the tables into dictionaries (hash tables/arrays/whatever) in your QT application. and write your queries in C++ instead of SQL.