Thread: Prepared statement vs. parameterized
I've only used the PG JDBC driver for pretty standard things in the past, and on pretty small db's. However, the current project I'm working on will be using rather large data sets and likely make use of some more advanced features. So, I have a few questions related to JDBC parameterized statements and their relation to prepared statements on the PG backend. It appears that parameterized statements are always prepared- is this correct? If so, why (JDBC spec, performance..)? As a general rule, we always use parameterized statements (courtesy of Hibernate). So, if the above is true, then all of our queries are also prepared. After reading previous discussions on -jdbc and -performance, it appears that this is very often not so good for performance since the PG backend basically has to plan a query without values. Now, on a small database, this wouldn't be much of an issue, especially as the planner gets better at picking reasonable plans without values (seems pretty tricky to me..). However, as your data sizes grow (> 500 million records or so) I would think it's critical to have the values when planning. (I'm starting a project with very large data sizes, so I will soon be testing some of this..) So my question is, given the above, would it possibly makes sense to be able to use parameterized statements without preparing them? Maybe this could be a user configurable option? (if not, the other request is really more for the back-end to improve the performance of prepared statements... which seems quite a bit more complicated). Thanks, Bucky
On Fri, 27 Oct 2006, Bucky Jordan wrote: > I've only used the PG JDBC driver for pretty standard things in the > past, and on pretty small db's. However, the current project I'm working > on will be using rather large data sets and likely make use of some more > advanced features. So, I have a few questions related to JDBC > parameterized statements and their relation to prepared statements on > the PG backend. > > It appears that parameterized statements are always prepared- is this > correct? If so, why (JDBC spec, performance..)? They are always prepared (using the V3 protocol), but there are two methods of prepared statement execution. A statement can be prepared with an explicit name (that is good for the rest of the connection) or on the "unnamed" statement (that is good until reused). Statements prepared on the unnamed statement use the bound parameter values for planning purposes so you should get performance equivalent to a non-parameterized query. Statements prepared with explicit names do use generic planning placeholders and can get bad query plans if you have uneven data distributions. When a PreparedStatement object is first created it will use the unnamed statement for its first several uses. Once the driver detects that it is being reused again and again it will switch over to a named statement. This execution count is kept in the PreparedStatement object and no pooling is done, so you have to look at your PreparedStatement object lifetime to determine if it's being reused or created again. The switchover point is configured (or disabled) via a URL parameter prepareThreshold[1] or may be set on each individual connection or statement by using pg specific java code. Kris Jurka [1] http://jdbc.postgresql.org/documentation/81/connect.html#connection-parameters