Re: psycopg3 and adaptation choices - Mailing list psycopg
From | Adrian Klaver |
---|---|
Subject | Re: psycopg3 and adaptation choices |
Date | |
Msg-id | 5d2207c6-cebb-082d-cf5b-a7a0fe8e58d4@aklaver.com Whole thread Raw |
In response to | Re: psycopg3 and adaptation choices (Daniele Varrazzo <daniele.varrazzo@gmail.com>) |
Responses |
Re: psycopg3 and adaptation choices
|
List | psycopg |
On 11/8/20 2:21 PM, Daniele Varrazzo wrote: > On Sun, 8 Nov 2020 at 20:35, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > >> Alright I understand now. >> More below. >> >>> >>> In psycopg3 the idea is to use a more advanced protocol, which >>> separates query and parameters. It brings several benefits: can use >>> prepared statements (send a query once, several parameters later), >>> passing large data doesn't bloat the parser (the params don't hit the >>> lexer/parser), can use binary format (useful to pass large binary >>> blobs without escaping them in a textual form), the format of the data >>> is more homogeneous (no need to quoting), so we can use Python objects >>> in COPY instead of limiting the interface for the copy functions to >>> file-like objects only. >>> >>> Both in psycopg2 and 3 there is an adaptation from Python types to >>> Postgres string representation. In pg2 there is additional quoting, >>> because apart from numbers and bools you need to quote a literal >>> string to merge it to the query and make it syntactically valid. >> >> So the issue in the psycopg3 protocol is making the parameters that are >> passed in separately match up correctly in type to what the server is >> expecting(or can cast implicitly)? > > Yes, correct. What we have to choose is which Postgres oid to map to > each Python type. > > Sometimes the mapping is trivial (e.g. `datetime.date` -> `date` in > Postgres, `uuid.UUID` -> `uuid`...) > > Sometimes it might be ambiguous: is a `datetime.datetime` a > `timestamp` or a `timestamptz`? In some cases we don't care (here we > can say `timestamptz` no problem: if the Python datetime doesn't have > tzinfo, Postgres will use the `TimeZone` setting). > > Sometimes it's messy: what Python type corresponds to a Postgres > `jsonb`? It might be a dict, or a list, or types that have other > representations too (numbers, strings, bools). In this case, as in > psycopg2, there can be a wrapper, e.g. `Json`, to tell psycopg that > this dict, or list, or whatever else, must be jsonified for the db. > > When there are mismatches, sometimes the database cast rules help > (e.gi in the timestamp[tz] case). Sometimes not: if we say `text` to a > jsonb field, it will raise an error. Sometimes a cast is automatic on > inserting in a table but not on passing a function parameter. > > Numbers are messy, as they usually are: Python has int, float, > Decimal, Postgres has int2, int4, int8, float4, float8, numeric. The > mappings float -> float8 and Decimal -> numeric are more or less > straightforward. `int` is not, as in Python it's unbounded. If you say > `select 10` in psql, the server understands "unknown type, but a > number", and can try if either int* or numeric fit the context. But we > don't have the help from the syntax that psql has: because 10 doesn't > have quotes, Postgres is sure that it is a number, and not a string, > but executing query/params separately we lose that expressivity: we > cannot quote the strings and not the number. So choices are: > > 1. If we specify `numeric` or `int8` as oid, inserting in an int field > in a table will work ok, but some functions/operators won't (e.g. "1 What is not working here? >>> %s"). > 2. If we specify `int4` it would work for those few functions defined > as `integer`, but if we try to write a number that doesn't fit in 32 > bits into a Postgres bigint field I assume something will overflow > along the way, even if both python and postgres can handle it. > 3. If we specify `unknown` it might work more often, but > `cursor.execute("select %s", [10]) will return the string "10" instead > of a number. > > So I wonder what's the best compromise to do here: the less bad seems > 1. 3. might work in more contexts, but it's a very counterintuitive > behaviour, and roundtripping other objects (dates, uuid) works no > problem: they don't come back as strings. There is a lot to digest here. I'm going to have to do some thinking on this. > > -- Daniele > -- Adrian Klaver adrian.klaver@aklaver.com