Re: Connection function - Mailing list psycopg
From | Adrian Klaver |
---|---|
Subject | Re: Connection function |
Date | |
Msg-id | 4F6F6E20.40708@gmail.com Whole thread Raw |
In response to | Re: Connection function (Bill House <wch-tech@house-grp.net>) |
Responses |
Re: Connection function
|
List | psycopg |
On 03/25/2012 08:14 AM, Bill House wrote: > On 03/24/2012 08:25 PM, Adrian Klaver wrote: >> On 03/24/2012 05:03 PM, Bill House wrote: >>> Hello all, >>> >>> I am very new to postgresql and python, but I have written a function to >>> retrieve some admin data from the server. >>> >>> I think this is more of a python question rather than a psycopg2 >>> question, but the context is psycopg2. >>> >>> I was wondering if this is the right place to post it and ask for a >>> critique? >> >> I would say yes, though I am not sure what the question is:)? >> >>> >>> Thanks, >>> >>> Bill >>> >> >> > Thanks, > > Sometimes it is helpful to know how many records one is working with. > > From my reading in the psycopg2 documentation (2.4.4) and > experimentation, I think that this information is only available if one > uses an unnamed cursor (in which case all the data will have been > fetched to the client) or with an named cursor (but only after all the > data has been fetched). > > If there is a large amount of data, this may be problematic. > > Am I wrong about this? If so, please let me know. See comments in line. > > Anyway, the way I approached this issue was to count the records I was > interested in and return the value. > > The code below works. > > I am interested in a critique of it from more experienced persons. Are > there more elegant or efficient ways to do what I have done? > > In particular, I am interested in: > 1) the python string substitution technique I used That works, though it also possible to use dictionaries for the Psycopg strings see here: http://initd.org/psycopg/docs/usage.html#query-parameters I find the dict form easier to use for many parameters because it is self documenting. The rest of the string substitution seems to be tied to using exec which I do not think is necessary, see below for more. > 2) the choice of the postgresql source of data I was seeking (the > record count) > 3) if there was an existing function I overlooked which does the > same thing (in psycopg2 or postgresql) > > Regarding # 2 above, I had been relying on the table: > pg_stat_user_tables and the n_live_tup value. This item has shown the > correct count for many days. Today, after updating my machines and > rebooting, the count is 0 and I am not really sure why. Records are > still there; and even though I have selected data from the database, the > count is still 0. The above are system tables/views used by the statistics mechanism of Postgres. For a starting point on this look at: http://www.postgresql.org/docs/9.0/interactive/sql-analyze.html Short version running Analyze will repopulate the tables. > > So I took the more direct but time consuming approach of just counting > the records in the table. That is the only way to get a 'true' count due to the MVCC nature of Postgres. There has been a recent optimization of that, but I cannot remember what version of Postgres has it. > > This will be better anyway because many times one will not be selecting > all the records in a table, so this function can be expanded to include > a parameter for the selection criteria so that that identical criteria > may be used again in the work portion of any operation. > > Thanks in advance for your input. > > Bill > > code (indent 2): > ------------------------------------------------------- > #!/usr/bin/python > # z_psycopg2_row_cnt7f.py > # A program to illustrate the methods of creating or supplying runtime > variables > # to functions using psycopg2 functions > > import sys > import os > > import psycopg2 > import psycopg2.extras > import psycopg2.extensions > > #for string generator > import string > import random > > con = None > > def gen_random_str(str_len): > """generates a random string str_len long - > randomly chooses and assembles a number of characters from the digits > and the upper and lower case alphabet """ > rand_str = ''.join(random.choice(string.ascii_uppercase + > string.digits + string.ascii_lowercase) for x in range(str_len)) > return rand_str > > def rec_cnt_func(conn, table_name): > """Count the records selected from the table, right now all of them""" > #Generate a random string to use as a curson name (short right now for > debugging purposes) > #The commands following will use this string > rand_curs_name = gen_random_str(3) > #Substitute data into a command string and execute > #create the cursor > exec "%s = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)" % > (rand_curs_name,) > #count the records in the table > exec "%s.execute(\"SELECT COUNT (*) FROM %s;\")" % (rand_curs_name, > table_name) Not sure why you are using exec? Why not?: cur = conn.cursor(cursor_factory=psycopg2.extras.DictCursor,name=rand_curs_name) Same for below. > #fetch the data > exec "row = %s.fetchone()" % (rand_curs_name,) > #All done now, throw the cursor away > exec "%s.close()" % (rand_curs_name,) > #return the data to the calling program > return row[0] > > > def main(): > #supply your connection string here > current_dsn = "host=xx" > > try: > con = psycopg2.connect(current_dsn) > > #supply your table name here > table_name_in = "xxxx" > > #Call the function which will retrieve the data from postgresql > rec_cnt_val = rec_cnt_func(con, table_name_in) > > #print the formatted data in an informative way > print "Records in table: {}: {:> 20,}".format(table_name_in, > rec_cnt_val) > > except psycopg2.DatabaseError, e: > print "psycopg2 reports error: {}".format(e) > sys.exit(1) > > finally: > > if con: > con.close() > return > > if __name__ == '__main__': > main() > ----------------------------------------------------- > end code > -- Adrian Klaver adrian.klaver@gmail.com