Thread: Problem using COPY command to load data
I'm not sure if this is the write place to post this question or not, but I hope someone can help me out. I am using the copy_from command from the python psycopg2 library to do some bulk data loading of a postgres database. This had been working OK, until my script barfed because I was being careless, and it seemed to leave the database in a strange state that I can't recover from. Here is what happened: I am connecting to postgres via TCP, but I happen to be running the script on the same host ast the server right now. I had forgotten to clear the contents of one table before loading it from a file and the copy_from command exited with error because of conflicting primary keys. After this any time I attempted to execute copy_from it would just hang (after successfully connecting to the database and doing a series of DELETE FROM SQL commands). I just see something like this if I look through the output of ps: postgres 30432 1.7 1.2 42884 12512 ? Ss 13:16 0:00 postgres: biomart mgi_biomart_intermediate 127.0.1.1(41831) COPY The script hangs forever, and I don't see anything in the postgres log. If I switched my script to connect through a unix socket (remove host='hostname' from the psycopg2.connect() call), then it works again. I've restarted postgresql many times, the TCP connection always hangs on the COPY from my script now. I have fixed my script to be more careful and handle the copy_from() error in a more sane way, but I'd like to fix whatever is now wrong with postgres. Here are some entries from pg_hba.conf: host all all 10.0.0.0/8 md5 host all all 127.0.0.1/32 md5 host all all 127.0.1.1/32 md5 local all biomart md5 Everything else in pg_hba.conf is default The only change to postgresql.conf is to change the listen_address to '*' Any insight would be appreciated. -- Glen L. Beane Software Engineer The Jackson Laboratory Phone (207) 288-6153
Glen Beane <Glen.Beane@jax.org> writes: > I am using the copy_from command from the python psycopg2 library to do some > bulk data loading of a postgres database. This had been working OK, until > my script barfed because I was being careless, and it seemed to leave the > database in a strange state that I can't recover from. What PG version is this exactly? What does pg_stat_activity show? Is there any indication of un-granted locks in pg_locks? The most direct evidence about why it's stuck would probably be had by attaching to the backend process with gdb and getting a stack trace. regards, tom lane
On 11/11/08 2:25 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Glen Beane <Glen.Beane@jax.org> writes: >> I am using the copy_from command from the python psycopg2 library to do some >> bulk data loading of a postgres database. This had been working OK, until >> my script barfed because I was being careless, and it seemed to leave the >> database in a strange state that I can't recover from. > > What PG version is this exactly? 8.3.3 > What does pg_stat_activity show? This is the only non-idle connection: 16498 | mgi_biomart_intermediate | 31356 | 16386 | biomart | COPY markers FROM stdin USING DELIMITERS ' ' | f | 2008-11-11 17:10:36.124919-05 | 2008-11-11 17:10:36.125522-05 | 2008-11-11 17:10:32.213159-05 | 127.0.1.1 | 35021 > Is > there any indication of un-granted locks in pg_locks? gbeane=# select * from pg_locks; locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted ---------------+----------+----------+------+-------+------------+---------- -----+---------+-------+----------+--------------------+-------+------------ ------+--------- virtualxid | | | | | 5/3 | | | | | 5/3 | 31361 | ExclusiveLock | t relation | 16385 | 10969 | | | | | | | | 5/3 | 31361 | AccessShareLock | t relation | 16498 | 16507 | | | | | | | | 3/1151 | 31356 | RowExclusiveLock | t virtualxid | | | | | 3/1151 | | | | | 3/1151 | 31356 | ExclusiveLock | t transactionid | | | | | | 637 | | | | 3/1151 | 31356 | ExclusiveLock | t relation | 16498 | 16510 | | | | | | | | 3/1151 | 31356 | RowExclusiveLock | t > The most direct evidence about why it's stuck would probably be had by > attaching to the backend process with gdb and getting a stack trace. > I'll take a look with gdb when I have a moment. > regards, tom lane -- Glen L. Beane Software Engineer The Jackson Laboratory Phone (207) 288-6153
On 11/11/08 2:25 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > The most direct evidence about why it's stuck would probably be had by > attaching to the backend process with gdb and getting a stack trace. It wasn't built with debugging symbols so there is some missing info, but here is what I get if I attach with gdb: (gdb) where #0 0xb7f63410 in __kernel_vsyscall () #1 0xb7ac9273 in read () from /lib/tls/i686/cmov/libc.so.6 #2 0xb7d42d37 in ?? () from /usr/lib/i686/cmov/libcrypto.so.0.9.8 #3 0x081a1226 in ?? () #4 0xb7d40db1 in BIO_read () from /usr/lib/i686/cmov/libcrypto.so.0.9.8 #5 0xb7e05362 in ssl3_read_n () from /usr/lib/i686/cmov/libssl.so.0.9.8 #6 0xb7e05b2e in ssl3_read_bytes () from /usr/lib/i686/cmov/libssl.so.0.9.8 #7 0xb7e03096 in ssl3_read () from /usr/lib/i686/cmov/libssl.so.0.9.8 #8 0xb7e13b78 in SSL_read () from /usr/lib/i686/cmov/libssl.so.0.9.8 #9 0x081a128a in secure_read () #10 0x081a94c1 in ?? () #11 0x081a95d9 in pq_getbytes () #12 0x081a9744 in pq_getmessage () #13 0x0813853c in ?? () #14 0x0813883a in ?? () #15 0x08138a05 in ?? () #16 0x0813b456 in DoCopy () #17 0x082302e4 in ProcessUtility () #18 0x0822cc6b in ?? () #19 0x0822d8e3 in ?? () #20 0x0822e104 in PortalRun () #21 0x08228eb5 in ?? () #22 0x08229f6b in PostgresMain () #23 0x081fa39d in ?? () #24 0x081fb3cb in PostmasterMain () #25 0x081ab902 in main () -- Glen L. Beane Software Engineer The Jackson Laboratory Phone (207) 288-6153
Glen Beane <Glen.Beane@jax.org> writes: > On 11/11/08 2:25 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >> The most direct evidence about why it's stuck would probably be had by >> attaching to the backend process with gdb and getting a stack trace. > It wasn't built with debugging symbols so there is some missing info, but > here is what I get if I attach with gdb: > (gdb) where > #0 0xb7f63410 in __kernel_vsyscall () > #1 0xb7ac9273 in read () from /lib/tls/i686/cmov/libc.so.6 > #2 0xb7d42d37 in ?? () from /usr/lib/i686/cmov/libcrypto.so.0.9.8 > #3 0x081a1226 in ?? () > #4 0xb7d40db1 in BIO_read () from /usr/lib/i686/cmov/libcrypto.so.0.9.8 > #5 0xb7e05362 in ssl3_read_n () from /usr/lib/i686/cmov/libssl.so.0.9.8 > #6 0xb7e05b2e in ssl3_read_bytes () from /usr/lib/i686/cmov/libssl.so.0.9.8 > #7 0xb7e03096 in ssl3_read () from /usr/lib/i686/cmov/libssl.so.0.9.8 > #8 0xb7e13b78 in SSL_read () from /usr/lib/i686/cmov/libssl.so.0.9.8 > #9 0x081a128a in secure_read () > #10 0x081a94c1 in ?? () > #11 0x081a95d9 in pq_getbytes () > #12 0x081a9744 in pq_getmessage () > #13 0x0813853c in ?? () > #14 0x0813883a in ?? () > #15 0x08138a05 in ?? () > #16 0x0813b456 in DoCopy () Well, this is sufficient to draw a conclusion: it's waiting for the client to send it some more COPY data. So next you need to look into what the client's state is. What software are you using on the client side, anyway? regards, tom lane
On 11/11/08 9:09 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Glen Beane <Glen.Beane@jax.org> writes: >> On 11/11/08 2:25 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >>> The most direct evidence about why it's stuck would probably be had by >>> attaching to the backend process with gdb and getting a stack trace. > >> It wasn't built with debugging symbols so there is some missing info, but >> here is what I get if I attach with gdb: > >> (gdb) where >> #0 0xb7f63410 in __kernel_vsyscall () >> #1 0xb7ac9273 in read () from /lib/tls/i686/cmov/libc.so.6 >> #2 0xb7d42d37 in ?? () from /usr/lib/i686/cmov/libcrypto.so.0.9.8 >> #3 0x081a1226 in ?? () >> #4 0xb7d40db1 in BIO_read () from /usr/lib/i686/cmov/libcrypto.so.0.9.8 >> #5 0xb7e05362 in ssl3_read_n () from /usr/lib/i686/cmov/libssl.so.0.9.8 >> #6 0xb7e05b2e in ssl3_read_bytes () from /usr/lib/i686/cmov/libssl.so.0.9.8 >> #7 0xb7e03096 in ssl3_read () from /usr/lib/i686/cmov/libssl.so.0.9.8 >> #8 0xb7e13b78 in SSL_read () from /usr/lib/i686/cmov/libssl.so.0.9.8 >> #9 0x081a128a in secure_read () >> #10 0x081a94c1 in ?? () >> #11 0x081a95d9 in pq_getbytes () >> #12 0x081a9744 in pq_getmessage () >> #13 0x0813853c in ?? () >> #14 0x0813883a in ?? () >> #15 0x08138a05 in ?? () >> #16 0x0813b456 in DoCopy () > > Well, this is sufficient to draw a conclusion: it's waiting for the > client to send it some more COPY data. So next you need to look into > what the client's state is. What software are you using on the client > side, anyway? Python with the psycopg2 library. I swear this was working earlier today. Maybe I am imagining things :) It does work with a unix socket, and I have a deadline to meet, so for now I just need to make sure I am running this on the same host as the postgresql server
On Tuesday 11 November 2008 6:20:09 pm Glen Beane wrote: > > Python with the psycopg2 library. I swear this was working earlier today. > Maybe I am imagining things :) > It does work with a unix socket, and I have a deadline to meet, so for now > I just need to make sure I am running this on the same host as the >postgresqlserver See these two posts for potential solution: http://lists.initd.org/pipermail/psycopg/2008-October/006224.html http://lists.initd.org/pipermail/psycopg/2008-October/006225.html They point to a problem with ssl and psycopg2 COPY. Also I know there where some problems with psycopg2 and copy that where fixed in psycopg2 version 2.0.7. -- Adrian Klaver aklaver@comcast.net
On Tuesday 11 November 2008 6:31:04 pm Adrian Klaver wrote: > On Tuesday 11 November 2008 6:20:09 pm Glen Beane wrote: > > Python with the psycopg2 library. I swear this was working earlier > > today. Maybe I am imagining things :) > > It does work with a unix socket, and I have a deadline to meet, so for > > now I just need to make sure I am running this on the same host as the > > postgresqlserver > > See these two posts for potential solution: > http://lists.initd.org/pipermail/psycopg/2008-October/006224.html > http://lists.initd.org/pipermail/psycopg/2008-October/006225.html > > They point to a problem with ssl and psycopg2 COPY. Also I know there where > some problems with psycopg2 and copy that where fixed in psycopg2 version > 2.0.7. > -- > Adrian Klaver > aklaver@comcast.net That should have been version 2.0.8. -- Adrian Klaver aklaver@comcast.net