Re: plsql gets "out of memory" - Mailing list pgsql-admin
From | Rural Hunter |
---|---|
Subject | Re: plsql gets "out of memory" |
Date | |
Msg-id | 4E619F52.7050104@gmail.com Whole thread Raw |
In response to | Re: plsql gets "out of memory" (Rural Hunter <ruralhunter@gmail.com>) |
Responses |
Re: plsql gets "out of memory"
|
List | pgsql-admin |
Hi Kevin, I did another try with following additional changes based on our discussion: 1. use the tcp connection 2. turn off autovacuum 3. turn off full_page_writes I could import more than 30G data in about 2 hours. That's totally acceptable performance to me with the current server capability. There is a minor issue though. I saw a few errors during the import: ERROR: invalid byte sequence for encoding "UTF8": 0xe6272c ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c ERROR: invalid byte sequence for encoding "UTF8": 0xe68e27 ERROR: invalid byte sequence for encoding "UTF8": 0xe7272c ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c ERROR: invalid byte sequence for encoding "UTF8": 0xe5a427 My data was exported from an UTF8 MySQL database and my pgsql db is also UTF8. I got 8 errors above only with about 3 million records imported. The strange thing is, I usually see the problematic SQL output in the log if there is any error for that SQL so I have a chance to fix the data manually. But for the errors above, I don't see any SQL logged. The pgsql log just output error log same as above with no additional info: 2011-09-01 11:26:32 CST ERROR: invalid byte sequence for encoding "UTF8": 0xe6272c 2011-09-01 11:26:47 CST ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c 2011-09-01 11:26:53 CST ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c 2011-09-01 11:26:58 CST ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c 2011-09-01 11:26:58 CST ERROR: invalid byte sequence for encoding "UTF8": 0xe68e27 2011-09-01 11:27:01 CST ERROR: invalid byte sequence for encoding "UTF8": 0xe7272c 2011-09-01 11:27:06 CST ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c 2011-09-01 11:27:15 CST ERROR: invalid byte sequence for encoding "UTF8": 0xe5a427 What could be the cause of that? 于 2011/8/30 9:29, Rural Hunter 写道: > Thank you. I didn't understand what 'vacuum freeze' actually does. I > will check the detail to see if it's good for my situation. and I will > also test the load by tcp connection. Thanks again for all your > advices and they are really very helpful to me! > > 于 2011/8/30 0:06, Kevin Grittner 写道: >> Rural Hunter<ruralhunter@gmail.com> wrote: >>> 2011/8/29 23:18, Kevin Grittner: >> >>>> I also recommend a VACUUM FREEZE ANALYZE on the database unless >>>> most of these rows will be deleted or updated before you run a >>>> billion database transactions. Otherwise you will get a painful >>>> "anti-wraparound" autovacuum on everything, probably at a time >>>> of heavy usage. >>> hmm....I will try to turn autovacuum off though I didn't see any >>> resource intension caused by it. >> >> Well, turning off autovacuum during a bulk load is probably a net >> gain if it's insert-only (i.e., no need to query just-loaded data to >> decide what to do with new rows); but that's not what I was getting >> at. Bulk loading 200 GB of data which is not going to be deleted or >> updated heavily is setting a performance time bomb without a VACUUM >> FREEZE. At some point, perhaps months later, it will be necessary >> to freeze the tuples to prevent data loss, and since this occurs >> based on a threshold of how many transaction IDs have been consumed, >> it is most likely to happen at peak OLTP loads, when it will be the >> biggest problem. A VACUUM FREEZE (and you might as well throw in >> ANALYZE while you're at it) will take care of that up front. As a >> side benefit it will keep SELECT statements from generating heavy >> *write* loads on the first access to tuples, and will perform other >> maintenance which will improve database performance. >> >> I just count the time for VACUUM FREEZE ANALYZE as part of the bulk >> load time required before letting in users. >> >>>> Network latency? >> >>> No, I do the import locally on the db server so the network >>> letency can be excluded. >> >> Hmm... I don't remember the details, but there was a problem at >> some point where Linux pipe connections could introduce significant >> latency, and you could get much better performance on a TCP >> connection through localhost. It might be worth a try. (Maybe >> someone else will remember the details.) >> >> -Kevin >> >
pgsql-admin by date: