Thread: Cannot allocate memory for output buffer
I am looking for some help regarding an python OperationalError that I recently received while executing a python script using sqlalchemy and psycopg2. The python script parses an xml file stored on a networked drive and enters the information into a pgsql database. Sometimes these xml files reference a binary file which is also located on the networked drive. These files are subsequently read in and stored in a table along with the file's md5. The binary data is stored within a bytea column. This script worked pretty well until recently when it came across a binary file about 258MB in size. While reading the file off the networked drive I received an OperationalError indicating that it was unable to allocate memory for the output buffer. My initial guess was that it ran out of memory, but according to the task manager the machine had close to 2GB free when the error occurred. I'm not 100% sure that this is a pgsql problem, but all Google searches show the exact error message within libpq's source code which is why I'm starting here. The machine that runs the script is a Windows XP machine running Python 2.6, PsycoPG2 2.0.10, and Postgresql 8.4. The database is running pgsql 8.4 on a FreeBSD 8.0 box with approximately 16GB or memory. I've been searching on Google and the mailing list archives for the past couple days trying to find an answer with no success. Any help is greatly appreciated. Thanks in advance. -- Pete Erickson redlamb _at_ redlamb _dot_ net
Pete Erickson <redlamb@redlamb.net> writes: > I am looking for some help regarding an python OperationalError that I > recently received while executing a python script using sqlalchemy and > psycopg2. The python script parses an xml file stored on a networked > drive and enters the information into a pgsql database. Sometimes > these xml files reference a binary file which is also located on the > networked drive. These files are subsequently read in and stored in a > table along with the file's md5. The binary data is stored within a > bytea column. This script worked pretty well until recently when it > came across a binary file about 258MB in size. While reading the file > off the networked drive I received an OperationalError indicating that > it was unable to allocate memory for the output buffer. My initial > guess was that it ran out of memory, but according to the task manager > the machine had close to 2GB free when the error occurred. Out of memory is probably exactly right. The textual representation of arbitrary bytea data is normally several times the size of the raw bits (worst case is 5x bigger, typical case perhaps half that). In addition to that you have to consider that there are likely to be several copies of the string floating around in your process' memory space. If you're doing this in a 32bit environment it doesn't surprise me at all that 258MB of raw data would exhaust available memory. Going to a 64bit implementation would help some, but I'm not sure that that's an available option for you on Windows, and anyway it doesn't eliminate the problem completely. If you want to process really large binary files you're going to need to divide them into segments. regards, tom lane
Thanks. Out of curiosity, if memory exhaustion was the problem, any idea why the task manager would show that I'm only using 1.2GB of the 3GB of memory? On 11/27/2009 5:15 PM, Tom Lane wrote: > Pete Erickson <redlamb@redlamb.net> writes: >> I am looking for some help regarding an python OperationalError that I >> recently received while executing a python script using sqlalchemy and >> psycopg2. The python script parses an xml file stored on a networked >> drive and enters the information into a pgsql database. Sometimes >> these xml files reference a binary file which is also located on the >> networked drive. These files are subsequently read in and stored in a >> table along with the file's md5. The binary data is stored within a >> bytea column. This script worked pretty well until recently when it >> came across a binary file about 258MB in size. While reading the file >> off the networked drive I received an OperationalError indicating that >> it was unable to allocate memory for the output buffer. My initial >> guess was that it ran out of memory, but according to the task manager >> the machine had close to 2GB free when the error occurred. > > Out of memory is probably exactly right. The textual representation of > arbitrary bytea data is normally several times the size of the raw bits > (worst case is 5x bigger, typical case perhaps half that). In addition > to that you have to consider that there are likely to be several copies > of the string floating around in your process' memory space. If you're > doing this in a 32bit environment it doesn't surprise me at all that > 258MB of raw data would exhaust available memory. > > Going to a 64bit implementation would help some, but I'm not sure that > that's an available option for you on Windows, and anyway it doesn't > eliminate the problem completely. If you want to process really large > binary files you're going to need to divide them into segments. > > regards, tom lane
Peter Erickson <redlamb@redlamb.net> writes: > Thanks. Out of curiosity, if memory exhaustion was the problem, any idea > why the task manager would show that I'm only using 1.2GB of the 3GB of > memory? Well, it would've failed to allocate the next copy of the string that it needed ... and I think also we try to keep the output buffer size at a power of 2, so it might have been asking for 2GB for the output buffer. regards, tom lane
On Fri, Nov 27, 2009 at 05:55:30PM -0500, Peter Erickson wrote: > Thanks. Out of curiosity, if memory exhaustion was the problem, any idea > why the task manager would show that I'm only using 1.2GB of the 3GB of > memory? Note that what usually kills you first on 32-bit systems is that you want 250MB of *contiguous* memory. Depending on how the program is started and if you have memory map randomization, it becomes entirely possible to have more than 1GB free, yet no blocks big enough to hold 250MB together. 64-bit systems don't suffer from this problem, the address space is so large it's just not an issue. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.