Thread: pg_restore out of memory
I am trying to restore a file that was done with pg_dump -Fc pg_dump on a postgreql 8.1.4 machine pg_restore on a postgresql 8.2.4 machine. The restore machine has the following settings. /etc/sysctl.conf (FreeBSD machine) kern.ipc.shmall=262144 kern.ipc.shmmax=534773760 #510MB kern.ipc.semmap=256 /boot/loader.conf kern.ipc.semmni=256 kern.ipc.semmns=512 kern.ipc.semmnu=256 kern.maxdsiz="1600MB" #1.6GB kern.dfldsiz="1600MB" #1.6GB kern.maxssiz="128M" # 128MB shared_buffers = 450MB temp_buffers = 8MB work_mem = 8MB # min 64kB maintenance_work_mem = 64M max_fsm_pages = 5000000 #Had error with 100,000 and increased to 5Million while trying the pg_restore OS can see 3.5GB of RAM. Swap is 20GB. The error that I got was: pg_restore: [archiver (db)] Error from TOC entry 2146; 0 988154 TABLE DATA message_attachments pgsql pg_restore: [archiver (db)] COPY failed: ERROR: out of memory DETAIL: Failed on request of size 134217728 (128MB) Syslog was: Jun 14 10:17:56 bk20 postgres[7294]: [130-1] ERROR: out of memory Jun 14 10:17:56 bk20 postgres[7294]: [130-2] DETAIL: Failed on request of size 134217728. I see a simmilar thread back in December 2006. http://tinyurl.com/3aa29g However i don't see a final resolution. Which parameter do I need to increase?
Francisco Reyes <lists@stringsutils.com> writes: > Syslog was: > Jun 14 10:17:56 bk20 postgres[7294]: [130-1] ERROR: out of memory > Jun 14 10:17:56 bk20 postgres[7294]: [130-2] DETAIL: Failed on request of size 134217728. Can we see the context-sizes dump that should've come out right before that in the log? regards, tom lane
Tom Lane writes: > Francisco Reyes <lists@stringsutils.com> writes: >> Syslog was: >> Jun 14 10:17:56 bk20 postgres[7294]: [130-1] ERROR: out of memory >> Jun 14 10:17:56 bk20 postgres[7294]: [130-2] DETAIL: Failed on request of size 134217728. > > Can we see the context-sizes dump that should've come out right before > that in the log? Is this the one you need? Is was right after the error Jun 14 10:17:56 bk20 postgres[7294]: [130-3] CONTEXT: COPY message_attachments, line 60490: "2720290 7225017 research/crew holds.sit sit 88885753 t 1 Jun 14 10:17:56 bk20 postgres[7294]: [130-4] U3R1ZmZJdCAoYykxOTk3LTIwMDIgQWxhZGRpbiBTeX..." Jun 14 10:17:56 bk20 postgres[7294]: [130-5] STATEMENT: COPY message_attachments (attachment_id, message_id, filename, extension, attachment_size, name_real, Jun 14 10:17:56 bk20 postgres[7294]: [130-6] parser_version, attachment_search_text, attachment_body, delete_status, delete_status_date) FROM stdin;
"Francisco Reyes" <lists@stringsutils.com> writes: > kern.maxdsiz="1600MB" #1.6GB > kern.dfldsiz="1600MB" #1.6GB > kern.maxssiz="128M" # 128MB It ought to be maxdsiz which seems large enough. > The error that I got was: > pg_restore: [archiver (db)] Error from TOC entry 2146; 0 988154 TABLE DATA > message_attachments pgsql > pg_restore: [archiver (db)] COPY failed: ERROR: out of memory > DETAIL: Failed on request of size 134217728 (128MB) What does the output of "ulimit -a" show? Can you arrange to run ulimit -a in the same environment as the server? Either by starting the server in shell manually or by putting ulimit -a in the startup script which starts the server if you have one? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark writes: > What does the output of "ulimit -a" show? In FreeBSD, as far as I know, what controls the size of a program is the /boot/loader.conf and /etc/login.conf The default /etc/login.conf has unlimited size. /boot/loader.conf is set to max program size of 1.6GB > Can you arrange to run ulimit -a in > the same environment as the server? There is no "ulimit -a" in cshell which is what I use. I guessed this may be a bash setting .. so tried that.. The output of ulimit -a is: core file size (blocks, -c) unlimited data seg size (kbytes, -d) 524288 file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 11095 pipe size (512 bytes, -p) 1 stack size (kbytes, -s) 65536 cpu time (seconds, -t) unlimited max user processes (-u) 5547 virtual memory (kbytes, -v) unlimited Don't see any limit at 128MB, the size at which the program crashed.
Francisco Reyes <lists@stringsutils.com> writes: > Tom Lane writes: >> Can we see the context-sizes dump that should've come out right before >> that in the log? > Is this the one you need? No. [squints...] Hm, you're looking at a syslog log, aren't you. The memory dump only comes out on stderr (I think because of paranoia about running out of memory while trying to report we're out of memory). Can you get the postmaster's stderr output? regards, tom lane
"Francisco Reyes" <lists@stringsutils.com> writes: > There is no "ulimit -a" in cshell which is what I use. > I guessed this may be a bash setting .. so tried that.. > The output of ulimit -a is: The csh equivalent is just "limit". > core file size (blocks, -c) unlimited > data seg size (kbytes, -d) 524288 > > Don't see any limit at 128MB, the size at which the program crashed. You might try starting Postgres with ulimit -d unliited or limit datasize unlimit You're right that your limit is above 128M but the error just means it tried to allocated 128M and failed, it may already have allocated 400M and been trying to go over the 524M mark. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Tom Lane writes: > The memory dump only comes out on stderr (I think because of paranoia > about running out of memory while trying to report we're out of memory). > Can you get the postmaster's stderr output? From stderr pg_restore: restoring data for table "message_attachments" pg_restore: [archiver (db)] Error from TOC entry 2146; 0 988154 TABLE DATA message_attachments pgsql pg_restore: [archiver (db)] COPY failed: ERROR: out of memory DETAIL: Failed on request of size 134217728. CONTEXT: COPY message_attachments, line 60490: "2720290 7225017 research/crew holds.sit sit 88885753 t 1U3R1ZmZJdCAoYykxOTk3LTIwMDIgQWxhZGRpbiBTeX..."
Gregory Stark writes: > You're right that your limit is above 128M but the error just means it tried > to allocated 128M and failed, it may already have allocated 400M and been > trying to go over the 524M mark. My limit should be 1.6GB as per /boot/loader.conf I ran a ps every 20 seconds to try and catch the program as it was going over the size. The ps right before the crash is: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND pgsql 9364 6.2 22.8 904472 803436 ?? Rs 8:34PM 139:37.46 postgres: pgsql backaway_copy [local] COPY (postgres) fran 9363 8.2 8.6 338928 305332 p3 S+ 8:34PM 244:49.41 pg_restore -U pgsql -v -d backaway_copy backaway-fc-2007-06-03.pg_d pgsql 15373 2.3 10.6 613892 374000 ?? Rs 3:06AM 11:45.28 postgres: pgsql full_table_restores [local] VACUUM (postgres) 904472 = 883MB So if it was trying to allocate 128MB more it crashed trying to allocate somewhere around 1011 MB. Although since I do it every 20 seconds.. could have gone beyond that. The question is, what type of memory is that trying to allocate? Shared memory? Shared memory from is kern.ipc.shmmax=534773760, well below the 1011MB or above that I am estimating it crashed from. The FreeBSD default is to cap programs at 512MB so thet fact the program is crashing at 1011MB means that the /boot/loader.conf setting of 1.6GB memory cap is active. If it is of any help.. information about the data. It is hex64 encoded data. The original data getting encoded can be up to 100MB. Not sure how much bigger the hex encoding could be making it.
"Francisco Reyes" <lists@stringsutils.com> writes: > Gregory Stark writes: > >> You're right that your limit is above 128M but the error just means it tried >> to allocated 128M and failed, it may already have allocated 400M and been >> trying to go over the 524M mark. > > My limit should be 1.6GB as per /boot/loader.conf Well according to "limit" it was 524M. I don't know how that relates to the kernel limits you see in /boot/loader.conf. It may be the same thing but being lowered by something in the startup scripts or it may be unrelated. > The question is, what type of memory is that trying to allocate? > Shared memory? If it couldn't allocate the shared memory it wanted it wouldn't start up. > The FreeBSD default is to cap programs at 512MB so thet fact the program is > crashing at 1011MB means that the /boot/loader.conf setting of 1.6GB memory cap > is active. I'm skeptical that you can trust ps's VSZ reporting at this level of detail. On some platforms VSZ includes a proportionate share of its shared memory or might not include memory allocated but not actually written to yet (due to copy-on-write). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark writes: > I'm skeptical that you can trust ps's VSZ reporting at this level of detail. > On some platforms VSZ includes a proportionate share of its shared memory or > might not include memory allocated but not actually written to yet (due to > copy-on-write). Understood. But at least it shows that the program was already above the default of 512MB limit of the operating system.
Francisco Reyes wrote: > Tom Lane writes: > > >The memory dump only comes out on stderr (I think because of paranoia > >about running out of memory while trying to report we're out of memory). > >Can you get the postmaster's stderr output? > > >From stderr > > pg_restore: restoring data for table "message_attachments" > pg_restore: [archiver (db)] Error from TOC entry 2146; 0 988154 TABLE DATA > message_attachments pgsql This is pg_restore's stderr. What Tom wants to see is postmaster's. It is probably redirected (hopefully to a file, but regretfully it is common to see it go to /dev/null) on the init script that starts the service. If it's going to /dev/null, change it to somewhere more reasonable and try again. Or you could start the postmaster by hand on a terminal under your control. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Francisco Reyes <lists@stringsutils.com> writes: > If it is of any help.. information about the data. > It is hex64 encoded data. The original data getting encoded can be up to > 100MB. Not sure how much bigger the hex encoding could be making it. Hmm. I'm wondering about a memory leak in the input converter. What datatype exactly are the wide column(s)? Also, do you have any ON INSERT triggers on this table? regards, tom lane
Tom Lane writes: > Hmm. I'm wondering about a memory leak in the input converter. What > datatype exactly are the wide column(s)? Text. >Also, do you have any ON INSERT triggers on this table? No.
Alvaro Herrera writes: > This is pg_restore's stderr. What Tom wants to see is postmaster's. It > is probably redirected (hopefully to a file, but regretfully it is > common to see it go to /dev/null) on the init script that starts the > service. How would I run it manually? When I do pg_ctl start, no output ever goes to the screen. I also looked at the freebsd startup script. It also uses pg_ctl. Do I just need to send "-l <filename>" to pg_ctl? According to the man page the default is to send output to /dev/null. It does seem like -l will redirect to file so going to try that. man pg_ctl ... In start mode, a new server is launched. The server is started in the background, and standard input is attached to /dev/null. The standard output and standard error are either appended to a log file (if the -l option is used), or redirected to pg_ctl's standard output (not stan-dard error). ...
Alvaro Herrera writes: > This is pg_restore's stderr. What Tom wants to see is postmaster's. It > is probably redirected (hopefully to a file, but regretfully it is > common to see it go to /dev/null) on the init script that starts the pg_ctl -l <file> didn't work. Trying now with changes to postgreql.conf: redirect_stderr = on log_directory = '/data/logs' log_filename = 'postgresql-%Y-%m-%d.log log_error_verbosity = verbose And increased all error levels to debug1 That looks promissing already.. see errors going to the file. Trying again. will be a few hours before it gets to the line where it runs out of memory.
Tom Lane writes: > Can we see the context-sizes dump that should've come out right before > that in the log? Hope this is what you are looking for. Included a few lines before the error in case that is of any help. These lines are from the postgresql log. Redirected stderr to a file. pg_amop_opr_opc_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_amop_opc_strat_index: 1024 total in 1 blocks; 216 free (0 chunks); 808 used pg_aggregate_fnoid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used MdSmgr: 8192 total in 1 blocks; 6616 free (0 chunks); 1576 used LOCALLOCK hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used Timezones: 48616 total in 2 blocks; 5968 free (0 chunks); 42648 used ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used ERROR: 53200: out of memory DETAIL: Failed on request of size 134217728. CONTEXT: COPY message_attachments, line 60490: "2720290 7225017 research/crew holds.sit sit 88885753 t 1 U3R1ZmZJdCAoYykxOTk3LTIwMDIgQWxhZGRpbiBTeX..." LOCATION: AllocSetRealloc, aset.c:907 STATEMENT: COPY message_attachments (attachment_id, message_id, filename, extension, attachment_size, name_real, parser_version, attachment_search_text, attachment_body, delete_status, delete_status_date) FROM stdin; The table has no insert triggers, but does have a Foreign-key constraint. "message_attachments_message_id_fkey" FOREIGN KEY (message_id) REFERENCES messages(message_id) It also has 3 indexes and 2 check constraints.
Francisco Reyes <lists@stringsutils.com> writes: > Tom Lane writes: >> Can we see the context-sizes dump that should've come out right before >> that in the log? > Hope this is what you are looking for. > Included a few lines before the error in case that is of any help. Well, that's the last few lines of what I wanted, but not the part that was interesting :-(. Please show all the lines that are like the "n total in m blocks" format. Or at least look for the one(s) that contain large numbers... > The table has no insert triggers, but does have a Foreign-key constraint. > "message_attachments_message_id_fkey" FOREIGN KEY (message_id) REFERENCES > messages(message_id) Hmm, you may well be blowing out the deferred-foreign-key-check list. But pg_dump normally orders its operations so that the data is loaded before trying to set up FK constraints. Are you perhaps trying to do a data-only restore? regards, tom lane
Tom Lane writes: > Well, that's the last few lines of what I wanted, but not the part that > was interesting :-(. Please show all the lines that are like the "n > total in m blocks" format. Or at least look for the one(s) that contain > large numbers... Since you know best what you are looking for I just put the log for you at http://public.natserv.net/postgresql-2007-06-15.log > Hmm, you may well be blowing out the deferred-foreign-key-check list. > But pg_dump normally orders its operations so that the data is loaded > before trying to set up FK constraints. Are you perhaps trying to do a > data-only restore? I don't believe so. The lines I am using for both backup/restore are: pg_dump -h <host> -Fc <database> >/data2/backups/`date"+%F"`.pg_dump pg_restore -U pgsql -v -L Load-list.txt -d copydb 2007-06-03.pg_dump
Francisco Reyes <lists@stringsutils.com> writes: > Tom Lane writes: >> Well, that's the last few lines of what I wanted, but not the part that >> was interesting :-(. Please show all the lines that are like the "n >> total in m blocks" format. Or at least look for the one(s) that contain >> large numbers... > Since you know best what you are looking for I just put the log for you at > http://public.natserv.net/postgresql-2007-06-15.log The only important space consumption seems to be here: PortalHeapMemory: 335682660 total in 13 blocks; 68880 free (52 chunks); 335613780 used So it's not a deferred-trigger problem (trigger events would be in their own context). Also, the fact that it's trying to allocate exactly 128MB lets out a lot of possibilities --- that has to be a single huge request, not just the end result of a boatload of little ones. So it probably isn't a memory leak per se. One time that we've seen something like this before is when someone had a COPY datafile that had had its newlines munged, such that COPY thought it was all one long line and tried to read it all in at once. What's the history of your datafile --- has it maybe gotten passed through a Windows newline conversion? regards, tom lane
Tom Lane writes: > thought it was all one long line and tried to read it all in at once. > What's the history of your datafile --- has it maybe gotten passed > through a Windows newline conversion? The database has emails. The table in question holds the attachments. Each row represents one attachment. The file to be inserted is hex64 encoded before insertion. The email capture and hex64 conversion are both done on FreeBSD machines. Although it never passes through a windows machine the attachments for the most part are coming from windows machines.
Tried a pg_dump without -Fc to see if I could get that one table loaded. Still failed. psql:message-attachments-2007-06-15.sql:2840177: ERROR: out of memory DETAIL: Failed on request of size 88885765. CONTEXT: COPY message_attachments, line 60660: "2720290 7225017 research/crew holds.sit sit 88885753 t 1 U3R1ZmZJdCAoYykxOTk3LTIwMDIgQWxhZGRpbiBTeX..." Have I encountered a bug? Looked at the record in question and the length of the long column in that row is 88885753 (84MB). Any suggestions?
Francisco Reyes <lists@stringsutils.com> writes: > Tried a pg_dump without -Fc to see if I could get that one table loaded. > Still failed. > psql:message-attachments-2007-06-15.sql:2840177: ERROR: out of memory > DETAIL: Failed on request of size 88885765. > CONTEXT: COPY message_attachments, line 60660: "2720290 7225017 > research/crew holds.sit sit 88885753 t 1 > U3R1ZmZJdCAoYykxOTk3LTIwMDIgQWxhZGRpbiBTeX..." > Looked at the record in question and the length of the long column in that > row is 88885753 (84MB). If that actually is the length of the line, the only answer is to raise the memory ulimit setting the postmaster runs under. regards, tom lane
Tom Lane writes: >> Looked at the record in question and the length of the long column in that >> row is 88885753 (84MB). > > If that actually is the length of the line, the only answer is to raise > the memory ulimit setting the postmaster runs under. The memory limit is 1.6GB. /boot/loader.conf kern.ipc.semmni=256 kern.ipc.semmns=512 kern.ipc.semmnu=256 kern.maxdsiz="1600MB" #1.6GB kern.dfldsiz="1600MB" #1.6GB kern.maxssiz="128M" # 128MB Also I have several postgress processes in the 400M+ size as reported by top Report from limit: cputime unlimited filesize unlimited datasize 2097152 kbytes stacksize 131072 kbytes coredumpsize unlimited memoryuse unlimited <--- vmemoryuse unlimited descriptors 11095 memorylocked unlimited maxproc 5547 sbsize unlimited
Tom Lane writes: > If that actually is the length of the line, the only answer is to raise > the memory ulimit setting the postmaster runs under. Did another test to try to see if the problem is that row or the size of the row. Another record of greater size also failed. Any ideas what this 84MB limit could be from? I have shared_buffers at 450MB maintenance_work_mem = 64MB (which I increased to 100MB with the same result) OS limit for applications at 1.6GB. Also when I start postgresql I see several postgresql processes using 400M+ so I don't see how it could be the Os limit.
On Jun 15, 2007, at 8:24 AM, Francisco Reyes wrote: > Understood. But at least it shows that the program was already > above the default of 512MB limit of the operating system. But that is a false assertion that the limit is 512Mb. On a random system of mine running FreeBSD/i386 it shows the default data limit as 1Gb, and on a random FreeBSD/amd64 box I see it at about 32Gb. I do no global tweaking of the size limits.
Vivek Khera writes: > But that is a false assertion that the limit is 512Mb. On a random > system of mine running FreeBSD/i386 it shows the default data limit > as 1Gb, and on a random FreeBSD/amd64 box I see it at about 32Gb. I > do no global tweaking of the size limits. Understood. I only showed "limit" because it was asked. I already set /boot/loader.conf to 1600MB. Also the error is about running out of memory when trying to allocate 84MB. The default FreeBSD limit is 512MB so 84MB is well below that. At this point this is basically stopping me from loading a table and so far I have not been able to get any insight into how this could be fixed. I wonder if there is any additional debuging I can turn on to help better troubleshoot this.
On Jun 18, 2007, at 2:10 PM, Francisco Reyes wrote: > Also the error is about running out of memory when trying to > allocate 84MB. > The default FreeBSD limit is 512MB so 84MB is well below that. Try being less stingy than 128Mb for your stack. The default stack is 512Mb.
Francisco Reyes <lists@stringsutils.com> writes: > Also the error is about running out of memory when trying to allocate 84MB. > The default FreeBSD limit is 512MB so 84MB is well below that. Keep in mind though that the COPY process is going to involve several working copies of that data (at least four that I can think of --- line input buffer, field input buffer, constructed text object, and constructed tuple). I'm also not clear on whether the 512MB limit you refer to will count the PG shared memory area, but if so that could easily be a couple hundred meg off the top of what a backend can allocate as temporary workspace. So it seems entirely likely to me that you'd need a ulimit above 512MB to push around 84MB fields. regards, tom lane
Tom Lane writes: > Keep in mind though that the COPY process is going to involve several > working copies of that data (at least four that I can think of --- > line input buffer, field input buffer, constructed text object, and > constructed tuple). Will this be for the shared_buffers memory? > I'm also not clear on whether the 512MB limit you refer to will count > the PG shared memory area The OS limit is set to 1.6GB. I increased the shared_buffers to 450MB and it still failed. > hundred meg off the top of what a backend can allocate as temporary > workspace. Is there anything I can change in my log settings so I can produce something which will help you narrow down this problem? > So it seems entirely likely to me that you'd need a ulimit above 512MB > to push around 84MB fields. The issue I am trying to figure is which limit.. the OS limit is set to 1.6GB. I am now trying to increase my shared_buffers. So far have them at 450MB and it is still failing. Will also try the setting Vivek suggested although for that may need to restart the machine.
Francisco Reyes <lists@stringsutils.com> writes: > The issue I am trying to figure is which limit.. the OS limit is set to > 1.6GB. I am now trying to increase my shared_buffers. So far have them at > 450MB and it is still failing. For this problem, increasing shared_buffers is either useless or downright counterproductive. It cannot increase the amount of temporary workspace a particular backend can grab, and it might decrease it. regards, tom lane
Francisco Reyes <lists@stringsutils.com> writes: > Also the error is about running out of memory when trying to allocate 84MB. > The default FreeBSD limit is 512MB so 84MB is well below that. Well, no, it's not. I traced through a test case involving loading a multi-megabyte text value, and what I find is that there are actually five concurrently allocated multi-megabyte areas: * copy's line buffer * copy's field buffer * textin's result value * heap_form_tuple result * output workspace for toast_compress_datum What's more, because the line and field buffers are StringInfos that are intended for reuse across multiple lines/fields, they're not simply made equal to the exact size of the big field. They're rounded up to the next power-of-2, ie, if you've read an 84MB field during the current COPY IN then they'll be 128MB apiece. In short, COPY is going to need 508MB of process-local RAM to handle this row. That's on top of the few megabytes of random housekeeping info that a backend keeps around. And it's entirely likely that your 450MB of shared buffers (plus whatever else is in your shared memory area) gets counted against each process' ulimit, too. In short, you need a bigger per-process memory allowance. BTW: I think if you were using different client and server encodings there would be yet a sixth large buffer involved, for the output of pg_client_to_server. Basically Postgres is designed on the assumption that you have room for multiple copies of the widest field you want to process. I have not bothered to see how many copies of the field would be involved in a "SELECT * FROM ..." operation, but I can assure you it'd be several. If you can't afford a factor of 5 or 10 headroom on your widest fields, you should look at storing them as large objects so you can store and fetch them a chunk at a time. regards, tom lane
Tom Lane writes: > What's more, because the line and field buffers are StringInfos that are > intended for reuse across multiple lines/fields, they're not simply made > equal to the exact size of the big field. They're rounded up to the > next power-of-2, ie, if you've read an 84MB field during the current > COPY IN then they'll be 128MB apiece. In short, COPY is going to need > 508MB of process-local RAM to handle this row. Of shared memory? I am a little confused,yesterday you said that increasing shared_buffers may be counterproductive. Or you are referring to the OS size? The OS size is 1.6GB, but today I am going to try increasing kern.maxssiz. Vivek recommended increasing it > In short, you need a bigger per-process memory allowance. I wrote a mini python program to copy one of the records that is failing. The client program is using 475MB with 429MB resident. The server has been running all night on this single insert. The server is using 977MB with 491MB resident. Yesterday I saw it grow as big as 1000MB with 900MB+ resident. > BTW: I think if you were using different client and server encodings > there would be yet a sixth large buffer involved, for the output of > pg_client_to_server. Using default encoding. The log file keeps growing with simmilar messages. Put subset of log at http://public.natserv.net/postgresql-2007-06-19.txt However the process doesn't crash.
Francisco Reyes wrote: > Tom Lane writes: > > >What's more, because the line and field buffers are StringInfos that are > >intended for reuse across multiple lines/fields, they're not simply made > >equal to the exact size of the big field. They're rounded up to the > >next power-of-2, ie, if you've read an 84MB field during the current > >COPY IN then they'll be 128MB apiece. In short, COPY is going to need > >508MB of process-local RAM to handle this row. > > Of shared memory? No > I am a little confused,yesterday you said that increasing shared_buffers > may be counterproductive. Yes, that's what he said. > Or you are referring to the OS size? Yes > The OS size is 1.6GB, but today I am going to try increasing kern.maxssiz. > Vivek recommended increasing it The problem is probably the ulimit. I don't know what kern.maxssiz is though. > >In short, you need a bigger per-process memory allowance. > > I wrote a mini python program to copy one of the records that is failing. > The client program is using 475MB with 429MB resident. > > The server has been running all night on this single insert. > The server is using 977MB with 491MB resident. > Yesterday I saw it grow as big as 1000MB with 900MB+ resident. Can you send the program along? And the table definition (including indexes, etc)? -- Alvaro Herrera Developer, http://www.PostgreSQL.org/ "Find a bug in a program, and fix it, and the program will work today. Show the program how to find and fix a bug, and the program will work forever" (Oliver Silfridge)
Alvaro Herrera writes: >> The OS size is 1.6GB, but today I am going to try increasing kern.maxssiz. >> Vivek recommended increasing it kern.maxssiz is the maximum stack size. Increased to 512MB. Didn't help. > The problem is probably the ulimit. I did a shell script which continuously did ps auxwm |head -n >>SizePS.txt as I tried to load the data using the copy command. >> >In short, you need a bigger per-process memory allowance. The program crashed right after it reached 895704 (874MB) in size and 400212 (390MB) resident. > Can you send the program along? And the table definition (including > indexes, etc)? The problem, as far as I can tell, is the data. I wrote the program to see if the problem was only with the "copy from". The program without the data is not likely to be of much use. I will see if I can get authorization to provide the data (1 record) and the table structure. The data is about 60MB compressed though. The file was created from using copy to, so to load it I just use "copy <table> from '<file>'"
Alvaro Herrera writes: > Can you send the program along? And the table definition (including > indexes, etc)? I put the table definition and a Python program that reproduces the problem at: http://public.natserv.net/table-schema.sql http://public.natserv.net/large_record.py The program uses the psycopg2 library. All the program does is to send test data to the table, in particular doing a large text size for a text column called attachment_body. The database doesn't give an out of error message like when trying to run a copy, but instead generates a lot output to the log such as http://public.natserv.net/postgresql-2007-06-19.txt The out of memory error when trying to use the copy command is http://public.natserv.net/postgresql-2007-06-15.log Thanks for any further pointers and suggestions. Don't know what else to try at this point.
Would it help at all to run a ktrace? Or are the logs I have supplied enough? ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster
Alvaro Herrera writes: > The problem is probably the ulimit. The problem occurrs on i386, but not on 64bit architecture. Tested 5 machines. 3 i386 FreeBSD 6.2 2 AMD64 FreeBSD 6.2 The 64 bit machines, with postgresql compiled from ports, worked. One of the machines had default OS limit. The second is already a dedicated postgresql machine so it already had OS limits increased. So is this a bug in the i386 version of Postgresql or a limitation of the FreeBSD i386?
Would it help at all to run a ktrace? Or are the logs I have supplied enough? ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster
Re-did the machine that was running out of memory and installed FreeBSD 6.2 AMD64. The rows that used to fail now load. Therefore, the problem is only with the i386 version. Should I report this as a bug or is this "nornal" and expected?
On Jun 25, 2007, at 9:33 PM, Francisco Reyes wrote: > Therefore, the problem is only with the i386 version. > Should I report this as a bug or is this "nornal" and expected? > i wouldn't call it a bug to need more resources than you've got available :-) obviously the limits on the i386 version of FreeBSD are less than that on the amd64 version, but I've not got any "big" i386 machines (ie, more than 1Gb of RAM) to compare. All my big boxes are amd64.
Vivek Khera writes: > i wouldn't call it a bug to need more resources than you've got > available :-) Hm... now I am really confused. The same settings on AMD64 work. So how are "more resources available" when I have the same amount of memory and the same settings? I even tried a machine with less memory, in AMD64 and the records that failed in i386 loaded fine. Even with default settings in postgresql.conf the problem records load whereas they fail in i386. I think it must be a variable that is likely 4 bits in i386 and 8bits in 64 bits.
On Jun 25, 2007, at 10:32 PM, Francisco Reyes wrote: > Hm... now I am really confused. > The same settings on AMD64 work. So how are "more resources > available" when I have the same amount of memory and the same > settings? you set your maxdsize to the same as on i386? on even my smallest amd64 boxes I see a data size limit of 33Mb per process.
Vivek Khera writes: > you set your maxdsize to the same as on i386? On the AMD64 I have /boot/loader.conf as kern.ipc.semmni=256 kern.ipc.semmns=512 kern.ipc.semmnu=256 In i386 I believe I had set kern.ipc.semmni=256 kern.ipc.semmns=512 kern.ipc.semmnu=256 kern.maxdsiz="1600MB" kern.dfldsiz="1600MB" kern.maxssiz="512MB" In another machine, my laptop, I did not even increase anything. Used defaults straight from postgresql.conf and the large columns loaded fine. All 4 trouble rows were above 80MB. Limits compared. AMD64 limit cputime unlimited filesize unlimited datasize 33554432 kbytes stacksize 524288 kbytes coredumpsize unlimited memoryuse unlimited vmemoryuse unlimited descriptors 11095 memorylocked unlimited maxproc 5547 sbsize unlimited Limit in i386 was Report from limit: cputime unlimited filesize unlimited datasize 2097152 kbytes stacksize 131072 kbytes coredumpsize unlimited memoryuse unlimited vmemoryuse unlimited descriptors 11095 memorylocked unlimited maxproc 5547 sbsize unlimited