Thread: New PG14 server won't start with >2GB shared_buffers
Hello All,
Have a strange issue that I can not solve.
I have a new server that has 1TB ram (not a typo). I have been fighting with this evening to start up with a shared_buffer setting over 2GB. When the memory allocation breaks 2GB, the database won’t start. I’ve dropped it down to 500M to get the database online. We’ve tried both SysV memory and mmap memory.
Any ideas on what is going on and how to resolve? I’m sure I’m missing something obvious, but it is totally eluding me.
OS: Ubuntu 18.04
Memory: 1007GB
CPU: Intel(R) Xeon(R) CPU X7550 @ 2.00GHz (64 cores total)
Kernel Memory:
kernel.shmmax = 274877906944
kernel.shmall = 17179869184
Postgres Config:
name | setting
autovacuum_max_workers | 3
autovacuum_work_mem | -1
effective_cache_size | 98992128
fsync | on
full_page_writes | on
maintenance_work_mem | 2097152
shared_buffers | 64000
shared_memory_type | sysv
temp_buffers | 4096
wal_buffers | 2048
work_mem | 119990
If you need additional parameters, please ask
SysV Error:
Feb 25 01:55:16 appdb-server01.production.aweberint.com postgres[34680]: FATAL: could not create shared memory segment: Cannot allocate memory
DETAIL: Failed system call was shmget(key=2359323, size=2063564800, 03600).
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMALL parameter. You might need to reconfigure the kernel with larger SHMALL.
FATAL: could not map anonymous shared memory: Cannot allocate memory
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 1635737600 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
Chris Hoover <chrish@aweber.com> writes: > I have a new server that has 1TB ram (not a typo). I have been fighting with this evening to start up with a shared_buffersetting over 2GB. When the memory allocation breaks 2GB, the database won’t start. I’ve dropped it down to500M to get the database online. We’ve tried both SysV memory and mmap memory. Are you running inside a VM, or cgroup, or something else that might be imposing a memory limit? regards, tom lane
This is bare metal. Thanks, Chris Hoover Sent from my iPhone > On Feb 24, 2023, at 21:19, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Chris Hoover <chrish@aweber.com> writes: >> I have a new server that has 1TB ram (not a typo). I have been fighting with this evening to start up with a shared_buffersetting over 2GB. When the memory allocation breaks 2GB, the database won’t start. I’ve dropped it down to500M to get the database online. We’ve tried both SysV memory and mmap memory. > > Are you running inside a VM, or cgroup, or something else that might > be imposing a memory limit? > > regards, tom lane
Why did you have to drop shared buffers to 500MB, instead of something just below 2GB?
And what's "your kernel's SHMALL parameter"?
And what's "your kernel's SHMALL parameter"?
On 2/24/23 20:08, Chris Hoover wrote:
Hello All,Have a strange issue that I can not solve.I have a new server that has 1TB ram (not a typo). I have been fighting with this evening to start up with a shared_buffer setting over 2GB. When the memory allocation breaks 2GB, the database won’t start. I’ve dropped it down to 500M to get the database online. We’ve tried both SysV memory and mmap memory.Any ideas on what is going on and how to resolve? I’m sure I’m missing something obvious, but it is totally eluding me.OS: Ubuntu 18.04Memory: 1007GBCPU: Intel(R) Xeon(R) CPU X7550 @ 2.00GHz (64 cores total)Kernel Memory:kernel.shmmax = 274877906944kernel.shmall = 17179869184Postgres Config:name | settingautovacuum_max_workers | 3autovacuum_work_mem | -1effective_cache_size | 98992128fsync | onfull_page_writes | onmaintenance_work_mem | 2097152shared_buffers | 64000shared_memory_type | sysvtemp_buffers | 4096wal_buffers | 2048work_mem | 119990If you need additional parameters, please askSysV Error:Feb 25 01:55:16 appdb-server01.production.aweberint.com postgres[34680]: FATAL: could not create shared memory segment: Cannot allocate memoryDETAIL: Failed system call was shmget(key=2359323, size=2063564800, 03600).HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMALL parameter. You might need to reconfigure the kernel with larger SHMALL.Mmap Error:FATAL: could not map anonymous shared memory: Cannot allocate memoryHINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 1635737600 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
--
Born in Arizona, moved to Babylonia.
Born in Arizona, moved to Babylonia.
Hello,
Something with large pages ?
Regards
De: "Ron" <ronljohnsonjr@gmail.com>
À: "pgsql-admin" <pgsql-admin@lists.postgresql.org>
Envoyé: Samedi 25 Février 2023 11:25:10
Objet: Re: New PG14 server won't start with >2GB shared_buffers
À: "pgsql-admin" <pgsql-admin@lists.postgresql.org>
Envoyé: Samedi 25 Février 2023 11:25:10
Objet: Re: New PG14 server won't start with >2GB shared_buffers
Why did you have to drop shared buffers to 500MB, instead of something just below 2GB?
And what's "your kernel's SHMALL parameter"?
And what's "your kernel's SHMALL parameter"?
On 2/24/23 20:08, Chris Hoover wrote:
Hello All,Have a strange issue that I can not solve.I have a new server that has 1TB ram (not a typo). I have been fighting with this evening to start up with a shared_buffer setting over 2GB. When the memory allocation breaks 2GB, the database won’t start. I’ve dropped it down to 500M to get the database online. We’ve tried both SysV memory and mmap memory.Any ideas on what is going on and how to resolve? I’m sure I’m missing something obvious, but it is totally eluding me.OS: Ubuntu 18.04Memory: 1007GBCPU: Intel(R) Xeon(R) CPU X7550 @ 2.00GHz (64 cores total)Kernel Memory:kernel.shmmax = 274877906944kernel.shmall = 17179869184Postgres Config:name | settingautovacuum_max_workers | 3autovacuum_work_mem | -1effective_cache_size | 98992128fsync | onfull_page_writes | onmaintenance_work_mem | 2097152shared_buffers | 64000shared_memory_type | sysvtemp_buffers | 4096wal_buffers | 2048work_mem | 119990If you need additional parameters, please askSysV Error:Feb 25 01:55:16 appdb-server01.production.aweberint.com postgres[34680]: FATAL: could not create shared memory segment: Cannot allocate memoryDETAIL: Failed system call was shmget(key=2359323, size=2063564800, 03600).HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMALL parameter. You might need to reconfigure the kernel with larger SHMALL.Mmap Error:FATAL: could not map anonymous shared memory: Cannot allocate memoryHINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 1635737600 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
--
Born in Arizona, moved to Babylonia.
Born in Arizona, moved to Babylonia.
Ron,
Honestly, I grabbed 500MB to get the database up, by this time I was very frustrated and just wanted it up. :) I think I was at 1500MB and it still would not start. So to save what was left of my evening out, I just picked 500.
Here is shmall:
kernel.shmall = 17179869184
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
On Feb 25, 2023, at 5:25 AM, Ron <ronljohnsonjr@gmail.com> wrote:Why did you have to drop shared buffers to 500MB, instead of something just below 2GB?
And what's "your kernel's SHMALL parameter"?On 2/24/23 20:08, Chris Hoover wrote:Hello All,Have a strange issue that I can not solve.I have a new server that has 1TB ram (not a typo). I have been fighting with this evening to start up with a shared_buffer setting over 2GB. When the memory allocation breaks 2GB, the database won’t start. I’ve dropped it down to 500M to get the database online. We’ve tried both SysV memory and mmap memory.Any ideas on what is going on and how to resolve? I’m sure I’m missing something obvious, but it is totally eluding me.OS: Ubuntu 18.04Memory: 1007GBCPU: Intel(R) Xeon(R) CPU X7550 @ 2.00GHz (64 cores total)Kernel Memory:kernel.shmmax = 274877906944kernel.shmall = 17179869184Postgres Config:name | settingautovacuum_max_workers | 3autovacuum_work_mem | -1effective_cache_size | 98992128fsync | onfull_page_writes | onmaintenance_work_mem | 2097152shared_buffers | 64000shared_memory_type | sysvtemp_buffers | 4096wal_buffers | 2048work_mem | 119990If you need additional parameters, please askSysV Error:Feb 25 01:55:16 appdb-server01.production.aweberint.com postgres[34680]: FATAL: could not create shared memory segment: Cannot allocate memoryDETAIL: Failed system call was shmget(key=2359323, size=2063564800, 03600).HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMALL parameter. You might need to reconfigure the kernel with larger SHMALL.Mmap Error:FATAL: could not map anonymous shared memory: Cannot allocate memoryHINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 1635737600 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.--
Born in Arizona, moved to Babylonia.
I just don't get what all this fuss is about trying to get PG up on the smallest possible values for shared_buffers, especially considering this person has 1 TB of memory?
Chris Hoover wrote on 2/25/2023 8:07 AM:

Chris Hoover wrote on 2/25/2023 8:07 AM:
Ron,Honestly, I grabbed 500MB to get the database up, by this time I was very frustrated and just wanted it up. :) I think I was at 1500MB and it still would not start. So to save what was left of my evening out, I just picked 500.Here is shmall:kernel.shmall = 17179869184On Feb 25, 2023, at 5:25 AM, Ron <ronljohnsonjr@gmail.com> wrote:Why did you have to drop shared buffers to 500MB, instead of something just below 2GB?
And what's "your kernel's SHMALL parameter"?On 2/24/23 20:08, Chris Hoover wrote:Hello All,Have a strange issue that I can not solve.I have a new server that has 1TB ram (not a typo). I have been fighting with this evening to start up with a shared_buffer setting over 2GB. When the memory allocation breaks 2GB, the database won’t start. I’ve dropped it down to 500M to get the database online. We’ve tried both SysV memory and mmap memory.Any ideas on what is going on and how to resolve? I’m sure I’m missing something obvious, but it is totally eluding me.OS: Ubuntu 18.04Memory: 1007GBCPU: Intel(R) Xeon(R) CPU X7550 @ 2.00GHz (64 cores total)Kernel Memory:kernel.shmmax = 274877906944kernel.shmall = 17179869184Postgres Config:name | settingautovacuum_max_workers | 3autovacuum_work_mem | -1effective_cache_size | 98992128fsync | onfull_page_writes | onmaintenance_work_mem | 2097152shared_buffers | 64000shared_memory_type | sysvtemp_buffers | 4096wal_buffers | 2048work_mem | 119990If you need additional parameters, please askSysV Error:Feb 25 01:55:16 appdb-server01.production.aweberint.com postgres[34680]: FATAL: could not create shared memory segment: Cannot allocate memoryDETAIL: Failed system call was shmget(key=2359323, size=2063564800, 03600).HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMALL parameter. You might need to reconfigure the kernel with larger SHMALL.Mmap Error:FATAL: could not map anonymous shared memory: Cannot allocate memoryHINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 1635737600 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.--
Born in Arizona, moved to Babylonia.
Regards,
Michael Vitale
703-600-9343

Attachment
> Chris Hoover wrote on 2/25/2023 8:07 AM: > >Ron, > > > >Honestly, I grabbed 500MB to get the database up, by this time I > >was very frustrated and just wanted it up. :) I think I was at > >1500MB and it still would not start. So to save what was left of > >my evening out, I just picked 500. > > > >Here is shmall: > >kernel.shmall = 17179869184 > > > >Thanks, > > > > > >Chris Hoover > >Senior DBA > >AWeber.com > >Cell: (803) 528-2269 > >Email: chrish@aweber.com Hi Chris, What are your limits set to for the account that the database runs as? Regards, Ken
MichaelDBA,
The issue is PostgreSQL won’t start with a large shared_buffer. If I try to start using more than ~2GB, it won’t start (See errors in original post). That’s the issue we are trying to resolve.
This is a large server running on bare metal that won’t let PG start with a large shared_buffer setting. Until we can resolve this, we can’t upgrade to this server.
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
On Feb 25, 2023, at 9:05 AM, MichaelDBA <MichaelDBA@sqlexec.com> wrote:I just don't get what all this fuss is about trying to get PG up on the smallest possible values for shared_buffers, especially considering this person has 1 TB of memory?
Chris Hoover wrote on 2/25/2023 8:07 AM:Ron,Honestly, I grabbed 500MB to get the database up, by this time I was very frustrated and just wanted it up. :) I think I was at 1500MB and it still would not start. So to save what was left of my evening out, I just picked 500.Here is shmall:kernel.shmall = 17179869184Why did you have to drop shared buffers to 500MB, instead of something just below 2GB?
And what's "your kernel's SHMALL parameter"?On 2/24/23 20:08, Chris Hoover wrote:Hello All,Have a strange issue that I can not solve.I have a new server that has 1TB ram (not a typo). I have been fighting with this evening to start up with a shared_buffer setting over 2GB. When the memory allocation breaks 2GB, the database won’t start. I’ve dropped it down to 500M to get the database online. We’ve tried both SysV memory and mmap memory.Any ideas on what is going on and how to resolve? I’m sure I’m missing something obvious, but it is totally eluding me.OS: Ubuntu 18.04Memory: 1007GBCPU: Intel(R) Xeon(R) CPU X7550 @ 2.00GHz (64 cores total)Kernel Memory:kernel.shmmax = 274877906944kernel.shmall = 17179869184Postgres Config:name | settingautovacuum_max_workers | 3autovacuum_work_mem | -1effective_cache_size | 98992128fsync | onfull_page_writes | onmaintenance_work_mem | 2097152shared_buffers | 64000shared_memory_type | sysvtemp_buffers | 4096wal_buffers | 2048work_mem | 119990If you need additional parameters, please askSysV Error:Feb 25 01:55:16 appdb-server01.production.aweberint.com postgres[34680]: FATAL: could not create shared memory segment: Cannot allocate memoryDETAIL: Failed system call was shmget(key=2359323, size=2063564800, 03600).HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMALL parameter. You might need to reconfigure the kernel with larger SHMALL.Mmap Error:FATAL: could not map anonymous shared memory: Cannot allocate memoryHINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 1635737600 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.--
Born in Arizona, moved to Babylonia.
On Sat, Feb 25, 2023 at 9:05 AM MichaelDBA <MichaelDBA@sqlexec.com> wrote:
I just don't get what all this fuss is about trying to get PG up on the smallest possible values for shared_buffers, especially considering this person has 1 TB of memory?
The fuss is why he can't make it bigger. No one wants to make it as small as possible.
Chris Hoover wrote on 2/25/2023 8:07 AM:Here is shmall:kernel.shmall = 17179869184
How did that get so small? On my Ubuntu18.04, it comes out of the box at 2^64 - 1. Why would anyone have set about making it smaller? Now the current value should still be big enough, but I wonder what else got changed while someone was monkeying with things that didn't need to be monkeyed with.
Cheers,
Jeff
I sincerely apologize. I should have read this stuff more closely.
Jeff Janes wrote on 2/25/2023 9:51 AM:

Jeff Janes wrote on 2/25/2023 9:51 AM:
On Sat, Feb 25, 2023 at 9:05 AM MichaelDBA <MichaelDBA@sqlexec.com> wrote:I just don't get what all this fuss is about trying to get PG up on the smallest possible values for shared_buffers, especially considering this person has 1 TB of memory?The fuss is why he can't make it bigger. No one wants to make it as small as possible.Chris Hoover wrote on 2/25/2023 8:07 AM:Here is shmall:kernel.shmall = 17179869184How did that get so small? On my Ubuntu18.04, it comes out of the box at 2^64 - 1. Why would anyone have set about making it smaller? Now the current value should still be big enough, but I wonder what else got changed while someone was monkeying with things that didn't need to be monkeyed with.Cheers,Jeff
Regards,
Michael Vitale
703-600-9343

Attachment
Ken,
$ id
uid=998(postgres) gid=1003(postgres) groups=1003(postgres),115(ssl-cert)
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 4128123
max locked memory (kbytes, -l) 65536
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4128123
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
$ sysctl -a | grep shm
kernel.shm_next_id = -1
kernel.shm_rmid_forced = 0
kernel.shmall = 17179869184
kernel.shmmax = 274877906944
kernel.shmmni = 4096
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
On Feb 25, 2023, at 9:49 AM, Kenneth Marshall <ktm@rice.edu> wrote:Chris Hoover wrote on 2/25/2023 8:07 AM:Ron,
Honestly, I grabbed 500MB to get the database up, by this time I
was very frustrated and just wanted it up. :) I think I was at
1500MB and it still would not start. So to save what was left of
my evening out, I just picked 500.
Here is shmall:
kernel.shmall = 17179869184
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
Hi Chris,
What are your limits set to for the account that the database runs as?
Regards,
Ken
NP, it’s early on a Saturday here on the East Coast. Just grateful for the assistance. :)
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
On Feb 25, 2023, at 9:53 AM, MichaelDBA <MichaelDBA@sqlexec.com> wrote:I sincerely apologize. I should have read this stuff more closely.
Jeff Janes wrote on 2/25/2023 9:51 AM:On Sat, Feb 25, 2023 at 9:05 AM MichaelDBA <MichaelDBA@sqlexec.com> wrote:I just don't get what all this fuss is about trying to get PG up on the smallest possible values for shared_buffers, especially considering this person has 1 TB of memory?The fuss is why he can't make it bigger. No one wants to make it as small as possible.Chris Hoover wrote on 2/25/2023 8:07 AM:Here is shmall:kernel.shmall = 17179869184How did that get so small? On my Ubuntu18.04, it comes out of the box at 2^64 - 1. Why would anyone have set about making it smaller? Now the current value should still be big enough, but I wonder what else got changed while someone was monkeying with things that didn't need to be monkeyed with.Cheers,Jeff
Bear in mind that if you are using systemd to start postgres, then these user limits may not apply.
Th logged in user limits get set by the PAM subsystem, but when systemd starts a subsystem no actual login occurs and PAM is never invoked.
The systemd unit file must have limits set within in.
In the [service] section use one of
LimitCPU= Seconds
LimitFSIZE= Bytes
LimitSTACK= Bytes
LimitCORE= Bytes
LimitRSS= Bytes
LimitNOFILE= intenger
LimitAS= Bytes
LimitNPROC= intenger
LimitMEMLOCK= Bytes
LimitLOCKS= intenger
LimitSIGPENDING= intenger
LimitMSGQUEUE= Bytes
LimitNICE= Nice Level
LimitRTPRIO= Realtime Priority
LimitRTTIME= Microseconds
LimitFSIZE= Bytes
LimitSTACK= Bytes
LimitCORE= Bytes
LimitRSS= Bytes
LimitNOFILE= intenger
LimitAS= Bytes
LimitNPROC= intenger
LimitMEMLOCK= Bytes
LimitLOCKS= intenger
LimitSIGPENDING= intenger
LimitMSGQUEUE= Bytes
LimitNICE= Nice Level
LimitRTPRIO= Realtime Priority
LimitRTTIME= Microseconds
On 2023-02-25 06:55, Chris Hoover wrote:
Ken,$ iduid=998(postgres) gid=1003(postgres) groups=1003(postgres),115(ssl-cert)$ ulimit -acore file size (blocks, -c) 0data seg size (kbytes, -d) unlimitedscheduling priority (-e) 0file size (blocks, -f) unlimitedpending signals (-i) 4128123max locked memory (kbytes, -l) 65536max memory size (kbytes, -m) unlimitedopen files (-n) 1024pipe size (512 bytes, -p) 8POSIX message queues (bytes, -q) 819200real-time priority (-r) 0stack size (kbytes, -s) 8192cpu time (seconds, -t) unlimitedmax user processes (-u) 4128123virtual memory (kbytes, -v) unlimitedfile locks (-x) unlimited$ sysctl -a | grep shmkernel.shm_next_id = -1kernel.shm_rmid_forced = 0kernel.shmall = 17179869184kernel.shmmax = 274877906944kernel.shmmni = 4096On Feb 25, 2023, at 9:49 AM, Kenneth Marshall <ktm@rice.edu> wrote:Chris Hoover wrote on 2/25/2023 8:07 AM:Ron,
Honestly, I grabbed 500MB to get the database up, by this time I
was very frustrated and just wanted it up. :) I think I was at
1500MB and it still would not start. So to save what was left of
my evening out, I just picked 500.
Here is shmall:
kernel.shmall = 17179869184
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
Hi Chris,
What are your limits set to for the account that the database runs as?
Regards,
Ken
-- Evan Rempel 250.721.7691 Senior Systems Administrator erempel@uvic.ca Data Centre Services, University Systems, University of Victoria
Hi.
Just for ask dummy questions:
What says the next commands?
cat /etc/*release*
uname -a
ulimit -a # run this as root and postres user
Best regards.
El sáb., 25 de feb. de 2023 10:17 a. m., Evan Rempel <erempel@uvic.ca> escribió:
Bear in mind that if you are using systemd to start postgres, then these user limits may not apply.Th logged in user limits get set by the PAM subsystem, but when systemd starts a subsystem no actual login occurs and PAM is never invoked.The systemd unit file must have limits set within in.In the [service] section use one ofLimitCPU= Seconds
LimitFSIZE= Bytes
LimitSTACK= Bytes
LimitCORE= Bytes
LimitRSS= Bytes
LimitNOFILE= intenger
LimitAS= Bytes
LimitNPROC= intenger
LimitMEMLOCK= Bytes
LimitLOCKS= intenger
LimitSIGPENDING= intenger
LimitMSGQUEUE= Bytes
LimitNICE= Nice Level
LimitRTPRIO= Realtime Priority
LimitRTTIME= MicrosecondsOn 2023-02-25 06:55, Chris Hoover wrote:Ken,$ iduid=998(postgres) gid=1003(postgres) groups=1003(postgres),115(ssl-cert)$ ulimit -acore file size (blocks, -c) 0data seg size (kbytes, -d) unlimitedscheduling priority (-e) 0file size (blocks, -f) unlimitedpending signals (-i) 4128123max locked memory (kbytes, -l) 65536max memory size (kbytes, -m) unlimitedopen files (-n) 1024pipe size (512 bytes, -p) 8POSIX message queues (bytes, -q) 819200real-time priority (-r) 0stack size (kbytes, -s) 8192cpu time (seconds, -t) unlimitedmax user processes (-u) 4128123virtual memory (kbytes, -v) unlimitedfile locks (-x) unlimited$ sysctl -a | grep shmkernel.shm_next_id = -1kernel.shm_rmid_forced = 0kernel.shmall = 17179869184kernel.shmmax = 274877906944kernel.shmmni = 4096On Feb 25, 2023, at 9:49 AM, Kenneth Marshall <ktm@rice.edu> wrote:Chris Hoover wrote on 2/25/2023 8:07 AM:Ron,
Honestly, I grabbed 500MB to get the database up, by this time I
was very frustrated and just wanted it up. :) I think I was at
1500MB and it still would not start. So to save what was left of
my evening out, I just picked 500.
Here is shmall:
kernel.shmall = 17179869184
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
Hi Chris,
What are your limits set to for the account that the database runs as?
Regards,
Ken
-- Evan Rempel 250.721.7691 Senior Systems Administrator erempel@uvic.ca Data Centre Services, University Systems, University of Victoria
Evan Rempel <erempel@uvic.ca> writes: > Bear in mind that if you are using systemd to start postgres, then these > user limits may not apply. Yeah, it seems likely that the PG server is being started under more-restrictive limits than what these manual reports suggest. It would be useful to try logging in as the Postgres OS user and manually starting the server -- just do "postgres &" and see what happens. (If it does start, "pg_ctl stop" can be used to shut it down again, or you can manually send SIGTERM to the postmaster process.) I tried to reproduce the problem by intentionally setting "ulimit -v" too small for my PG settings, and I got error messages that were similar but not identical to what Chris reported. (I think the sysv case failed at shmat() not shmget().) So it's probably not ulimit per se that's responsible. But if the server is being started under systemd, then I can entirely believe that systemd has some poorly-documented feature that sets additional limits for daemon processes. I'm still wondering about cgroups, too. regards, tom lane
Do you have hugepages configured in the OS? Is the Postgres hugepages setting set to on or off?
On Sat, Feb 25, 2023 at 11:35 AM Carlos Martinez <camarti@gmail.com> wrote:
Hi.Just for ask dummy questions:What says the next commands?cat /etc/*release*uname -aulimit -a # run this as root and postres userBest regards.El sáb., 25 de feb. de 2023 10:17 a. m., Evan Rempel <erempel@uvic.ca> escribió:Bear in mind that if you are using systemd to start postgres, then these user limits may not apply.Th logged in user limits get set by the PAM subsystem, but when systemd starts a subsystem no actual login occurs and PAM is never invoked.The systemd unit file must have limits set within in.In the [service] section use one ofLimitCPU= Seconds
LimitFSIZE= Bytes
LimitSTACK= Bytes
LimitCORE= Bytes
LimitRSS= Bytes
LimitNOFILE= intenger
LimitAS= Bytes
LimitNPROC= intenger
LimitMEMLOCK= Bytes
LimitLOCKS= intenger
LimitSIGPENDING= intenger
LimitMSGQUEUE= Bytes
LimitNICE= Nice Level
LimitRTPRIO= Realtime Priority
LimitRTTIME= MicrosecondsOn 2023-02-25 06:55, Chris Hoover wrote:Ken,$ iduid=998(postgres) gid=1003(postgres) groups=1003(postgres),115(ssl-cert)$ ulimit -acore file size (blocks, -c) 0data seg size (kbytes, -d) unlimitedscheduling priority (-e) 0file size (blocks, -f) unlimitedpending signals (-i) 4128123max locked memory (kbytes, -l) 65536max memory size (kbytes, -m) unlimitedopen files (-n) 1024pipe size (512 bytes, -p) 8POSIX message queues (bytes, -q) 819200real-time priority (-r) 0stack size (kbytes, -s) 8192cpu time (seconds, -t) unlimitedmax user processes (-u) 4128123virtual memory (kbytes, -v) unlimitedfile locks (-x) unlimited$ sysctl -a | grep shmkernel.shm_next_id = -1kernel.shm_rmid_forced = 0kernel.shmall = 17179869184kernel.shmmax = 274877906944kernel.shmmni = 4096On Feb 25, 2023, at 9:49 AM, Kenneth Marshall <ktm@rice.edu> wrote:Chris Hoover wrote on 2/25/2023 8:07 AM:Ron,
Honestly, I grabbed 500MB to get the database up, by this time I
was very frustrated and just wanted it up. :) I think I was at
1500MB and it still would not start. So to save what was left of
my evening out, I just picked 500.
Here is shmall:
kernel.shmall = 17179869184
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
Hi Chris,
What are your limits set to for the account that the database runs as?
Regards,
Ken
-- Evan Rempel 250.721.7691 Senior Systems Administrator erempel@uvic.ca Data Centre Services, University Systems, University of Victoria
What is the output of; free -h
On Sat, Feb 25, 2023 at 12:56 PM Tim <timfosho@gmail.com> wrote:
Do you have hugepages configured in the OS? Is the Postgres hugepages setting set to on or off?On Sat, Feb 25, 2023 at 11:35 AM Carlos Martinez <camarti@gmail.com> wrote:Hi.Just for ask dummy questions:What says the next commands?cat /etc/*release*uname -aulimit -a # run this as root and postres userBest regards.El sáb., 25 de feb. de 2023 10:17 a. m., Evan Rempel <erempel@uvic.ca> escribió:Bear in mind that if you are using systemd to start postgres, then these user limits may not apply.Th logged in user limits get set by the PAM subsystem, but when systemd starts a subsystem no actual login occurs and PAM is never invoked.The systemd unit file must have limits set within in.In the [service] section use one ofLimitCPU= Seconds
LimitFSIZE= Bytes
LimitSTACK= Bytes
LimitCORE= Bytes
LimitRSS= Bytes
LimitNOFILE= intenger
LimitAS= Bytes
LimitNPROC= intenger
LimitMEMLOCK= Bytes
LimitLOCKS= intenger
LimitSIGPENDING= intenger
LimitMSGQUEUE= Bytes
LimitNICE= Nice Level
LimitRTPRIO= Realtime Priority
LimitRTTIME= MicrosecondsOn 2023-02-25 06:55, Chris Hoover wrote:Ken,$ iduid=998(postgres) gid=1003(postgres) groups=1003(postgres),115(ssl-cert)$ ulimit -acore file size (blocks, -c) 0data seg size (kbytes, -d) unlimitedscheduling priority (-e) 0file size (blocks, -f) unlimitedpending signals (-i) 4128123max locked memory (kbytes, -l) 65536max memory size (kbytes, -m) unlimitedopen files (-n) 1024pipe size (512 bytes, -p) 8POSIX message queues (bytes, -q) 819200real-time priority (-r) 0stack size (kbytes, -s) 8192cpu time (seconds, -t) unlimitedmax user processes (-u) 4128123virtual memory (kbytes, -v) unlimitedfile locks (-x) unlimited$ sysctl -a | grep shmkernel.shm_next_id = -1kernel.shm_rmid_forced = 0kernel.shmall = 17179869184kernel.shmmax = 274877906944kernel.shmmni = 4096On Feb 25, 2023, at 9:49 AM, Kenneth Marshall <ktm@rice.edu> wrote:Chris Hoover wrote on 2/25/2023 8:07 AM:Ron,
Honestly, I grabbed 500MB to get the database up, by this time I
was very frustrated and just wanted it up. :) I think I was at
1500MB and it still would not start. So to save what was left of
my evening out, I just picked 500.
Here is shmall:
kernel.shmall = 17179869184
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
Hi Chris,
What are your limits set to for the account that the database runs as?
Regards,
Ken
-- Evan Rempel 250.721.7691 Senior Systems Administrator erempel@uvic.ca Data Centre Services, University Systems, University of Victoria
Tom,
Got a chance to work on this today. Here is what I’m getting. I used the same command that is used when the config has small memory.
$ /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/<dir> --config-file=/etc/postgresql/14/appdb/postgresql.conf --listen_addresses=0.0.0.0 --port=5432 —cluster_name=<name> --wal_level=logical --hot_standby=on --max_connections=150 --max_wal_senders=10 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp=off --max_replication_slots=10 --max_worker_processes=12 --wal_log_hints=on
FATAL: could not map anonymous shared memory: Cannot allocate memory
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 21967716352 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
LOG: database system is shut down
$ /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/<dir> --config-file=/etc/postgresql/14/appdb/postgresql.conf --listen_addresses=0.0.0.0 --port=5432 —cluster_name=<name> --wal_level=logical --hot_standby=on --max_connections=150 --max_wal_senders=10 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp=off --max_replication_slots=10 --max_worker_processes=12 --wal_log_hints=on
FATAL: could not create shared memory segment: Cannot allocate memory
DETAIL: Failed system call was shmget(key=2359323, size=21967716352, 03600).
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMALL parameter. You might need to reconfigure the kernel with larger SHMALL.
The PostgreSQL documentation contains more information about shared memory configuration.
LOG: database system is shut down
$ ulimit -v
unlimited
I’m not sure about the cgroup stuff. I installed lscgroup and this is what is shows. (BTW, we are using Patroni to control PostgreSQL, but it does not appear to be part of the problem since I’ve duplicated the errors without it):
$ lscgroup
devices:/
devices:/user.slice
devices:/system.slice
devices:/system.slice/irqbalance.service
devices:/system.slice/system-systemd\x2dfsck.slice
devices:/system.slice/syslog-ng.service
devices:/system.slice/systemd-networkd.service
devices:/system.slice/systemd-udevd.service
devices:/system.slice/cron.service
devices:/system.slice/oddjobd.service
devices:/system.slice/sys-fs-fuse-connections.mount
devices:/system.slice/sys-kernel-config.mount
devices:/system.slice/networkd-dispatcher.service
devices:/system.slice/sys-kernel-debug.mount
devices:/system.slice/certmonger.service
devices:/system.slice/accounts-daemon.service
devices:/system.slice/swapfile.swap
devices:/system.slice/numad.service
devices:/system.slice/systemd-journald.service
devices:/system.slice/unattended-upgrades.service
devices:/system.slice/sensu-client.service
devices:/system.slice/ssh.service
devices:/system.slice/dev-mqueue.mount
devices:/system.slice/rpc-gssd.service
devices:/system.slice/vnstat.service
devices:/system.slice/var-lib-postgresql.mount
devices:/system.slice/rpcbind.service
devices:/system.slice/chrony.service
devices:/system.slice/sssd.service
devices:/system.slice/proc-sys-fs-binfmt_misc.mount
devices:/system.slice/run-rpc_pipefs.mount
devices:/system.slice/autofs.service
devices:/system.slice/patroni.service
devices:/system.slice/consul.service
devices:/system.slice/telegraf.service
devices:/system.slice/dev-hugepages.mount
devices:/system.slice/dbus.service
devices:/system.slice/system-getty.slice
devices:/system.slice/systemd-logind.service
cpuset:/
cpu,cpuacct:/
cpu,cpuacct:/user.slice
cpu,cpuacct:/user.slice/user-5088.slice
cpu,cpuacct:/system.slice
memory:/
pids:/
pids:/user.slice
pids:/user.slice/user-5088.slice
pids:/user.slice/user-5088.slice/user@5088.service
pids:/user.slice/user-5088.slice/session-267.scope
pids:/user.slice/user-5088.slice/session-269.scope
pids:/user.slice/user-5088.slice/session-88.scope
pids:/system.slice
pids:/system.slice/irqbalance.service
pids:/system.slice/system-systemd\x2dfsck.slice
pids:/system.slice/syslog-ng.service
pids:/system.slice/systemd-networkd.service
pids:/system.slice/systemd-udevd.service
pids:/system.slice/cron.service
pids:/system.slice/oddjobd.service
pids:/system.slice/sys-fs-fuse-connections.mount
pids:/system.slice/sys-kernel-config.mount
pids:/system.slice/networkd-dispatcher.service
pids:/system.slice/sys-kernel-debug.mount
pids:/system.slice/certmonger.service
pids:/system.slice/accounts-daemon.service
pids:/system.slice/swapfile.swap
pids:/system.slice/numad.service
pids:/system.slice/systemd-journald.service
pids:/system.slice/unattended-upgrades.service
pids:/system.slice/sensu-client.service
pids:/system.slice/ssh.service
pids:/system.slice/dev-mqueue.mount
pids:/system.slice/rpc-gssd.service
pids:/system.slice/vnstat.service
pids:/system.slice/var-lib-postgresql.mount
pids:/system.slice/rpcbind.service
pids:/system.slice/chrony.service
pids:/system.slice/sssd.service
pids:/system.slice/proc-sys-fs-binfmt_misc.mount
pids:/system.slice/run-rpc_pipefs.mount
pids:/system.slice/autofs.service
pids:/system.slice/patroni.service
pids:/system.slice/consul.service
pids:/system.slice/telegraf.service
pids:/system.slice/dev-hugepages.mount
pids:/system.slice/dbus.service
pids:/system.slice/system-getty.slice
pids:/system.slice/system-getty.slice/getty@tty1.service
pids:/system.slice/systemd-logind.service
rdma:/
hugetlb:/
perf_event:/
blkio:/
freezer:/
net_cls,net_prio:/
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
On Feb 25, 2023, at 11:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Evan Rempel <erempel@uvic.ca> writes:Bear in mind that if you are using systemd to start postgres, then these
user limits may not apply.
Yeah, it seems likely that the PG server is being started under
more-restrictive limits than what these manual reports suggest.
It would be useful to try logging in as the Postgres OS user and
manually starting the server -- just do "postgres &" and see what
happens. (If it does start, "pg_ctl stop" can be used to shut it
down again, or you can manually send SIGTERM to the postmaster
process.)
I tried to reproduce the problem by intentionally setting
"ulimit -v" too small for my PG settings, and I got error messages
that were similar but not identical to what Chris reported.
(I think the sysv case failed at shmat() not shmget().) So it's
probably not ulimit per se that's responsible. But if the
server is being started under systemd, then I can entirely
believe that systemd has some poorly-documented feature that
sets additional limits for daemon processes.
I'm still wondering about cgroups, too.
regards, tom lane
Chris Hoover <chrish@aweber.com> writes: > Got a chance to work on this today. Here is what I’m getting. I used the same command that is used when the config hassmall memory. OK, so it still happens when interactive, which seems to exclude systemd. Have you tried setting huge_pages = off, as somebody suggested upthread? (The default setting of "try" ought to work anyway, but we've heard a few reports suggesting sometimes it doesn't.) Googling for ways to limit resources under Linux led me to mentions of /etc/security/limits.conf, which might be worth checking. The comments in that say it only applies to logins via PAM, which should include your interactive session and I'm not too sure about Patroni. Otherwise I'm kind of running out of ideas :-(. *Something* is clearly interfering with your resource limits, but I don't know what. regards, tom lane
Tom and everyone,
We are leaning towards some sort of hardware/physical memory issue. Our server admins are going to take the server and do some in depth testing of the hardware.
Thanks again for all the assistance over the weekend.
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
On Feb 26, 2023, at 7:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Chris Hoover <chrish@aweber.com> writes:Got a chance to work on this today. Here is what I’m getting. I used the same command that is used when the config has small memory.
OK, so it still happens when interactive, which seems to exclude systemd.
Have you tried setting huge_pages = off, as somebody suggested upthread?
(The default setting of "try" ought to work anyway, but we've heard a few
reports suggesting sometimes it doesn't.)
Googling for ways to limit resources under Linux led me to mentions of
/etc/security/limits.conf, which might be worth checking. The comments
in that say it only applies to logins via PAM, which should include your
interactive session and I'm not too sure about Patroni.
Otherwise I'm kind of running out of ideas :-(. *Something* is clearly
interfering with your resource limits, but I don't know what.
regards, tom lane
Just to wrap this up, we finally found the issue.
vm.overcommit_ratio was set to 0. After setting vm.overcommit_ratio to 50, the issue resolved.
Thanks,
Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com
On Feb 27, 2023, at 11:25 AM, Chris Hoover <chrish@aweber.com> wrote:Tom and everyone,We are leaning towards some sort of hardware/physical memory issue. Our server admins are going to take the server and do some in depth testing of the hardware.Thanks again for all the assistance over the weekend.Thanks,Chris HooverSenior DBAAWeber.comCell: (803) 528-2269Email: chrish@aweber.comOn Feb 26, 2023, at 7:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Chris Hoover <chrish@aweber.com> writes:Got a chance to work on this today. Here is what I’m getting. I used the same command that is used when the config has small memory.
OK, so it still happens when interactive, which seems to exclude systemd.
Have you tried setting huge_pages = off, as somebody suggested upthread?
(The default setting of "try" ought to work anyway, but we've heard a few
reports suggesting sometimes it doesn't.)
Googling for ways to limit resources under Linux led me to mentions of
/etc/security/limits.conf, which might be worth checking. The comments
in that say it only applies to logins via PAM, which should include your
interactive session and I'm not too sure about Patroni.
Otherwise I'm kind of running out of ideas :-(. *Something* is clearly
interfering with your resource limits, but I don't know what.
regards, tom lane
The default is 50, so somebody changed it?
Chris Hoover wrote on 3/2/2023 4:53 PM:

Chris Hoover wrote on 3/2/2023 4:53 PM:
Just to wrap this up, we finally found the issue.vm.overcommit_ratio was set to 0. After setting vm.overcommit_ratio to 50, the issue resolved.On Feb 27, 2023, at 11:25 AM, Chris Hoover <chrish@aweber.com> wrote:Tom and everyone,We are leaning towards some sort of hardware/physical memory issue. Our server admins are going to take the server and do some in depth testing of the hardware.Thanks again for all the assistance over the weekend.On Feb 26, 2023, at 7:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Chris Hoover <chrish@aweber.com> writes:Got a chance to work on this today. Here is what I’m getting. I used the same command that is used when the config has small memory.
OK, so it still happens when interactive, which seems to exclude systemd.
Have you tried setting huge_pages = off, as somebody suggested upthread?
(The default setting of "try" ought to work anyway, but we've heard a few
reports suggesting sometimes it doesn't.)
Googling for ways to limit resources under Linux led me to mentions of
/etc/security/limits.conf, which might be worth checking. The comments
in that say it only applies to logins via PAM, which should include your
interactive session and I'm not too sure about Patroni.
Otherwise I'm kind of running out of ideas :-(. *Something* is clearly
interfering with your resource limits, but I don't know what.
regards, tom lane
Regards,
Michael Vitale
703-600-9343
