Thread: Need for help!
Dear all,
I have a postgres 8.2.5 and ~6 GB database with lots of simple selects using indexes. I see that they use the shared memory so much.
Before, my server has 4GB of RAM, shmmax 1GB, Shared_buffers is set to 256 MB, effective_cache_size 300MB, when i test it's performance with option -c 40 -t 1000, it's results is about 54.542 tps, but when i up number of clients to over 64 it refuses to run? Now, my server has 8GB, shmmax 3 GB, shared_buffers is 2GB ----> it uses ~ 7GB cache, after benchmark ( c 40 t 1000 ) the results is 57.658 (???). But after upgrade the max clients is also 64 (?!?) Is this the maximum clients support by program pgbench (my server on Linux ver8.2.5, pgbench on Windows - version postgresql is 8.3.1)? And the number 57 tps is fast?
Another questions, i heard that PostgreSQL does not support HT Technology, is it right?
Last question, i don't understand so much the shmmax, shared_buffers, after upgrading my server from 4 GB RAM to 8 GB RAM, first i configure shmmax to 2GB, share_buffers to 1GB and start server, it runs, after that i set shmmax to 4GB and restart, it fails (?!?). The error logs said that not enough share memory! and final i set shmmax to 3GB and share buffer to 2GB, it runs. Don't know why, can you explain?
Thanks so much!
Regards,
I have a postgres 8.2.5 and ~6 GB database with lots of simple selects using indexes. I see that they use the shared memory so much.
Before, my server has 4GB of RAM, shmmax 1GB, Shared_buffers is set to 256 MB, effective_cache_size 300MB, when i test it's performance with option -c 40 -t 1000, it's results is about 54.542 tps, but when i up number of clients to over 64 it refuses to run? Now, my server has 8GB, shmmax 3 GB, shared_buffers is 2GB ----> it uses ~ 7GB cache, after benchmark ( c 40 t 1000 ) the results is 57.658 (???). But after upgrade the max clients is also 64 (?!?) Is this the maximum clients support by program pgbench (my server on Linux ver8.2.5, pgbench on Windows - version postgresql is 8.3.1)? And the number 57 tps is fast?
Another questions, i heard that PostgreSQL does not support HT Technology, is it right?
Last question, i don't understand so much the shmmax, shared_buffers, after upgrading my server from 4 GB RAM to 8 GB RAM, first i configure shmmax to 2GB, share_buffers to 1GB and start server, it runs, after that i set shmmax to 4GB and restart, it fails (?!?). The error logs said that not enough share memory! and final i set shmmax to 3GB and share buffer to 2GB, it runs. Don't know why, can you explain?
Thanks so much!
Regards,
On Tue, May 13, 2008 at 2:43 PM, Semi Noob <seminoob@gmail.com> wrote: > But after upgrade the max clients is > also 64 (?!?) Is this the maximum clients support by program pgbench (my > server on Linux ver8.2.5, pgbench on Windows - version postgresql is 8.3.1)? > And the number 57 tps is fast? > You did not give CPU and disk info. But still 57 seems a small number. What I guess is you're running pgbench with scale factor 1 (since you haven't mentioned scale factor) and that causes extreme contention for smaller tables with large number of clients. Regarding maximum number of clients, check your "max_connections" setting. > Another questions, i heard that PostgreSQL does not support HT Technology, > is it right? > I'm not sure what do you mean by HT, but if it's hyper threading, then IMO that statement is not completely true. Postgres is not multi-threaded, so a single process (or connection) may not be able to use all the CPUs, but as long as there are multiple connections (each connection corresponds to one backend process), as many CPUs will be used. > Last question, i don't understand so much the shmmax, shared_buffers, after > upgrading my server from 4 GB RAM to 8 GB RAM, first i configure shmmax to > 2GB, share_buffers to 1GB and start server, it runs, after that i set shmmax > to 4GB and restart, it fails (?!?). The error logs said that not enough > share memory! and final i set shmmax to 3GB and share buffer to 2GB, it > runs. Don't know why, can you explain? That doesn't make sense. I am guessing that you are running a 32 bit OS. 4GB of shmmax won't work on a 32 bit OS. Thanks, Pavan Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
On Thu, May 15, 2008 at 3:48 PM, Semi Noob <seminoob@gmail.com> wrote: > > > I set max_connections is 200. What error message you get when you try with more than 64 clients ? > 57 seems a small number, according to you, how much tps is normal or fast? Its difficult to say how much is good. On my laptop for s = 10, c = 40, t = 1000, I get 51 tps. But on a larger 2 CPU, 2 GB, 3 RAID 0 disks for data and a separate disk for xlog, I get 232 tps. > and what is the different of "shared_buffers" and "effective_cache_size". > "shared_buffers" is the size of the buffer pool which Postgres uses to cache the data blocks. "effective_cache_size" is usually size of the shared buffer plus estimate of whatever data OS can cache. Planner uses this approximation to choose right plan for execution. http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Thank you for your answer!
"You did not give CPU and disk info. But still 57 seems a small number.
What I guess is you're running pgbench with scale factor 1 (since you
haven't mentioned scale factor) and that causes extreme contention for
smaller tables with large number of clients."
My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is RAID-5; OS CentOS. the number of scale in pgbench initialization is 100. It will be generate 10 000 000 rows in the accounts table. Fill factor is default.
In the other way, I heard that: PostgreSQL working with RAID-10 better than RAID-5 is it right?
"Regarding maximum number of clients, check your "max_connections" setting."
I set max_connections is 200.
57 seems a small number, according to you, how much tps is normal or fast? and what is the different of "shared_buffers" and "effective_cache_size".
Thank you once more!
Regards,
Semi Noob
"You did not give CPU and disk info. But still 57 seems a small number.
What I guess is you're running pgbench with scale factor 1 (since you
haven't mentioned scale factor) and that causes extreme contention for
smaller tables with large number of clients."
My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is RAID-5; OS CentOS. the number of scale in pgbench initialization is 100. It will be generate 10 000 000 rows in the accounts table. Fill factor is default.
In the other way, I heard that: PostgreSQL working with RAID-10 better than RAID-5 is it right?
"Regarding maximum number of clients, check your "max_connections" setting."
I set max_connections is 200.
57 seems a small number, according to you, how much tps is normal or fast? and what is the different of "shared_buffers" and "effective_cache_size".
Thank you once more!
Regards,
Semi Noob
2008/5/15 Pavan Deolasee <pavan.deolasee@gmail.com>:
On Tue, May 13, 2008 at 2:43 PM, Semi Noob <seminoob@gmail.com> wrote:You did not give CPU and disk info. But still 57 seems a small number.
> But after upgrade the max clients is
> also 64 (?!?) Is this the maximum clients support by program pgbench (my
> server on Linux ver8.2.5, pgbench on Windows - version postgresql is 8.3.1)?
> And the number 57 tps is fast?
>
What I guess is you're running pgbench with scale factor 1 (since you
haven't mentioned scale factor) and that causes extreme contention for
smaller tables with large number of clients.
Regarding maximum number of clients, check your "max_connections" setting.I'm not sure what do you mean by HT, but if it's hyper threading, then
> Another questions, i heard that PostgreSQL does not support HT Technology,
> is it right?
>
IMO that statement is not completely true. Postgres is not
multi-threaded, so a single process (or connection) may not be able to
use all the CPUs, but as long as there are multiple connections (each
connection corresponds to one backend process), as many CPUs will be
used.That doesn't make sense. I am guessing that you are running a 32 bit
> Last question, i don't understand so much the shmmax, shared_buffers, after
> upgrading my server from 4 GB RAM to 8 GB RAM, first i configure shmmax to
> 2GB, share_buffers to 1GB and start server, it runs, after that i set shmmax
> to 4GB and restart, it fails (?!?). The error logs said that not enough
> share memory! and final i set shmmax to 3GB and share buffer to 2GB, it
> runs. Don't know why, can you explain?
OS. 4GB of shmmax won't work on a 32 bit OS.
Thanks,
Pavan
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com
Pavan Deolasee wrote:
I have the max connection set 50. You want to be careful with this setting if theres allot of active users and the processes/users get hung open this will be eating up memory on the server This number needs to be set to the maximum number of users you ever want on the server at any one time. If you get to many hung open presses performance gets hurt.On Thu, May 15, 2008 at 3:48 PM, Semi Noob <seminoob@gmail.com> wrote:I set max_connections is 200.What error message you get when you try with more than 64 clients ?
As Pevan has started TPS number is directly related to how fast your hardware is. 51 tps is not very good given the hardware specs you have stated.57 seems a small number, according to you, how much tps is normal or fast?Its difficult to say how much is good. On my laptop for s = 10, c = 40, t = 1000, I get 51 tps. But on a larger 2 CPU, 2 GB, 3 RAID 0 disks for data and a separate disk for xlog, I get 232 tps.
Now the server i have gets 1500 to 2000 tps depending on the the test . We had a pretty detail discussion about performance numbers back in March
http://archives.postgresql.org/pgsql-performance/2008-03/thrd3.php#00370
the thread is called Benchmark: Dell/Perc 6, 8 disk RAID 10.
RIAD 5 is a terrible setup for performance RAID 10 seems to be what everybody goes with unless you get in SANs storage or other more complicated setups
and what is the different of "shared_buffers" and "effective_cache_size"."shared_buffers" is the size of the buffer pool which Postgres uses to cache the data blocks. "effective_cache_size" is usually size of the shared buffer plus estimate of whatever data OS can cache. Planner uses this approximation to choose right plan for execution. http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html Thanks, Pavan
Semi Noob wrote: > My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is RAID-5; Early versions of postgresql had issues with P4 HT CPU's but I believe they have been resolved. I am quite certain that it only related to the early P4's not the Xeon. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz
On Thu, May 15, 2008 at 11:08 AM, Shane Ambler <pgsql@sheeky.biz> wrote: > Semi Noob wrote: > >> My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is >> RAID-5; > > Early versions of postgresql had issues with P4 HT CPU's but I believe they > have been resolved. > > I am quite certain that it only related to the early P4's not the Xeon. The real problem was with the various OS kernels not know how to treat a HT "core" versus a real core. Linux in particular was a bad performer with HT turned on, and pgsql made it suffer more than many other apps for not knowing the difference. The linux kernel has known for some time now how to treat a HT core properly.
Scott Marlowe wrote:
From every thing i have read about Hyper Threading, it should be just turned off. There is so much over head to process, it killed its own performance if the application was not designed to take advantage of it.. A really cool idea that proved unfeasible at the time.On Thu, May 15, 2008 at 11:08 AM, Shane Ambler <pgsql@sheeky.biz> wrote:Semi Noob wrote:My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is RAID-5;Early versions of postgresql had issues with P4 HT CPU's but I believe they have been resolved. I am quite certain that it only related to the early P4's not the Xeon.The real problem was with the various OS kernels not know how to treat a HT "core" versus a real core. Linux in particular was a bad performer with HT turned on, and pgsql made it suffer more than many other apps for not knowing the difference. The linux kernel has known for some time now how to treat a HT core properly.
Intel is says its bringing back hyper threading for Nehalem http://en.wikipedia.org/wiki/Nehalem_(CPU_architecture)
If you can i would turn it off and see what the results are
Justin <justin@emproshunts.com> writes: > From every thing i have read about Hyper Threading, it should be just > turned off. Depends entirely on what you're doing. I usually leave it turned on because compiling Postgres from source is measurably faster with it than without it on my dual-Xeon box. I'd recommend experimenting with your own workload before making any decisions. regards, tom lane
Tom Lane wrote:
Sense PostgreSql is not a multi-threaded but a single thread application which spawns little exe's how is hyper threading helping Postgresql at all ??Justin <justin@emproshunts.com> writes:From every thing i have read about Hyper Threading, it should be just turned off.Depends entirely on what you're doing. I usually leave it turned on because compiling Postgres from source is measurably faster with it than without it on my dual-Xeon box. I'd recommend experimenting with your own workload before making any decisions. regards, tom lane
To perfectly honest my programming skills with milti-threading apps is non-existent along with Linux world but in the Windows world single threaded apps saw no measurable performance boost but the opposite it kill the apps performance and allot of multi-threaded apps also got there performance smashed? And if you really kill performance turn on HT running W2K?
On Thu, May 15, 2008 at 12:53 PM, Justin <justin@emproshunts.com> wrote: > > Sense PostgreSql is not a multi-threaded but a single thread application > which spawns little exe's how is hyper threading helping Postgresql at all > ?? Two ways: * The stats collector / autovacuum / bgwriter can operate on one CPU while you the user are using another (whether they're physically separate cores on different sockets, dual or quad cores on the same socket, or the sorta another CPU provided by hyperthreading P4s. * You can use > 1 connection, and each new connection spawns another process. Note that this actually makes postgresql scale to a large number of CPUs better than many multi-threaded apps, which can have a hard time using > 1 CPU at a time without a lot of extra help.
Justin wrote: > [Since] PostgreSql is not multi-threaded but a single thread > application which spawns little exe's how is hyper threading helping > Postgresql at all ?? Multiple threads and multiple processes are two ways to tackle a similar problem - that of how to do more than one thing on the CPU(s) at once. Applications that use multiple cooperating processes benefit from more CPUs, CPU cores, and CPU hardware execution threads (HT) just like applications that use multiple threads do, so long as there is enough work to keep multiple CPU cores busy. There's really not *that* much difference between a multi-threaded executable and an executable with multiple processes cooperating using shared memory (like PostgreSQL). Nor is there much difference in how they use multiple logical CPUs. The main difference between the two models is that multiple processes with shared memory don't share address space except where they have specifically mapped it. This means that it's relatively hard for one process to mangle other processes' state, especially if it's properly careful with its shared memory. By contrast, it's depressingly easy for one thread to corrupt the shared heap or even to corrupt other threads' stacks in a multi-threaded executable. On Windows, threads are usually preferred because Windows has such a horrible per-process overhead, but it's very good at creating threads quickly and cheaply. On UNIX, which has historically been bad at threading and very good at creating and destroying processes, the use of multiple processes is preferred. It's also worth noting that you can combine multi-process and multi-threaded operation. For example, if PostgreSQL was ever to support evaluating a single query on multiple CPU cores one way it could do that would be to spawn multiple threads within a single backend. (Note: I know it's not even remotely close to that easy - I've been doing way too much threaded coding lately). So ... honestly, whether PostgreSQL is multi-threaded or multi-process just doesn't matter. Even if it was muti-threaded instead of multi-process, so long as it can only execute each query on a maximum of one core then single queries will not benefit (much) from having multiple CPU cores, multiple physical CPUs, or CPUs with hyperthreading. However, multiple CPU bound queries running in parallel will benefit massively from extra cores or physical CPUs, and might also benefit from hyperthreading. > To perfectly honest my programming skills with milti-threading apps is > non-existent along with Linux world but in the Windows world single > threaded apps saw no measurable performance boost Of course. They cannot use the extra logical core for anything, so it's just overhead. You will find, though, that hyperthreading may improve system responsiveness under load even when using only single threaded apps, because two different single threaded apps can run (kind of) at the same time. It's pretty useless compared to real multiple cores, though. > and allot of multi-threaded apps also got > there performance smashed? That will depend a lot on details of CPU cache use, exactly what they were doing on their various threads, how their thread priorities were set up, etc. Some apps benefit, some lose. -- Craig Ringer
Craig Ringer wrote: > Justin wrote: > >> [Since] PostgreSql is not multi-threaded but a single thread >> application which spawns little exe's how is hyper threading >> helping Postgresql at all ?? > > Multiple threads and multiple processes are two ways to tackle a > similar problem - that of how to do more than one thing on the CPU(s) > at once. > > Applications that use multiple cooperating processes benefit from more > CPUs, CPU cores, and CPU hardware execution threads (HT) just like > applications that use multiple threads do, so long as there is enough > work to keep multiple CPU cores busy. > > There's really not *that* much difference between a multi-threaded > executable and an executable with multiple processes cooperating using > shared memory (like PostgreSQL). Nor is there much difference in how > they use multiple logical CPUs. > > The main difference between the two models is that multiple processes > with shared memory don't share address space except where they have > specifically mapped it. This means that it's relatively hard for one > process to mangle other processes' state, especially if it's properly > careful with its shared memory. By contrast, it's depressingly easy > for one thread to corrupt the shared heap or even to corrupt other > threads' stacks in a multi-threaded executable. > > On Windows, threads are usually preferred because Windows has such a > horrible per-process overhead, but it's very good at creating threads > quickly and cheaply. On UNIX, which has historically been bad at > threading and very good at creating and destroying processes, the use > of multiple processes is preferred. > > It's also worth noting that you can combine multi-process and > multi-threaded operation. For example, if PostgreSQL was ever to > support evaluating a single query on multiple CPU cores one way it > could do that would be to spawn multiple threads within a single > backend. (Note: I know it's not even remotely close to that easy - > I've been doing way too much threaded coding lately). > > So ... honestly, whether PostgreSQL is multi-threaded or multi-process > just doesn't matter. Even if it was muti-threaded instead of > multi-process, so long as it can only execute each query on a maximum > of one core then single queries will not benefit (much) from having > multiple CPU cores, multiple physical CPUs, or CPUs with > hyperthreading. However, multiple CPU bound queries running in > parallel will benefit massively from extra cores or physical CPUs, and > might also benefit from hyperthreading. > > >> To perfectly honest my programming skills with milti-threading apps >> is non-existent along with Linux world but in the Windows world >> single threaded apps saw no measurable performance boost > > Of course. They cannot use the extra logical core for anything, so > it's just overhead. You will find, though, that hyperthreading may > improve system responsiveness under load even when using only single > threaded apps, because two different single threaded apps can run > (kind of) at the same time. Isn't that the rub point in HT processor. A process running in HT virtual processor can lock up a specific chunk of the real processor up that would not normally be locked. So the process running in the Real processor gets blocked and put to sleep till the process running in HT is cleared out. Now the problem is on windows to my understanding is the kernel scheduler did not understand that HT was a virtual process so it screwed up scheduling orders and what not. This added allot of overhead to the processor to sort what was going on sending sleep commands to keep the processor from rolling over and dieing. As time has progress the kernel schedulers i imagine have improved to better understand that this virtual processor locks parts of the real processor up so it needs to schedule things a little better so it don't keep dumping things into the HT processor that it will need a second latter for another process it queried for the real processor. > > It's pretty useless compared to real multiple cores, though. > >> and allot of multi-threaded apps also got there performance smashed? > > That will depend a lot on details of CPU cache use, exactly what they > were doing on their various threads, how their thread priorities were > set up, etc. Some apps benefit, some lose. I understand these things in theory I just never had to do any of the programming. Life has taught me theory/book differs allot from real life/fact. > > -- > Craig Ringer >
On Thu, May 15, 2008 at 04:57:03PM -0400, Justin wrote: > Isn't that the rub point in HT processor. A process running in HT > virtual processor can lock up a specific chunk of the real processor up > that would not normally be locked. So the process running in the Real > processor gets blocked and put to sleep till the process running in HT > is cleared out. There is no "real" or "virtual" processor, HT is symmetric. Both look like real CPU's but internally they share certain resources. The advantage is that if a program gets a cache miss it will stall for dozens of cycles waiting for memory. On a real multicore CPU that's wasted time, on an HT CPU there is another thread which can (hopefully) keep running and use the resources the other isn't using. For programs like GCC which are waiting for memory 50% of the time of so, HT can provide a measurable increase in performace. For computationally expensive programs it may be worse. > As time has progress the kernel schedulers i imagine have improved to > better understand that this virtual processor locks parts of the real > processor up so it needs to schedule things a little better so it don't > keep dumping things into the HT processor that it will need a second > latter for another process it queried for the real processor. The thing is that HT processors share an L1 cache so switching between two HT processors on the same die is much less expensive than switching to another core. But if you have only two runnning processes it better to split them across two cores. Schedulars know this now, they didn't at first. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.