Re: 60 core performance with 9.3 - Mailing list pgsql-performance
From | Mark Kirkwood |
---|---|
Subject | Re: 60 core performance with 9.3 |
Date | |
Msg-id | 53D84E16.90609@catalyst.net.nz Whole thread Raw |
In response to | Re: 60 core performance with 9.3 (Mark Kirkwood <mark.kirkwood@catalyst.net.nz>) |
Responses |
Re: 60 core performance with 9.3
Re: 60 core performance with 9.3 |
List | pgsql-performance |
On 17/07/14 11:58, Mark Kirkwood wrote: > > Trying out with numa_balancing=0 seemed to get essentially the same > performance. Similarly wrapping postgres startup with --interleave. > > All this made me want to try with numa *really* disabled. So rebooted > the box with "numa=off" appended to the kernel cmdline. Somewhat > surprisingly (to me anyway), the numbers were essentially identical. The > profile, however is quite different: > A little more tweaking got some further improvement: rwlocks patch as before wal_buffers = 256MB checkpoint_segments = 1920 wal_sync_method = open_datasync LSI RAID adaptor disable read ahead and write cache for SSD fast path mode numa_balancing = 0 Pgbench scale 2000 again: clients | tps (prev) | tps (tweaked config) ---------+------------+--------- 6 | 8175 | 8281 12 | 14409 | 15896 24 | 17191 | 19522 48 | 23122 | 29776 96 | 22308 | 32352 192 | 23109 | 28804 Now recall we were seeing no actual tps changes with numa_balancing=0 or 1 (so the improvement above is from the other changes), but figured it might be informative to try to track down what the non-numa bottlenecks looked like. We tried profiling the entire 10 minute run which showed the stats collector as a possible source of contention: 3.86% postgres [kernel.kallsyms] [k] _raw_spin_lock_bh | --- _raw_spin_lock_bh | |--95.78%-- lock_sock_nested | udpv6_sendmsg | inet_sendmsg | sock_sendmsg | SYSC_sendto | sys_sendto | tracesys | __libc_send | | | |--99.17%-- pgstat_report_stat | | PostgresMain | | ServerLoop | | PostmasterMain | | main | | __libc_start_main | | | |--0.77%-- pgstat_send_bgwriter | | BackgroundWriterMain | | AuxiliaryProcessMain | | 0x7f08efe8d453 | | reaper | | __restore_rt | | PostmasterMain | | main | | __libc_start_main | --0.07%-- [...] | |--2.54%-- __lock_sock | | | |--91.95%-- lock_sock_nested | | udpv6_sendmsg | | inet_sendmsg | | sock_sendmsg | | SYSC_sendto | | sys_sendto | | tracesys | | __libc_send | | | | | |--99.73%-- pgstat_report_stat | | | PostgresMain | | | ServerLoop Disabling track_counts and rerunning pgbench: clients | tps (no counts) ---------+------------ 6 | 9806 12 | 18000 24 | 29281 48 | 43703 96 | 54539 192 | 36114 While these numbers look great in the middle range (12-96 clients), then benefit looks to be tailing off as client numbers increase. Also running with no stats (and hence no auto vacuum or analyze) is way too scary! Trying out less write heavy workloads shows that the stats overhead does not appear to be significant for *read* heavy cases, so this result above is perhaps more of a curiosity than anything (given that read heavy is more typical...and our real workload is more similar to read heavy). The profile for counts off looks like: 4.79% swapper [kernel.kallsyms] [k] read_hpet | --- read_hpet | |--97.10%-- ktime_get | | | |--35.24%-- clockevents_program_event | | tick_program_event | | | | | |--56.59%-- __hrtimer_start_range_ns | | | | | | | |--78.12%-- hrtimer_start_range_ns | | | | tick_nohz_restart | | | | tick_nohz_idle_exit | | | | cpu_startup_entry | | | | | | | | | |--98.84%-- start_secondary | | | | | | | | | --1.16%-- rest_init | | | | start_kernel | | | | x86_64_start_reservations | | | | x86_64_start_kernel | | | | | | | --21.88%-- hrtimer_start | | | tick_nohz_stop_sched_tick | | | __tick_nohz_idle_enter | | | | | | | |--99.89%-- tick_nohz_idle_enter | | | | cpu_startup_entry | | | | | | | | | |--98.30%-- start_secondary | | | | | | | | | --1.70%-- rest_init | | | | start_kernel | | | | x86_64_start_reservations | | | | x86_64_start_kernel | | | --0.11%-- [...] | | | | | |--40.25%-- hrtimer_force_reprogram | | | __remove_hrtimer | | | | | | | |--89.68%-- __hrtimer_start_range_ns | | | | hrtimer_start | | | | tick_nohz_stop_sched_tick | | | | __tick_nohz_idle_enter | | | | | | | | | |--99.90%-- tick_nohz_idle_enter | | | | | cpu_startup_entry | | | | | | | | | | | |--99.04%-- start_secondary | | | | | | | | | | | --0.96%-- rest_init | | | | | start_kernel | | | | | x86_64_start_reservations | | | | | x86_64_start_kernel | | | | --0.10%-- [...] | | | | Any thoughts on how to proceed further appreciated! Cheers, Mark
pgsql-performance by date: