Home > mailing lists
Re: Performances issues with SSD volume ? - Mailing list pgsql-admin

From	Thomas SIMON
Subject	Re: Performances issues with SSD volume ?
Date	May 20, 2015 15:41:56
Msg-id	555CAB39.1090403@neteven.com Whole thread Raw
In response to	Re: Performances issues with SSD volume ? (Glyn Astill <glynastill@yahoo.co.uk>)
Responses	Re: Performances issues with SSD volume ?
List	pgsql-admin
Tree view
Hi Glyn,

I'll try to answer this points.

I've made some benchs, and indeed 3.2 not helping. not helping at all.
I changed to 3.14 and gap is quite big !
With pgbench RW test, 3.2 --> 4200 TPS ; 3.14 --> 6900 TPS in same
conditions
With pgbench RO test, 3.2 --> 37000 TPS ; 3.14 --> 95000 TPS, same
conditions too.

It should so be better, but when server was in production, and ever with
bad kernel, performances was already quite good before they quickly
decreased.
So i think too I have another configuration problem.



You say you're IO bound, so some output from sar / iostat / dstat and pg_stat_activity etc before and during the issue
wouldbe of use. 

-> My server is not in production right now, so it is difficult to
replay production load and have some useful metrics.
The best way I've found is to replay trafic from logs with pgreplay.
I hoped that the server falls back by replaying this traffic, but it
never happens ... Another thing I can't understand ...

Below is my dstat output when I replay this traffic (and so when server
runs normally)
I have unfortunately no more outputs when server's performances decreased.

[17:31:12]root@db10:~$ dstat -alrmy
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
---load-avg--- --io/total- ------memory-usage----- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int csw |
1m   5m  15m | read  writ| used  buff  cach  free| int   csw
   1   1  98   0   0   0|4870k   16M|   0     0 |   0     0 |4666
35k|4.68 1.48 0.56| 419   774 |6204M 47.1M 70.9G 175G|4666    35k
  10   4  81   5   0   0| 177M  368k| 510B 9674B|   0     0 | 32k
81k|5.11 1.62 0.61|14.7k 54.0 |6232M 47.1M 71.1G  175G| 32k   81k
   3   2  92   3   0   0|  81M 7568k| 224B  738B|   0     0 | 12k
28k|5.11 1.62 0.61|8687   822 |6233M 47.1M 71.1G  175G| 12k   28k
   4   2  92   2   0   0|  87M 7844k|1046B   10k|   0     0 | 12k
25k|5.11 1.62 0.61|8031   837 |6235M 47.1M 71.2G  175G| 12k   25k
  14   5  78   3   0   0| 155M 2972k| 556B 1084B|   0     0 | 28k
69k|5.11 1.62 0.61|12.7k  176 |6645M 47.1M 71.4G  174G| 28k   70k
  19   4  73   4   0   0| 172M 3012k|1082B 9182B|   0     0 | 33k
73k|5.11 1.62 0.61|12.5k  248 |6727M 47.1M 71.5G  174G| 33k   72k
  23   5  66   6   0   0| 247M   20M| 252B  800B|   0     0 | 42k
88k|5.42 1.75 0.65|20.1k  426 |6827M 47.1M 71.8G  173G| 42k   88k
  16   3  78   4   0   0| 119M 1384k| 744B 7680B|   0     0 | 28k
47k|5.42 1.75 0.65|11.3k  113 |6883M 47.1M 71.9G  173G| 28k   47k
   5   1  92   2   0   0|  55M 5952k|6960B 8210B|   0     0 | 16k
25k|5.42 1.75 0.65|6304   676 |6890M 47.1M 72.0G  173G| 16k   25k
   5   2  91   2   0   1|  56M  568k|9593B   13k|   0     0 | 16k
26k|5.42 1.75 0.65|6383  10.0 |6892M 47.1M 72.0G  173G| 16k   26k
   5   1  92   2   0   0|  51M    0 | 546B  888B|   0     0 | 14k
24k|5.42 1.75 0.65|6091     0 |6891M 47.1M 72.1G  173G| 14k   24k
  21   5  71   4   0   0| 110M 1416k| 384B 6874B|   0     0 | 31k
58k|6.18 1.97 0.73|11.1k  203 |7023M 47.1M 72.2G  173G| 31k   58k
  19   4  74   2   0   0| 110M 2364k| 312B  846B|   0     0 | 23k
44k|6.18 1.97 0.73|9399   221 |7055M 47.2M 72.3G  173G| 22k   44k
  26   6  65   3   0   0| 110M 6040k|1770B 7068B|   0     0 | 31k
59k|6.18 1.97 0.73|10.5k  781 |7255M 47.2M 72.4G  172G| 31k   59k
  23   5  69   3   0   0| 128M 2912k|1146B 1350B|   0     0 | 27k
52k|6.18 1.97 0.73|11.2k  240 |7314M 47.2M 72.5G  172G| 27k   52k


Other things you asked

     System memory size : 256 Go
     SSD Model numbers and how many : 4 SSd disks ; RAID 10 ; model
INTEL SSDSC2BB480G4
     Raid controller : MegaRAID SAS 2208
     Partition alignments and stripe sizes : see fdisk delow
     Kernel options : the config file is here :
ftp://ftp.ovh.net/made-in-ovh/bzImage/3.14.43/config-3.14.43-xxxx-std-ipv6-64
     Filesystem used and mount options : ext4, see mtab below
     IO Scheduler : noop [deadline] cfq for my ssd raid volume
     Postgresql version and configuration : 9.3.5

max_connections=1800
shared_buffers=8GB
temp_buffers=32MB
work_mem=100MB
maintenance_work_mem=12GB
bgwriter_lru_maxpages=200
effective_io_concurrency=4
wal_level=hot_standby
wal_sync_method=fdatasync
wal_writer_delay=2000ms
commit_delay=1000
checkpoint_segments=80
checkpoint_timeout=15min
checkpoint_completion_target=0.7
archive_command='rsync ....'
max_wal_senders=10
wal_keep_segments=38600
vacuum_defer_cleanup_age=100
hot_standby = on
max_standby_archive_delay = 5min
max_standby_streaming_delay = 5min
hot_standby_feedback = on
random_page_cost = 1.0
effective_cache_size = 240GB
log_min_error_statement = warning
log_min_duration_statement = 0
log_checkpoints = on
log_connections = on
log_disconnections = on
log_line_prefix = '%m|%u|%d|%c|'
log_lock_waits = on
log_statement = 'all'
log_timezone = 'localtime'
track_activities = on
track_functions = pl
track_activity_query_size = 8192
autovacuum_max_workers = 5
autovacuum_naptime = 30s
autovacuum_vacuum_threshold = 40
autovacuum_analyze_threshold = 20
autovacuum_vacuum_scale_factor = 0.10
autovacuum_analyze_scale_factor = 0.10
autovacuum_vacuum_cost_delay = 5ms
default_transaction_isolation = 'read committed'
max_locks_per_transaction = 128



     Connection pool sizing (pgpool2)
num_init_children = 1790
max_pool = 1

I also add megacli parameters :

Virtual Drive: 2 (Target Id: 2)
Name                :datassd
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 893.25 GB
Sector Size         : 512
Is VD emulated      : Yes
Mirror Data         : 893.25 GB
State               : Optimal
Strip Size          : 256 KB
Number Of Drives per span:2
Span Depth          : 2
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write
Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write
Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Enabled
Encryption Type     : None
Bad Blocks Exist: No
PI type: No PI

Is VD Cached: No


Other outputs :
     fdisk -l

Disk /dev/sdc: 959.1 GB, 959119884288 bytes
255 heads, 63 sectors/track, 116606 cylinders, total 1873281024 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Disk /dev/mapper/vg_datassd-lv_datassd: 751.6 GB, 751619276800 bytes
255 heads, 63 sectors/track, 91379 cylinders, total 1468006400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000



     cat /etc/mtab
/dev/mapper/vg_datassd-lv_datassd /datassd ext4
rw,relatime,discard,nobarrier,data=ordered 0 0
(I added nobarrier option)


     cat /sys/block/sdc/queue/scheduler
noop [deadline] cfq


     sysctl kernel | grep sched
kernel.sched_child_runs_first = 0
kernel.sched_rr_timeslice_ms = 25
kernel.sched_rt_period_us = 1000000
kernel.sched_rt_runtime_us = 950000

I've read some advices about changing kernel.sched_autogroup_enabled=0
and kernel.sched_migration_cost_ns=5000000, but this parameters are not
recognized by my kernel. So I don't know what to do with that...

     sysctl vm
vm.admin_reserve_kbytes = 8192
vm.block_dump = 0
vm.dirty_background_bytes = 8388608
vm.dirty_background_ratio = 0
vm.dirty_bytes = 67108864
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 0
vm.dirty_writeback_centisecs = 500
vm.drop_caches = 3
vm.laptop_mode = 0
vm.legacy_va_layout = 0
vm.lowmem_reserve_ratio = 256    256    32
vm.max_map_count = 65530
vm.memory_failure_early_kill = 0
vm.memory_failure_recovery = 1
vm.min_free_kbytes = 65008
vm.min_slab_ratio = 5
vm.min_unmapped_ratio = 1
vm.mmap_min_addr = 4096
vm.nr_pdflush_threads = 0
vm.numa_zonelist_order = default
vm.oom_dump_tasks = 1
vm.oom_kill_allocating_task = 0
vm.overcommit_kbytes = 0
vm.overcommit_memory = 2
vm.overcommit_ratio = 50
vm.page-cluster = 3
vm.panic_on_oom = 0
vm.percpu_pagelist_fraction = 0
vm.scan_unevictable_pages = 0
vm.stat_interval = 1
vm.swappiness = 60
vm.user_reserve_kbytes = 131072
vm.vfs_cache_pressure = 100
vm.zone_reclaim_mode = 0


     select * from pg_stat_activity
I've got hundred of entries for that when i'm in production, and I can't
paste it here due to confidentiality.
By day, it is around 50 millions queries usually. (35% selects ; 55%
updates & 5% inserts)


     lspci | grep -E 'RAID|SCSI|IDE|SATA'
00:1f.2 SATA controller: Intel Corporation C600/X79 series chipset
6-Port SATA AHCI Controller (rev 06)
02:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2208
[Thunderbolt] (rev 05)
07:00.0 Serial Attached SCSI controller: Intel Corporation C602 chipset
4-Port SATA Storage Control Unit (rev 06)

Thanks

Thomas

Le 18/05/2015 16:29, Glyn Astill a écrit :
>> From: Koray Eyidoğan <korayey@gmail.com>
>> To: Thomas SIMON <tsimon@neteven.com>
>> Cc: pgsql-admin@postgresql.org
>> Sent: Monday, 18 May 2015, 14:51
>> Subject: Re: [ADMIN] Performances issues with SSD volume ?
>>
>>
>>
>> Hi Thomas,
>>
>>
>> 3.2 kernel may be #1 cause of your I/O load problem:
>>
>>
>> http://www.databasesoup.com/2014/09/why-you-need-to-avoid-linux-kernel-32.html
>>
>>
https://medium.com/postgresql-talk/benchmarking-postgresql-with-different-linux-kernel-versions-on-ubuntu-lts-e61d57b70dd4
>>
>>
>>
>> Have a nice day.
>>
>>
>> Koray
>
> Likely 3.2 kernel isn't helping, but I think we need much more information before jumping to conclusions.
>
> You say you're IO bound, so some output from sar / iostat / dstat and pg_stat_activity etc before and during the
issuewould be of use. 
>
> Also:
>      System memory size
>
>      SSD Model numbers and how many
>      Raid controller
>      Partition allignments and stripe sizes
>      Kernel options
>      Filesystem used and mount options
>      IO Scheduler
>      Postgresql version and configuration
>      Connection pool sizing
>
>
> Perhaps you could thow us the output of some of these:
>
>      fdisk -l
>      cat /etc/mtab
>      cat /sys/block/<ssd device>/queue/scheduler
>      sysctl kernel | grep sched
>      sysctl vm
>      select * from pg_stat_activity
>      select name, setting from pg_settings
>      lspci | grep -E 'RAID|SCSI|IDE|SATA'
pgsql-admin by date:
From: Gustav Karlsson
Date: 20 May 2015, 05:29:51
Subject: Re: Deleting old WAL-files
From: Glyn Astill
Date: 20 May 2015, 16:51:22
Subject: Re: Performances issues with SSD volume ?
Re: Performances issues with SSD volume ? - Mailing list pgsql-admin

Previous

Next