Home > mailing lists

Re: Recommendations for SSDs in production? - Mailing list pgsql-general

From	Yeb Havinga
Subject	Re: Recommendations for SSDs in production?
Date	November 4, 2011 11:26:56
Msg-id	4EB3F618.30504@gmail.com Whole thread Raw
In response to	Re: Recommendations for SSDs in production? (Kurt Buff <kurt.buff@gmail.com>)
Responses	Re: Recommendations for SSDs in production?
List	pgsql-general

Tree view

On 2011-11-04 04:21, Kurt Buff wrote:
> Oddly enough, Tom's Hardware has a review of the Intel offering today
> - might be worth your while to take a look at it. Kurt

Thanks for that link! Seeing media wearout comparisons between 'consumer
grade' and 'enterprise' disks was enough for me to stop thinking about
the vertex 3 and intel 510 behind hardware raid: I'm going to stick with
Intel 710 and Vertex 2 Pro on onboard SATA.

Tom's Hardware also showed how to test wearout using the workload
indicator, so I thought lets do that with a pgbench workload.

First, if your'e interested in doing a test like this yourself, I'm
testing on ubuntu 11.10, but even though this is a brand new
distribution, the smart database was a few months old.
'update-smart-drivedb' had as effect that the names of the values turned
into something useful: instead of #LBA's written, it now shows #32MiB's
written. Also there are now three 'workload' related parameters.

225 Host_Writes_32MiB       0x0032   100   100   000    Old_age
Always       -       108551
226 Workld_Media_Wear_Indic 0x0032   100   100   000    Old_age
Always       -       17
227 Workld_Host_Reads_Perc  0x0032   100   100   000    Old_age
Always       -       0
228 Workload_Minutes        0x0032   100   100   000    Old_age
Always       -       211
232 Available_Reservd_Space 0x0033   100   100   010    Pre-fail
Always       -       0
233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age
Always       -       0
241 Host_Writes_32MiB       0x0032   100   100   000    Old_age
Always       -       108551
242 Host_Reads_32MiB        0x0032   100   100   000    Old_age
Always       -       21510

Tom's hardware on page
http://www.tomshardware.com/reviews/ssd-710-enterprise-x25-e,3038-4.html
shows how to turn these numbers into useful values.

The numbers above were taken 211 minutes after I cleared the workload
values with smartctl -t vendor,0x40 /dev/sda. If you do that, the
workload values become 0, then after a few minutes they all become 65535
and not before 60 minutes of testing you'll see some useful values returned.

During the test, I did two one hour pgbench runs on a md raid1 with the
intel 710 and vertex 2 pro, wal in ram.
pgbench -i -s 300 t (fits in ram)
pgbench -j 20 -c 20 -M prepared -T 3600 -l  t  (two times)

% mediawear by workload is Workld_Media_Wear_Indic / 1024
17/1024 = .0166015625 %

Lets turn this into # days. I take the most pessimistic number of 120
minutes of actual pgbench testing, instead of the total minutes since
workload reset of 211 minutes.
120/(17/1024/100)/60/24 = 501.9608599031 days

The Host_Reads_32MiB value was 91099 before the test, now it is at 108551.
(108551-91099)*32/1024 = 545 GB written during the test.

(108551-91099)*32/1024/1024/(17/1024/100) = 3208 TB before media wearout.

This number fits between Tom's hardware's calculated wearout numbers,
7268 TB for sequential and 1437 TB for random load.

-- Yeb

PS: info on test setup
Model Number:       INTEL SSDSA2BZ100G3  Firmware Revision:  6PB10362
Model Number:       OCZ-VERTEX2 PRO      Firmware Revision:  1.35

partitions aligned on 512kB boundary.
workload on ~20GB software raid mirror (drives are 100GB).

Linux client46 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux
PostgreSQL 9.2devel on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit

/proc/sys/vm/dirty_background_bytes set to 178500000

non standard parameters of pg are:
maintenance_work_mem = 1GB # pgtune wizard 2011-10-28
checkpoint_completion_target = 0.9 # pgtune wizard 2011-10-28
effective_cache_size = 16GB # pgtune wizard 2011-10-28
work_mem = 80MB # pgtune wizard 2011-10-28
wal_buffers = 8MB # pgtune wizard 2011-10-28
checkpoint_segments = 96
shared_buffers = 5632MB # pgtune wizard 2011-10-28
max_connections = 300 # pgtune wizard 2011-10-28

Latency and tps graphs of *one* of the 20 clients during the second
pgbench test are here: http://imgur.com/a/jjl13 - note that max latency
has dropped from ~ 3 seconds from earlier tests to ~ 1 second - this is
mainly due to an increase of checkpoint segments from 16 to 96.

pgsql-general by date:

From: Samba
Date: 04 November 2011, 10:58:30
Subject: Re: equivalent to "replication_timeout" on standby server

From: Sean Patronis
Date: 04 November 2011, 11:50:09
Subject: Streaming Replication woes

Re: Recommendations for SSDs in production? - Mailing list pgsql-general

Previous

Next