Re: Recommendations for SSDs in production? - Mailing list pgsql-general
From | Yeb Havinga |
---|---|
Subject | Re: Recommendations for SSDs in production? |
Date | |
Msg-id | 4EB3F618.30504@gmail.com Whole thread Raw |
In response to | Re: Recommendations for SSDs in production? (Kurt Buff <kurt.buff@gmail.com>) |
Responses |
Re: Recommendations for SSDs in production?
|
List | pgsql-general |
On 2011-11-04 04:21, Kurt Buff wrote: > Oddly enough, Tom's Hardware has a review of the Intel offering today > - might be worth your while to take a look at it. Kurt Thanks for that link! Seeing media wearout comparisons between 'consumer grade' and 'enterprise' disks was enough for me to stop thinking about the vertex 3 and intel 510 behind hardware raid: I'm going to stick with Intel 710 and Vertex 2 Pro on onboard SATA. Tom's Hardware also showed how to test wearout using the workload indicator, so I thought lets do that with a pgbench workload. First, if your'e interested in doing a test like this yourself, I'm testing on ubuntu 11.10, but even though this is a brand new distribution, the smart database was a few months old. 'update-smart-drivedb' had as effect that the names of the values turned into something useful: instead of #LBA's written, it now shows #32MiB's written. Also there are now three 'workload' related parameters. 225 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 108551 226 Workld_Media_Wear_Indic 0x0032 100 100 000 Old_age Always - 17 227 Workld_Host_Reads_Perc 0x0032 100 100 000 Old_age Always - 0 228 Workload_Minutes 0x0032 100 100 000 Old_age Always - 211 232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0 233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0 241 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 108551 242 Host_Reads_32MiB 0x0032 100 100 000 Old_age Always - 21510 Tom's hardware on page http://www.tomshardware.com/reviews/ssd-710-enterprise-x25-e,3038-4.html shows how to turn these numbers into useful values. The numbers above were taken 211 minutes after I cleared the workload values with smartctl -t vendor,0x40 /dev/sda. If you do that, the workload values become 0, then after a few minutes they all become 65535 and not before 60 minutes of testing you'll see some useful values returned. During the test, I did two one hour pgbench runs on a md raid1 with the intel 710 and vertex 2 pro, wal in ram. pgbench -i -s 300 t (fits in ram) pgbench -j 20 -c 20 -M prepared -T 3600 -l t (two times) % mediawear by workload is Workld_Media_Wear_Indic / 1024 17/1024 = .0166015625 % Lets turn this into # days. I take the most pessimistic number of 120 minutes of actual pgbench testing, instead of the total minutes since workload reset of 211 minutes. 120/(17/1024/100)/60/24 = 501.9608599031 days The Host_Reads_32MiB value was 91099 before the test, now it is at 108551. (108551-91099)*32/1024 = 545 GB written during the test. (108551-91099)*32/1024/1024/(17/1024/100) = 3208 TB before media wearout. This number fits between Tom's hardware's calculated wearout numbers, 7268 TB for sequential and 1437 TB for random load. -- Yeb PS: info on test setup Model Number: INTEL SSDSA2BZ100G3 Firmware Revision: 6PB10362 Model Number: OCZ-VERTEX2 PRO Firmware Revision: 1.35 partitions aligned on 512kB boundary. workload on ~20GB software raid mirror (drives are 100GB). Linux client46 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux PostgreSQL 9.2devel on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit /proc/sys/vm/dirty_background_bytes set to 178500000 non standard parameters of pg are: maintenance_work_mem = 1GB # pgtune wizard 2011-10-28 checkpoint_completion_target = 0.9 # pgtune wizard 2011-10-28 effective_cache_size = 16GB # pgtune wizard 2011-10-28 work_mem = 80MB # pgtune wizard 2011-10-28 wal_buffers = 8MB # pgtune wizard 2011-10-28 checkpoint_segments = 96 shared_buffers = 5632MB # pgtune wizard 2011-10-28 max_connections = 300 # pgtune wizard 2011-10-28 Latency and tps graphs of *one* of the 20 clients during the second pgbench test are here: http://imgur.com/a/jjl13 - note that max latency has dropped from ~ 3 seconds from earlier tests to ~ 1 second - this is mainly due to an increase of checkpoint segments from 16 to 96.
pgsql-general by date: