V2 of PITR performance improvement for 8.4 - Mailing list pgsql-hackers
From | Koichi Suzuki |
---|---|
Subject | V2 of PITR performance improvement for 8.4 |
Date | |
Msg-id | a778a7260811270404g49254640x8ed58b12b7c65d0b@mail.gmail.com Whole thread Raw |
Responses |
Re: V2 of PITR performance improvement for 8.4
Re: V2 of PITR performance improvement for 8.4 |
List | pgsql-hackers |
Please find enclosed a revised version of pg_readahead and a patch to invoke pg_readahead. Changes from the previous one is as follows: Pg_readahead now does not return any prefetched point. It simply prefetches all the datapages refered from WAL records in a given WAL segment, except for those whose first WAL record includes full page write. Because of this change, patch to the core was changed so that pg_readahead is invoked when WAL segment is opened. Details will be found in README. I've done a benchmark to see the effect of the prefetch. Here's a report. -------------------------------- Benchmark: DBT-2 Database size: 20GB Gave less number of transactions than DBT-2 default avoid overload status. We ran the benchmark for on hour with chekpoint timeout 30min and completion_target 0.5. Then, collected all the archive log and run PITR. Disks: RAID0 array (8 disks, 7200rpm). Detailed conditions are given at the last. Measure ment result is as follows: (for readability, PDF chart is also attached) ----------------------+------------+--------------------+--------------- WAL conditions | Recovery | Amount of | recovery | time (sec) | physical read (MB) | rate (TX/min) ----------------------+------------+--------------------+--------------- w/o prefetch | | | archived with cp | 6,611 | 5,435 | 402 FPW=off | | | ----------------------+------------+--------------------+--------------- w/o prefetch | | | archived with cp | 1,683 | 801 | 1,458 FPW=on | | | (8.3) ----------------------+------------+--------------------+--------------- w/o prefetch | | | archived with lesslog | 6,644 | 5,090 | 369 FPW=on | | | ----------------------+------------+--------------------+--------------- With prefetch | | | archived with cp | 1,161 | 5,543 | 2,290 FPW=off | | | ----------------------+------------+--------------------+--------------- With prefetch | | | archived with cp | 1,415 | 2,157 | 1,733 FPW=on | | | ----------------------+------------+--------------------+--------------- With prefetch | | | archived with lesslog | 1,196 | 5,369 | 2,051 FPW=on | | | (This proposal) ----------------------+------------+--------------------+--------------- * lesslog means pg_compresslog ** DBT-2 thoughput: 682TPM (FPW=on), 739TPM (FPW=off) This shows that although prefetch does not reduce the physical read, it tremendously impreves the time to read and as a result, if WAL archive is taken with pg_lesslog and prefetch is done, recovery duration is somewhat shorter than current FPW=on score. Important point is that the recovery rate is much higher than DBT-2 throughput. Therefore, this can be combined with synchronous replication and hot standby, tremendously reducing the amount of logs to be shipped (up to as small as one tenth), improving the recovery time and maintaining crash recovery success chance. Just without FWP=off or with pg_compresslog, recovery does not catch up. Because current pg_readahead only works in Linux, I'd like the patch to be into the core and pg_readahead into contrib. Other (major) environment is given below. ----<< H/W and OS >>------------------- CPU: Pentium D, 2.8GHz Memory: 2GB Internal Disk: SATA 150GB, used to archive WAL. External Disk: RAID 0 (Ultra Wide SCSI), 8 disks (SATA 7200rpm) OS: RHEL ES 5.1 (64bit) ----<< Other PostgreSQL configuration >>-------- PostgreSQL: 8.4 dev. head, as of Oct.28th max_connections: 100 shared_buffers: 32MB checkpoint_segments: 1000 checkpoint_timeout: 30min checkpoint_completion target: 0.5 archive_mode: on autovacuum: on logging_collector: on -- ------ Koichi Suzuki
Attachment
pgsql-hackers by date: