Re: Proposing pg_hibernate - Mailing list pgsql-hackers
From | Gurjeet Singh |
---|---|
Subject | Re: Proposing pg_hibernate |
Date | |
Msg-id | CABwTF4WJpaUQmr0Dg9uGmNS=LyjA=uy_MSdZ5dg+tcQEmRTZgw@mail.gmail.com Whole thread Raw |
In response to | Re: Proposing pg_hibernate (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Proposing pg_hibernate
Re: Proposing pg_hibernate |
List | pgsql-hackers |
On Thu, Jun 12, 2014 at 12:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > So, are you proposing this for inclusion in PostgreSQL core? Yes, as a contrib module. > If so, that's different: you'll need to demonstrate the benefits via > convincing proof points Please see attached charts, and the spreadsheet that these charts were generated from. Quoting from my blog, where I first published these charts: <quote> As can be seen in the chart below, the database ramp-up time drops dramatically when Postgres Hibernator is enabled. The sooner the database TPS can reach the steady state, the faster your applications can start performing at full throttle. The ramp-up time is even shorter if you wait for the Postgres Hibernator processes to end, before starting your applications. As is quite evident, waiting for Postgres Hibernator to finish loading the data blocks before starting the application yeilds a 97% impprovement in database ramp-up time (2300 seconds to get to 122k TPS without Postgres Hibernator vs. 70 seconds). ### Details Please note that this is not a real benchmark, just something I developed to showcase this extension at its sweet spot. The full source of this mini benchmark is available with the source code of the Postgres Hibernator, at its [Git repo][pg_hibernator_git]. ``` Hardware: MacBook Pro 9,1 OS Distribution: Ubuntu 12.04 Desktop OS Kernel: Linux 3.11.0-19-generic RAM: 8 GB Physical CPU: 1 CPU Count: 4 Core Count: 8 pgbench scale: 260 (~ 4 GB database) ``` Before every test run, except the last ('DB-only restart; No Hibernator'), the Linux OS caches are dropped to simulate an OS restart. In 'First Run', the Postgres Hibernator is enabled, but since this is the first ever run of the database, Postgres Hibernator doesn't kick in until shutdown, to save the buffer list. In 'Hibernator w/ App', the application (pgbench) is started right after database restart. The Postgres Hibernator is restoring the data blocks to shared buffers while the application is also querying the database. In the 'App after Hibernator' case, the application is started _after_ the Postgres Hibernator has finished reading database blocks. This took 70 seconds for reading the ~4 GB database. In 'DB-only restart; No Hibernator` run, the OS caches are not dropped, but just the database service is restarted. This simulates database minor version upgrades, etc. </quote> > and you'll also need to show that the > disadvantages are in fact minor and that the scenario is in fact > unlikely. Attached is the new patch that addresses this concern. Right at startup, Postgres hibernator renames all .save files to .save.restoring. Later BlockReaders restore the blocks listed in the .save.restoring files. If, for any reason, the database crashes and restarts, the next startup of Hibernator will first remove all .save.restoring files. So in the case of my contrived example, <scenario> 1) A database is shutdown, which creates the save-files in $PGDATA/pg_hibernator/. 2) The database is restarted. 3) BlockReaders begin to read and restore the disk blocks into buffers. 4) Before the BlockReaders could finish*, a copy of the database is taken (rsync/cp/FS-snapshot/etc.) This causes the the save-files to be present in the copy, because the BlockReaders haven't deleted them, yet. * (The BlockReaders ideally finish their task in first few minutes after first of them is started.) 5) The copy of the database is used to restore and erect a warm-standby. 6) The warm-standby starts replaying logs from WAL archive/stream. 7) Some time (hours/weeks/months) later, the warm-standby is promoted to be a master. 8) It starts the Postgres Hibernator, which sees save-files in $PGDATA/pg_hibernator/ and launches BlockReaders. </scenario> Right at step 2 the .save files will be renamed to .save.restoring, and later at step 8 Hibernator removes all .save.restoring files before proceeding further. So the BlockReaders will not restore stale save-files. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB www.EnterpriseDB.com
Attachment
pgsql-hackers by date: