Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation) - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)
Date
Msg-id CAM-w4HMeTjbVKooUq_sEqjnEp+7fSZQj823XaE5geQ7-in4pdA@mail.gmail.com
Whole thread Raw
In response to Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)  (Craig Ringer <ringerc@ringerc.id.au>)
List pgsql-hackers


On Wed, Jul 18, 2012 at 1:13 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
That makes me wonder if on top of the buildfarm, extending some buildfarm machines into a "crashfarm" is needed:

- Keep kvm instances with copy-on-write snapshot disks and the build env on them
- Fire up the VM, do a build, and start the server
- From outside the vm have the test controller connect to the server and start a test run
- Hard-kill the OS instance at a random point in time.

For what it's worth you don't need to do a hard kill of the vm and start over repeatedly to kill at different times. You could take a snapshot of the disk storage and keep running. You could take many snapshots from a single run. Each snapshot would represent the storage that would exist if the machine had crashed at the point in time that the snapshot was taken.

You do want the snapshots to be taken using something outside the virtual machine. Either the kvm storage layer or using lvm on the host. But not using lvm on the guest virtual machine.

And yes, the hard part that always stopped me from looking at this was having any way to test the correctness of the data.

--
greg

pgsql-hackers by date:

Previous
From: Adam Crews
Date:
Subject: Re: postgres 9 bind address for replication
Next
From: Alvaro Herrera
Date:
Subject: Re: isolation check takes a long time