Thread: Rapid Seek Devices (feature request)

Rapid Seek Devices (feature request)

From

NTPT

Date:

16 August 2009, 10:15:23

Hi all

I have some idea/feature request.

Now, there are several devices available, that can be called "rapid seek
devices" (RSD in future text). I mean SSD disks, some devices like
gigabyte I-RAM and other (semi)profesional "ram disk like" solutions for
example Acard ANS-9010 . Rapid seek because there are not moving parts,
thus non-sequential access (seeks) have no penalty (no moving heads
time). I think  it would be cool to have explicitly support for that
kind of devices.


1: Postgresql has its tablespaces, so creating a tablespce on RSD and
put some speed critical indexes/tables there, is optimal from the costs
point (RSD are not cheap, so use it for the speed critical  tasks within
single database is reasonable)  but it is only a half of the needed.
Query planner act strongly "pro sequentional reads" while it try to
guess best query plan. This bias is fine for classic disk based storage,
but is not necesery in RSD. So, for tables/indexes on RSD, a suboptimal
query plan is likely to be generated  (fix me if I am wrong).

So I suggest we should have "random_page_cost" and
"Sequential_page_cost" configurable  on per tablespace basis. And query
planner that is aware of it, include situation where tables and
corresponding indexes are on different speed tablespaces.

Imagine this scenario: Default tablespace on sata disk,
random_page_cost=8, sequential_page_cost=3 (from config file),

CREATE TABLESPACE fast /path/to/multiple_15k_rpm_SCSI_hw_RAID
random_page_cost=4 sequential_page_cost=2
CREATE TABLESPACE lightspeed /path/to/SSD_OR_I-RAM random_page_cost=1
sequential_page_cost=1.2

and now scatter your tables / indexes around new tablespaces :-)


2: Many of that RSD devices are not so much reliable (power outage in
ramdisk, weak auxillarity battery in i-ram like devices, block "wear
out" in SSD). While moving only an indexes to this device (  I found
this article showing there IS a big difference having only an indexes on
SSD -  http://linux.com/archive/feature/142658.) may be appropriate, and
just "reindex" in worst case, this is not suitable in high availability
enviroment.

So I suggest to have something like this to solve reliability problems
on some RSD:

CREATE TABLESPACE lightspeed /path/to/SSD_OR_i-RAM random_page_cost=1
sequential_page_cost=1.2 TEE name_of_the_slow_tablespace

ie read from fast tablespace, write to both fast and slow, reconstruct
fast from slow if appropriate.

Thanx for your attention.

PS: Execuse my wrong english

Re: Rapid Seek Devices (feature request)

From

Craig Ringer

Date:

17 August 2009, 00:21:02

On 16/08/2009 9:06 PM, NTPT wrote:

> So I suggest we should have "random_page_cost" and
> "Sequential_page_cost" configurable on per tablespace basis.

That strikes me as a REALLY good idea, personally, though I don't know
enough about the planner to factor in implementation practicalities and
any cost for people _not_ using the feature.

> 2: Many of that RSD devices are not so much reliable (power outage in
> ramdisk, weak auxillarity battery in i-ram like devices, block "wear
> out" in SSD).

[snip]

> ie read from fast tablespace, write to both fast and slow, reconstruct
> fast from slow if appropriate.

This can probably be done as well or better at the OS block layer, using
device-mapper or the `md' driver (on Linux).

What'd be interesting, though, would be if Pg had support for
auto-rebuilding indexes and (if/when explicit support is added)
materialized views if it finds the backing files are missing. This would
be helpful for such transient devices as RAM disks if you didn't want to
bother having physical storage backing it.

--
Craig Ringer

Re: Rapid Seek Devices (feature request)

From

Jeremy Harris

Date:

17 August 2009, 16:04:15

On 08/17/2009 03:24 AM, Craig Ringer wrote:
> On 16/08/2009 9:06 PM, NTPT wrote:
>> So I suggest we should have "random_page_cost" and
>> "Sequential_page_cost" configurable on per tablespace basis.
>
> That strikes me as a REALLY good idea, personally, though I don't know
> enough about the planner to factor in implementation practicalities and
> any cost for people _not_ using the feature.

Could not pgsql *measure* these costs (on a sampling basis, and with long
time-constants)?

- Jeremy

Re: Rapid Seek Devices (feature request)

From

NTPT

Date:

17 August 2009, 19:34:48

AFAIK postgresql measure characteristic of the data distribution in the
tables and indexes (that is what vacuum ANALYSE does) , but results of
that  measures  are **weighted by** random_page_cost and
sequential_page_cost. So  measurements are correct,  but costs (weight)
should reflect a real speed for sequentional and random operation of the
storage device(s) (tablespaces) involved.

Jeremy Harris napsal(a):
> On 08/17/2009 03:24 AM, Craig Ringer wrote:
>> On 16/08/2009 9:06 PM, NTPT wrote:
>>> So I suggest we should have "random_page_cost" and
>>> "Sequential_page_cost" configurable on per tablespace basis.
>>
>> That strikes me as a REALLY good idea, personally, though I don't know
>> enough about the planner to factor in implementation practicalities and
>> any cost for people _not_ using the feature.
>
> Could not pgsql *measure* these costs (on a sampling basis, and with long
> time-constants)?
>
> - Jeremy
>

Re: Rapid Seek Devices (feature request)

From

Greg Stark

Date:

17 August 2009, 20:21:27

2009/8/17 Jeremy Harris <jgh@wizmail.org>:
> Could not pgsql *measure* these costs (on a sampling basis, and with long
> time-constants)?

In theory, sure. In practice, well, there are some engineering
challenges to solve.

1) The cost model isn't perfect so the it's not clear exactly what to
measure to get the best result. As the cost model stands now you would
have to include some of the cpu time as well as the i/o time.

2) Timing is expensive. On many platforms getting the current time
takes a long time and would make a big difference to the runtime of
the query.



--
greg
http://mit.edu/~gsstark/resume.pdf