Thread: Table clustering idea

Table clustering idea

From

"Dawid Kuroczko"

Date:

25 June 2006, 20:48:47

There is a well known command called CLUSTER which organizes table<br />in specified index's order.  It has a drawback,
thatnew tuples added are<br />not in this order. Last night I had idea which could be interesting, I hope. <br /><br
/>Theidea is to make use of 'histogram_bounds' collected statistical data.<br />Instead of inserting row into first
suitablespot in a table, a table would<br />be "divided" into sections, one for each of histogram_bounds ranges. <br
/>Wheninserting, the database would try to find most suitable section<br />to insert (using the histogram_bounds), and
ifthere were free spots<br />there, would insert there.  If not, it would either look for a tuple in nearby <br
/>sections,or first suitable place.<br /><br />What would it do?  It would try to keep table somewhat organized,<br
/>keepingrows of similar values close together (within SET STATISTICS<br />resolution, so a common scenario would be 50
or100 "sections"). <br />It would make it a bit hard for a table to shrink (since new rows would<br />be added
throughoutthe table, not at the beginning).<br /><br />Other idea than using histogram_bounds would be using the
position<br/>of key inside the index to determine the "ideal" place of row inside <br />the table and find the closest
freespot there. This would be of course<br />much more precise and wouldn't rely on statistic.<br /><br />  
Regards,<br/>      Dawid<br />

Re: Table clustering idea

From

"Luke Lonergan"

Date:

25 June 2006, 21:02:59

Dawid,

> Other idea than using histogram_bounds would be using the
> position of key inside the index to determine the "ideal"
> place of row inside the table and find the closest free spot
> there. This would be of course much more precise and wouldn't
> rely on statistic.

This is generally known as "index organized tables" in other databases.
It implements a clustered storage of the table, which dramatically
speeds access on the chosen indexed column and makes inserts fast.

This also eases some of the visibility/MVCC implementation issues being
discussed on hackers, but does not help with the maintenance of
non-storage key indices.

Other DBMS have index organized tables that can use either hash or btree
organizations, both of which have their uses.  We are planning to
implement btree organized tables sometime - anyone else interested in
this idea?

- Luke

Re: Table clustering idea

From

Josh Berkus

Date:

25 June 2006, 21:06:07

Luke,

> Other DBMS have index organized tables that can use either hash or btree
> organizations, both of which have their uses.  We are planning to
> implement btree organized tables sometime - anyone else interested in
> this idea?

Search the archives.  It's been dicussed on this list at least twice
before, that I know of.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Table clustering idea

From

"Jim C. Nasby"

Date:

27 June 2006, 00:15:38

On Sun, Jun 25, 2006 at 08:04:18PM -0400, Luke Lonergan wrote:
> Other DBMS have index organized tables that can use either hash or btree
> organizations, both of which have their uses.  We are planning to
> implement btree organized tables sometime - anyone else interested in
> this idea?

I'm curious how you'll do it, as I was once told that actually trying to
store heap data in a btree structure would be a non-starter (don't
remember why).

On a somewhat related note, I think that it would be advantageous if the
FSM had a means to prefer certain pages for a given tuple over other
pages. This would allow for a better way to keep heap and possibly index
data more compacted, and it would also be a means of keeping tables
loosely clustered. It would also make it far easier to shrink heaps that
have become bloated, because the FSM could be told to favor pages at the
beginning of the relation.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Table clustering idea

From

"Luke Lonergan"

Date:

27 June 2006, 03:31:37

Jim,

On 6/26/06 8:15 PM, "Jim C. Nasby" <jnasby@pervasive.com> wrote:

> On a somewhat related note, I think that it would be advantageous if the
> FSM had a means to prefer certain pages for a given tuple over other
> pages. This would allow for a better way to keep heap and possibly index
> data more compacted, and it would also be a means of keeping tables
> loosely clustered. It would also make it far easier to shrink heaps that
> have become bloated, because the FSM could be told to favor pages at the
> beginning of the relation.

Interesting idea - page affinity implemented using the FSM.

WRT feasibility of BTREE organized tables, I'm not sure I see the problem.
Teradata implemented a hashing filesystem for their heap storage and I've
always wondered about how they handle collision and chaining efficiently,
but it's a solved problem for sure - knowing that makes the challenge that
much easier :-)
- Luke

Re: Table clustering idea

From

Kim Bisgaard

Date:

27 June 2006, 05:56:12

Jim C. Nasby wrote:
> On Sun, Jun 25, 2006 at 08:04:18PM -0400, Luke Lonergan wrote:
>   
>> Other DBMS have index organized tables that can use either hash or btree
>> organizations, both of which have their uses.  We are planning to
>> implement btree organized tables sometime - anyone else interested in
>> this idea?
>>     
>
> I'm curious how you'll do it, as I was once told that actually trying to
> store heap data in a btree structure would be a non-starter (don't
> remember why).
>   
Ingres is now open source - they have clustering on btree/isam/hash 
(it's called "modify table xx to btree on col1,col2")

Regards,
Kim

Re: Table clustering idea

From

"Jim C. Nasby"

Date:

27 June 2006, 13:39:58

On Mon, Jun 26, 2006 at 11:31:24PM -0700, Luke Lonergan wrote:
> Jim,
> 
> On 6/26/06 8:15 PM, "Jim C. Nasby" <jnasby@pervasive.com> wrote:
> 
> > On a somewhat related note, I think that it would be advantageous if the
> > FSM had a means to prefer certain pages for a given tuple over other
> > pages. This would allow for a better way to keep heap and possibly index
> > data more compacted, and it would also be a means of keeping tables
> > loosely clustered. It would also make it far easier to shrink heaps that
> > have become bloated, because the FSM could be told to favor pages at the
> > beginning of the relation.
> 
> Interesting idea - page affinity implemented using the FSM.
> 
> WRT feasibility of BTREE organized tables, I'm not sure I see the problem.
> Teradata implemented a hashing filesystem for their heap storage and I've
> always wondered about how they handle collision and chaining efficiently,
> but it's a solved problem for sure - knowing that makes the challenge that
> much easier :-)

I know there were discussions in the past, though as per usual I can't
find them in the archives. At one point I had suggested clustering not
on a row level, but on a page level, since it doesn't really matter
terribly if the tuples in a page are clustered (worst case you can scan
the entire page).

I think one of the issues might have been: how will you handle other
indexes on the table when you can no longer point them at an item (since
items will need to move to maintain an IOT).
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Table clustering idea

From

Csaba Nagy

Date:

27 June 2006, 13:51:58

> I think one of the issues might have been: how will you handle other
> indexes on the table when you can no longer point them at an item (since
> items will need to move to maintain an IOT).

I guess you shouldn't allow any other indexes. That's a perfectly
acceptable compromise I think... it would be still very useful for big
and narrow tables which would benefit from being clustered.

The other concern is how would you do sequential scans on the table if
items are allowed to move ? I think some other DBs have a facility to
make a "fast index scan" which is essentially a sequential scan of the
index file, something like that would be needed here too.

Cheers,
Csaba.

Re: Table clustering idea

From

"J. Andrew Rogers"

Date:

27 June 2006, 14:13:13

On Jun 27, 2006, at 9:39 AM, Jim C. Nasby wrote:
> I think one of the issues might have been: how will you handle other
> indexes on the table when you can no longer point them at an item  
> (since
> items will need to move to maintain an IOT).

There are clean ways to handle this.  The table is organized on the  
primary key, a typical requirement for IOTs.  Any indexes you add to  
IOT reference the primary key of the heap tuple.  Since the heap and  
PK index are the same thing, external indexes use the PK as the tuple  
identifier.

The only caveat is that this creates performance asymmetries.  IOTs  
have significantly faster access through their primary keys but  
slower external index access since two B-Trees have to be traversed.   
An IOT is typically only used for tables that are only accessed  
through their primary key.  Not supporting external indexes on IOTs  
is a functional implementation (and probably recommended in  
practice), though most real implementations allow external indexes if  
not always in their first version.

J. Andrew Rogers
jrogers@neopolitan.com

Re: Table clustering idea

From

Josh Berkus

Date:

28 June 2006, 00:46:42

Jim,

> I know there were discussions in the past, though as per usual I can't
> find them in the archives.

Search on "B-Tree Organized Tables".

From what I can find, this feature isn't prohibitively useless.  It's just a
singnificant amount of effort for a result which is a tradeoff.   That is,
you'd *only* want to use it on tables which are *always* accessed by their
primary key.

What stopped the features AFAICT is that the interested parties weren't up to
doing the code.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco