Home > mailing lists

Re: MD5 aggregate - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: MD5 aggregate
Date	June 14, 2013 14:57:02
Msg-id	20130614145657.GH19500@alap2.anarazel.de Whole thread Raw
In response to	Re: MD5 aggregate (Dean Rasheed <dean.a.rasheed@gmail.com>)
List	pgsql-hackers

Tree view

On 2013-06-14 15:49:31 +0100, Dean Rasheed wrote:
> On 14 June 2013 15:19, Stephen Frost <sfrost@snowman.net> wrote:
> > * Andrew Dunstan (andrew@dunslane.net) wrote:
> >> I'd rather go the other way, processing the records without having
> >> to process them otherwise at all. Turning things into text must slow
> >> things down, surely.
> >
> > That's certainly an interesting idea also..
> >
> 
> md5_agg(record) ?
> 
> Yes, I like it.

It's more complex than just memcmp()ing HeapTupleData though. At least
if the Datum contains varlena columns there's so many different
representations (short, long, compressed, external, external compressed)
of the same data that a md5 without normalizing that wouldn't be very
interesting.
So you would at least need a normalizing version of
toast_flatten_tuple() that also deals with short/long varlenas. But even
after that, you would still need to deal with Datums that can have
different representation (like short numerics, old style hstore, ...).

It might be more realistic to use the binary output functions, but I am
not sure whether all of those are sufficiently reproduceable.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

pgsql-hackers by date:

From: Dean Rasheed
Date: 14 June 2013, 14:49:55
Subject: Re: MD5 aggregate

From: Hannu Krosing
Date: 14 June 2013, 15:09:53
Subject: Re: MD5 aggregate

Re: MD5 aggregate - Mailing list pgsql-hackers

Previous

Next