Re: Postgres for a "data warehouse", 5-10 TB - Mailing list pgsql-performance

From Stefan Keller
Subject Re: Postgres for a "data warehouse", 5-10 TB
Date
Msg-id CAFcOn2_PA=e4fbhfgRrzov8_OTyzmk7mrpii-NeuCXA7XuJ9wA@mail.gmail.com
Whole thread Raw
In response to Re: Postgres for a "data warehouse", 5-10 TB  (Marti Raudsepp <marti@juffo.org>)
Responses Re: Postgres for a "data warehouse", 5-10 TB
Re: Postgres for a "data warehouse", 5-10 TB
List pgsql-performance
Interesting debate.

2011/9/13 Marti Raudsepp <marti@juffo.org>:
> Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
> SAVEPOINT it can roll back to in case of an error.

Are you sure? In theory I always understood that there are no
"subtransactions".

In fact when looking at the docs there is chapter 39.6.6. saying "By
default, any error occurring in a PL/pgSQL function aborts execution
of the function, and indeed of the surrounding transaction as well.
You can trap errors and recover from them by using a BEGIN block with
an EXCEPTION clause."
(http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html
)

So the doc isn't totally explicit about this. But whatever: What would
be the the function of a subtransaction? To give the possibility to
recover and continue within the surrounding transaction?

Stefan

2011/9/13 Marti Raudsepp <marti@juffo.org>:
> On Tue, Sep 13, 2011 at 19:34, Robert Klemme <shortcutter@googlemail.com> wrote:
>> I don't think so.  You only need to catch the error (see attachment).
>> Or does this create a sub transaction?
>
> Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
> SAVEPOINT it can roll back to in case of an error.
>
>> Yes, I mentioned the speed issue.  But regardless of the solution for
>> MySQL's "INSERT..ON DUPLICATE KEY UPDATE" which Igor mentioned you
>> will have the locking problem anyhow if you plan to insert
>> concurrently into the same table and be robust.
>
> In a mass-loading application you can often divide the work between
> threads in a manner that doesn't cause conflicts.
>
> For example, if the unique key is foobar_id and you have 4 threads,
> thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes
> (foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the
> work.
>
> I already suggested this in my original post.
>
> Regards,
> Marti
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

pgsql-performance by date:

Previous
From: Carlo Stonebanks
Date:
Subject: Re: Migrated from 8.3 to 9.0 - need to update config (re-post)
Next
From: Samuel Gendler
Date:
Subject: raid array seek performance