Home > mailing lists

Re: Proposal: Commit timestamp - Mailing list pgsql-hackers

From	Jan Wieck
Subject	Re: Proposal: Commit timestamp
Date	February 3, 2007 18:09:14
Msg-id	45C507FC.4010405@Yahoo.com Whole thread Raw
In response to	Re: Proposal: Commit timestamp (Theo Schlossnagle <jesus@omniti.com>)
Responses	Re: Proposal: Commit timestamp Re: Proposal: Commit timestamp
List	pgsql-hackers

Tree view

On 2/3/2007 4:58 PM, Theo Schlossnagle wrote:
> On Feb 3, 2007, at 4:38 PM, Jan Wieck wrote:
> 
>> On 2/3/2007 4:05 PM, Theo Schlossnagle wrote:
>>> On Feb 3, 2007, at 3:52 PM, Jan Wieck wrote:
>>>> On 2/1/2007 11:23 PM, Jim Nasby wrote:
>>>>> On Jan 25, 2007, at 6:16 PM, Jan Wieck wrote:
>>>>>> If a per database configurable tslog_priority is given, the    
>>>>>> timestamp will be truncated to milliseconds and the increment   
>>>>>> logic  is done on milliseconds. The priority is added to the   
>>>>>> timestamp.  This guarantees that no two timestamps for commits   
>>>>>> will ever be  exactly identical, even across different servers.
>>>>> Wouldn't it be better to just store that information  
>>>>> separately,   rather than mucking with the timestamp?
>>>>> Though, there's anothe issue here... I don't think NTP is good   
>>>>> for  any better than a few milliseconds, even on a local network.
>>>>> How exact does the conflict resolution need to be, anyway?  
>>>>> Would  it  really be a problem if transaction B committed 0.1  
>>>>> seconds  after  transaction A yet the cluster thought it was the  
>>>>> other way  around?
>>>>
>>>> Since the timestamp is basically a Lamport counter which is just   
>>>> bumped be the clock as well, it doesn't need to be too precise.
>>> Unless I'm missing something, you are _treating_ the counter as a   
>>> Lamport timestamp, when in fact it is not and thus does not  
>>> provide  semantics of a Lamport timestamp.  As such, any  
>>> algorithms that use  lamport timestamps as a basis or assumption  
>>> for the proof of their  correctness will not translate (provably)  
>>> to this system.
>>> How are your counter semantically equivalent to Lamport timestamps?
>>
>> Yes, you must be missing something.
>>
>> The last used timestamp is remembered. When a remote transaction is  
>> replicated, the remembered timestamp is set to max(remembered,  
>> remote). For a local transaction, the remembered timestamp is set  
>> to max(remembered+1ms, systemclock) and that value is used as the  
>> transaction commit timestamp.
> 
> A Lamport clock, IIRC, require a cluster wide tick.  This seems based  
> only on activity and is thus an observational tick only which means  
> various nodes can have various perspectives at different times.
> 
> Given that time skew is prevalent, why is the system clock involved  
> at all?

This question was already answered.

> As is usual distributed systems problems, they are very hard to  
> explain casually and also hard to review from a theoretical angle  
> without a proof.  Are you basing this off a paper?  If so which one?   
> If not, have you written a rigorous proof of correctness for this  
> approach?

I don't have any such paper and the proof of concept will be the 
implementation of the system. I do however see enough resistance against 
this proposal to withdraw the commit timestamp at this time. The new 
replication system will therefore require the installation of a patched, 
non-standard PostgreSQL version, compiled from sources cluster wide in 
order to be used. I am aware that this will dramatically reduce it's 
popularity but it is impossible to develop this essential feature as an 
external module.

I thank everyone for their attention.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

pgsql-hackers by date:

From: Theo Schlossnagle
Date: 03 February 2007, 17:58:44
Subject: Re: Proposal: Commit timestamp

From: Theo Schlossnagle
Date: 03 February 2007, 18:19:31
Subject: Re: Proposal: Commit timestamp

Re: Proposal: Commit timestamp - Mailing list pgsql-hackers

Previous

Next