Re: [GENERAL] using ID as a key - Mailing list pgsql-general
From | Ed Loehr |
---|---|
Subject | Re: [GENERAL] using ID as a key |
Date | |
Msg-id | 389EEBC1.AC9E65E9@austin.rr.com Whole thread Raw |
In response to | Re: [GENERAL] using ID as a key (davidb@vectormath.com) |
Responses |
Re: [GENERAL] using ID as a key
Re: [GENERAL] using ID as a key Re: [GENERAL] using ID as a key |
List | pgsql-general |
Sevo Stille wrote: > > davidb@vectormath.com wrote: > > > > ID generation ought to be handled programmatically. > > At the server? This is what OIDs do - alas, these are even less portable > than serials and sequences. At the client interface? Nice, as long as > you have a single user database. In a multiuser environment, generating > and maintaining a unique ID externally to the database is close to > impossible. Simply incrementing the highest available ID from the > database by one is error prone, even if you catch duplicate insertions > by making the ID field UNIQUE and incrementing the ID in a loop until > you succeed - a slow client in a heavily updated database might > permanently fail to insert his record before the generated ID is used by > some other, faster client. And generating unique IDs based on an > database-independent would require some synchronized mechanism for ID > generation, adding dependency on the ID source to dependency on the > database. These are resolvable problems. One way to do a programmatic ID generation as David B. suggests is to have a DB server whose purpose is to issue "seed numbers" which designate, through use of a pre-determined multiplier, a *range* of allocated IDs guaranteed to be unique across all seed requesters. This avoids both of the problems raised above (throughput and synchronization). For example, suppose we have 10 servers, a1 through a10, each of whom needs to insert new primary keys that are unique across all 10 servers. Upon startup, each server queries the seed server, call it S, for a seed. Suppose the pre-determined multiplier is 100,000, meaning ranges will be allocated in sets of 100,000. S returns 1 on the first request from, say, a7. That means that a7 is now allocated IDs 100,000 thru 199,999. S might then return seed value 2 to a4, meaning a4 can use 200,000 thru 299,999, and so on for each of the other servers. Internally, S could simply use a sequence object to handle the atomic update requirement. When S's next internal sequence value is at 7, for example, that means it has allocated IDs through 699,999. This removes the bottleneck on serial generation at the expense of (infrequent) dependencies on the seed server S, and the serials are not time-ordered. A few more details must be handled, and there is some maintainance overhead, but it seems to work fairly well. Cheers, Ed Loehr
pgsql-general by date: