Thread: string primary key
Is there a disadvantage to having the primary key for a table be a text type vs. an integer type? Performance? Any difference between having a varchar or char as a primary key? My instinct tells me that an integer is preferred, but I'm looking for a more concrete answer. Thanks, Mark
On Thu, 2006-05-11 at 10:52, Mark Gibson wrote: > Is there a disadvantage to having the primary key for a table be a text > type vs. an integer type? Performance? Any difference between having a > varchar or char as a primary key? > > My instinct tells me that an integer is preferred, but I'm looking for a > more concrete answer. If you need a unique constraint on the text field anyway, and it's a natural key, you're generally better of using that field as the pk. However, if it's not a natually unique key, then it shouldn't be the pk, and int is a perhaps better choice. There are two VERY oppositional schools of thought on natural versus artificial keys out there, and neither side is likely to change their minds. My preference is generally for artificial keys (i.e. sequence generated ones) because I've had requirements change underfoot too many times to rely on natural keys all the time.
Scott Marlowe wrote: > > If you need a unique constraint on the text field anyway, and it's a > natural key, you're generally better of using that field as the pk. > > However, if it's not a natually unique key, then it shouldn't be the pk, > and int is a perhaps better choice. > > There are two VERY oppositional schools of thought on natural versus > artificial keys out there, and neither side is likely to change their > minds. > > My preference is generally for artificial keys (i.e. sequence generated > ones) because I've had requirements change underfoot too many times to > rely on natural keys all the time. > Thanks for your answer. It sounds like your saying that in terms of performance, there is no difference between a character field pk and an integer pk. I've got a followup - The primary key for the table in question consists of 2 varchar fields: picture 'state' and 'city' where city is guaranteed to be unique within a state, and (state, city) form a unique key. This sounds like a good candidate for a sequence key. Is there a difference in terms of performance in this case?
On Thu, 2006-05-11 at 11:43, Mark Gibson wrote: > Scott Marlowe wrote: > > > > > If you need a unique constraint on the text field anyway, and it's a > > natural key, you're generally better of using that field as the pk. > > > > However, if it's not a natually unique key, then it shouldn't be the pk, > > and int is a perhaps better choice. > > > > There are two VERY oppositional schools of thought on natural versus > > artificial keys out there, and neither side is likely to change their > > minds. > > > > My preference is generally for artificial keys (i.e. sequence generated > > ones) because I've had requirements change underfoot too many times to > > rely on natural keys all the time. > > > > Thanks for your answer. It sounds like your saying that in terms of > performance, there is no difference between a character field pk and an > integer pk. Sort of. Generally, the int pk-fk relationship will be a tad faster. However, the maintenance of the unique / primary key index is what really costs you, and if you've gotta have one unique key (on the text) the extra time spent mainaining another on an artificial key (in an int) will lost you as much time as you gain from the faster joins on an integer. Generally. > I've got a followup - The primary key for the table in question consists > of 2 varchar fields: picture 'state' and 'city' where city is guaranteed > to be unique within a state, and (state, city) form a unique key. This > sounds like a good candidate for a sequence key. Is there a difference > in terms of performance in this case? ahhh. You'll need the unique key anyway, right? Then if you're going to do 99.999% selects, it will likely be faster to have an artificial key (i.e. integers from a sequence) than using the natural key, since the updates will be seldom, if ever. However, the more updates you do (percentage wise) the more the second index will cost you for maintenance, and eventually, you'll run slower, on average, than if you had just the one index. It's all about usage patterns. Some usage patterns favor one solution or another. There are few, if any, absolutes. except always make sure your key types match up.
On Thu, May 11, 2006 at 09:52:41 -0600, Mark Gibson <mark@gibsonsoftware.com> wrote: > Is there a disadvantage to having the primary key for a table be a text > type vs. an integer type? Performance? Any difference between having a > varchar or char as a primary key? You probably want to use 'text' unless there is a busniess rule limiting the size of the field.
On Thu, May 11, 2006 at 10:43:50 -0600, Mark Gibson <mark@gibsonsoftware.com> wrote: > > I've got a followup - The primary key for the table in question consists > of 2 varchar fields: picture 'state' and 'city' where city is guaranteed > to be unique within a state, and (state, city) form a unique key. This > sounds like a good candidate for a sequence key. Is there a difference > in terms of performance in this case? That might not be such a good idea. I did a quick check of some GNS data and found what appear to be 4 different cities in Vermont with the same name. They are in 4 different counties, so it isn't likely that it is a single city spanning multiple counties. VT Mill Village ppl Orange 435738N0721758W Vershire 1014 VT Mill Village ppl Orleans 443958N0722233W Albany 1066 VT Mill Village ppl Essex 442951N0713937W Gilman 1276 VT Mill Village ppl Washington 442029N0724454W Middlesex
Using interger as opposed to a character as a primary key has an advantage when it comes to querying data in the table, it is faster searching with an interger as compared to characters date types +-----------------------------------------------------+ | Martin W. Kuria (Mr.) martin.kuria@unon.org +----------------------------------------------------+ >From: Bruno Wolff III <bruno@wolff.to> >To: Mark Gibson <mark@gibsonsoftware.com> >CC: pgsql-general@postgresql.org >Subject: Re: [GENERAL] string primary key >Date: Fri, 12 May 2006 01:54:17 -0500 > >On Thu, May 11, 2006 at 09:52:41 -0600, > Mark Gibson <mark@gibsonsoftware.com> wrote: > > Is there a disadvantage to having the primary key for a table be a text > > type vs. an integer type? Performance? Any difference between having a > > varchar or char as a primary key? > >You probably want to use 'text' unless there is a busniess rule limiting >the size of the field. > >---------------------------(end of broadcast)--------------------------- >TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/