Thread: Fwd: Proposal - UUID data type
---------- Forwarded message ---------- From: Jerry Stuckle <jstuck...@attglobal.net> Date: Jul 13, 10:29 pm Subject: Proposal - UUID data type To: comp.databases.mysql Kless wrote: > On Jul 13, 9:37 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote: >> Rather, I would think you should propose the UUID data type be added to >> the SQL standard. There are enough variations from the standards now. > Those variations must be managed by the language and not by the RDBMS, > where its *main work* is storing data using the specific data types. > A lot of languages already have modules to working with UUID [1]. Yes, they must be managed by the language. Which is why it should be part of the standard. That way, changing databases does not require changing code. >> And as it is, UUID's can be stored as a 32 byte string. So it isn't as >> if there is not an alternative. > If they're stored in ASCII form (32 hex digits), would indeed be very > inefficient. So it's necessary that the RDBMS have a specific data > type to handle the UUIDs. > In PostgreSQL they're stored as 16 binary bytes [2], and the core > database does not include any function for generating UUIDs > [1]http://en.wikipedia.org/wiki/Uuid#Implementations > [2]http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/... Yep, which in the grand scheme of things, probably makes zero difference. The difference between 16 and 32 bytes in any single row is minuscule. I am not in favor of adding more database-specific types to ANY database - and I think PostGres doing it was a mistake. If there is a demand for it, then it should be added to the SQL standard. That is the correct way to propose a change. That's why there are standards. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstuck...@attglobal.net ==================
On Mon, Jul 14, 2008 at 01:04:33AM -0700, Kless wrote: > I am not in favor of adding more database-specific types to ANY > database > - and I think PostGres doing it was a mistake. So you think that adding full text indexing, gist/gin indexes, text, geometric types should have waited until the SQL standard specified them? With that kind of thinking we'd still be in the database stone age. One of postgresql's greatest strengths is user-defined types, lets use it. > If there is a demand for it, then it should be added to the SQL > standard. That is the correct way to propose a change. That's why > there are standards. You are ofcourse free to propose it to them, but the question is if they'd listen... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
On Jul 14, 2008, at 01:34, Martijn van Oosterhout wrote: >> I am not in favor of adding more database-specific types to ANY >> database >> - and I think PostGres doing it was a mistake. > > So you think that adding full text indexing, gist/gin indexes, text, > geometric types should have waited until the SQL standard specified > them? With that kind of thinking we'd still be in the database stone > age. Besides which, I seriously doubt that the SQL standard limits data types. Best, David
Kless wrote:<br /><blockquote cite="mid:e82ea310-cfd9-4b2e-9aec-4795f9073948@i76g2000hsf.googlegroups.com" type="cite"><prewrap="">Yes, they must be managed by the language. Which is why it should be part of the standard. That way, changing databases does not require changing code. </pre></blockquote><br /> You are correct that putting widely used features into a standard that is implementedby everyone is good.<br /><br /> This does not extend to the conclusion that one should never put in a featureuntil it is standard. Look at any successful software product and see how it usually leads the standard rather thanfollows it. People only tend to make standards once they realize things are getting out of control, which is long afterthe products are in use.<br /><br /><blockquote cite="mid:e82ea310-cfd9-4b2e-9aec-4795f9073948@i76g2000hsf.googlegroups.com"type="cite"><blockquote type="cite"><pre wrap="">InPostgreSQL they're stored as 16 binary bytes [2], and the core database does not include any function for generating UUIDs </pre></blockquote><pre wrap=""> Yep, which in the grand scheme of things, probably makes zero difference. The difference between 16 and 32 bytes in any single row is minuscule. </pre></blockquote><br /> This is incorrect. UUID at 16 bytes is already "long" in terms of being used as aprimary index. In an 8K page, one can only fit 512 UUIDs (forgetting the requirement for headers) - if it was stored as32 bytes - or 36 bytes, or 40 bytes (with punctuation), it would be at less than 256 UUIDs per page. For a join table joiningone set of UUID to another set, that's < 256 vs < 128. Doubling the size of an index row roughly doubles thetime to look up the value.<br /><br /><blockquote cite="mid:e82ea310-cfd9-4b2e-9aec-4795f9073948@i76g2000hsf.googlegroups.com"type="cite"><pre wrap="">I am not in favor ofadding more database-specific types to ANY database - and I think PostGres doing it was a mistake. </pre></blockquote><br /> As somebody who wrote his own module to do UUIDfor PostgreSQL when I needed it in PostgreSQL 8.0, I don't agree. Just as you think defining it in a standard is betterthan each vendor doing it their own way, I think doing it in one product is better than each user of the product doingit their own way.<br /><br /><blockquote cite="mid:e82ea310-cfd9-4b2e-9aec-4795f9073948@i76g2000hsf.googlegroups.com"type="cite"><pre wrap="">If there is a demandfor it, then it should be added to the SQL standard. That is the correct way to propose a change. That's why there are standards. </pre></blockquote><br /> Provide a real example of any similar product doing this. Exactly which enhancementto a standard was defined without even a prototype existing used in an existing product that purports to implementthe standard?<br /><br /> I'm sure one or two examples must exist, but I cannot think of any. Every enhancementI can think of that eventually made it into a standard, was first implemented within a popular product, and thendemanded as a standard to be applied to all other products.<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature"cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
I write here the answer of Jerry Stuckle [1] because it looks me interesting and enough logical. [1] http://groups.google.com/group/comp.databases.mysql/browse_thread/thread/89557609239a995e ----------------------- Quite frankly, I don't care that PostGres has user-defined types. They restrict you to a single database, when others might be better for other reasons. And yes, I think other things should have been proposed to the SQL standards committee. It doesn't take that long to get a good proposal into the standards. No, it isn't immediate. But if there is a case to be made for it, then the committee will act. Then all databases get the feature, eventually. As I said. Do it the right way. Submit your proposal. If you have a case, it will be added to the SQL standard. If not, then it's not that important. ----------------------- On Jul 14, 9:34 am, klep...@svana.org (Martijn van Oosterhout) wrote: > On Mon, Jul 14, 2008 at 01:04:33AM -0700, Kless wrote: > > I am not in favor of adding more database-specific types to ANY > > database > > - and I think PostGres doing it was a mistake. > > So you think that adding full text indexing, gist/gin indexes, text, > geometric types should have waited until the SQL standard specified > them? With that kind of thinking we'd still be in the database stone > age. > > One of postgresql's greatest strengths is user-defined types, lets use > it. > > > If there is a demand for it, then it should be added to the SQL > > standard. That is the correct way to propose a change. That's why > > there are standards. > > You are ofcourse free to propose it to them, but the question is if > they'd listen... >
On Jul 14, 2008, at 10:49, Kless wrote: > I write here the answer of Jerry Stuckle [1] because it looks me > interesting and enough logical. It just sounds narrow-minded to me. See: http://www.oreillynet.com/pub/a/network/2005/07/29/cjdate.html Best, David
Kless wrote: > I write here the answer of Jerry Stuckle [1] because it looks me > interesting and enough logical. > Jerry's answer isn't a real answer - and we don't care what MySQL does or does not do. PostgreSQL developers are not going to invest time into helping you get a feature into MySQL - if this is what you are trying to do, please stop. MySQL didn't implement SQL-standards views until what - MySQL 4 or 5? Obviously "standards" is not their goal either. In Open Source / Free Software, the free contributions are from people with itches that they scratched. In a company like MySQL, it is more about business value or somewhere in between. I was a MySQL 3.x/4.x user until I learned PostgreSQL, and I have no intention of going back. They have so many incorrect assumptions built into their system, that I chose to switch databases instead of arguing with them. It's not worth my time, and I don't intend to go back. So, I will not be helping you get UUID into MySQL because I just don't care about MySQL... Cheers, mark > > [1] http://groups.google.com/group/comp.databases.mysql/browse_thread/thread/89557609239a995e > ----------------------- > Quite frankly, I don't care that PostGres has user-defined types. > They > restrict you to a single database, when others might be better for > other reasons. > > And yes, I think other things should have been proposed to the SQL > standards committee. It doesn't take that long to get a good proposal > into the standards. No, it isn't immediate. But if there is a case > to > be made for it, then the committee will act. > > Then all databases get the feature, eventually. > > As I said. Do it the right way. Submit your proposal. If you have a > case, it will be added to the SQL standard. If not, then it's not > that > important. > ----------------------- > -- Mark Mielke <mark@mielke.cc>
Kless wrote: > I write here the answer of Jerry Stuckle [1] because it looks me > interesting and enough logical. > > > [1] http://groups.google.com/group/comp.databases.mysql/browse_thread/thread/89557609239a995e > ----------------------- > Quite frankly, I don't care that PostGres has user-defined types. > They > restrict you to a single database, when others might be better for > other reasons. > > And yes, I think other things should have been proposed to the SQL > standards committee. It doesn't take that long to get a good proposal > into the standards. No, it isn't immediate. But if there is a case > to > be made for it, then the committee will act. > > Then all databases get the feature, eventually. > > As I said. Do it the right way. Submit your proposal. If you have a > case, it will be added to the SQL standard. If not, then it's not > that > important. > > The time taken to get something into the standard is a lifetime in computing terms. If my client has a need for a UDF they need it now, not when the standards committee gets around to thinking about it. Many UDTs will be specialised to a single user, and never be candidates for inclusion in the standards. Excluding any type that isn't in the standard would be to throw away one of Postgres' greatest strengths, one we are justly proud of. Maybe you don't care about that, but we do, and our clients do. In any case, a standard for UUIDs would almost certainly not specify how it is to be stored, which is where we got into this discussion. This debate started with a misconception about how Postgres actually stores UUIDs, and doesn't seem to have gained much point since then. cheers andrew
"Mark Mielke" <mark@mark.mielke.cc> writes: > Kless wrote: >> Yes, they must be managed by the language. Which is why it should be >> part of the standard. That way, changing databases does not require >> changing code. >> > > You are correct that putting widely used features into a standard that is > implemented by everyone is good. > > This does not extend to the conclusion that one should never put in a feature > until it is standard. Look at any successful software product and see how it > usually leads the standard rather than follows it. People only tend to make > standards once they realize things are getting out of control, which is long > after the products are in use. > To be fair there are two types of standards. Some standards follow implementations, others lead and prevent the babel of incompatible implementations from ever developing. Both ways work in the right context. But (perhaps unfortunately) the SQL spec is very definitely of the former type. Nothing goes into the SQL spec that one of the implementors hasn't already done. But that said Postgres isn't one of the main participants in the spec committee. We don't really want to add features that later get added to the spec in an incompatible way and we have no say in the committee to avoid that situation. >>> In PostgreSQL they're stored as 16 binary bytes [2], and the core >>> database does not include any function for generating UUIDs >>> >> >> Yep, which in the grand scheme of things, probably makes zero >> difference. The difference between 16 and 32 bytes in any single row >> is minuscule. Really? It sounds like 100% difference to me. If you applied that logic to everything you would have a database which runs at half the speed it would otherwise. Keep in mind also that your primary keys have to be stored in every other table as foreign keys too... > I'm sure one or two examples must exist, but I cannot think of any. Every > enhancement I can think of that eventually made it into a standard, was first > implemented within a popular product, and then demanded as a standard to be > applied to all other products. C99? SMTP? NTP? It tends to be important for network protocols since there's no gain in having non-interoperable protocols. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
Gregory Stark <stark@enterprisedb.com> writes: > "Mark Mielke" <mark@mark.mielke.cc> writes: >> I'm sure one or two examples must exist, but I cannot think of any. Every >> enhancement I can think of that eventually made it into a standard, was first >> implemented within a popular product, and then demanded as a standard to be >> applied to all other products. > C99? SMTP? NTP? > It tends to be important for network protocols since there's no gain in having > non-interoperable protocols. Actually, the IETF's mantra has always been "rough consensus and running code" (cf http://www.faqs.org/rfcs/rfc2031.html). Network protocols don't get standardized in advance of a working prototype, either. (No, I take that back: there were some that did. Ever heard of OSI?) regards, tom lane
Gregory Stark wrote: <blockquote cite="mid:877ibonuyk.fsf@oxford.xeocode.com" type="cite"><pre wrap="">"Mark Mielke" <aclass="moz-txt-link-rfc2396E" href="mailto:mark@mark.mielke.cc"><mark@mark.mielke.cc></a> writes: </pre><blockquotetype="cite"><pre wrap="">I'm sure one or two examples must exist, but I cannot think of any. Every enhancement I can think of that eventually made it into a standard, was first implemented within a popular product, and then demanded as a standard to be applied to all other products. </pre></blockquote><pre wrap=""> C99? SMTP? NTP? It tends to be important for network protocols since there's no gain in having non-interoperable protocols. </pre></blockquote><br /> For C99 - GCC had most of the C99 features years before C99 started.There are now some incompatibles that need to be dealt with.<br /><br /> For SMTP and NTP I think these protocolsare just so old that people don't realize how much they have evolved, and how many products existed. I wasn't inthe know at the time they were written (I was either a baby or in grade school), but I bet either: 1) they were writtenbefore it existed at all (not really an enhancment), or 2) they followed the prototype as it was implemented. Therehave been many extensions to SMTP that I have been aware of included support for SSL, that I doubt were in the standardfirst. The "RFC" is a "request for comment". The "STD" process came a lot later.<br /><br /> If we grab a phrasefrom RFC 1305 for NTP - "In Version 3 a new algorithm to combine the offsets of a number of peer time servers is presentedin Appendix F. This algorithm is modelled on those used by national standards laboratories to combine the weightedoffsets from a number of standard clocks to construct a synthetic laboratory timescale more accurate than that ofany clock separately." This seems pretty clear that the "standard" was updated based upon existing implementation.<br /><br/> To some degree, except for the simplest of designs, it is almost bad to write down what WILL be done, without havingexperience, or a prototype to based ones conclusions from. Ivory tower stuff. The purpose of a standard is to haveone common way that things are done - hopefully the best way - not just the only way that was considered. :-)<br /><br/> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
An answer of Jerry Stuckle: ----------------------- >> Yes, they must be managed by the language. Which is why it should be >> part of the standard. That way, changing databases does not require >> changing code. > You are correct that putting widely used features into a standard that > is implemented by everyone is good. > This does not extend to the conclusion that one should never put in a > feature until it is standard. Look at any successful software product > and see how it usually leads the standard rather than follows it. > People > only tend to make standards once they realize things are getting out > of > control, which is long after the products are in use. Non-standard features just force people to stick with that one product. In the long run, the only people who benefit are the product developers. >>> In PostgreSQL they're stored as 16 binary bytes [2], and the core >>> database does not include any function for generating UUIDs >> Yep, which in the grand scheme of things, probably makes zero >> difference. The difference between 16 and 32 bytes in any single row >> is minuscule. > This is incorrect. UUID at 16 bytes is already "long" in terms of > being > used as a primary index. In an 8K page, one can only fit 512 UUIDs > (forgetting the requirement for headers) - if it was stored as 32 > bytes > - or 36 bytes, or 40 bytes (with punctuation), it would be at less > than > 256 UUIDs per page. For a join table joining one set of UUID to > another > set, that's < 256 vs < 128. Doubling the size of an index row roughly > doubles the time to look up the value. Incorrect. Doubling the size of the index has very little effect on how long it takes to look up a value. Intelligent databases use a binary search so doubling the size only means one additional comparison need be done. And heavily used indexes are generally cached in memory anyway. >> I am not in favor of adding more database-specific types to ANY >> database >> - and I think PostGres doing it was a mistake. > As somebody who wrote his own module to do UUID for PostgreSQL when I > needed it in PostgreSQL 8.0, I don't agree. Just as you think defining > it in a standard is better than each vendor doing it their own way, I > think doing it in one product is better than each user of the product > doing it their own way. Fine. Whatever you want for your code. But don't expect the rest of the world to jump because you want it. >> If there is a demand for it, then it should be added to the SQL >> standard. That is the correct way to propose a change. That's why >> there are standards. > Provide a real example of any similar product doing this. Exactly > which > enhancement to a standard was defined without even a prototype > existing > used in an existing product that purports to implement the standard? > I'm sure one or two examples must exist, but I cannot think of any. > Every enhancement I can think of that eventually made it into a > standard, was first implemented within a popular product, and then > demanded as a standard to be applied to all other products. Most features added to the SQL standard, for instance. Like explicit JOINs, recursive SQL and a bunch more. Also changes to the C++ standard such as exceptions were at least in the process of being evaluated and approved before they were in any product. There's a reason for having a process to propose features to a product. And it does not require the proposed change to be in any product. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstuck...@attglobal.net ================== -----------------------
At 2008-07-15 08:34:01 -0700, jonas.esp@googlemail.com wrote: > > An answer of Jerry Stuckle: Please stop cross-posting messages from this list to whatever MySQL list you're on. It's a boring, pointless waste of time at best, and at worst will get you written off as a troll in both places pretty soon. -- ams
First - please stop copying this list - this is not the "convince Jerry to include UUID in MySQL" mailing list.<br /><br/> Really - I don't care what he thinks. But, on the subjects themselves and how they apply to *PostgreSQL*:<br /><br/><blockquote cite="mid:f1048bd4-52bf-428b-af62-71fba0c645fa@t54g2000hsg.googlegroups.com" type="cite"><pre wrap=""> Non-standard features just force people to stick with that one product. In the long run, the only people who benefit are the product developers. </pre></blockquote><br /> I chose PostgreSQL over MySQL because it provided numerous features - both standardand non - that I needed on the day I made my decision. I don't care about the long run as a user. One might as wellsay 90% of the world is wrong for using Microsoft products, because it locks one into Microsoft. One can say this - andpeople do say this - but none of this changes the fact that 90% of the world is relatively happy with their choice. Theyvoted with their dollars. All decisions should be made on a cost-benefit analysis - they should not be based on somearbitrary code like "I will not choose a solution that locks me in".<br /><br /> Additionally - in the context of MySQL- the main reason I chose PostgreSQL over MySQL is because it provided things like CREATE VIEW, which MySQL did notat the time. People such as Jerry can pretend that standards guarantee that a feature is in all products, but it seemsquite clear that just because something is a standard does NOT mean it is implemented the same everywhere, or even atall. At the time I chose PostgreSQL it was my opinion that PostgreSQL was far more standards-compliant than MySQL was goingto be for at least a few years. I am glad I came to the correct conclusion. MySQL implemented ACID as an after-thought.I mean - comone.<br /><br /><blockquote cite="mid:f1048bd4-52bf-428b-af62-71fba0c645fa@t54g2000hsg.googlegroups.com"type="cite"><blockquote type="cite"><pre wrap="">Thisis incorrect. UUID at 16 bytes is already "long" in terms of being used as a primary index. In an 8K page, one can only fit 512 UUIDs (forgetting the requirement for headers) - if it was stored as 32 bytes - or 36 bytes, or 40 bytes (with punctuation), it would be at less than 256 UUIDs per page. For a join table joining one set of UUID to another set, that's < 256 vs < 128. Doubling the size of an index row roughly doubles the time to look up the value. </pre></blockquote><pre wrap=""> Incorrect. Doubling the size of the index has very little effect on how long it takes to look up a value. Intelligent databases use a binary search so doubling the size only means one additional comparison need be done. And heavily used indexes are generally cached in memory anyway. </pre></blockquote><br /> Wrong. A binary search thatmust read double the number of pages, and compare double the number of bytes, will take double the amount of time. Thereare factors that will reduce this, such as if you assume that most of the pages are in memory or cache memory, thereforethe time to read the page is zero, therefore it's only the time to compare bytes - but at this point, the majorityof the time is spent comparing bytes, and it's still wrong. If we add in accounting for the fact that UUID is comparedusing a possibly inlined memcpy() compared to treating it as a string where it is variable sized, and much harderto inline (double the number of oeprations), and it's pretty clear that the person who would make such a statementas above is wrong.<br /><br /> As another poster wrote - why not double the size of all other data structures too.It costs nothing, right?<br /><br /> Why does MySQL have a 3-byte integer support if they truly believe that saving 1byte in 4 doesn't result in a savings for keys?<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
I'm sorry, but it was necessary that certain answers were answered by someone with wide knowledge on databases and overall about its own database. This one was the only way, and I believe that it has been enough positive, at least for the end users -every one that choose its data base-. At least this clarifies how is working each community, and what is to be true or not. On Jul 15, 6:45 pm, a...@oryx.com (Abhijit Menon-Sen) wrote: > At 2008-07-15 08:34:01 -0700, jonas....@googlemail.com wrote: > > > > > An answer of Jerry Stuckle: > > Please stop cross-posting messages from this list to whatever MySQL list > you're on. It's a boring, pointless waste of time at best, and at worst > will get you written off as a troll in both places pretty soon. >
Kless wrote: > I'm sorry, but it was necessary that certain answers were answered by > someone with wide knowledge on databases and overall about its own > database. This one was the only way, and I believe that it has been > enough positive, at least for the end users -every one that choose its > data base-. At least this clarifies how is working each community, and > what is to be true or not. Nonsense. It was not at all necessary. If someone wants to post on this mailing list they should do it themselves. If not, you shouldn't cross-post for them. cheers andrew