OSDN Database conference report (long) - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | OSDN Database conference report (long) |
Date | |
Msg-id | 6072.973226830@sss.pgh.pa.us Whole thread Raw |
Responses |
RE: OSDN Database conference report (long)
|
List | pgsql-hackers |
On Oct. 30 and 31 I attended OSDN's rather grandiosely named "Open Source Database Summit" (despite what you might infer from the name, it was just a small, open-to-the-public conference). Their info about the conference is at http://www.osdn.com/conf/osd/conf_index.shtml, though I'm not sure how long that page will remain up. OSDN invited a number of the principal suspects from each major open-source database project to speak, and paid for airfare and hotel rooms for the speakers. The invited speakers were Bruce Momjian and myself from Postgres, David Axmark and Monty Widenius from MySQL, Mike Olson and Mike Ubell from Sleepycat (Berkeley DB), and Ann Harrison from InterBa^H^H^H^H IBPhoenix; also Britt Johnston, who as CTO of NuSphere can fairly be ranked in the MySQL camp; plus Tim Perdue and Rob Ferber as representative application-builders. Total attendance was about forty or fifty, so we had a pretty good crowd of interested people. Ned Lilly of Great Bridge was also there (on GB's dime), as well as two or three more NuSphere people, but mostly it seemed to be users and potential users of open-source databases. Sunday evening, Bruce and Ned and I filtered in at different times. OSDN had laid out a spread of free food in one of the meeting rooms, but the hotel staff didn't tell any of us about it, so we ended up hanging out in the hotel bar with a number of similarly ill-informed souls. It was particularly interesting to talk to John Scott, who is working for Verizon Wireless on redoing the software for their nationwide paging service. It turns out they are looking at using Postgres for the customer database, and either Postgres or Berkeley DB for the realtime database that handles paging messages being pumped through the system. That'll be a feather in our caps if it happens! Monday, Britt Johnston opened the formal proceedings with what amounted to a pep talk for OS DB work. I have a fairly interesting table in my notes, giving current total Web-search hits for various databases: Oracle 3.0 mil MySQL 2.3 mil Postgres 0.7 mil SQL Server 0.6 mil IBM DB2 0.5 mil Interbase 0.1 mil (I had to copy the last half of the table from memory, so it may not be exactly what he said, but it's close.) This says that MySQL+PG together are *already* as interesting as Oracle for web work. He also announced that NuSphere would be financing substantial work on MySQL --- I have a note about 10000 concurrent transactions on a single server, which'd be pretty impressive (he didn't say what size server, though). The conference format alternated between group-wide sessions and pairs of concurrent workshop talks, so after that we split into two groups. I went to hear Tim Perdue talk, while Bruce and Ned listened to Mike Ubell; they'll have to report on what Mike said. Tim's discussion was about building apps atop PHP and a database. He pointed out that for most website builders, the path of least resistance given their existing skills is to construct an "application heavy" system in which most of the logic is in application code. He contrasted this with "database heavy" design, in which more reliance is put on database functionality, such as constraints, triggers, views, etc. Unfortunately (from our point of view) Postgres excels for the database-heavy style, whereas MySQL's lean feature set is sufficient or at least self-reinforcing for the application-heavy style. It'll be difficult for PG to achieve world domination until Web developers become more database-savvy ;-). Tim encouraged a great deal of comment from the audience, and went so far as to make everyone introduce themselves first. (One interesting thing that emerged at that point was that there were *very* few MySQL users, and no MySQL developers, at this talk --- though I guess that just meant that the MySQL people all wanted to hear what Mike Ubell had to say, since he was talking about a directly-MySQL-related subject.) One of the longest-running parts of the discussion had to do with giving good error messages and how it is hard to get friendly messages when you rely on the database to do error checking. I thought this pointed up the need we've been aware of for awhile to overhaul our error reporting. Tim also had a "wish list" for PG that included better admin tools, such as a way to see exactly what queries are running; and a way to retrieve all the database-generated items in a just-inserted row, not only the OID. Both of these have also been on the radar screen for awhile. After a fine lunch (all the food was superb BTW; OSDN made an excellent choice of hotel), we reconvened to hear David Axmark talk about the history and philosophy of MySQL. The only thing that really surprised me is that that project is quite young: it started in 1995. Given that Monty seems to do the vast majority of the development work, there are not many man-years in it, certainly far less than in Postgres. They've done well to come as far as they have. The subsequent breakout was between Rob Ferber talking about shedding database processing load to stateless clients, and me talking about Postgres' transaction model. I was quite annoyed that I couldn't go hear Rob, because his talk abstract sounded very interesting :-(. You can find the slides from my talks (also Bruce's) at http://www.postgresql.org/osdn/index.html, so I won't go into detail, but I hope Bruce will report on Rob's talk. That evening there was a cocktail hour in the hotel's library (free booze, courtesy of the conference) followed by dinner at the hotel's better restaurant. I spent a good part of the cocktail hour talking with Ann Harrison and several other people about organizing some sort of open-source database benchmarking project. It turns out that DEC's (now Compaq's) performance measurement group has a nearly-done reference implementation of AS3AP, which they're thinking of releasing as an open source project. Everyone agreed that would be a fine starting point. We also got to hear Ann's version of the InterBase situation --- more about that later. Towards the end of the hour I wandered over and started to talk to Monty and David. That stretched into eating dinner with them. Since I'd had a couple glasses of wine already, and a couple more during dinner, while they'd started with vodka and then joined in on the wine, I doubt that either side could repeat much of the conversation word-for- word ;-). But it was all pleasant and perhaps will serve to dispel some of the bad blood that's existed between the two projects for awhile. The next morning, the opening speaker was me, with a presentation on the internals of Postgres (see slides at above URL). The subsequent breakout had Bruce giving a talk on the history and project-management practices of Postgres (see slides), while I went to hear Ann Harrison talk about integrity checking in databases. Before she could get into her promised topic, the audience pretty much forced her to give a rundown on the InterBase situation. Bottom line: it's a mess. She feels Borland were being unreasonable (and in her telling of it, they indeed seem to be) while they felt, or said they felt, that she was. She thinks that once she and others had come up with a business plan for doing something with InterBase rather than dropping it, Borland/Inprise decided they could execute the business plan without her --- and that may be pretty accurate. Anyway, Inprise now has a small in-house development team with few if any original developers, Ann has only jawbone control over a dozen or so open-source developers (these also with little or no deep knowledge of the source, apparently), and there's a code fork between the Inprise version and the "Firebird" open-source project. The two groups are apparently talking enough to try to keep their trees from diverging too much, in the hopes that the fork might be reunited someday, but Ann didn't sound all that hopeful about it. Things sound mighty bleak to me --- but perhaps InterBase is just going through a transition comparable to Postgres' transition from a Berkeley project to an open project. To get back to the technical part of Ann's talk, the thing I came away with is a realization that IB did a lot of things pretty similar to Postgres. In particular, it sounds like they have a multi-versioning model nearly identical to Postgres'. They also have some ideas we might be able to adopt --- for example, their indexes point only to the newest version of a row, not all versions. It'd be worth our while to dig through their code for ideas. However, Ann admitted that they are woefully short on internals documentation, so extracting useful ideas promises to be painful :-( The final group-wide session featured Mike Olson of Sleepycat as speaker. Most of you know that Mike was part of the Berkeley Postgres team years ago (if you don't, try scanning our sources for the initials mao) so I count him still a Postgres man, even though Sleepycat is currently in bed with MySQL. Mike had some *extremely* interesting things to say about the prospects for open-source databases making inroads against commercial competition. He pointed out that the notion that we have any chance of doing so is mostly founded on the success Linux has been having competing with Windows --- but that success is founded on (a) a cost advantage, (b) a reliability advantage, and (c) an advantage in the applications space: Linux runs sendmail, bind, Apache, and all the other core Internet server apps, whereas Windows doesn't run them especially well. Mike pointed out that Oracle could *easily* afford to give away their software for free and make all their money on support contracts (license fees are already only 1/3rd of their revenue, so it wouldn't be that big a switch). That would make the cost advantage a harder sell. We could still make a good case for open databases on total cost of ownership, but a key ball to keep our eye on is the ease of installation and administration of our servers. Much of the differential comes from the fact that qualified Oracle DBAs are scarce and obscenely well-paid. We have to be sure that Joe Average Unix Sysadmin can deal with our servers without much trouble. As for point (b), the news is bad: we are *not* up to Oracle standards on reliability. (Mike only said that it's unproven that we are up to commercial standards, but from here in the trenches I'd say we ain't.) We need to keep our noses to the grindstone on this issue, and even so it's unlikely that we'll ever have the same sort of obvious reliability advantage that Linux has over Windows, simply because the commercial databases aren't anywhere near as bad as Windows. That leaves point (c) --- we have to exploit the open-source nature of our systems to encourage a flowering of compatible applications. And we'd better make sure that people can make money building apps atop open-source databases, or that flowering won't happen. A thought-provoking talk indeed; probably the best one at the conference, IMHO. The final pair of speakers were Monty on the history and project-management practices of MySQL, and Rob Ferber on Open Sales' ^H^H^H^H Zelerate's way of building distributed transaction processing. Bruce went to hear Monty, I went to hear Rob. It was pretty interesting: basically, they do not try to replicate state, but instead distribute "events" --- maybe better called "actions", since the events are things like "decrement available-stock by 1". Each server in their network is "authoritative" for events that it originates, and is responsible for transmitting those events to other servers. Each server maintains state tables that represent the integral of all the events it knows of so far, but it's explicitly recognized that these state tables may be out of sync due to network latency, communication failures, etc. With appropriate application programming it's possible to build a highly robust distributed system, sitting atop non-distributed database servers. Their system is open source and all coded in Perl, so you can go have a look if you want to learn more. Bruce and John Scott and I wasted most of Tuesday evening in a fruitless search for the Computer Literacy bookstore that used to exist near Apple headquarters, so I can't say if anything interesting happened around the hotel then. But it seemed that things were winding down and a lot of people were departing that evening, so probably not... Overall it was a very interesting and worthwhile conference. I have to congratulate Mark Stone and Christine Dzierzeski of OSDN on organizing a great conference on little time and minimal budget. If they invite me to the next one, I'll be there. regards, tom lane
pgsql-hackers by date: