Re: What popular, large commercial websites run - Mailing list pgsql-general
From | Shaun Thomas |
---|---|
Subject | Re: What popular, large commercial websites run |
Date | |
Msg-id | Pine.LNX.4.44.0205020920250.16874-100000@hamster.lee.net Whole thread Raw |
In response to | Re: What popular, large commercial websites run (postgres@vrane.com) |
Responses |
Re: What popular, large commercial websites run
Re: What popular, large commercial websites run |
List | pgsql-general |
On Wed, 1 May 2002 postgres@vrane.com wrote: > I'm very curious to know why you have problem with growing > database. Does the performance suffer significantly > if you don't do the FULL vacuum? Surely if you can > afford the oracle you can afford relatively much > cheaper storage. You must have other reasons > than just not liking large database Well, it's not the fact that it's growing that's the problem. It's the fact that 100 actual MB of frequently changed data becomes 2gigs if not frequently vacuumed. Even with hourly full vacuums, it still slowly bloats to 200mb in two weeks, with the same amount of data. The worst part about this is that the more it bloats, the longer vacuum takes, and the speed of the bloating increases almost exponentially. Given two months, it's back up to 2 gigs. Full dump and restore? Back down to 100MB. I'm sorry, but no amount of disk storage should have to compensate for a database growing to 60x larger than the actual data stored (the data dump is 30 megs, but you can expect a certain amount of bloat due to column sizes). When the data files start hitting 2GB each, the Linux file-size limit comes into play, and you have no choice but to dump and restore. Even worse? Since there is only about 100 MB of real data in that 2GB morass, there's 1.9GB of old or invalid rows that Vacuum didn't clean up. That makes all subsequent vacuums slower, which makes their locks last longer, which means all selects on the tables being vacuumed are stalled until the vacuum is done. What happens when the vacuum takes half an hour, and it's a web application? Oh darn, you stop serving pages that use the database. Only full vacuum on non-peak times, you say? We tried that. The datafiles exploded to 2GB within days. *DAYS* The hourly vacuum brought it under control, but our insert script which runs every hour and replaces about 10% of the data per run, really cries bloody murder while the vacuum is running. As a result, this machine commonly has a load approaching 20 almost all the time. Turn off postgres? Less than 1, even with a concurrent Mysql DB that has a *valid* 2GB database that contains 2GB of actual data. I'm not passing blame. I'm not even angry. I'm just frustrated and tired of babying Postgres so it doesn't cause the server to burst into flames, crash, or otherwise fail. I actually had less admin overhead with an Oracle database. Anyone who has used Oracle knows just how hard it is to administer, but I sure as hell never had to completely dump and restore the data every month to keep it from eating my drives. I mean, what good is MVCC to avoid locking, when frequent full vacuums are even worse? I'd rather have a little locking contention, than have a completely useless database for ten to twenty minutes, every hour. Heck, maybe it's just our special case, that such a high rate of data turnover just bewilders postgres. But as of now, it's just plain unusable. Until vacuum goes away completely, which it never will as long as MVCC is in place, postgres is not an option for us. I just can't see any way around it. I hate mysql when I want things like foreign keys or subselects, so we still need a full DBMS. Hell, we're even considering giving a look to freaking Interbase, for the love of God. We're desperate, here. ^_^ As a side note, Sybase used to be our DBMS of choice, but it didn't play nice with PHP (segfaults on connects, sometimes), and only old versions fall under the free license, so we ditched it too. It also had weird syntax (go, for crying out loud!?), so our developers hated it. If I could only take all the good things from the databases I liked, and make them one database... Ease of administration of Mysql + Features of Oracle, for instance. But that database is only available in the wonderful, magical world of 'you must be kidding' land. Oh well. -- +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+ | Shaun M. Thomas INN Database Administrator | | Phone: (309) 743-0812 Fax : (309) 743-0830 | | Email: sthomas@townnews.com AIM : trifthen | | Web : www.townnews.com | | | | "Most of our lives are about proving something, either to | | ourselves or to someone else." | | -- Anonymous | +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
pgsql-general by date: