Hello,
Thanks for the response. I believe I have these bases covered:
> - You run vacumm analyze often. This is one of the most important
>procedures and needs to be frequent. If you fail to do so, your database
>will most likely slow down.
I ANALYZE every 8 hours at most, whenever new data is finished being imported.
I VACUUM irregularly, as my tables mostly grow in size and don't get
trimmed regularly. Hence, I VACUUM whenever there is any major shrinkage.
> - Are you REALLY sure that indices are being used? For that, use
>EXPLAIN <query> to see that. Note that I had real trouble until I noticed
>that PostgreSQL still does not recognize type casts, so for instance,
>if you got a bigint key, a select * from table where key = 12312 will not
>use indices. A "select * from table where key = 12312::int8" will be
>necessary. This is valid for EVERY "non-standard" type.
I'm certain that the indices are being used. I've EXPLAINed to death over
the last year. I've even made my queries do things like "SET
ENABLE_NESTLOOP=OFF;SET ENABLE_SEQSCAN=OFF;query...;RESET..." in order to
force usage of the indices for some of the queries.
We don't use any non-standard types (although we may move to a BIGSERIAL
one day) except for BOOLEAN.
> - If your "data importing" is done via inserts, make sure that the
>batch uses transactions for each (at least or so) 200 inserts. If you
>don't, each insert will be a transaction, what will slow down you.
Indeed. At first, I did:
BEGIN WORK;INSERT...; INSERT...;COMMIT; and so forth to ensure they were in
a transaction.
Later, I imported the whole thing into a temporary table, then INSERT INTO
real_table SELECT * FROM temp_table to make it even faster (or so I thought).
The biggest slowdown seems to come when there are queries of the form:
1) INSERT INTO tableA SELECT * FROM temp_table
executing simultaneously with queries of the form
2) INSERT INTO tableB SELECT column FROM tableA WHERE various clauses
Of course, #1 happens after a bunch of inserts into temp_table, but those
go very fast.
Either of those queries, in themselves, go slowly (for #2) or are frequent
(for #1).
We have 8GB RAM. I've allocated about 700 megs to shared memory. The rest
is buffer cached by the O/S. I can't afford a 32 gig server as 2 gig RAM
modules are exorbitantly expensive. The database won't fit into RAM anyway.
At this point, after working with variants of this for a year, and watching
my database grow to several tables of 100 million records, I need
professional, high quality, in depth help.
Thanks,
Doug