Moving Indices to a different disk and various performance questions - Mailing list pgsql-general
From | Doug Fields |
---|---|
Subject | Moving Indices to a different disk and various performance questions |
Date | |
Msg-id | 5.1.0.14.2.20021212144719.02107570@pop.pexicom.com Whole thread Raw |
Responses |
Re: Moving Indices to a different disk and various
|
List | pgsql-general |
Hello all, (7.2.1 on Debian Woody, so it's using backported 7.2.3 patches.) I've found several threads on this in the past, so I've done a bit of homework. (Questions at end.) Most of you who follow this know I put out a request for commercial help a few days ago. I'm swamped with doing workarounds so I haven't been able to follow up yet (some of my engineers are on vacation). However, one immediate suggestion was for me to put the indices on another disk. (The only other disk I have right now is the transaction log disk, so it's not necessarily viable, but I can relatively quickly add an external RAID enclosure.) That's easy enough to do: find the index in pg_class, get the filename, and move the corresponding files to another disk. However, these indexes are fairly large: almost 6 gigs in one case (my slow VARCHAR index). This causes PostgreSQL to create multiple index files: 6402181, 6402181.1, 6402181.2, etc. It's obvious that as the index grows, it will need more files. So, say I symlink the existing ones to another drive (while the database is offline, of course). Now the index expands to another file. That will obviously be on the "wrong" disk now. What would happen if I did something like "touch 6402181.7" on the next few file IDs, move the empty files over, and then put in symlinks? Will that screw anything up? What will the database server do? Does 7.3 have any provision to say "create all index files in such-and-such directory" and so on? Final miscellaneous questions: 1) Does using those "conditional indexes" speed up indexing? For example, on one table, I have two indexes, one on columns (type, col1) and another on (type, col2) - all INTEGER columns. The index on (type, col1) is used for some values of type, and the index on (type, col2) is used for other values of type. Should I turn those into conditional indexes which list the exact values of type which they should be used for? Should I have a separate conditional index for each possible type value instead? (I'd end up with 20-odd indices, but I don't think that's a big deal.) 2) How badly does turning on stats_row_level and stats_block_level impact performance? a) Which one(s) do I need to get useful data out of pg_stat(io)_user_indexes and such, so I can see where most of my reads and writes are going to? b) What's the difference between *_blks_hit and *_blks_read? c) What are toast_blks? tidx_blocks? (It refers to a TOAST table but not sure what that is.) d) Is there a reference for how to interpret this data? (Beyond the current 7.2.1 manual "monitoring-stats" section) 3) It says I'm using client/server encoding SQL_ASCII. Is this the fastest one? I know I'm supposed to use the "C" locale, but I'm not sure if that locale is SQL_ASCII. 4) Does anyone have any positive or negative feelings on Hyperthreading? It's enabled on my dual P4 Xeon, but I'm considering disabling it, as there are really only 2 CPUs and PostgreSQL might be blocking on locks thinking there are 4, or the same thing at the O/S level. Thanks, Doug
pgsql-general by date: