Re: [HACKERS] Open 6.5 items - Mailing list pgsql-hackers
From | Tatsuo Ishii |
---|---|
Subject | Re: [HACKERS] Open 6.5 items |
Date | |
Msg-id | 199905310824.RAA21429@srapc451.sra.co.jp Whole thread Raw |
In response to | Re: [HACKERS] Open 6.5 items (Vadim Mikheev <vadim@krs.ru>) |
Responses |
Re: [HACKERS] Open 6.5 items
|
List | pgsql-hackers |
>Tom Lane wrote: >> >> Vadim Mikheev <vadim@krs.ru> writes: >> >> If I recall the dynahash.c code correctly, a null return value >> >> indicates either damage to the structure of the table (ie someone >> >> stomped on memory that didn't belong to them) or running out of memory >> >> to add entries to the table. The latter should be impossible if we >> >> > Quite different cases and should result in different reactions. >> >> I agree; will see about cleaning up hash_search's call convention after >> 6.5 is done. Actually, maybe I should do it now? I'm not convinced yet >> whether the reports we're seeing are due to memory clobber or running >> out of space... fixing this may be the easiest way to find out. > >Imho, we have to fix it in some way before 6.5 >Either by changing dynahash.c (to return 0x1 if table is >corrupted and 0x0 if out of space) or by changing >elog(NOTICE) to elog(ERROR). > >> >> > #define NLOCKS_PER_XACT 40 >> > ^^ >> > Isn't it too low? >> >> You tell me ... that was the number that was in the 6.4 code, but I >> have no idea if it's right or not. (Does MVCC require more locks >> than the old stuff?) What is a good upper bound on the number >> of concurrently existing locks? > >Probably yes, because of writers can continue to work and lock >other tables instead of sleeping of first lock due to concurrent >select. I'll change it to 64, but this should be configurable >thing. > >> >> > /* xidHash table */ >> > size += hash_estimate_size(maxBackends, >> > ^^^^^^^^^^^ >> > SHMEM_XIDTAB_KEYSIZE, >> > SHMEM_XIDTAB_DATASIZE); >> >> > Why just maxBackends is here? NLOCKENTS should be used too >> > (each transaction lock requieres own xidhash entry). >> >> Should it be NLOCKENTS(maxBackends) xid entries, or do you mean >> NLOCKENTS(maxBackends) + maxBackends? Feel free to stick in any >> estimates that you like better --- what's there now is an interpretation >> of what the 6.4 code was trying to do (but it was sufficiently buggy and >> unreadable that it was probably coming out with different numbers in >> the end...) > >Just NLOCKENTS(maxBackends) - I'll change it now. I have just done cvs update and saw your changes. I tried the same testing as I did before (64 conccurrent connections, and each connection excutes 100 transactions), but it failed again. (1) without -B 1024, it failed: out of free buffers: time to abort! (2) with -B 1024, it went into stuck spin lock So I looked into sources a little bit, and made a minor change to include/storage/lock.h: #define INIT_TABLE_SIZE 100 to: #define INIT_TABLE_SIZE 4096 then restarted postmaster with -B 1024 (this will prevent out-of-free-buffers problem, I guess). Now everything seems to work great! I suspect that huge INIT_TABLE_SIZE prevented dynamic expanding the hash tables and seems there's something wrong in the routines responsible for that. Comments? -- Tatsuo Ishii
pgsql-hackers by date: