Home > mailing lists

Re: [HACKERS] Hashjoin status report - Mailing list pgsql-hackers

From	Tatsuo Ishii
Subject	Re: [HACKERS] Hashjoin status report
Date	May 6, 1999 22:33:18
Msg-id	199905070231.LAA18421@srapc451.sra.co.jp Whole thread Raw
In response to	Re: [HACKERS] Hashjoin status report (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [HACKERS] Hashjoin status report
List	pgsql-hackers

Tree view

> The Hermit Hacker <scrappy@hub.org> writes:
> >> Opinions?  Should I plow ahead, or leave this to fix after 6.5 release?
> 
> > Estimate of time involved to fix this?  vs likelihood of someone
> > triggering the bug in production?
> 
> I could probably get the coding done this weekend, unless something else
> comes up to distract me.  It's the question of how much testing it'd
> receive before release that worries me...
> 
> As for the likelihood, that's hard to say.  It's very easy to trigger
> the bug as a test case.  (Arrange for a hashjoin where the inner table
> has a lot of identical rows, or at least many sets of more-than-10-
> rows-with-the-same-value-in-the-field-being-hashed-on.)  In real life
> you'd like to think that that's pretty improbable.
> 
> What started this go-round was Contzen's report of seeing the
> "hash table out of memory. Use -B parameter to increase buffers"
> message in what was evidently a real-life scenario.  So it can happen.
> Do you recall having seen many complaints about that error before?

We already have a good example for this "hash table out of memory. Use
-B parameter to increase buffers" syndrome in our source tree. Go
src/test/bench, remove "-B 256" from the last line of runwisc.sh then
run the test. The "-B 256" used to not be in there. That was added by
me while fixing the test suit and elog() (see included posting).  I
don't see the error message in 6.4.2. I guess this is due to the
change of the optimizer.

IMHO, we should fix this before 6.5 is out, or should change the
default settings of -B to 256 or so, this may cause short of shmem,
however.

P.S. At that time I misunderstood in that I didn't have enough sort
memory :-<

>Message-Id: <199904160654.PAA00221@srapc451.sra.co.jp>
>From: Tatsuo Ishii <t-ishii@sra.co.jp>
>To: hackers@postgreSQL.org
>Subject: [HACKERS] elog() and wisconsin bench test fix
>Date: Fri, 16 Apr 1999 15:54:16 +0900
>
>I have modified elog() so that it uses its own pid(using getpid()) as
>the first parameter for kill() in some cases. It used to get its own
>pid from MyProcPid global variable. This was fine until I ran the
>wisconsin benchmark test suit (test/bench/). In the test, postgres is
>run as a command and MyProcPid is set to 0. As a result elog() calls
>kill() with the first parameter being set to 0 and SIGQUIT was issued
>to the process group, not the postgres process itself! This was why
>/bin/sh got core dumped whenever I ran the bench test.
>
>Also, I fixed several bugs in the test quries.
>
>One thing still remains is some queries fail due to insufficient sort
>memory. I modified the test script adding -B option. But is this
>normal? I think not. I thought postgres should use disk files instead
>of memory if there's enough sort buffer.
>
>Comments?
>--
>Tatsuo Ishii

pgsql-hackers by date:

From: The Hermit Hacker
Date: 06 May 1999, 21:55:41
Subject: Re: [HACKERS] Hashjoin status report

From: Tom Lane
Date: 06 May 1999, 22:37:37
Subject: Re: [HACKERS] Hashjoin status report

Re: [HACKERS] Hashjoin status report - Mailing list pgsql-hackers

Previous

Next