Thread: Intermediate report for AIX 5L port

Intermediate report for AIX 5L port

From

Tatsuo Ishii

Date:

06 December 2001, 09:57:32

This is an intermediate report for AIX 5L port.

o gmake check sometimes hungs up

o pgbench almost always hungs up

In summary current on AIX 5L is quite unstable(7.1 worked very well).
FYI, here are the back trace info while pgbench hungs up.  This
includes backtraces for 3 processes. I will continue to investigate as
long as the hardware is available for me.

[ps indicates UPDATE]
semop(??, ??, ??) at 0xd02be73c
IpcSemaphoreLock() at 0x100091d0
LWLockAcquire() at 0x10019df4
LockRelease() at 0x10052540
UnlockRelation() at 0x10051950
index_endscan() at 0x10051104
ExecCloseR() at 0x100a0584
ExecEndIndexScan() at 0x100a3a30
ExecEndNode() at 0x1009e960
EndPlan() at 0x1009d700
ExecutorEnd() at 0x1009e434
ProcessQuery() at 0x100b60e8
pg_exec_query_string() at 0x1001c5b4
PostgresMain() at 0x1001c0a8
DoBackend() at 0x10003380
BackendStartup() at 0x1000287c
ServerLoop() at 0x10002be8
PostmasterMain() at 0x10004934
main() at 0x100004ec
(dbx) 


[ps indicates UPDATE]
semop(??, ??, ??) at 0xd02be73c
IpcSemaphoreLock() at 0x100091d0
LWLockAcquire() at 0x10019df4
LockRelease() at 0x10052540
UnlockRelation() at 0x10051950
relation_close() at 0x10014f1c
SearchCatCache() at 0x100103f8
SearchSysCache() at 0x1000daac
IndexSupportInitialize() at 0x100ae150
RelationInitIndexAccessInfo() at 0x100332a4
RelationBuildDesc() at 0x10030a20
RelationNameGetRelation() at 0x100337b0
relation_openr() at 0x10014f84
index_openr() at 0x100513dc
SearchCatCache() at 0x1001022c
SearchSysCache() at 0x1000daac
pg_aclcheck() at 0x1008d844
ExecCheckRTEPerms() at 0x1009c838
ExecCheckRTPerms() at 0x1009c94c
ExecCheckQueryPerms() at 0x1009cb28
InitPlan() at 0x1009da10
ExecutorStart() at 0x1009e6f4
ProcessQuery() at 0x100b6028
pg_exec_query_string() at 0x1001c5b4
PostgresMain() at 0x1001c0a8
DoBackend() at 0x10003380
BackendStartup() at 0x1000287c
ServerLoop() at 0x10002be8
PostmasterMain() at 0x10004934
main() at 0x100004ec
(dbx) 

[ps indicates UPDATE waiting]
semop(??, ??, ??) at 0xd02be73c
IpcSemaphoreLock() at 0x100091d0
ProcSleep() at 0x1001d680
WaitOnLock() at 0x10051e70
LockAcquire() at 0x10052d00
XactLockTableWait() at 0x10051580
heap_update() at 0x100133d0
ExecReplace() at 0x1009ce28
ExecutePlan() at 0x1009d678
ExecutorRun() at 0x1009e564
ProcessQuery() at 0x100b60c4
pg_exec_query_string() at 0x1001c5b4
PostgresMain() at 0x1001c0a8
DoBackend() at 0x10003380
BackendStartup() at 0x1000287c
ServerLoop() at 0x10002be8
PostmasterMain() at 0x10004934
main() at 0x100004ec

Re: Intermediate report for AIX 5L port

From

Tatsuo Ishii

Date:

10 December 2001, 12:21:26

> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > In summary current on AIX 5L is quite unstable(7.1 worked very well).
> 
> This looks like a problem with the new LWLock logic.  I guess I have
> to answer for that ;-).  I am willing to dig into it if you can get
> me access to the machine.

Thanks. Since the machine is behind a firewall, I have to negotiate
with the network admin. Once I succeed it, I will let you know.

> BTW, is this a single-CPU or multi-CPU machine?

It's a 4-way machine.
--
Tatsuo Ishii

Re: Intermediate report for AIX 5L port

From

Tom Lane

Date:

10 December 2001, 12:27:22

Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> In summary current on AIX 5L is quite unstable(7.1 worked very well).

This looks like a problem with the new LWLock logic.  I guess I have
to answer for that ;-).  I am willing to dig into it if you can get
me access to the machine.

BTW, is this a single-CPU or multi-CPU machine?
        regards, tom lane

Re: Intermediate report for AIX 5L port

From

Tom Lane

Date:

10 December 2001, 16:37:44

Got it.  The AIX compiler apparently feels free to rearrange the
sequence
    proc->lwWaiting = true;    proc->lwExclusive = (mode == LW_EXCLUSIVE);    proc->lwWaitLink = NULL;    if
(lock->head== NULL)        lock->head = proc;    else        lock->tail->lwWaitLink = proc;    lock->tail = proc;
 
    /* Can release the mutex now */    SpinLockRelease_NoHoldoff(&lock->mutex);

into something wherein the SpinLockRelease (which is just "x = 0")
occurs before the last two assignments into the lock structure.
Boo, hiss.  Evidently, on your multiprocessor machine, there may be
another CPU that is able to obtain the spinlock and then read the
un-updated lock values before the stores occur.

Declaring the lock pointer "volatile" seems to prevent this misbehavior.

Personally I'd call this a compiler bug; isn't it supposed to consider
semicolons as sequence points?  I never heard that rearranging the order
of stores into memory was considered a kosher optimization of C code.
        regards, tom lane

Re: Intermediate report for AIX 5L port

From

Thomas Lockhart

Date:

10 December 2001, 20:07:49

...
> Declaring the lock pointer "volatile" seems to prevent this misbehavior.

Great. That is what it is anyway, right?

> Personally I'd call this a compiler bug; isn't it supposed to consider
> semicolons as sequence points?  I never heard that rearranging the order
> of stores into memory was considered a kosher optimization of C code.

Sure it is. Presumably "-O0" or equivalent would have kept this from
happening, but seemingly unrelated stores into non-overlapping memory
are always fair game at even modest levels of optimization. The "x = 0"
is cheaper than the other operations, though it may be reordered for
internal RISC-y reasons rather than "cheapest first" considerations.
                     - Thomas

Re: Intermediate report for AIX 5L port

From

Tom Lane

Date:

10 December 2001, 20:07:56

Thomas Lockhart <lockhart@fourpalms.org> writes:
>> Declaring the lock pointer "volatile" seems to prevent this misbehavior.

> Great. That is what it is anyway, right?

The reason I hadn't declared it volatile in the first place was that I
was assuming there'd be sequence points at the spin lock and unlock
calls.  The order of operations *while holding the lock* is, and should
be, optimizable.  Pushing updates outside of the range where the lock is
held, however, isn't cool.

Now that I think a little more, I am worried about the idea that the
same thing could potentially happen in other modules that access shared
memory and use spinlocks to serialize the access.  Perhaps the fix I
applied was wrong, and the correct fix is to change S_LOCK from

#define S_UNLOCK(lock)        (*(lock) = 0)

to

#define S_UNLOCK(lock)        (*((volatile slock_t *) (lock)) = 0)

Assuming that the compiler does faithfully treat that as a sequence
point, it would solve potential related problems in other modules, not
only LWLock.  I note that we've carefully marked all the TAS operations
as using volatile pointers ... but we forgot about S_UNLOCK.

Comments?
        regards, tom lane

Re: Intermediate report for AIX 5L port

From

Tatsuo Ishii

Date:

10 December 2001, 20:08:59

> Got it.  The AIX compiler apparently feels free to rearrange the
> sequence
> 
>         proc->lwWaiting = true;
>         proc->lwExclusive = (mode == LW_EXCLUSIVE);
>         proc->lwWaitLink = NULL;
>         if (lock->head == NULL)
>             lock->head = proc;
>         else
>             lock->tail->lwWaitLink = proc;
>         lock->tail = proc;
> 
>         /* Can release the mutex now */
>         SpinLockRelease_NoHoldoff(&lock->mutex);
> 
> into something wherein the SpinLockRelease (which is just "x = 0")
> occurs before the last two assignments into the lock structure.
> Boo, hiss.  Evidently, on your multiprocessor machine, there may be
> another CPU that is able to obtain the spinlock and then read the
> un-updated lock values before the stores occur.
> 
> Declaring the lock pointer "volatile" seems to prevent this misbehavior.
> 
> Personally I'd call this a compiler bug; isn't it supposed to consider
> semicolons as sequence points?  I never heard that rearranging the order
> of stores into memory was considered a kosher optimization of C code.

Looks funny to me too. I will let the IBM engineers know what you have
found. Thanks.
--
Tatsuo Ishii

Re: Intermediate report for AIX 5L port

From

Tom Lane

Date:

10 December 2001, 22:28:57

I said:
> Now that I think a little more, I am worried about the idea that the
> same thing could potentially happen in other modules that access shared
> memory and use spinlocks to serialize the access.  Perhaps the fix I
> applied was wrong, and the correct fix is to change S_LOCK from
> #define S_UNLOCK(lock)        (*(lock) = 0)
> to
> #define S_UNLOCK(lock)        (*((volatile slock_t *) (lock)) = 0)

I have applied this patch also, since on reflection it seems the clearly
Right Thing.  However, I find that AIX 5's compiler must have the LWLock
pointers marked volatile as well, else it still generates bad code.

Original assembly code fragment (approximately lines 244-271 of
lwlock.c):
l    r3,8(r25)stb    r24,44(r25)cmpi    0,r0,0stb    r4,45(r25)st    r23,48(r25)cal    r5,0(r0)stx    r23,r28,r27
        <----- spinlock releasebc    BO_IF_NOT,CR0_EQ,__L834st    r25,12(r26)                <----- store into
lock->headst   r25,16(r26)                <----- store into lock->taill    r4,12(r25)bl    .IpcSemaphoreLock{PR}
 

With "volatile" added in S_UNLOCK:
stb    r24,44(r25)stb    r3,45(r25)cmpi    0,r0,0cal    r5,0(r0)st    r23,48(r25)bc    BO_IF_NOT,CR0_EQ,__L81cst
r25,12(r26)               <----- store into lock->headstx    r23,r28,r27                <----- spinlock releasel
r3,8(r25)st   r25,16(r26)                <----- store into lock->taill    r4,12(r25)bl    .IpcSemaphoreLock{PR}
 

With "volatile" lock pointer in LWLockAcquire:
stb    r25,44(r23)stb    r3,45(r23)cmpi    0,r0,0cal    r5,0(r0)st    r24,48(r23)bc    BO_IF_NOT,CR0_EQ,__L850st
r23,12(r26)               <----- store into lock->headst    r23,16(r26)                <----- store into lock->tailstx
 r24,r28,r27                <----- spinlock releasel    r3,8(r23)l    r4,12(r23)bl    .IpcSemaphoreLock{PR}
 

I believe the second of these cases is inarguably a compiler bug.
It is moving a store (into lock->tail) across a store through a
volatile-qualified pointer.  As I read the spec, that's not kosher.
        regards, tom lane

Re: Intermediate report for AIX 5L port

From

Bruce Momjian

Date:

11 December 2001, 21:08:04

> Thomas Lockhart <lockhart@fourpalms.org> writes:
> >> Declaring the lock pointer "volatile" seems to prevent this misbehavior.
> 
> > Great. That is what it is anyway, right?
> 
> The reason I hadn't declared it volatile in the first place was that I
> was assuming there'd be sequence points at the spin lock and unlock
> calls.  The order of operations *while holding the lock* is, and should
> be, optimizable.  Pushing updates outside of the range where the lock is
> held, however, isn't cool.
> 
> Now that I think a little more, I am worried about the idea that the
> same thing could potentially happen in other modules that access shared
> memory and use spinlocks to serialize the access.  Perhaps the fix I
> applied was wrong, and the correct fix is to change S_LOCK from

Here is my limited experience with volatile.  There was a BSD/OS
multiport card that mapped card memory to a RAM address, but the
pointers pointing to that address weren't marked as volatile.  An
upgrade to a better compiler caused the driver to fail, and I finally
figured out why.  Marking them as volatile fixed it.

Seems this is the same case.  We are not pointing to memory on a card
but to shared memory which can change on its own, hence it is volatile.

Tom, I assume what you are saying is that the access to the spinlocks,
already marked as volatile, should have prevented any code from
migrating over those locks.  I guess my big question is does any
volatile access prevent optimization of other variables across that
volatiles access?  I didn't think that was guaranteed.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Intermediate report for AIX 5L port

From

Tom Lane

Date:

12 December 2001, 16:19:31

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom, I assume what you are saying is that the access to the spinlocks,
> already marked as volatile, should have prevented any code from
> migrating over those locks.  I guess my big question is does any
> volatile access prevent optimization of other variables across that
> volatiles access?  I didn't think that was guaranteed.

After eyeballing the C spec some more, I think you might be right.
If that's the correct reading then it is indeed necessary for lwlock.c
to mark the whole lock structure as volatile, not only the spinlock
fields.

However, if that's true then (a) 7.2 has three other modules that are
potentially vulnerable to similar problems; (b) prior releases had
many more places that were potentially vulnerable --- ie, all the
modules that used to use spinlocks directly and now use LWLocks.
If this sort of behavior is allowed, ISTM we should have been seeing
major instability on lots of SMP machines.

Comments?  Do we need to put a bunch of "volatile" keywords into
every place that uses raw spinlocks?  If so, why wasn't the
previous code equally broken??

I don't think the places that use LWLocks need volatile markers on
their data structures, since the LWLock lock and unlock calls will
be out-of-line subroutine calls.  But for spinlocks that can be
inlined, it seems there is a risk.
        regards, tom lane

Re: Intermediate report for AIX 5L port

From

Bruce Momjian

Date:

13 December 2001, 05:38:32

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom, I assume what you are saying is that the access to the spinlocks,
> > already marked as volatile, should have prevented any code from
> > migrating over those locks.  I guess my big question is does any
> > volatile access prevent optimization of other variables across that
> > volatiles access?  I didn't think that was guaranteed.
> 
> After eyeballing the C spec some more, I think you might be right.
> If that's the correct reading then it is indeed necessary for lwlock.c
> to mark the whole lock structure as volatile, not only the spinlock
> fields.

OK.

> However, if that's true then (a) 7.2 has three other modules that are
> potentially vulnerable to similar problems; (b) prior releases had

That was going to be my next question.

> many more places that were potentially vulnerable --- ie, all the
> modules that used to use spinlocks directly and now use LWLocks.
> If this sort of behavior is allowed, ISTM we should have been seeing
> major instability on lots of SMP machines.

Again, a good question.  No idea.

Here is a more general question:

If you do:
get lock;a=4release lock;

Can the compiler reorder that to:
a=4get lock;release lock;

It can see the lock values don't have any effect on 'a'.  What actually
does keep this stuff from moving around?


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Intermediate report for AIX 5L port

From

Thomas Lockhart

Date:

13 December 2001, 10:29:44

...
> It can see the lock values don't have any effect on 'a'.  What actually
> does keep this stuff from moving around?

Lack of ambition?

I'm pretty sure that the only reasons *to* reorder instructions are:

1) there could be a performance gain, as in  a) loop unrolling b) pipeline fill considerations c) unnecessary
assignment(e.g. result is ignored, or only used on one
 
path)

2) the optimization level allows it (-O0 does not reorder at all)

I vaguely recall that the gcc docs discuss the kinds of optimizations
allowed at each level. Presumably IBM's AIX compiler was a bit more
aggressive in evaluating costs or pipeline fills than is gcc on other
processors.
                     - Thomas

Re: Intermediate report for AIX 5L port

From

Tatsuo Ishii

Date:

14 December 2001, 00:08:40

Tom, Have you fixed the case on AIX 5L?  I still see hunging backends
with pgbench -c 64. Maybe AIX 5L (more precisely xlc) needs additional
fixed? If so, I'm wondering why see no improvements even with gcc.
--
Tatsuo Ishii

(dbx) where
semop(??, ??, ??) at 0xd02be73c
IpcSemaphoreLock() at 0x100091d0
LWLockAcquire() at 0x10019df4
ReleaseBuffer() at 0x100205a4
CatalogIndexFetchTuple() at 0x1005a31c
AttributeRelidNumIndexScan() at 0x1005a4e4
build_tupdesc_ind() at 0x10030c5c
RelationBuildTupleDesc() at 0x10031180
RelationBuildDesc() at 0x100309c0
RelationNameGetRelation() at 0x100337b0
relation_openr() at 0x10014f84
heap_openr() at 0x10014d3c
CatalogCacheInitializeCache() at 0x1000f194
SearchCatCache() at 0x1000fe9c
SearchSysCache() at 0x1000daac
eqsel() at 0x100c5388
OidFunctionCall4() at 0x10045ccc
restriction_selectivity() at 0x100c7594
clauselist_selectivity() at 0x100c72a4
restrictlist_selectivity() at 0x100c7424
set_baserel_size_estimates() at 0x100c8924
set_plain_rel_pathlist() at 0x100e1268
set_base_rel_pathlists() at 0x100e13a4
make_one_rel() at 0x100e1518
subplanner() at 0x100e0b6c
query_planner() at 0x100e0d98
grouping_planner() at 0x100df0f0
subquery_planner() at 0x100dff00
planner() at 0x100dffe0
pg_plan_query() at 0x1001c6b0
pg_exec_query_string() at 0x1001c530
PostgresMain() at 0x1001c0a8
DoBackend() at 0x10003380
BackendStartup() at 0x1000287c
ServerLoop() at 0x10002be8
PostmasterMain() at 0x10004934
main() at 0x100004ec
(dbx) 

> I said:
> > Now that I think a little more, I am worried about the idea that the
> > same thing could potentially happen in other modules that access shared
> > memory and use spinlocks to serialize the access.  Perhaps the fix I
> > applied was wrong, and the correct fix is to change S_LOCK from
> > #define S_UNLOCK(lock)        (*(lock) = 0)
> > to
> > #define S_UNLOCK(lock)        (*((volatile slock_t *) (lock)) = 0)
> 
> I have applied this patch also, since on reflection it seems the clearly
> Right Thing.  However, I find that AIX 5's compiler must have the LWLock
> pointers marked volatile as well, else it still generates bad code.
> 
> Original assembly code fragment (approximately lines 244-271 of
> lwlock.c):
> 
>     l    r3,8(r25)
>     stb    r24,44(r25)
>     cmpi    0,r0,0
>     stb    r4,45(r25)
>     st    r23,48(r25)
>     cal    r5,0(r0)
>     stx    r23,r28,r27                <----- spinlock release
>     bc    BO_IF_NOT,CR0_EQ,__L834
>     st    r25,12(r26)                <----- store into lock->head
>     st    r25,16(r26)                <----- store into lock->tail
>     l    r4,12(r25)
>     bl    .IpcSemaphoreLock{PR}
> 
> With "volatile" added in S_UNLOCK:
> 
>     stb    r24,44(r25)
>     stb    r3,45(r25)
>     cmpi    0,r0,0
>     cal    r5,0(r0)
>     st    r23,48(r25)
>     bc    BO_IF_NOT,CR0_EQ,__L81c
>     st    r25,12(r26)                <----- store into lock->head
>     stx    r23,r28,r27                <----- spinlock release
>     l    r3,8(r25)
>     st    r25,16(r26)                <----- store into lock->tail
>     l    r4,12(r25)
>     bl    .IpcSemaphoreLock{PR}
> 
> With "volatile" lock pointer in LWLockAcquire:
> 
>     stb    r25,44(r23)
>     stb    r3,45(r23)
>     cmpi    0,r0,0
>     cal    r5,0(r0)
>     st    r24,48(r23)
>     bc    BO_IF_NOT,CR0_EQ,__L850
>     st    r23,12(r26)                <----- store into lock->head
>     st    r23,16(r26)                <----- store into lock->tail
>     stx    r24,r28,r27                <----- spinlock release
>     l    r3,8(r23)
>     l    r4,12(r23)
>     bl    .IpcSemaphoreLock{PR}
> 
> I believe the second of these cases is inarguably a compiler bug.
> It is moving a store (into lock->tail) across a store through a
> volatile-qualified pointer.  As I read the spec, that's not kosher.
> 
>             regards, tom lane
>

Re: Intermediate report for AIX 5L port

From

Tom Lane

Date:

14 December 2001, 10:38:48

Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Tom, Have you fixed the case on AIX 5L?  I still see hunging backends
> with pgbench -c 64. Maybe AIX 5L (more precisely xlc) needs additional
> fixed? If so, I'm wondering why see no improvements even with gcc.

Hmm, I thought I'd fixed it ... are you using CVS tip?  The largest
test case I tried was 5 client * 10000 transactions, but that ran to
completion just fine.
        regards, tom lane

Re: Intermediate report for AIX 5L port

From

Tatsuo Ishii

Date:

14 December 2001, 21:39:08

> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > Tom, Have you fixed the case on AIX 5L?  I still see hunging backends
> > with pgbench -c 64. Maybe AIX 5L (more precisely xlc) needs additional
> > fixed? If so, I'm wondering why see no improvements even with gcc.
> 
> Hmm, I thought I'd fixed it ... are you using CVS tip?  The largest
> test case I tried was 5 client * 10000 transactions, but that ran to
> completion just fine.

I believe I grabbed the latest current source. But I will try again...
--
Tatsuo Ishii