9fans archive / 1997 / 04 / 80 / prev next
From: postmaster@bt.net postmaster@bt.net
Subject: Delivery Report (failure) for pete.fenelon@zet...
Date: Tue, 29 Apr 1997 01:12:19 +0100
------------------------------ Start of body part 1
This report relates to your message: Subject: Pentium Pro and coherence,
Message-ID: <199704211614.MAA02731@cse...>,
To: 9fans@cse...
of Mon, 21 Apr 1997 17:26:16 +0100
Your message was not delivered to pete.fenelon@zet...
for the following reason:
Message timed out
Message timed out
***** The following information is directed towards the local administrator
***** and is not intended for the end user
*
* DR generated by: mta BT.NET
* in /PRMD=BTNet/ADMD=BT/C=GB/
* at Tue, 29 Apr 1997 00:37:29 +0100
*
* Converted to RFC 822 at bt.net at Tue, 29 Apr 1997 01:12:19 +0100
*
* Delivery Report Contents:
*
* Subject-Submission-Identifier: [/PRMD=BTNet/ADMD=BT/C=GB/;<199704211614.MAA02731@cse...]
* Content-Identifier: Pentium Pro a...
* Original-Encoded-Information-Types: ia5-text
* Subject-Intermediate-Trace-Information: /PRMD=BTNet/ADMD=BT/C=GB/arrival Mon, 21 Apr 1997 17:26:16 +0100 action Relayed
* Content-Correlator: Subject: Pentium Pro and coherence,
* Message-ID: <199704211614.MAA02731@cse...>,
* To: 9fans@cse...* Recipient-Info: pete.fenelon@zet...,
* /RFC-822=pete.fenelon(a)zetnet.co.uk/O=BTnet/PRMD=BTNet/ADMD=BT/C=GB/;
* FAILURE reason Unable-To-Transfer (1);
* diagnostic Maximum-Time-Expired (5);
* last trace (ia5-text) Mon, 21 Apr 1997 17:26:16 +0100;
* converted eits ia5-text;
* supplementary info "Message timed out";
****** End of administration information
------------------------------ Start of forwarded message 1
Received: from minster.cs.york.ac.uk by relay.bt.net with SMTP (PP); Mon, 21 Apr 1997 17:26:17 +0100
From: 9fans@cse...
Received: from localhost (majordom@localhost) by cse.psu.edu (8.8.5/8.7.3) with SMTP id MAA02775;
Mon, 21 Apr 1997 12:15:31 -0400 (EDT)
Received: by claven.cse.psu.edu (bulk_mailer v1.5); Mon, 21 Apr 1997 12:15:26 -0400
Received: (from majordom@localhost) by cse.psu.edu (8.8.5/8.7.3) id MAA02739 for 9fans-outgoing;
Mon, 21 Apr 1997 12:15:08 -0400 (EDT)
X-Authentication-Warning: claven.cse.psu.edu: majordom set sender to owner-9fans using -f
Received: from plan9.cs.bell-labs.com (plan9.bell-labs.com [204.178.16.2]) by cse.psu.edu (8.8.5/8.7.3) with SMTP id MAA02731
for <9fans@cse...>; Mon, 21 Apr 1997 12:14:59 -0400 (EDT)
>From: presotto@pla...
Message-Id: <199704211614.MAA02731@cse...>
To: 9fans@cse...
Date: Mon, 21 Apr 1997 10:33:10 -0400
Subject: Pentium Pro and coherence
Sender: owner-9fans@cse...
Reply-To: 9fans@cse...
Precedence: bulk
Sorry for yet another long message...
b From: hamnavoe.demon.co.uk!miller
To: cse.psu.edu!9fans
Subject: Re: porting linux programs and drivers to plan9
presotto@pla... writes:
> [a fascinating account of how the Pentium Pro's out-of-order
> instruction execution breaks the Plan 9 sleep/wakeup code on
> a multi-CPU system]
I didn't write those words. I may have written what
accompanied them but not having seen the message, I don't
know.
The exact ordering I gave in my last mail was impossible
because of the locks. An equally illustrative
(and this time actually possible) version follows.
wakeup_condition = 1;
p = u->p;
lock(&p->rlock);
r->p = p; /* put myself in the rendezvous structure */
A: if(wakeup_condition){
r->p = 0; /* no need to sleep */
unlock(&p->rlock);
return;
} else {
/* go to sleep */
p->state = Wakeme;
p->r = r;
unlock(&p->rlock);
p = r->p;
B: if(p == 0)
return;
lock(&p->rlock);
if(r->p == p && p->r == r){
r->p = 0;
p->r = 0;
ready(p);
}
unlock(&p->rlock);
sched();
}
The ordering of the critical instructions is the same but at least
this time I got the ordering of the locked pieces right. The critical
points are A and B. With speculative reads, both r->p and
wakeup_condition may appear to be 0 (depending on what lock()
does or doesn't do).
b It appears that the slightly different version of sleep/wakeup
given in the Volume 2 paper `Process Sleep and Wakeup on a
Shared-memory Multiprocessor' should be immune to the effects
of weak memory coherency, because the shared variables are
referenced only inside a lock/unlock pair. Is this right?
I'm not sure. It depends a bit on what we believe fixes
the coherence. We don't really know what's happening inside the
pro, we're just guessing. We're not even certain that speculative
reads are the problem. The Pro people have remained silent
on the subject (we've sent email).
Assuming that it was indeed speculative reads, the simplest mechanism
that I can posit Intel to have provided was to have speculative
reads canceled whenever an interlocking instruction is encountered.
If this is indeed the case, then leaving everything between locks
wouldbe sufficient.
( Unfortunately, we don't do that
because of the interaction between postnote and sleep/wakeup. Postnote
doesn't know what r is without first looking at p->r outside of any
possible lock. We could fix sleep/wakeup by moving the problem so
to be between sleep and postnote. However, it'ld be the same
problem. This is perhaps another story. )
Of course, I could be totally wrong about the speculative reads and
it may be the interlock instruction on the writer and not the
reader that causes the processors to become coherent. In that case, at the
very least, we'ld have to make unlock() end with an interlocking
instruction. The released version just sets 'l->val = 0'.
We have discovered empiricly that performing an interlock instruction
between setting one shared variable and looking at the other seems
sufficient. Nothing less seemed to work for us. Putting everything
back inside the locks might have worked but we didn't because of
postnote().
Since we're paranoids, we now perform an interlocking instruction
before checking the state variables in sleep() and wakeup() AND
at the end of unlock(). Everywhere else, we seem to be following
a strict just change/look at shared things inside of lock/unlock policy.
b Perhaps the moral is that it's better to be conservative with
locks than to trust hardware designers to do what we expect.
I certainly agree. We are going to encounter more relaxed ordering
in multiprocessors. The question is, what do the hardware
designers consider conservative? Forcing an interlock
at both the beginning and end of a locked section seems to be
pretty conservative to me, but I clearly am not immaginative
enough. The Pro manuals go into excruciating detail in describing
the caches and what keeps them coherent but don't seem to care
to say anything detailed about execution or read ordering. The
truth is that we have no way of knowing whether we're conservative
enough.
------------------------------ End of forwarded message 1