9fans archive / 2000 / 07 / 509 prev next
From: miller@ham...
Subject: Kernel question: i386 test-and-set problem
Date: Thu, 20 Jul 2000 14:54:57 BST
jmk@pla... writes:
> The sleep/wakeup/postnote Rendez structure still has a lock which
> protects it, it just moved somewhere else.
Sorry, I didn't explain in enough detail. In /sys/src/9/port/proc.c:588
wakeup() looks at r->p (pointer from Rendez to sleeping process)
without first acquiring any lock. That's the unprotected access I was
referring to: it's dangerous because r->p is shared asynchronously
by sleep() and postnote().
The original 2nd edition kernel (CD version) had a lock in the Rendez
structure, and all accesses to r->p were protected by acquiring
the lock first. However, p->r (pointer from sleeping process to Rendez)
was shared between sleep() and postnote() without locking.
A later kernel update (845586056.rc) introduced a new lock in the Proc
structure (p->rlock) to protect the shared access to p->r, but eliminated
the lock in the Rendez structure. This left r->p exposed again. I believe
that's why you need coherence() calls.
> The 2nd Edition code would
> have needed coherence() calls too, but in different places, had it not
> been rewritten before we tried running on a multiprocessor Pentium Pro.
When I added mp support to the 2nd edition for my dual ppro system,
I reinstated the Rendez lock, and kept p->rlock as well, so in the
three-way conversation between sleep(), wakeup() and postnote() both
r->p and p->r are protected. I didn't add any explicit coherence()
calls anywhere, and the system has been running stably for over two years.
If I remove the lock around the r->p access in wakeup(), a few simultaneous
'du -a /' commands will quickly cause a crash.
-- Richard Miller