9fans archive / 2000 / 07 / 716 /    prev next

From: presotto@pla...
Subject: Re: [9fans] Kernel question: i386 test-and-set problem
Date: Mon, 31 Jul 2000 13:26:57 -0400

We did try your solution since it was the obious one.  Consider the
following:

process x calls postnote:
	postnote(p):
		p->notepending = 1
		lock(p->rlock)
		r = p->r
		if r != 0
			lock(r)
			if(r->p == p)
				r->p = 0
				p->r = 0
				ready(p)
			unlock(r)
		unlock(p->rlock)

Immediately after the r = p->r is executed,
process q calls wakeup

	process q:
		wakeup(r):    {wakeup condition is satisfied}
			lock(r)
			p = r->p
			if p != 0
				r->p = 0
				p->r = 0	
				ready(p)
			unlock(r)

Process p now continues after the sleep:

	process p:
		sleep(r);
		free(r)

Process y now does

		xxx = malloc(234);
		xxx->a = 12;

And finally process x does its lock(r).  We've just
clobbered some other processes kernel structure.

We don't know that p->r really points to a valid
Rendzvous structure: they are in malloc'd structures
and p->r may easily be pointing to something in
another structure.  That means that the lock(r)
steps on some random piece of memory.  Usually that's
not a problem.  This took forever to track down.

A possible solution is to use both locks in all three
places which runs into lock ordering problems.