9fans archive / 2000 / 09 / 18 /    prev next

From: Alexander Viro <viro@mat...>
Subject: Re: [9fans] rfork(), getss() etc etc
Date: Sat, 2 Sep 2000 06:52:57 -0400 (EDT)



On Sat, 2 Sep 2000 nigel@9fs.org wrote:

> >>	(Thread_Data *) (ESP & -Alignment) + Alignement - sizeof(Thread_Data)
> 
> Hadn't escaped my radar. We're getting into machine dependency here again,
> but it is a solution that I had tried.
> 
> >>	consequences - you are welcome, just let's avoid imitating *.advocacy.
> 
> One man's advocacy is another man's technical discussion. I simply do not
> buy the "clone() is perfect and cannot be changed" attitude, or for that
> matter the "FreeBSD rfork() is perfect and cannot be changed" attitude either.
FreeBSD rfork() still can be used to panic the box ;-/

> In both cases, the problem could be solved by adding a spot of functionality,
> and taking away none.
> 
> So one could add this feature, break nothing, and aid a whole class of applications.
> Why not?

	OK, I'll try to describe the reasons. Let's hope that I'm awake
enough to do that...

	Splitting the stack means that we are getting two classes of 
pointers - stack and non-stack ones. E.g. if you are doing coroutines you
can't pass the pointers to auto variables even if their lifetimes are OK.
It's not nice, to put it mildly.
	On the kernel side we would have to use separate page tables for
every process. Even if they share VM context. It has a lot of interesting
implications. One of them is that unmapping becomes very expensive.
Another, and that's more serious, is that _every_ context switch leads to
complete TLB flush.
	You will not notice the effect simply benshmarking schedule(),
but you will get big slowdown spread over the userland. It's not a pure
theory - effect is quite visible.
	Trying to work around that would give serious mess in VM code. If
you have a clean way to do that - great. So far all proposals were
extremely messy.
	Moreover, we _have_ support for large amount of mappings. It makes
the situation very different - solution that works for Plan 9 will break
horribly on such types of use. "Don't do it, then" is a nice policy, but
it works both ways.
	Having the "same VM - same memory" policy simplifies the things
big way.

	IOW, mixing these things will require serious changes in the
kernel that will try it and I'm less than sure that it's worth the
trouble. Same goes for doing tons of segments on the Plan 9 side (bloated
kernel memory and serious slowdown  or  changing the data structures in
not-too-obvious ways).

	Features may be nice, but they must be doable in clean way.
Otherwise you end up with SVR4 on hands.

	One of the things that might make sense on our side would be
sharing a VMA (more or less equivalent to Plan 9 segment) between several
VMs. Right now I don't see how to do it without very bad behaviour in case
of VM with many areas. Hell knows, it might be doable. However, semantics
of mmap() becomes rather interesting with such change. And life without a
feature is IMO better than life with an ugly kludge.