uKernel Paging Management

Crazed123 · Post by **Crazed123** » Sun Dec 02, 2007 4:47 pm

I'm doing Computer Science, and I'm still alive.

Anyway, I've long since given up on my original idea of a microkernel based on portal IPC. It just doesn't create a useful environment for application development.

However, I read up on Plan 9 from Bell Labs, and consider it an extremely elegant OS. I'd like to write a micro-kernel to act as a Plan-9-like "server multiplexer" using 9P2000. File-related system calls get translated by the kernel into 9P2000 messages (or messages in a slightly modified protocol, see below), which the kernel then sends to actual servers. Servers, however are implemented as portals - with absolutely fixed protocols (ie: parameter and return-value passing conventions) made only for receiving 9P requests and returning responses - instead of processes that perform complicated, standard asynchronous message-passing.

I want it to be a micro-kernel so that it can scale up or down, from embedded devices like smart-phones to huge servers in corporate data-centers. This way, users could construct a computer system tailored to their needs by linking various desired devices running this system and having them authenticate with one another.

However, a micro-kernel always has the trouble of how to handle virtual memory. Many simply don't support it, but I don't believe a modern general-purpose operating system can really go without it. I liked how L4 handled the problem by providing a mechanism for user-space to handle its own damn paging, but that complicated the IPC APIs beyond the point of usefulness.

Can anyone think of a good virtual memory mechanism for a microkernel that doesn't complicate the APIs that way? The best I've been able to think of is adding a mechanism for sending single pages that would look and work like the rendezvous call from Plan 9.

mystran · Post by **mystran** » Sun Dec 02, 2007 9:31 pm

Yup. Consider device drivers (I mean, those that actually talk to real devices) part of the kernel. Now you can have a swap partition (or at least a swap-disk if nothing else) which is independent of anything outside the kernel.

Problem solved.

And yes, I know that's against the spirit of uKernels but if you want the kernel to dynamically allocate memory (that is, you don't want to reserve physical memory in the kernel for the worst-case) then you basically have two options:

1. introduce real-time constraints into the system by requiring virtual memory servers to be able to free memory within some amount of time, and then somehow manage to not starve them when you run out of memory and need them free some for you (the kernel).

2. just accept that physical disks are part of the "system" just like processors and physical memory, and handle them in kernel

Crazed123 · Post by **Crazed123** » Tue Dec 04, 2007 8:36 am

Well the idea was that a micro-kernel without swapping code could be used on devices that don't have or need a swap partition.

And the idea of the VMM system was to allow user-level servers to control swapping and memory mapping without outsourcing the mechanism of virtual memory (control of page tables) to a user-level server. Think like L4.

Of course, it is a ***** to make a working API.

mystran · Post by **mystran** » Tue Dec 04, 2007 12:32 pm

Crazed123 wrote: And the idea of the VMM system was to allow user-level servers to control swapping and memory mapping without outsourcing the mechanism of virtual memory (control of page tables) to a user-level server. Think like L4.

The main problem is how to regulate the memory..

L4 says that the initial memory server gets all the memory, at which point it's that server's problem how to regulate it. The kernel then need not care. The problem still remains though: there has to be centralized entity that decides who gets how much memory. If you ever allow allocations to be overcommited (which is normal on most systems, 'cos it's typical that many programs allocate more than they really need.. say 1 MB for each threads stack, even if only a couple of pages was needed) then you need some method for either swapping out to disk when you run out of memory (which allows overcommit of memory but not swap space.. but swap space is cheaper so who cares) or forcibly free memory at times (think of OOM killer in linux)

Anyway, the problem is, if you offload memory management to userspace, you can no longer allocate memory in kernel. Ofcouse you can reserve some amount of memory for the kernel, then let userspace manage the rest, but if you run out the reserved memory in kernel, it's practically impossible to figure out how to reserve more.

So in order to manage memory in userspace, you basicly have to be able to live with statically allocated memory in kernel. That basicly means that your kernel can't have a malloc() or at least that you'll have to decide the size of your kernel heap at boot time, then live with whatever you have.

That sounds simple enough, until you think about it in detail... say, how many threads are we going to have running at a time? If you reserve kernel memory structures for ten thousand threads, and you only need a hundred, you're wasting quite a bit of memory. On the other hand, if you go too low, you might run against the thread max.

Systems that manage memory in kernel, can easily cap the number of threads (and other similar things) to say "half the physical memory" and if they only need 1MB then they only use 1MB but they have the ability to take another GB of memory for the purpose if it's needed. Linux for example does this.

Ofcourse if you know at boot time how much of whatever resource you are going to need (say, an embedded system) then this is a non-issue, but for such a system you probably also know how much memory you need for whichever process, and even a memory manager isn't necessary, just convenient.

One approach is to have static tables of pointers for maximum number of each object type (like the threads) then require the userspace entity willing to create a new thread to also give the memory required to manage the kernel space parts of it, but that means you'll need to be able to kill a thread of process A because As memory manager wants the back the page that A told the kernel to use for it's thread data... and if you'd like to avoid this, you somehow need to communicate to As memory manager that this page is more important than the other pages of A.

Oh... and did I mention the trouble of making intelligent swapping decisions? Like, say you wanna do LRU or even the clock? That basicly means being able to access (read/write) the page tables of processes while looking for a page to throw away. In kernel this is simple. In a userspace memory manager (that is not allowed to write to page tables) it basicly means doing a kernel call for each table entry you want to reset a flag on. You could ofcourse move such policies to kernel, but then what's the point of having a userspace memory manage in the first place?

Finally, if you attempt to put several memory managers in a row, you basicly have all the above mentioned problems on each level, not to mention tons of extra overhead.

Really. I'd like kernel manage memory. If memory is physical memory then fine, applications could still be enabled to do their own swapping (which is easier than managing memory in kernel). If memory is virtual memory, then you need the drivers for the swap device into the kernel.

Anyway... L4 is a nice research toy. In practice, there's a lot of design decisions in L4 that basicly go like "let's simplify the kernel, then prove that userspace mechanisms are theoretically sufficient" with the result that yes, the userspace mechanisms are theoretically sufficient, but what you win in simplicity of kernel, you get back as orders of magnitude more complexity in userspace, and where you win in performance of the kernel itself, you lose as orders of magnitude more overhead in the system as a whole.

In short: L4 really sucks big time once you look outside the kernel.

Colonel Kernel · Post by **Colonel Kernel** » Tue Dec 04, 2007 2:31 pm

Wow... that's the most complete and useful summary of these issues that I've ever seen. It's the content of maybe 3 or 4 long threads from the days of yore condensed down into one post.

I tend to agree with mystran, except for one thing: I think it is possible (perhaps even practical) to have an external swapper in user space. In my OS design (still unimplemented sadly, given my busy work/life) all the policies, management of page tables, etc. is in the kernel, but the actual swap I/O is in a trusted process in user space. The API between the two is not at all elegant, but I think it will work. However, one simplification in my overall design is that my OS will not support shared memory. This has the side effect of simplifying kernel/swapper interactions somewhat.

I'm at work now and don't have my design notes in front of me, but if anybody's curious I can go into more detail later.