How to allocate kernel stacks so that they can grow?

jpdoane · Post by **jpdoane** » Sat Feb 06, 2021 8:39 am

I'm a bit confused about how to best manage memory for kernel stack(s) in my 32bit OS. My main question is whether kernel stacks should be allowed to grow beyond their initially allocated size (e.g. using a guard page) and if so, how to implement this. There seem to be at least 3 ways of mapping the kernel stacks that I can envision:

Option 1: Processes share all kernel memory, and kernel stacks are allocated sequentially in virtual memory. Kernel stack is fixed in size
Option 2: Processes share all kernel memory, and kernel stack virtual addresses are spaced out a bit. Kernel stacks can grow slightly, until they hit another stack
Option 3: Processes each have their own individual memory map for kernel stack(s). Each stack (or at least one per process) can grow down arbitrarily until it hits the heap

Option 1 is simple, but kernel processes need to be very careful not to overflow their stacks (no recursion, etc). I think this is what linux does? Option 2 might be a bit more flexible but still requires choosing a maximum stack size and thus has the same issue as option 1 (limited by finite virtual space rather than physical space). Option 3 seems like it would be the most flexible, but I'm not sure what pitfalls I may encounter if processes start having different kernel memory maps. And this option gets less elegant when there are multiple kernel stacks per process. Furthermore, I think I will face the fundamental issue again with user mode multithreading when multiple stacks must share the same memory space.

Its entirely possible I am overthinking this. Is there an obvious answer here or are any of these valid design choices?

Thanks in advance!

AndrewAPrice · Post by **AndrewAPrice** » Sat Feb 06, 2021 8:44 pm

Without recursion, I'd find it pretty hard to overflow a stack. What operations are you doing in the kernel that require recursion and could you implement it as a non-recursive loop?

jpdoane · Post by **jpdoane** » Sat Feb 06, 2021 9:25 pm

Im not sure - I havent written much kernel code yet, and I dont fully appreciate the implications of designing in limitations like this. Im just trying to work out what good design options are. And Ive encountered several posts (eg viewtopic.php?p=268915#p268915) that indicated that I should consider allowing for kernel stack growth. But perhaps this is impractical and/or unnecessary?

Relatedly, how does this same problem typically get solved for user mode multithreading? If the heap grows up and the stack grows down, where do you put the stack for a new thread? I guess you just have to decide a priori how much virtual memory you want to allocate respectively to heap and stack(s)? I suppose this only really matters if you have close to 4GB of physical memory and need to really optimize virtual address placement.

jpdoane · Post by **jpdoane** » Sat Feb 06, 2021 9:49 pm

After googling about usermode multithreading, I see that posix threads have a configurable stack size, which seems like a reasonable approach.

nullplan · Post by **nullplan** » Sat Feb 06, 2021 9:59 pm

Option 3 has the giant drawback that kernel threads can no longer share stack data among each other. I am implementing my main synchronization method (futexes) with list nodes allocated on stack (because the tasks have to wait for the node to become unlisted anyway), and this would just not be possible in your case.

I'm going with option 1. Each task has a three page kernel stack and a guard page (so four pages VM). Additionally, interrupts are enabled unless less than one page is left. In case of a page fault due to the guard page being hit, I have a special kernel panic message. So far, I have yet to see that message.

In user space, the threading library allocates all stacks except the primary one, so no need to worry about that.

nexos · Post by **nexos** » Sun Feb 07, 2021 7:36 am

In my OS, kernel stacks are going to be part of the u area. If you are unfamiliar with Unix, he u area contains the Thread Control Block and the kernel stack (sometimes). The TCBs will start 0x1000, and end at 0x3000. The kernel stack area will will start at 0x4000, and end at 0xFFFFF. Every kernel stack will have 4 pages reserved, with just one page actually mapped. When the it hits the unmapped area, a double fault will occur, and the kernel will recognize via the address in CR2 that the kernel stack needs to be expanded. It will then allocate the next page for the stack. If the stack has hit the maximum size, it will panic. I am doing it this very complex way to help conserve memory usage and perhaps gain a little performance boost as the stack only has what is needed mapped.

rdos · Post by **rdos** » Sun Feb 07, 2021 7:57 am

AndrewAPrice wrote:Without recursion, I'd find it pretty hard to overflow a stack. What operations are you doing in the kernel that require recursion and could you implement it as a non-recursive loop?

The most common scenario that results in kernel stack overflow is bugs in interrupt handlers that reenters themselves. In that scenario, expanding the stack will do no good and it will always end up as a panic. However, if the stack is simply too small to handle legitime stack usages & interrupt frames, then it needs to be expanded. I used to have only 512 bytes of kernel stack, but have gradually increased it to one page (4k). I don't think a larger stack is needed as that indicates overusage of the kernel stack for inappropriate purposes.

AndrewAPrice · Post by **AndrewAPrice** » Sun Feb 07, 2021 8:44 am

You'd generally want your interrupt handlers to use their own stack, not the stack of the running thread (be it kernel or userspace) at the time the interrupt was fired.

The easiest and most common thing to do is keep interrupts disabled while in the interrupt handler. Keep your interrupt handlers short and sweet. If it's a long running operation, you can put the user thread to sleep and kick off a kernel thread to do the actual processing work, then return from your interrupt handler.

You could turn your interrupt handler into a long running thread, by assigning a new stack in the TSS before calling 'sti'. (Some other work will have to be done, such as creating and scheduling a new thread object to represent your handler-turned-thread.) It's less effort to just keep interrupts disabled while in your interrupt handler and to create a kernel thread to do the work.

nexos · Post by **nexos** » Sun Feb 07, 2021 10:30 am

Or you could be like a Solaris and use interrupt threads to handle interrupts. For example, if you have a pool of five threads, you dispatch the interrupt to a thread. If the pool is empty, you create a new thread to go in the pool. I plan on doing that to make all interrupt handlers preemptible (as my kernel is eventually going to be fully preemptible).

rdos · Post by **rdos** » Sun Feb 07, 2021 10:34 am

AndrewAPrice wrote:You'd generally want your interrupt handlers to use their own stack, not the stack of the running thread (be it kernel or userspace) at the time the interrupt was fired.

Interrupt handlers never use the userspace stack. If userspace is running while the interrupt occurs, the processor will first switch to the kernel stack and then invoke the interrupt handler on top of it.

AndrewAPrice wrote: The easiest and most common thing to do is keep interrupts disabled while in the interrupt handler. Keep your interrupt handlers short and sweet. If it's a long running operation, you can put the user thread to sleep and kick off a kernel thread to do the actual processing work, then return from your interrupt handler.

You could turn your interrupt handler into a long running thread, by assigning a new stack in the TSS before calling 'sti'. (Some other work will have to be done, such as creating and scheduling a new thread object to represent your handler-turned-thread.) It's less effort to just keep interrupts disabled while in your interrupt handler and to create a kernel thread to do the work.

The drawback of disabling interrupts in the interrupt handler is lousy interrupt latency, which you don't want in a kernel that wants to be highly responsive. I usually offload most of the work to kernel server threads, but I still run most of the interrupt handlers with interrupts enabled.

jpdoane · Post by **jpdoane** » Sun Feb 07, 2021 10:38 am

Should I be differentiating between the small kernel stacks I need to set up for system calls and ISRs, vs the primary stack for persistent kernel-mode processes (e.g. scheduler, drivers, etc). The former does not need much stack space, but I could see how a small stack with only a few pages might be tight for the latter? I am just starting out with this OS project and haven't written any kernel code for my OS that actually requires a huge stack, but I have worked with linux device drivers that were fairly complex and had a pretty deep call tree. I don't want to make this more complicated than it needs to be but I do want to understand the implications of my design choices.

One (possibly dumb) idea might be to provide each kernel-mode proces with their own private stack (and heap?) below 0xc0000000, thats managed just like user-mode processes, and which is not visible to other processes. As mentioned in an earlier comment this could be a significant limitation, but perhaps not so bad as long as shared kernel memory was available via kmalloc, etc?

All of your feedback on this is really appreciated!

rdos · Post by **rdos** » Sun Feb 07, 2021 10:39 am

nexos wrote:Or you could be like a Solaris and use interrupt threads to handle interrupts. For example, if you have a pool of five threads, you dispatch the interrupt to a thread. If the pool is empty, you create a new thread to go in the pool. I plan on doing that to make all interrupt handlers preemptible (as my kernel is eventually going to be fully preemptible).

I usually have dedicated threads that receive signals from interrupt handlers. This also helps in keeping them on the same core.

rdos · Post by **rdos** » Sun Feb 07, 2021 10:43 am

jpdoane wrote:Should I be differentiating between the small kernel stacks I need to set up for system calls and ISRs, vs the primary stack for persistent kernel-mode processes (e.g. scheduler, drivers, etc). The former does not need much stack space, but I could see how a small stack with only a few pages might be tight for the latter? I am just starting out with this OS project and haven't written any kernel code for my OS that actually requires a huge stack, but I have worked with linux device drivers that were fairly complex and had a pretty deep call tree. I don't want to make this more complicated than it needs to be but I do want to understand the implications of my design choices.

One (possibly dumb) idea might be to provide each kernel-mode proces with their own private stack (and heap?) below 0xc0000000, thats managed just like user-mode processes, and which is not visible to other processes. As mentioned in an earlier comment this could be a significant limitation, but perhaps not so bad as long as shared kernel memory was available via kmalloc, etc?

All of your feedback on this is really appreciated!

I don't differentiate these. It's possible for kernel threads to create larger stacks than the usual kernel stack, but in practise, I always use the default kernel stack size.

For the scheduler, I have another 4k per-core stack that it uses when it switches between threads.

AndrewAPrice · Post by **AndrewAPrice** » Sun Feb 07, 2021 8:26 pm

jpdoane wrote:Should I be differentiating between the small kernel stacks I need to set up for system calls and ISRs, vs the primary stack for persistent kernel-mode processes (e.g. scheduler, drivers, etc).

Yes, I'd give the system call and interrupts their own stack. The scheduler, for example, triggers with a timer interrupt and would use your ISR stack. For most cases the ISRs should be short (e.g. they wake up a driver.) Even a heavy system call (e.g. if you had a system call that was "download Assassin's Creed Valhalla" which would take a while to download 50 GB) doesn't have to do all the work in the system call handler. You could either create a thread or wake up a thread that does the actual download, put the calling thread to sleep, then return from the handler into the next awake thread. No need to do any heavy work in the system call handler itself, even if by putting the calling thread to sleep while the work is being done gives the illusion to the process that it's a long running system call.

For other things the kernel may do (e.g. drivers in a monolithic kernel), unless you're planning on having a single threaded kernel that uses an event loop, there wouldn't be a "primary" kernel thread/stack. You'd have one stack per thread, and each driver/service can create as many threads as they want, and each thread would have their own stack.

A kernel's main() function usually ends along the lines of:

Code: Select all

asm("sti");
for(;;) asm("hlt");

You enable interrupts (which kicks your scheduler into gear and transfers control to a running thread), and the "primary" thread is kept around as the "idle thread" you switch to when your schedule can't find any other awake threads.

OSDev.org

How to allocate kernel stacks so that they can grow?

How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?

Re: How to allocate kernel stacks so that they can grow?