I'm at the point in my OS where I need to start thinking about allocating userspace address spaces. This comes with a few interesting design decisions that need to be made.
To actually build the page directory, I think I can abuse recursive paging, by putting the top level directory as an entry in the kernel's top level entry at some known offset, then map the same offset in the process' top level to itself - kind of making a recursive page table, but that's accessible from the kernel address space. So far so good, I see no issues with this beyond finding the time to build it.
Separately, the kernel also needs to be mapped to all process address spaces, to allow for syscalls and interrupts. Intel provides the GLOBAL flag, which looks like it's for exactly this purpose, but the docs are very unclear on how it works in practise. AIUI, if I've read correctly, setting GLOBAL in the kernel address space on the relevant pages will mean that, on switching to a userspace address space, the GLOBAL address space will still be mapped, without having necessarily populated the relevant entries in the userspace address spaces. However, I'm not certain that I've correctly understood that. It's also unclear whether I need to set GLOBAL for every entry recursively, OR just the bottom level, OR just the top level. Every entry recursively probably won't hurt, but good to clarify.
Does anyone else make use of GLOBAL pages for kernel mapping, who can shed some light on how to make it work in practise as opposed to just on paper? I know it can work for what I'm trying to do, it's just a question of implementation details the Intel manuals are a bit sparse on.
Relatedly, I'm almost executing userspace programs, which is a very exciting milestone
On understanding the x86 paging GLOBAL flag
Re: On understanding the x86 paging GLOBAL flag
Global pages do not get evicted from the TLB when reloading CR3. If you are employing a normal higher-half scheme, this will improve performance of task switches, because the kernel is mapped into both the old and new address spaces at the same place, and the TLBs can just remain.
Without looking at the documentation, I'd just set it on every level, because I have no reason not to. With the normal address-space split, I know from the highest level which pages are going to be global.
Carpe diem!
Re: On understanding the x86 paging GLOBAL flag
I remember trying this out, but giving it up. Now when you describe the function, I think I understand why it failed. My logic for TLB flushing is based on keeping a small list of pages that needs to be invalidated per core, but when the page count grows above some maximum amount, I will use a reload of CR3 instead as this would be faster. Of course, on 386, invalidate page is not even supported, which means CR3 reload must always be used there.nullplan wrote: ↑Sun Nov 24, 2024 11:42 am Global pages do not get evicted from the TLB when reloading CR3. If you are employing a normal higher-half scheme, this will improve performance of task switches, because the kernel is mapped into both the old and new address spaces at the same place, and the TLBs can just remain.
So, how are you supposed to handle invalidation of hundreds of pages, some which might be in kernel space, when the global flag is used for kernel space?
Re: On understanding the x86 paging GLOBAL flag
Why would you need to invalidate hundreds of kernel pages? That's an event that should be rare. Just invlpg all the pages that need it, is my guess.
But the 386 doesn't even have global pages.
Carpe diem!
Re: On understanding the x86 paging GLOBAL flag
I can see situations when this happens, like when a large heap object in kernel is freed. It could also be that a user process is terminated, along with freeing a small number of kernel pages. The point is also that this creates a "TLB shootdown" that must be handled by all cores in the system. IOW, it's not enough to do these invalidations on the current core, they must be done by all cores in the system. You cannot have an unlimited list of page invalidations per core, unless you allocate it dynamically, which is quite ineffective.
Also, rare cases sometimes happens, and you cannot have those bringing down your system in uncontrolled ways.
-
- Member
- Posts: 5588
- Joined: Mon Mar 25, 2013 7:01 pm
Re: On understanding the x86 paging GLOBAL flag
Yes. I found this solution here: https://stackoverflow.com/questions/283 ... b-flushingOctocontrabass wrote: ↑Sun Nov 24, 2024 3:17 pmIf you really need to flush the entire TLB including global pages, you can toggle CR4.PGE.
They also describe the trade-off between entire TLB invalidation and the use of INVLPG.