LRU on SMP memory manager

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

LRU on SMP memory manager

Post by nexos »

Hello,
I have been planning out my memory manager, but I had one question come to mind. I know many memory managers use the "clock algorithm", which is really approximate LRU. But, one question I have is how does this work on SMP? For example, if the clock hand changes the referenced bit, wouldn't that require a shootdown? So my question is, how do modern systems implement page replacement algorithms?
Thanks,
nexos
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: LRU on SMP memory manager

Post by thewrongchristian »

nexos wrote:Hello,
I have been planning out my memory manager, but I had one question come to mind. I know many memory managers use the "clock algorithm", which is really approximate LRU. But, one question I have is how does this work on SMP? For example, if the clock hand changes the referenced bit, wouldn't that require a shootdown? So my question is, how do modern systems implement page replacement algorithms?
Thanks,
nexos
The same way as UP, but with TLB shoot downs as required, as you say. How a page mapping is changed to clear a reference bit is very platform specific, so the implementation details are hidden behind an abstraction, but the process of scanning memory for reference bits to clear should be platform agnostic, as the referenced bit should be maintained in a central page frame database. [1]

In clock, the scan rate generally tends to be proportional to memory pressure. No memory pressure, very little scanning, and so little overhead from resulting TLB shoot downs. Under high memory pressure, you'll notice thrashing overhead more than TLB shoot down overhead, so it is worth a little bit of pain to avoid thrashing.

Of course, this very much depends on your abstractions. The SVR4 and BSD VMM have a much cleaner separation between platform independent and platform specific code, than say, Linux. Linux is more like how 4 BSD prior to 4.4 expected platform specific memory management to look VAX like, and interpret the VAX like page tables onto the local CPU. [2]

Sun threw that out for SunOS (originally 4.1 BSD based?), and replaced it with a VM subsystem that used a much more abstract interface to the MMU using HAT (Hardware Address Translation) API, and was eventually subsumed into SVR4.

4.4 BSD also threw that out, replacing it with the Mach VM, again with a much more abstract interface to the MMU using the pmap API.

But in both the SunOS/SVR4 and BSD/MachVM cases, the page scanning is very much done in the platform independent portion of the VMM, and both have since scaled quite happily SMP wise.

[1] Of course, how the reference bit makes it from the platform MMU to the platform independent page frame database is the crux of the problem. My own personal code doesn't export that information from the hardware reference bit, and instead is filled in by software in response to page faults. Thus, no reference bit is required in hardware, so I should be fine when I port to VAX :)

[2] Linux and Windows provide an idealised multi-level page table in which to communicate mappings, including maintenance of reference information. I'm not sure if Linux/Windows scans these page table structures directly for reference information, or whether that information is also mirrored to the central page database.
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: LRU on SMP memory manager

Post by nexos »

Ok thanks! I guess I'll be looking into lazy shootdown algorithms
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: LRU on SMP memory manager

Post by Korona »

The clock algorithm (at least the one described on Wikipedia) requires a reverse map (from physical to virtual addresses) though, if the CPU has virtually-mapped access bits. That sounds quite inconvenient. How do OSes that use such an algorithm typically solve this issue?
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: LRU on SMP memory manager

Post by thewrongchristian »

Korona wrote:The clock algorithm (at least the one described on Wikipedia) requires a reverse map (from physical to virtual addresses) though, if the CPU has virtually-mapped access bits. That sounds quite inconvenient. How do OSes that use such an algorithm typically solve this issue?
MachVM/BSD provides pmap_is_referenced, which takes a vm_page structure.

Rather than cycling through the physical page database, the BSDs link the vm_page objects onto LRU queues, so it doesn't look like even the BSDs require a reverse map. Forward mapping is enough.

I may be reading it wrong though, I was going from the NetBSD code here:

uvmpdpol_selectvictim
Post Reply