Memory management idea

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
oscoder
Member
Member
Posts: 59
Joined: Mon Mar 27, 2006 12:00 am
Location: UK

Memory management idea

Post by oscoder »

Hi,
For my OS design I've recently thought of an interesting idea for memory management. Instead of the kernel allocating a heap-space for each process, I plan to allow each process control of its own address space by providing calls 'enable_memory(address)' and 'disable_memory(address)'. My reasons for this are that it makes kernel code a little simpler and gives applications greater flexibility, but might there be any other problems/advantages with this design?

OScoder
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Post by AndrewAPrice »

1 program could map any part of system/kernel memory to itself.
Last edited by AndrewAPrice on Sun Nov 11, 2007 3:46 am, edited 1 time in total.
My OS is Perception.
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

:shock: Well sir, you *could* make each application set it's own space. They used to do that in the DOS days. Now the compiler specifies the values for us (it's in the program header) and the OS uses the values specified there.

It's only a good idea if you can suggest a good reason to do it compared to that. I can't see it, myself, off the top.
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: Memory management idea

Post by AndrewAPrice »

oscoder wrote:My reasons for this are that it makes kernel code a little simpler and gives applications greater flexibility
How so?

Also, if programs manage their own memory, where will a list of free/used areas be stored? E.g. How does program 1 know not to use 0x45327->0x54420 because program 2 is also occupying that area of memory? This information is usually stored in the kernel, and programs rely on the kernel to give them an area of memory to work with. Paging/segmentation is merely a layer on top of this.

If you're talking about your userland malloc/free implementation, then this should be up to the application, unless you wish to syscall each malloc/new request. :D
Last edited by AndrewAPrice on Fri Feb 05, 2021 2:51 pm, edited 2 times in total.
My OS is Perception.
nick8325
Member
Member
Posts: 200
Joined: Wed Oct 18, 2006 5:49 am

Post by nick8325 »

I think the idea is fine, as long as the addresses you're talking about are virtual. (I think MessiahAndrw has assumed they're physical.) Both UNIX and Windows have functions like your enable_address/disable_address (mmap/munmap on UNIX, VirtualAlloc/VirtualFree on Windows).
User avatar
JAAman
Member
Member
Posts: 879
Joined: Wed Oct 27, 2004 11:00 pm
Location: WA

Post by JAAman »

no, this isnt safe even for virtual addresses (although, you really cant separate them -- if you allow the process to write to the page tables, they can set whatever they want... i think this is what he was talking about -- take memory allocation out of the kernel)

but even if you do it that way, if an allocation decides to allocate itself a portion of the kernel memory, then it has access to protected memory which it shouldnt have

i think what you are talking about is more of the process being able to request whatever memory it wants (where physical addresses are still handled by the kernel, and restricted addresses can be refused) however his reason was to simplify the kernel by removing memory management from it, my conclusion then, is that he wanted to turn over control of the page tables to the application...
nick8325
Member
Member
Posts: 200
Joined: Wed Oct 18, 2006 5:49 am

Post by nick8325 »

I got the impression he was just trying to remove virtual memory allocation from the kernel...of course, if the system call allows the program to read and write the kernel's memory, or the memory of other processes, then it's not safe. I suppose we won't be able to tell which the OP means until he replies :)
Craze Frog
Member
Member
Posts: 368
Joined: Sun Sep 23, 2007 4:52 am

Re: Memory management idea

Post by Craze Frog »

oscoder wrote:Hi,
For my OS design I've recently thought of an interesting idea for memory management. Instead of the kernel allocating a heap-space for each process, I plan to allow each process control of its own address space by providing calls 'enable_memory(address)' and 'disable_memory(address)'. My reasons for this are that it makes kernel code a little simpler and gives applications greater flexibility, but might there be any other problems/advantages with this design?

OScoder
This is exactly the normal way to do it, (if we assume that the program of course can't deallocate memory that was mapped by the kernel). On top of this sits a malloc implementation in userspace and this is what the actual program calls.
dxcnjupt
Posts: 1
Joined: Mon Jul 09, 2007 9:55 pm

Post by dxcnjupt »

this has been down by l4
see the article <user-level managment of kernel memory>

if you just want to let the process contrl the memory of itself(but not the kernel), there is some other way to do that.
oscoder
Member
Member
Posts: 59
Joined: Mon Mar 27, 2006 12:00 am
Location: UK

Post by oscoder »

I got the impression he was just trying to remove virtual memory allocation from the kernel...of course, if the system call allows the program to read and write the kernel's memory, or the memory of other processes, then it's not safe. I suppose we won't be able to tell which the OP means until he replies Smile
You've got it, I think! The idea is that, instead of the kernel managing the processes heap, etc, it present two system calls (+ a few more for setting memory properties) that enable and disable pages of user memory. What the OS does when it gets a call is it allocates physical memory and maps the given page to it.

There is added flexibility mainly because the developer can set pages as readonly, executable, etc.
Both UNIX and Windows have functions like your enable_address/disable_address (mmap/munmap on UNIX, VirtualAlloc/VirtualFree on Windows).
This was the kind of information I was looking for. Do these OS'es also provide a memory map to let the process know what parts of memory are not used and so can be allocated?

Thanks,
OScoder
User avatar
mystran
Member
Member
Posts: 670
Joined: Thu Mar 08, 2007 11:08 am

Post by mystran »

oscoder wrote: This was the kind of information I was looking for. Do these OS'es also provide a memory map to let the process know what parts of memory are not used and so can be allocated?
Is that even necessary? If you make sure that the process (or a dynamic-linker on behalf of the process, or whatever) is always loaded at a known address, and then heap starts after that, and ends at a known offset (say, where kernel memory starts) and then let the process itself (or the dynamic linker on behalf of the process) map everything including stacks into that known region, then all the information required is already in userspace, right?

:)

Ok, the dynamic loader and CRT startup code that gets it all rolling for a new process will necessary get a bit low-level and complicated, somewhat like lowest levels of kernel code, but since that's code that's the same for every process (and ideally in a shared library) it doesn't really matter.

Well, you can ofcourse have such a system call to query the kernel for a map, if you maintain such a map in kernel. But in any case I'd design process loading as something like:

1. load (or mark for demand-loading) dynamic linker-loader at known address
2. load (blah blah) the main program binary at another known address after the linker-loader
3. jump to the entry point of the linker-loader, and let it fix rest of the environment for the main binary, including setting up stack and loading any necessary libraries and such
4. jump into main binary's CRT startup code, which can then setup heaps and whatever, and then call the actual application main()

Now, if the linker tells the CRT where the stack is and where the free area for the heap is, then the CRT can manage the region allocations by itself quite fine. :)

Ofcourse that's just how I intend to make it eventually. YMMV.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
User avatar
mystran
Member
Member
Posts: 670
Joined: Thu Mar 08, 2007 11:08 am

Post by mystran »

Oh, btw, NT kernel supposedly always keeps a 1-on-1 relationship with mapped regions of memory, and regions of files. Well, except for shared memory which can't be 1-on-1 but let's ignore that.

The advantage is that now mapping a region of memory involves storing somewhere in the kernel that "this area of this programs memory refers to this file." If it's not really a part of any file, then a sufficient amount of space is allocated from the page file, and that's used instead.

Now, this system is quite nice, because it gives you the following essentially for free:

- you don't need to load anything before a page faults, since you know where the data is coming from.. either you load it from the file that was mapped, or you give a zero page, if it's a region of pagefile that's marked to contain all zeroes (if the FS supports sparse files, you don't even need anything extra for this)

- you don't need to figure out where to store stuff when you need to free the physical memory for other uses... either you write it to the file that was mapped (possibly pagefile), or if it wasn't modified, you just discard it (it's in the file anyway, right?)

- as a side-effect you now support memory mapping of files.

So you get memory mapping of files, demand-loading, and virtual memory together. Oh and it gets better:

- to implement shared memory, all you need is a buffer cache that stores the cached disk data on pages of memory, and allows direct mapping of those pages into processes. You can now map the same file (or region of pagefile) into two processes, and the buffer cache gives you the same pages for each process. And if you propagate "modified" flags from the page tables back to the buffer cache for write-back, shared memory pages can be reliably swapped just like anything else.

Finally, for memory mapped I/O (say video memory), just make the device driver present a virtual file that can be mapped into processes. Basicly all you need is a fake buffer-cache, that just hands over the real video memory addresses when asked for a page.

The only thing that this model does NOT make easy, is the traditional Unix style of creating new processes: supporting copy-on-write style fork() means your filesystems need to support copy-on-write... which is why Windows builds new processes from the scratch (?) instead of making copies of other processes.

Anyway, just some food for your thoughts.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
User avatar
crazygray
Member
Member
Posts: 73
Joined: Sat Nov 03, 2007 10:17 am
Location: Toky,Japan

Post by crazygray »

Somebody wrote: Are you sure laziness isn't your motive for this idea?
Imagine if a creature came from a 4 dimensional world, would he think you to be flat?
User avatar
JoeKayzA
Member
Member
Posts: 79
Joined: Wed Aug 24, 2005 11:00 pm
Location: Graz/Austria

Post by JoeKayzA »

mystran wrote:The only thing that this model does NOT make easy, is the traditional Unix style of creating new processes: supporting copy-on-write style fork() means your filesystems need to support copy-on-write... which is why Windows builds new processes from the scratch (?) instead of making copies of other processes.
AFAIK, this has always been one of the main problems with posix-runtimes under windows nt (psxss and cygwin).

Alternatively, one could also implement copy-on-write above the filesystem through layered mappings, that is, map a file (or region of pagefile) to a region of virtual memory and overlap it with a second mapping, which is triggered on write accesses only and copies the underlying pages when used. Thus, reads will go straight through to the lower layer and writes allocate a page in the layer above (and copy the contents). Just a quick design thought which seems pretty clean and general enough.
oscoder wrote:Do these OS'es also provide a memory map to let the process know what parts of memory are not used and so can be allocated?
The only way to get such a memory map under unix is (AFAIK) to query a pseudo file in the /proc directory. But this really shouldn't be used in order to determine a free area of userspace address space. As mystran already said, when you define a fixed layout at startup everything can be tracked in userspace only.

cheers
Joe
User avatar
mystran
Member
Member
Posts: 670
Joined: Thu Mar 08, 2007 11:08 am

Post by mystran »

JoeKayzA wrote: AFAIK, this has always been one of the main problems with posix-runtimes under windows nt (psxss and cygwin).
Correct.
Alternatively, one could also implement copy-on-write above the filesystem through layered mappings, that is, map a file (or region of pagefile) to a region of virtual memory and overlap it with a second mapping, which is triggered on write accesses only and copies the underlying pages when used. Thus, reads will go straight through to the lower layer and writes allocate a page in the layer above (and copy the contents). Just a quick design thought which seems pretty clean and general enough.
Well, yeah, the filesystem on disk doesn't have to understand those, but at least the buffer cache will have to, and now you get a lot more complexity, because your mapping isn't <memaddr,len,file,fileoffset> anymore, 'cos you'll have to keep track of whether the "copy" has been modified at some address, or whether it still refers to the original, and you have to track how many references there are to the original file, so you know when it's no longer shared... and so on and so on.

If you implement copy-on-write on filesystem level instead, you actually get several other things for free: free snapshotting (just make a copy-on-write of the whole filesystem), versioning (snapshot every once in a while or something), transactions (basicly, write a log entry, make a snapshot of the files affected, modify that, then on commit replace the original with the snapshot and finally write another log entry)...

Anyway, IMHO the whole POSIX fork() is somewhat overrated. The main thing where it's useful is servers that have one process for listening incoming connections, then fork() for each client. That's a nice way to serve lots of clients, if you lack multi-threading and proper asynchronous I/O. If you've got those instead, then the only remaining advantages of fork() are the ability to lower priviledges of the process (not such an issue if you got more fine-grained access control) and the ability to restart the main server without disconnecting the existing connections (could be solved if connections can be transferred between processes).

The only non-server process I think of where fork() also makes things easier, is the command line shells (which can fork() subshells), but here you don't get any extra functionality, just make programming the shell somewhat easier.

Well, YMMV. There are several ways to organize it all, and POSIX is just one of those, and has it's advantages and disadvantages.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
Post Reply