FS Drivers: How to integrate stock filesystems in exokernels

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

FS Drivers: How to integrate stock filesystems in exokernels

Post by Combuster »

I'm stuck with a design problem: In traditional exokernel design, it is customary that an application has access to the locations (in sectors) of files, and can read or write individual sectors accordingly. More importantly, it should be able to say that file Y should follow directly after X, so that I can read them both without having to perform a seek.

I've checked out the idea behind MIT's exokernel, and it just assigns stretches of disk blocks to an user library, which then manage their own little filesystem at their own discretion. This means that two distinct applications need to have access to the same library to be able to use data from each other. Based on that, the filesystem should be a preset like it is on any other system, and all applications should be able to work with whatever filesystem is present.

This lead to the following initial design:
  • The FS driver controls which applications get access to what sectors
  • Applications can ask the FS for a file, and it will return (part of) a blocklist, and tell the disk driver to give the process the relevant permissions
  • Applications can ask the FS for free blocks, and can ask the FS to create a file using a provided blocklist.
This would work on any FS that supports fragmentation and does not perform journaling. That's exactly where the problems start:
  • In SFS, files need to be contiguous on disk, and the FS driver needs to defragment the disk the moment it does not have space for the extent needed, potentially invalidating all blocklists
  • Similarly, defragmentating a FAT system leads to the same problem.
  • Data journaling is not possible: An application would potentially overwrite parts of files and leave the rest intact. Redirecting writes elsewhere breaks an application's assumption that files are stored consecutively on disk, and can cause fragmentation.
  • The above problem probably manifests at its peak horror with versioning filesystems.
Do any of you have a solution to the above problems without requiring excess messaging overhead, lock contention, or tons of if(property)'s on the client side to make things work together efficiently? Ideas or partial solutions are welcome as well.

Thanks in advance
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Solar »

Wouldn't it be up to the libOS to handle any necessary abstraction? That the exokernel allows the libOS access at the block level doesn't mean the libOS should pass this access on to the application, does it?

The freedom with an exokernel is that you can have a libOS handling blocks x-y this way, and another libOS handling blocks y-z that way, and allowing an application to bring its very own libOS into the arena. But as far as I understood the concept, to have any cooperation between applications, they have to share the same libOS, or at least compatible libOS's.

(I dropped the idea of an exokernel for me, personally. Too many headaches. 8) )
Every good solution is obvious once you've found it.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Combuster »

The reason I'm not doing it that way is because:
- An application with a specific libos won't be able to read f.x. a SD card when FAT16 is not part of the library.
- An application could either spoof a libos and reinterpret the security bits, or all ACL-related features should be part of a host FS, which leads back to the problem that only one FS can exist per medium.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Solar »

That's what you get for using an exokernel IMHO: Maximum customizability, maximum confusion.
Every good solution is obvious once you've found it.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Owen »

IMO, the only way to realise a practical Exokernel system is to assume that, while there may be multiple operating systems running under it, there is one primary OS responsible for resource control; all others are slave to that. In fact, I'd go further and say that applications should run exclusively under the primary OS; secondary OS's are used either for (a) paravirtualization or (b) specialized device drivers.

In fact, I'd go further and say that I have only seen two practical applications of exokernels/similar systems:
  • Symbian/EKA2: EKA2 is a realtime "nanokernel"; Symbian is a non-realtime OS built on top of it. EKA2 is used to allow the GSM/UMTS/LTE signalling stack to share the processor with the Symbian platform; in this case, the signalling stack is effectively a glorified driver
  • The aforementioned virtualization and paravirtualization; particularly where one of the VMs is responsible for arbitrating hardware access for the others (e.g. Xen style Dom0/DomUs)
My personal opinion is that, for FS access, everything should go through the file system. This shouldn't impose much overhead either: App calls FS and gives it page(s) to scatter data into; FS calls disk driver and gives it page(s) and block list; disk driver reads data and then returns notification (and if you're good, you should be able to make the notification go directly back to the app). This seems to me to be especially necessary in the case of stacks; for example, EXT2 in loopback device on ReiserFS on LVM on Soft RAID-5 on 3 physical disks.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Combuster »

Come on guys, this is not about religion or bad practices, this is about showing that things can be done. And apparently I should have said that earlier. :(

Anyway:

I want a design for a filesystem driver that works as an server in a microkernel environment, such that I can
- access files per block and manually schedule (within FS limits) when something is read or written.
- have filesystem independence. Should operate nicely with established FSes, including but not limited to FAT, SFS, XFS, ISO 9660, UDF, Reiser, JFS, ZFS, Btrfs and *FS.
- have permissions to the level allowed by the filesystem.
- minimize the amount of communication.

Any takers? (or people trying)
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by bewing »

Well, honestly, it seems like you have a true conundrum here.

Leaving aside WORM-type devices, any rewritable medium in any reasonable filesystem in any reasonable theoretical OS is going to need some kind of defragging/reorganization daemon. Just as an example, you must recopy sectors of a magnetic disk every few years, or the magnetic domains will deteriorate.

So, assuming the computer is on for an extended period of time: an application may grab some sectors, write a file to them, and then the daemon is going to come along eventually and move the sectors. This is a generic situation, and should be handled generically. It sounds to me like it is absolutely necessary for the LBA->file mapping to have a method for invalidation. Your only other hope is to add one layer of indirection -- which defeats the entire spirit of an exokernel, really.

So the only real question seems to me to be how the mapping gets tested for invalidation. Perhaps the FS manager changes the permission bits on open mappings that have been invalidated? Then, if the app actually tries to access the mapping again after it's been invalidated, it gets a "invalidated, please remap" error? Perhaps that error could even be handled automatically, without even informing the app that it happened?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Combuster »

I was thinking along that line, and yes it can work. There's a problem though, how would you prevent starvation when some app is repeatedly writing a file which causes the mapping to be invalidated before some other app can try to read it?

There's a second ABA-problem (file A moves, file B moves into A's position, trying to read a sectors that would be A returns B), which can be solved with tagging but it isn't the cleanest either.

At least it is a step closer to the goal.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
quanganht
Member
Member
Posts: 301
Joined: Fri May 16, 2008 7:13 pm
Location: Hanoi, Vietnam

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by quanganht »

Owen's idea seems to be more suitable for this kernel design. Actually, it is how Xen VM works. Xen's Dom0 is a modified Linux kernel which acts as a master, and all other kernels are slaves. That way maybe u can control applications better.
"Programmers are tools for converting caffeine into code."
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Brendan »

Hi,

There are basic rules. An OS can bend the rules a little, but the rules can't be avoided.

At any point in time; for all types of resources (I/O ports, IRQs, sectors, files, network connections, software interfaces, whatever), for all types of OS (exo-kernel, micro-kernel, monolithic, whatever):
  • one piece of code may be given exclusive read/write access to the resource, or
  • many pieces of code may share read only access to the resource
For groups of resources of any type or any mixture of different types (for all types of OS):
  • the group of resources can be sub-divided, or
  • the group of resources can be used to create a different kind of resource
Initially (at least for 80x86) there's only 3 types of resources:
  • areas of the physical address space (pages)
  • areas of the I/O address space (I/O ports)
  • CPUs
All other types of resources are created (either directly or indirectly) from these initial resources.

So, an example:
  • "Physical Address Space Manager" is given exclusive access to all of physical memory, and sub-divides it.
  • "Memory Manager" is given all "usable RAM" pages by "Physical Address Space Manager", and sub-divides it.
  • "Device Manager" is given exclusive access to all I/O ports, and "Physical Address Space Manager" gives "Device Manager" exclusive access to all memory mapped I/O areas. "Device Manager" keeps some of these resources to itself (e.g. the I/O ports needed to access PCI configuration space) and subdivides the rest of the resources (giving different I/O ports and different memory mapped I/O areas to the corresponding device drivers).
  • "PIC Driver" is given exclusive access to a few I/O ports by "Device Manager", and uses them to create a new "IRQ" resource. "PIC Driver" gives exclusive access to all IRQs back to "Device Manager".
  • "I/O APIC Driver" is given exclusive access to a memory mapped I/O area by "Device Manager", and uses it to create a new "IRQ" resource. "I/O APIC Driver" gives exclusive access to all IRQs back to "Device Manager".
  • "SATA Controller Driver" is given exclusive access to a few I/O ports and an IRQ by "Device Manager"; and uses them to create a new "SATA channel" resource. It sub-divides the new SATA channel resource.
  • "SATA Disk Driver #1" is given exclusive use of a SATA channel by "SATA Controller Driver". It uses this resource to create a new type of "sectors" resource; and sub-divides this new resource.
  • "Database Task" is given exclusive use of 1 million sectors by "SATA Disk Driver #1". It uses these resources to create a new type of "SQL channel" resource. The (potentially unlimited) number of "SQL channel" resources are sub-divided.
  • "FAT File System" is given exclusive use of 1 million sectors by "SATA Disk Driver #1". It uses these resource to create a new type of "file" resource. The new "file" resources are sub-divided (some tasks are given exclusive read/write access to some files, some tasks are given shared read only access to other files, etc).
  • "SATA CD-ROM Driver #1" is given exclusive use of a SATA channel by "SATA Controller Driver". It uses this resource to create a new type of "sectors" resource; and sub-divides this new resource.
  • "ISO9660 File System" is given exclusive use of all of the sectors provided by "SATA CD-ROM Driver #1". It uses these resource to create a new type of "file" resource. The new "file" resources are sub-divided (different tasks are given shared read only access to files, etc).
  • "Unzip" is given exclusive use of a file by "FAT File System". It uses this resource to create a new type of "file" resource. The new "file" resources are sub-divided (some tasks are given exclusive read/write access to some files, some tasks are given shared read only access to other files, etc).

Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Combuster »

Brendan wrote:
  • one piece of code may be given exclusive read/write access to the resource, or
  • many pieces of code may share read only access to the resource
That is completely wrong. Shared memory disagrees, Windows' file management disagrees: both can have two separate entities share read/write access to the same block. And arguably the same holds for NICs where you occasionally want to be able to send any traffic from any app, and read all traffic coming in (think virtual machines)

I don't really see what the rest is supposed to help me with, being that such a hierarchical system already exists within my OS :?
Owen's idea seems to be more suitable for this kernel design. Actually, it is how Xen VM works. Xen's Dom0 is a modified Linux kernel which acts as a master, and all other kernels are slaves. That way maybe u can control applications better.
What part of "See if it can be done" needs elaboration? :(

My OS is not Xen. My OS is not a clone of a famous Exokernel. My OS does not virtualize many guest OSes. In fact, I want this to work independent of the kernel so I can plug the same code on f.x. my development platform to build whatever disk images I want.

And besides, Owen's design is one for inter-process communication that uses shared memory, it is NOT a driver interface at all.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Brendan »

Hi,
Combuster wrote:
Brendan wrote:
  • one piece of code may be given exclusive read/write access to the resource, or
  • many pieces of code may share read only access to the resource
That is completely wrong. Shared memory disagrees, Windows' file management disagrees: both can have two separate entities share read/write access to the same block. And arguably the same holds for NICs where you occasionally want to be able to send any traffic from any app, and read all traffic coming in (think virtual machines)
A better example of "an OS can bend the rules a little" would've been "append mode" file access, where one or more tasks can have read access to a file while another task has append access to the same file.

For shared memory and Window's file management, how do you guarantee that different pieces of code don't screw each other up? Do you have a reentrancy lock (where whoever holds the lock has exclusive access), or (for shared memory) do you have atomic pointer/s (maybe "head" and "tail" pointers) saying which tasks have exclusive access to which areas of the shared memory, or (for Window's file management) do you give a writer a virtual copy of the file (e.g. where writes are buffered and don't effect reads made by other tasks) so that all task's have read access to the original file and writers have exclusive access to the parts that they modified?

Mostly what I'm asking is "in which way are the rules bent a little".
Combuster wrote:I don't really see what the rest is supposed to help me with, being that such a hierarchical system already exists within my OS :?
Did you see any of your design problems in my example?
Combuster wrote:I'm stuck with a design problem: In traditional exokernel design, it is customary that an application has access to the locations (in sectors) of files, and can read or write individual sectors accordingly.
If an application has been granted access to a file (one type of resource), then it has not been given access to any raw sectors (which are a completely different type of resource). I have no idea why you made the mistake of thinking "it's customary", or how you could forget about things like NFS.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
quartsize
Posts: 8
Joined: Mon Sep 24, 2007 1:23 pm

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by quartsize »

Hello,
Combuster wrote:I've checked out the idea behind MIT's exokernel, and it just assigns stretches of disk blocks to an user library, which then manage their own little filesystem at their own discretion.
I believe XN is a little more complicated than that. If I remember correctly, through the use of UDFs, XN knows enough about the filesystem metadata format to make access control decisions, and therefore can multiplex a filesystem among several libOSes.

It really just goes to show that you have to jump through a lot of hoops to achieve exokernel-level flexibility.
rdos
Member
Member
Posts: 3320
Joined: Wed Oct 01, 2008 1:55 pm

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by rdos »

I don't think there is a need for a "Device Manager" which handles all kind of resources. It works just as well to simply let different drivers use it's allocated IOs (either hard-coded, through PCI or something else). IRQs will be handled by a IRQ handler, and it can export functions to either share an IRQ or request exclusive access to it. When it comes to protecting data structures, I use a simple "section" primitive defined in the task manager. A device that needs to protect its data structures, will simply create a section and use enter/leave when it wants to protect it. IOW, I use a non-hierarchial, local approach to resources and resource protection.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: FS Drivers: How to integrate stock filesystems in exoker

Post by Combuster »

Thanks to everyone who tried to help, including those testing my stubbornness :wink:
bewing wrote:So the only real question seems to me to be how the mapping gets tested for invalidation. Perhaps the FS manager changes the permission bits on open mappings that have been invalidated? Then, if the app actually tries to access the mapping again after it's been invalidated, it gets a "invalidated, please remap" error? Perhaps that error could even be handled automatically, without even informing the app that it happened?
I came up with the following design. It's not perfect, but at least it leaves lock ownership at the trusted side. Essentially, what I plan on doing is putting an MMU on top of the disk, then let the FS driver manage the virtual spaces (of which there would be one per file). Then if the filesystem needs to change something, it can temporarily put a lock over the space's permissions, causing a stall on the least privileged side. At any time, the process can look up the exact mappings from this MMU, and it can subscribe to notifications to be sent when its projection of the disk changes. Negotiation of the exact sectors happens between the application and the FS server.

By making the MMU recursive, file (capability) sharing and loopback devices can be implemented without adding a level of indirection.

Comments?
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
Post Reply