OdinOS: I'd love some design feedback

Brendan · Post by **Brendan** » Thu Nov 21, 2013 11:20 am

Hi,

bwat wrote:
Brendan wrote:Are you sure you're not talking about "embedded bootable applications"? I can't think of an OS intended for embedded use that is a modern OS but doesn't support multiple processes in some way.
Multiple processes does not mean multiple application. There's no need to confuse the two. To give a specific example, the old Ericsson EMP modem which can be found in many smartphones was an ARM chip running Enea's OSE operating system and a single control application which was broken up into several processes.

It's probably a bad idea to prefix something with the words "the old" when you're trying to invent examples of "modern". The web site for Enea's OSE says it is "an RTOS optimized for distributed, fault-tolerant, Linux enabled systems. I'm fairly sure that Linux supports multiple processes and multiple applications, even if some third part real-time layer sits above it.

Note: Sadly there is a need to distinguish between "multiple processes" and "multiple applications"; because we're talking about an OS that has many processes (for device drivers, etc) but only allows one "application process".

bwat wrote:
Brendan wrote:Not that it actually matters - mrstobbe said "4 or more CPUs recommended" and I don't think his idea would work on single-CPU; so that rules out almost all embedded systems. He also implied 80x86, which would rule out almost all embedded systems again.
There's plenty of embedded applications that run on x86 hardware (I've worked on two different process control systems that ran on x86 hardware). There's also plenty of embedded systems that have multiple CPUs (I worked on an embedded system which ran on a twin DEC Alpha CPU board back in the late 1990s/early 2000s, and more recently several different platforms with multicore PowerPC chips).

So you agree with me, except that in order to continue being an argumentative troll you had to pretend that 2 of out 2 million or more is "plenty" while ignoring the fact that "almost all" is not "all"?

bwat wrote:
Brendan wrote:If I was wrong (or "opining from a blinkered world-view") someone would've provided a sane reason for "single application" by now. That someone could've been you, but I guess you weren't able.
Embedded systems, hardware and software dedicated to a single use. This is the second time I've provided that demonstration. You can deny it all you want but you're just exposing your limited experience.

Here's a detailed list of reasons why combustion engines are good: car, boat, train.

Notice how I only wrote a list of things that combustion engines might be used for, and failed to provide any reasons why combustion engines are good? This is exactly what you've done - a list of things and zero reasons.

Now see if you find any reason why "single application only" (and not "one or more applications") is better for "embedded system" or "hardware" or "software dedicated to a single use".

To save time here's some possible suggestions that you could've attempted if your head wasn't firmly jammed somewhere warm and smelly:

"It reduces the number of TLB misses and/or cache misses (compared to "one or more applications" when only one application is being run)". Note: This is false.
"It makes something easier for the end user (or the person using the OS as part of an embedded system, or whatever). Note: This is false too.
"It reduces the size/complexity of the OS". Note: In general this might have been a valid reason (e.g. back when CPUs were 8-bit); but not for the OS that mrstobbe described (that runs drivers as processes and needs the size/complexity involved anyway).
"It makes it easier to do a real time OS". Note: This is false (you just need a real time scheduler, and there's plenty of existing real time OSs and research papers to choose from.

Now notice how I'm doing a better job of arguing your point then you did? Funny that. Did you also notice that a relatively high number of topics that you've gotten involved with ended up locked because someone was an antagonistic moron? That should tell you something about your inability to construct a rational argument.

Cheers,

Brendan

OSwhatever · Post by **OSwhatever** » Thu Nov 21, 2013 11:58 am

Brendan wrote:Hi,

It's probably a bad idea to prefix something with the words "the old" when you're trying to invent examples of "modern". The web site for Enea's OSE says it is "an RTOS optimized for distributed, fault-tolerant, Linux enabled systems. I'm fairly sure that Linux supports multiple processes and multiple applications, even if some third part real-time layer sits above it.

In this particular case, the OSE operating system is supposed to be used together with Linux systems where the Linux system communicates with the OSE system via their proprietary LINX interface. That's what they are selling if I read it correctly. In practice this setup could be an embedded system with a CPU with an MMU running Linux and then smaller CPUs without MMU running OSE. The Linux part is the main controlling part and main applications, then the OSE part is the part that does specialized processing. Embedded systems with an ARM Cortex A9 running linux and then one or more Cortex R/M running some simpler MMU-less OS is a very common setup.

Brendan · Post by **Brendan** » Thu Nov 21, 2013 12:20 pm

Hi,

OSwhatever wrote:
Brendan wrote:Are you sure you're not talking about "embedded bootable applications"? I can't think of an OS intended for embedded use that is a modern OS but doesn't support multiple processes in some way.
There are plenty of small embedded kernels that has no real process concept. As the software in small embedded systems are known, there is less need for protecting parts of the software from each other. These OSes do support what we usually know as threads and also loadable code (usually elf files) but no process isolation.

If there's no support for loadable executables then it falls into the "embedded bootable applications" category (e.g. if it exists at all, the "OS" is built into the application/executable). If there is support for loadable executables, then those executables (when loaded/running) are "processes in some way".

Processes without isolation are still processes. For example, MS-DOS supported multiple processes, even though parent had to wait for child to terminate, and even though isolation wasn't enforced. For another example, there's at least one "embedded Linux" (maybe uClinux - too lazy to check) that has the similar restrictions to MS-DOS (e.g. "fork()" won't return until child terminates, and no real isolation between processes due to support for CPUs that don't have MMU, etc).

For an OS to only support one process, it'd have to support loadable executables (to avoid being a bootable application) and can't have any way to fork/spawn a second process once one is already running. This is technically possible (e.g. perhaps the kernel provides a "terminate_then_execute(command)" function).

OSwhatever wrote:
Brendan wrote:It's probably a bad idea to prefix something with the words "the old" when you're trying to invent examples of "modern". The web site for Enea's OSE says it is "an RTOS optimized for distributed, fault-tolerant, Linux enabled systems. I'm fairly sure that Linux supports multiple processes and multiple applications, even if some third part real-time layer sits above it.
In this particular case, the OSE operating system is supposed to be used together with Linux systems where the Linux system communicates with the OSE system via their proprietary LINX interface. That's what they are selling if I read it correctly.

That sounds right to me. According to their whitepaper the RTOS layer is able to handle multiple real time processes (and does priority-based preemptive scheduling), while the Linux part handles multiple normal (non-real time) processes.

To be honest, RTOS actually sounds like a really nice OS; and has some impressive/interesting features (its own messaging/communication system that can handle distributed systems, memory management with support for "with or without MMU", support for AMP and SMP, etc). It's a pity they went and stuck Linux under it...

In any case; "RTOS+Linux" is definitely not a valid example "a modern OS that doesn't support multiple processes".

Cheers,

Brendan

OSwhatever · Post by **OSwhatever** » Thu Nov 21, 2013 12:51 pm

Brendan wrote:Hi,

That sounds right to me. According to their whitepaper the RTOS layer is able to handle multiple real time processes (and does priority-based preemptive scheduling), while the Linux part handles multiple normal (non-real time) processes.

When I read it's like they rearranged the basic operating system terms, thread and processes.

OSE’s natural programming
model is based on passing direct,
asynchronous messages between tasks
(or “processes” in OSE terminology). This
model tends to promote, though not
strictly enforce, consistent behavior
when coding applications.

The OSE kernel allows its processes to
be collected into “blocks”, where each
block of processes can be assigned its
own separate (non-fragmenting) RAM
memory pool (Figure 11). This compartmentalization
prevents problems in one
memory pool (such as a memory leak)
occurs from affecting other blocks.
If the target CPU has a memory
management unit (“MMU”), OSE can take
advantage of the MMU to establish hardware-
enforced, RTOS-aware memory
protection between OSE blocks (or
groups of blocks) within a single
processor. This allows separate blocks
(or groups of blocks) to have their own
separate memory address spaces.

process = thread
block = process

?

Combuster · Post by **Combuster** » Thu Nov 21, 2013 4:13 pm

I don't think they redefined anything. Processes, tasks and applications are ambiguous terms to say the least - open a random printed dictionary and I'm sure you'll find something not to your taste. These terms simply need proper definitions before you compare discussions about them.

In fact, my kernel doesn't know a thing about processes, tasks, or applications. It only has threads and address spaces as primitives.

Brendan · Post by **Brendan** » Thu Nov 21, 2013 10:04 pm

Hi,

OSwhatever wrote:
Brendan wrote:That sounds right to me. According to their whitepaper the RTOS layer is able to handle multiple real time processes (and does priority-based preemptive scheduling), while the Linux part handles multiple normal (non-real time) processes.
When I read it's like they rearranged the basic operating system terms, thread and processes.

OSE’s natural programming
model is based on passing direct,
asynchronous messages between tasks
(or “processes” in OSE terminology). This
model tends to promote, though not
strictly enforce, consistent behavior
when coding applications.

It's possible that they're using something like Linux, where the scheduler schedules "tasks", and a task may be either a process or a thread depending on which attributes were used to create it (e.g. if the task shares its creator's virtual address space or not).

OSwhatever wrote:
The OSE kernel allows its processes to
be collected into “blocks”, where each
block of processes can be assigned its
own separate (non-fragmenting) RAM
memory pool (Figure 11). This compartmentalization
prevents problems in one
memory pool (such as a memory leak)
occurs from affecting other blocks.
If the target CPU has a memory
management unit (“MMU”), OSE can take
advantage of the MMU to establish hardware-
enforced, RTOS-aware memory
protection between OSE blocks (or
groups of blocks) within a single
processor. This allows separate blocks
(or groups of blocks) to have their own
separate memory address spaces.

I think this part actually means processes when it says "processes". For example; all processes that belong to the same user might grouped and use the same memory pool (similar to per user disk quotas in Linux).

Cheers,

Brendan

mrstobbe · Post by **mrstobbe** » Fri Nov 22, 2013 12:52 am

Long day of work and what-not and everyone's been arguing amongst themselves over a bunch of stuff that has nothing to do with my design

. I'll respond in reverse order (newest first) so nothing gets missed here:

First, just to clarify...

Combuster wrote:Processes, tasks and applications are ambiguous terms to say the least

Couldn't agree more... in my head I have a tendency to think that "processes" are one or more "threads" in a single unique memory space. Therefore multiple "threads" may share the same address space, but no two "processes" (or related "threads") may use the other's memory space under normal conditions. Loose definition, but computers aren't limited to one specific paradigm like that, so you can call anything you want anything you want.

In regards to this system, we're using the "process" as I just described above, and "worker", "thread", "task", and "job" to mean "thread" as described above. It's pretty simple, it's a mono-process system with exactly CPU-count number of threads (one as the main process thread, and the others sticky to each remaining CPU). The exception here of course is that the drivers run in CPU0 as well, but it's not such an exception that the spirit of the design couldn't be called mono-process/multi-tasking (I mean, TSRs existed for MSDOS that could hook the PIC, but no one would argue that it's not a mono-process OS).

That's all I have to say about that so I'll skip up past the next handful of posts arguing semantics...

Brendan wrote:
bwat wrote:
Brendan wrote:Are you sure you're not talking about "embedded bootable applications"? I can't think of an OS intended for embedded use that is a modern OS but doesn't support multiple processes in some way.
Multiple processes does not mean multiple application. There's no need to confuse the two. To give a specific example, the old Ericsson EMP modem which can be found in many smartphones was an ARM chip running Enea's OSE operating system and a single control application which was broken up into several processes.
So you agree with me, except that in order to continue being an argumentative troll you had to pretend that 2 of out 2 million or more is "plenty" while ignoring the fact that "almost all" is not "all"?

I was hoping to get some feedback on the design, not argue over the practicality of designing for an embedded system market while focusing on the x86 arch (which doesn't even apply to OdinOS right now).

Moving on...

Brendan wrote:To save time here's some possible suggestions that you could've attempted if your head wasn't firmly jammed somewhere warm and smelly:

Brendan wrote:"It reduces the number of TLB misses and/or cache misses (compared to "one or more applications" when only one application is being run)". Note: This is false.

You're wrong. I'm not going to be nice anymore. You're flat-out-unarguably wrong. Remember when I said, "correct me if I'm wrong" when addressing that earlier, I was just being socially "nice". I followed up with being nice and not calling you flat-out wrong, but... yes... you're wrong. Any unnecessary cache miss or invalidation, any unnecessary inefficiencies in branch prediction, any unnecessary major/minor page faults, any at all, even the slightest bit, is needless expense (and in many cases, very expensive). Why you keep stating that it's negligible simply because you're confusing the idea that having a lot of things that a system can do is worth the cost of context switching with your particular design vision, completely ignoring the very fact that context switching, but it's very inherent nature is simply expensive. Stop confusing a big picture design idea (in your head) with the in-the-moment reality during a context switch. I'm looking at it from a pure "inherent nature" perspective, and you're looking at it from a "acceptable/negligible cost of doing business for what I want to achieve" perspective... stop confusing the two.

Brendan wrote:"It makes something easier for the end user (or the person using the OS as part of an embedded system, or whatever). Note: This is false too.

Who said that? You just did, but I didn't see anyone else in this thread say anything along those lines.

Brendan wrote:"It reduces the size/complexity of the OS". Note: In general this might have been a valid reason (e.g. back when CPUs were 8-bit); but not for the OS that mrstobbe described (that runs drivers as processes and needs the size/complexity involved anyway).

No, this is still absolutely valid. It will never be invalid unless computers suddenly turn into processing gods who magically give the answer 42 before you even think up the question. Any reduction of complexity adds to the overall efficiency of the system. Again, I think you're just thinking, "Ohhhh... I can do a lot and there doesn't seem to be any major ramifications... CPUs just get better... memory gets cheaper... etc". You're really missing a bigger aggregate picture entirely. As for my design, it can clearly be incredibly simpler (even with a powerful event-driven API) than any common modern general-purpose OS. At this point, I really don't think you have any kind of a solid grasp of even the basics of my design, so it's pretty hard for you to make statements like that and look reasonable. Go re-read it. I'm pretty much done being nice with you here. Still don't agree after re-reading it and the exact same arguments? Fine... but keep them to yourself because you've said the few things that have made me think you'll probably ever say, and ten times more irrelevant and false things on top of that.

Brenan wrote:"It makes it easier to do a real time OS". Note: This is false (you just need a real time scheduler, and there's plenty of existing real time OSs and research papers to choose from.

Again, not relevant.

Well... that was a whole bunch of negative writing... moving on, hopefully this one's a positive contribution.

OSwhatever wrote:
mrstobbe wrote:Hypothetically they then could become almost perfectly CPU bound as work demands, but would be extremely energy efficient while more "idle". In a general purpose SMP design, there is tons of context switching going on even in a mostly "idle" scenario, which is expensive in terms of energy, and a process (or processes) never get the opportunity to fully utilize a CPU because they are constantly being preempted at that point.
Um no, when my microkernel is idle, the CPUs are really idle. There is no context switching going on at all and they are just waiting for some external event.

Yes, the same could be said as true for monolithic kernels as well. That's really not about whether a kernel is monolithic or micro in nature, but I was comparing this system to general purposes OSs that have a bunch of processes constantly running, many of which are frequently switching in because they need to do simple "book-keeping" tasks or "check-and-sleep" tasks if nothing else.

OSwhatever wrote:As you bring up that your goal is to reduce power consumption, there are systems where your model can be beneficial. For example a system with one less powerful but extremely power efficient CPU cores and then several bigger cores that are powerful but draws a lot of power. Having only the power efficient CPU handling the interrupts might be good in this case as that can prevent you from powering up the bigger ones and this is the one that wakes up the other ones if necessary.

Ooohhh... definitely going to research that... building support for that into the kernel may end up being fantastic. Thanks!

OSwhatever wrote:[snip... more (good) stuff related...]

Definitely a good idea. I should probably stub out the basic separate architectures for the time being in the build tree. For the time being though I'm focused on cheap commodity server hardware... the kind of stuff that gets deployed 10-50 servers at a time. That means x86 (in fact, I'm only focusing on long-mode right now). Simple market fact. Wonderful idea though (at least to experiment with).

The whole problem so far is that (what you're trying to help address here), is that I have no idea how efficient the main CPU0 stuff will be if I take this road, regardless of architecture. It might be far more expensive than I would like, and might end up being a major bottleneck. I keep going back and forth in my head about how I should approach it.

Thanks for some great input!

Moving on...

Brendan wrote:
bwat wrote:Absurd and childish. Why can't you accept that the OP is designing this system and he/she just might have needs that you do not share. If the OP wants to build a dedicated system then he/she is free to do so, and very possibly justified in doing so.

It surely isn't beyond your imagination to envisage a scenario where a dedicated OS is going to be preferable to a general OS.
It is beyond my imagination - I can't think of a single sane reason for "single application" for a modern OS (regardless of whether the OS is intended for a special purpose or not).

Sigh... Brendan, my point exactly, see my previous comments in this post.

Brendan wrote:Do you honestly think that Google or Facebook use "single application only" OSs? Why would they bother when any "as many application's as you like" OS is capable of only running one application?

First, I know people at Facebook (but not at Google), and yes... what I described (one single purpose per system, and X number of those systems load balanced and fail over capable) is exactly what they do. Everyone else I know does it too. You've never even done a dig to find out that a company has a large round-robin dns pool? What exactly do you think is behind that pool? You think those IPs point directly to the "end" servers? No they point to load-balancers. What do you think is behind those load-balancers? An array of servers each with static http, dynamic http, dns, db, mta, backup, etc... all managing to be magically in sync with each other? It shocks me that you seriously haven't even given this any thought even if you don't have any direct experience with it. It was actually a rhetorical question and I didn't expect you to actually come back and literally think that large infrastructures don't do this. I think it's completely common knowledge for most nerds, even ones not involved in server infrastructure themselves. Basically you just screened out to the world, "I have no idea how the modern internet operates at all!" That's perfectly fine (there's nothing wrong about having a lack of knowledge about something), but don't pretend like you do. That's really what's irritating me right this second. You act like a know-it-all.

Sigh... more negative crap... moving on.

Brendan wrote:The problem with running multiple operating systems under virtual machines on the same physical hardware is that those OSs don't/can't cooperate to ensure that the most important work is done before less important work. To allow more important work to be done before less important work, it's far better to run both applications under the same OS. This has the additional benefit of avoiding the overhead of virtualisation.

Ummm... correct if it mattered. That's presuming that both OSs are general purposes SMPs but they don't know how to talk to each other. Where is it even implied that either of those conditions are met. The only thing implied was that the host OS is probably general purpose [Xen virtualization under Linux or what-not]). Man you like arguing without thinking. This a hobby for you?

Brendan wrote:I didn't say that I expect servers to be general purpose; only that you can have special purpose servers without pointlessly crippling a kernel for no reason.

Holy monkeys... seriously. Doing these responses in reverse is amazing. Not only have you been making that clear from where I picked up on this response, you were just as adamant about that before this very statement. Here, let me google, no wait, quote that for you...

Brendan wrote:
mrstobbe wrote:
Brendan wrote:Finally; there's a severe difference in magnitudes here. For example, to avoid a task switch that might cause a few microseconds of delays, you're considering wasting entire CPUs for many seconds.
Wait, what? Explain if you could please. I can't even remotely envision this at all.
Ok, imagine a computer being used for a HTTP server. There's 4 CPUs and 4 threads, and all CPUs spend 95% of their time waiting for network and/or waiting for disk.

Now imagine you also want to run an FTP server too. You've got 4 CPUs that are 95% wasted, but you can't use that wasted CPU time for the FTP server because the OS can't handle that; so you have to go and buy a complete second computer so that the FTP server can waste 95% of 4 more CPUs.

To avoid wasting a tiny little bit of CPU time (with task switches), you're wasting massive truckloads of CPU time.

That's just the very beginning. Just enjoy reading your own posts from start to even where you made the comment "I didn't say that I expect servers to be general purpose" in this thread. Seriously, get a glass of wine or whatever and take your time. So you're saying, a server should be able to do quite a few special-purpose things (let's just say, picking some based on your posts here, HTTP server, backup, and pre-building some sort of cache). Right? So you think I, or anyone, would design an OS specifically to do those three things? No, that's called a general-purpose OS. To go back to quoting you, "Now imagine you also want to run an FTP server too"... yeah, well it's a general purpose server at that point. This is exactly your argument. I'm seriously... speechless... that you've been making this argument and then suddenly, out of the blue, contradict yourself, only to switch back to making that same exact argument.

Are you for real?

Holy crap, again... moving on (please be something good, please be something good)... evidently not...

Brendon wrote:
mrstobbe wrote:
Brendan wrote:If one process starts a second process and waits until the child process terminates, then that's 2 processes (where only one is given CPU time, but both share memory, have file handles, etc) and not a single process. Of course if you're planning to have drivers running in their own virtual address spaces (as processes) it's not really single process anyway; and you're effectively doing "multiple processes and multi-tasking, with different obscure limits on what different types of processes can do".
Pure semantics... if one process starts another but can't exist (in terms of processor... can't see the light of day again) until the other one exits, it's still a mono-process system. Again, pure semantics.
If an elephant has ears, legs and a heartbeat, and you temporarily pause its heartbeat, does the elephant cease to exist? Of course not.

If a process consists of a virtual address space, one or more threads, and any number of other things (file handles, signal handlers, etc); and you pause its threads, does the process cease to exist?

You can call it pure semantics if you like; as long as you understand that the semantics you've been using to describe your ideas are extremely confusing because you've mangled the terminology every else uses.

You're an idiot. Officially. Let me give you a medal. By your logic everyone should think of a kernel as a process as well. Oh wait, there's the BIOS to consider too. Maybe the video BIOS as well? Crap, we forgot about that custom RAID card with it's own stuff going on. Therefore, since the dawn of man, there has never been a mono-processing system. Of course... I see the light of day, something has a resource and therefore it's automatically considered a process, therefore it's a multi-process system. Never mind the fact that to be a multi-process system, you have to schedule those processes. Sorry, would the term "task" work for you better there? How about multi-tasking system? Was MS-DOS a multi-tasking (using that term just for you) system because it had COMMAND.COM which could then execute something else? It's freaking semantics. You're the type of person who doesn't listen to anyone else. You hear the words, but don't actually listen to what's being said because all you care about is when it's your turn to talk so you can attack those words. I bet it makes you feel smart a lot of the time doesn't it? My ideas seem to be "extremely confusing" to you, and only you. Not a lot of other people have chimed in, but they also haven't reverted into some sort of primal "hey, look at me! i'm smart! you know how you can tell? you're wrong about everything no matter what it is!" state over the basic idea simply because they didn't understand it (or, as I gather by now, a lot of basics about modern usage of computers). I asked you what I should do to clarify things and all you've done is come back and just act like a know-it-all and look like a prick. If you were confused, why didn't your first reply start with, "I'm confused. This, this, and this doesn't make sense to me. By 'term a' do you mean 'definition that I think is term a'? Can you please explain some of your ideas to me? I think I understand this and that, but think you might be wrong about this because of that, that, and that." How do you think I would've responded? You think this would have been a fun and productive dialog? Instead your first two posts were you trying to puff up and look like you knew what you were talking about. You subsequently, and systematically followed up with the most brain-dead, mostly off topic, argumentative just for the sake of arguing replies through the duration. bwal is right about you. I tried to be nice and even (several times) "agreed" with you in hopes of getting a better conversation started, but please stop posting in this thread.

Major sigh. If you can't have a reasonable conversion about OS design on one of the few forums in the world that deals only with that subject, then there's something wrong. I mean, I can be a bit of a jerk when I'm in a bad mood posting, but Brendan, you're just a troll. I probably don't even think you know it either.

Sadly, that looks like that's the end of replies too. I was hoping for something positive to end on. OSwhatever, at least there's your comments about the high/low power CPU thing. Productive.

I am being genuine in asking for comments, suggestions, concerns, red-flags, criticisms, etc. Please share them. I sincerely apologize for having to take up almost an entire post addressing Brendan in a progressively negative manner, but I think it was warranted. Brendan is hopefully an exception to the rule here.

EDIT: Cleaned up a tiny bit of bad grammer (stupid brain

)

rdos · Post by **rdos** » Fri Nov 22, 2013 2:08 am

mrstobbe wrote:You're wrong. I'm not going to be nice anymore. You're flat-out-unarguably wrong. Remember when I said, "correct me if I'm wrong" when addressing that earlier, I was just being socially "nice". I followed up with being nice and not calling you flat-out wrong, but... yes... you're wrong. Any unnecessary cache miss or invalidation, any unnecessary inefficiencies in branch prediction, any unnecessary major/minor page faults, any at all, even the slightest bit, is needless expense (and in many cases, very expensive). Why you keep stating that it's negligible simply because you're confusing the idea that having a lot of things that a system can do is worth the cost of context switching with your particular design vision, completely ignoring the very fact that context switching, but it's very inherent nature is simply expensive. Stop confusing a big picture design idea (in your head) with the in-the-moment reality during a context switch. I'm looking at it from a pure "inherent nature" perspective, and you're looking at it from a "acceptable/negligible cost of doing business for what I want to achieve" perspective... stop confusing the two.

Yes, he is wrong, and that's why traditional microkernels will never be successful on the larger market. Doing isolation between drivers with address-space switches is so costly that it never will be able to compete with simpler designs.

And the 4-step page lookup process in x86-64 is very costly even if most of it has been implemented in hardware. And this hardware still consumes power, even if it doesn't affect performance a lot during normal conditions.

Brendan · Post by **Brendan** » Fri Nov 22, 2013 2:43 am

Hi,

mrstobbe wrote:
Brendan wrote:"It reduces the number of TLB misses and/or cache misses (compared to "one or more applications" when only one application is being run)". Note: This is false.
You're wrong. I'm not going to be nice anymore. You're flat-out-unarguably wrong. Remember when I said, "correct me if I'm wrong" when addressing that earlier, I was just being socially "nice". I followed up with being nice and not calling you flat-out wrong, but... yes... you're wrong. Any unnecessary cache miss or invalidation, any unnecessary inefficiencies in branch prediction, any unnecessary major/minor page faults, any at all, even the slightest bit, is needless expense (and in many cases, very expensive). Why you keep stating that it's negligible simply because you're confusing the idea that having a lot of things that a system can do is worth the cost of context switching with your particular design vision, completely ignoring the very fact that context switching, but it's very inherent nature is simply expensive. Stop confusing a big picture design idea (in your head) with the in-the-moment reality during a context switch. I'm looking at it from a pure "inherent nature" perspective, and you're looking at it from a "acceptable/negligible cost of doing business for what I want to achieve" perspective... stop confusing the two.

Please just stop and think about it. If an OS is only running one process then there is never any need for any task switches (regardless of whether the OS supports running more than one process or not).

mrstobbe wrote:
Brendan wrote:"It makes something easier for the end user (or the person using the OS as part of an embedded system, or whatever). Note: This is false too.
Who said that? You just did, but I didn't see anyone else in this thread say anything along those lines.

Because bwat failed to provide reasons for "single application", I provided my own. Perhaps you should've read the conversation between bwat and I before replying to it?

mrstobbe wrote:
Brendan wrote:"It reduces the size/complexity of the OS". Note: In general this might have been a valid reason (e.g. back when CPUs were 8-bit); but not for the OS that mrstobbe described (that runs drivers as processes and needs the size/complexity involved anyway).
No, this is still absolutely valid. It will never be invalid unless computers suddenly turn into processing gods who magically give the answer 42 before you even think up the question. Any reduction of complexity adds to the overall efficiency of the system.

Let's make an OS like Linux less complex; by removing all disk caches and removing the complex fine grained locking (go back to the "big kernel lock"). Let's remove the complex trees used for finding names in the VFS and replace it with a simple linear search through a linked list. The idea of using CPU's time stamp counters (and keeping them calibrated and synchronised) is too complex, so let's just go back to using the PIT (and get rid of local APIC timer support and HPET too). SSE/AVX is a little more complex, so get rid of all that. All of these things will reduce complexity, and therefore it will make Linux a lot more efficient!

Of course I'm being sarcastic. A reduction of complexity does not add to the overall efficiency.

mrstobbe wrote:As for my design, it can clearly be incredibly simpler (even with a powerful event-driven API) than any common modern general-purpose OS.

For your design; your scheduler will need to handle task switching between device drivers, so not allowing the same code to be used for switching between "application processes" doesn't avoid any complexity (and adds a little more complexity, because your scheduler has to know if a process is "the process" or device driver, file system, etc).

mrstobbe wrote:At this point, I really don't think you have any kind of a solid grasp of even the basics of my design, so it's pretty hard for you to make statements like that and look reasonable.

If you'd mind pointing out where your description was inadequate and caused me any additional confusion (beyond your original misuse of terminology that I think I've already corrected), then that could help a lot.

mrstobbe wrote:Well... that was a whole bunch of negative writing...

Please understand that the "negative writing" was a reply to bwat and had nothing to do with you in the first place...

mrstobbe wrote:
Brendan wrote:Do you honestly think that Google or Facebook use "single application only" OSs? Why would they bother when any "as many application's as you like" OS is capable of only running one application?
First, I know people at Facebook (but not at Google), and yes... what I described (one single purpose per system, and X number of those systems load balanced and fail over capable) is exactly what they do.

I did not ask if Google or Facebook use "single purpose systems" (they do). I asked if Google or Facebook use "single application only" OSs (they do not).

Google use a (modified version of) Linux (which is obviously capable of running multiple processes). Facebook also use a (modified version of) Linux (which is obviously capable of running multiple processes). They are both proof that an OS capable of running multiple processes can be used for single purpose systems.

Please note that I checked to ensure I was right before I asked "Do you honestly think that Google or Facebook use "single application only" OSs?". There are plenty of web sites and news articles that will confirm this - you'd only need to do a few web searches if you have any doubts.

mrstobbe wrote:
Brendan wrote:The problem with running multiple operating systems under virtual machines on the same physical hardware is that those OSs don't/can't cooperate to ensure that the most important work is done before less important work. To allow more important work to be done before less important work, it's far better to run both applications under the same OS. This has the additional benefit of avoiding the overhead of virtualisation.
Ummm... correct if it mattered. That's presuming that both OSs are general purposes SMPs but they don't know how to talk to each other. Where is it even implied that either of those conditions are met. The only thing implied was that the host OS is probably general purpose [Xen virtualization under Linux or what-not]). Man you like arguing without thinking. This a hobby for you?

This part of the conversation started by you claiming that running normal OSs under virtual machines on the same physical hardware is silly (note: your actual words were "The virtualization point is a great one actually because you're essentially already in an SMP and the last thing you need is another SMP under that SMP"). I agree with this (I do think it's silly). However; your conclusion was that people should run "single application only" OSs under virtual machines on the same physical hardware. This is why I had to explain why it's a bad idea (due to lack of control over priorities between the applications), and better to use a "many applications" OS without any virtual machine at all (where you can give different applications different priorities).

mrstobbe wrote:
Brendan wrote:I didn't say that I expect servers to be general purpose; only that you can have special purpose servers without pointlessly crippling a kernel for no reason.
Holy monkeys... seriously. Doing these responses in reverse is amazing. Not only have you been making that clear from where I picked up on this response, you're were just as adamant about that before. Here, let me google, no wait, quote that for you...

mrstobbe wrote:Are you for real?

Yes. The problem is that you seem to get "single application OS", "general purpose OS" and "special purpose server" all mixed up.

To make it easier for you to understand, there are 3 things I am saying:

a) There's nothing wrong with using a general purpose OS for a special purpose server (especially if the general purpose OS can be installed in a "minimal" configuration without GUI or unnecessary applications). Most single purpose servers use general purpose OSs like this.

b) There's no sane reason to use a "single application OS" for a special purpose server at all.

c) If you don't use a "single application OS" for special purpose servers, then you can use the same OS for "multi-purpose servers" (and avoid wasting resources).

Cheers,

Brendan

Brendan · Post by **Brendan** » Fri Nov 22, 2013 2:49 am

Hi,

rdos wrote:
mrstobbe wrote:You're wrong. I'm not going to be nice anymore. You're flat-out-unarguably wrong. Remember when I said, "correct me if I'm wrong" when addressing that earlier, I was just being socially "nice". I followed up with being nice and not calling you flat-out wrong, but... yes... you're wrong. Any unnecessary cache miss or invalidation, any unnecessary inefficiencies in branch prediction, any unnecessary major/minor page faults, any at all, even the slightest bit, is needless expense (and in many cases, very expensive). Why you keep stating that it's negligible simply because you're confusing the idea that having a lot of things that a system can do is worth the cost of context switching with your particular design vision, completely ignoring the very fact that context switching, but it's very inherent nature is simply expensive. Stop confusing a big picture design idea (in your head) with the in-the-moment reality during a context switch. I'm looking at it from a pure "inherent nature" perspective, and you're looking at it from a "acceptable/negligible cost of doing business for what I want to achieve" perspective... stop confusing the two.
Yes, he is wrong, and that's why traditional microkernels will never be successful on the larger market. Doing isolation between drivers with address-space switches is so costly that it never will be able to compete with simpler designs.

And the 4-step page lookup process in x86-64 is very costly even if most of it has been implemented in hardware. And this hardware still consumes power, even if it doesn't affect performance a lot during normal conditions.

Mrstobbe was planning to do "separate virtual address space for each device driver" from the very beginning; and neither of us were talking about the "micro-kernel vs. monolithic" at all.

Cheers,

Brendan

Combuster · Post by **Combuster** » Fri Nov 22, 2013 2:51 am

rdos wrote:Yes, he is wrong, and that's why traditional microkernels will never be successful on the larger market. Doing isolation between drivers with address-space switches is so costly that it never will be able to compete with simpler designs.

Not successful? Not competitive? You, sir, are a liar.

mrstobbe · Post by **mrstobbe** » Fri Nov 22, 2013 3:04 am

rdos wrote:
mrstobbe wrote:You're wrong. I'm not going to be nice anymore. You're flat-out-unarguably wrong. Remember when I said, "correct me if I'm wrong" when addressing that earlier, I was just being socially "nice". I followed up with being nice and not calling you flat-out wrong, but... yes... you're wrong. Any unnecessary cache miss or invalidation, any unnecessary inefficiencies in branch prediction, any unnecessary major/minor page faults, any at all, even the slightest bit, is needless expense (and in many cases, very expensive). Why you keep stating that it's negligible simply because you're confusing the idea that having a lot of things that a system can do is worth the cost of context switching with your particular design vision, completely ignoring the very fact that context switching, but it's very inherent nature is simply expensive. Stop confusing a big picture design idea (in your head) with the in-the-moment reality during a context switch. I'm looking at it from a pure "inherent nature" perspective, and you're looking at it from a "acceptable/negligible cost of doing business for what I want to achieve" perspective... stop confusing the two.
Yes, he is wrong, and that's why traditional microkernels will never be successful on the larger market. Doing isolation between drivers with address-space switches is so costly that it never will be able to compete with simpler designs.

And the 4-step page lookup process in x86-64 is very costly even if most of it has been implemented in hardware. And this hardware still consumes power, even if it doesn't affect performance a lot during normal conditions.

This is why the novel idea here is to stick the to one multi-purpose CPU (CPU0)... again, it's a risk and certainly might not pan out. Any experience with this? Have lessons learned you'd like to share? Best case is that everything is perfectly balanced (incredibly unlikely). Worst case is that CPU0 is a bottleneck (at which point, the idea goes out the window in it's current state). I'm hoping for an "average case" in which, at completely full load, the bottle-neck is usually worker tasks (like a DB where essentially all high-impact data is properly cached, or a static HTTP server where all high-impact files have been buffered), but CPU0 is still busy enough to not be "wasted". Think 10's of thousands or 100's of thousands of concurrent static HTTP requests a second across 8 cores. The first core does all the "dirty" work (dealing with uncached stuff, handling IRQs, etc), the other 7 are the HTTP works (again, imagine, at that load, that nearly all requests in a given slice of time are primed)... it just comes down to processing power at that point (at least for the workers). It's a gamble in terms of design.

I agree about the whole PML4 thing. I'm planning on only using the 4-step paging setup if it's necessary (hardware even has that much RAM and controllable as a kernel arg). The idea is to keep the high bits 0 so that any paging mode makes sense no matter what the compiler did. Basically, the smallest table chain possible is planned to be the default. I say planning, because I just started working on getting the paging setup a couple days ago, so... there's a lot of work to even begin to get dynamic about it (and I have a full time job

). Paging still sucks in terms of TLB, but at least most cores wouldn't be missing constantly.

Back to your point about microkernels... and of course, the general idea... any thoughts, ideas, or experience with this? Give me more

I'm starving for comments like yours.

Kevin · Post by **Kevin** » Fri Nov 22, 2013 5:07 am

Brendan wrote:The problem with running multiple operating systems under virtual machines on the same physical hardware is that those OSs don't/can't cooperate to ensure that the most important work is done before less important work. To allow more important work to be done before less important work, it's far better to run both applications under the same OS. This has the additional benefit of avoiding the overhead of virtualisation.

For a simple example, imagine running an FTP server and a HTTP server on a physical machine; where the HTTP server is more important (response times) and the FTP server is less important (as FTP has always tolerated "varying upload/download speed" well). If you run both in their own little "single application" OS, and put both of the OSs inside virtual machines on the same physical machine; then you're screwed - they will share CPU and network bandwidth equally (even though it is less important, the FTP server is allowed to take CPU time and network bandwidth away from the HTTP server).

Yes, if you don't configure any limits, they will share CPU time and network bandwidth equally. Both the VM and a normal application. And if you do configure limits, suddenly they don't share equally any more, but such as you configured them. Both the VM and a normal application.

Funny, isn't it?

Kevin wrote:Yes, but I think Brendan still has a valid point: Your drivers are processes that exist all the time and run in parallel with the single application. So you already have to have some kind of multitasking in order to run the drivers. (And if Brendan and I agree on something, there are chances it is right - because it doesn't happen too often.)

For this reason, a microkernel without multitasking is probably a contradiction in itself. The difference that you can make compared to the "normal" OS is that you don't schedule based on a timer, but only on events like IRQs or explicit calls into a driver function.

Okay, I guess now you know why it doesn't happen too often that Brendan and I agree...

I still think that this is the crucial point in the whole discussion: How do you deal with the multiple processes that you do get with your drivers and how do they play together with your application process?

If I understood it correctly from the latest few posts, you're going to run all drivers on CPU 0 and the application threads on CPU 1-n. Is this correct? If so, interfacing with a driver always means that you need to switch to a different CPU. Doesn't this hurt your latencies? If everything can indeed be done asynchronously, at least the throughput should be okay, but if your static HTTP server only delivers small pages instead of huge files, you're probably more interested in latency.

Also, I think it requires that your drivers do nothing CPU-heavy. You won't be able to implement compressed or encrypted file systems, for example, without hurting others threads that just want to send some network packets at the same time when you force both drivers to run on the same CPU.

Wouldn't it make more sense to run the drivers in the CPU of their caller, and perhaps also distribute IRQ handlers across the CPUs?

Then, of course, you would have multiple processes on each CPU, and the limitation to one application process becomes rather arbitrary. So maybe another question is what advantages you get from the limitation.

Brendan · Post by **Brendan** » Fri Nov 22, 2013 5:46 am

Hi,

Kevin wrote:
Brendan wrote:The problem with running multiple operating systems under virtual machines on the same physical hardware is that those OSs don't/can't cooperate to ensure that the most important work is done before less important work. To allow more important work to be done before less important work, it's far better to run both applications under the same OS. This has the additional benefit of avoiding the overhead of virtualisation.

For a simple example, imagine running an FTP server and a HTTP server on a physical machine; where the HTTP server is more important (response times) and the FTP server is less important (as FTP has always tolerated "varying upload/download speed" well). If you run both in their own little "single application" OS, and put both of the OSs inside virtual machines on the same physical machine; then you're screwed - they will share CPU and network bandwidth equally (even though it is less important, the FTP server is allowed to take CPU time and network bandwidth away from the HTTP server).
Yes, if you don't configure any limits, they will share CPU time and network bandwidth equally. Both the VM and a normal application. And if you do configure limits, suddenly they don't share equally any more, but such as you configured them. Both the VM and a normal application.

Funny, isn't it?

Sounds great if load is static and latency isn't important. Sadly, load is usually constantly changing and latency can be important; and even within one "single application" VM different things are more or less important than others.

For example; how can the host OS know that a disk read from one virtual machine's guest trying to pre-fetch data into its VFS cache is less important than a disk read from another VM's guest that wants to "page-in" data for a critical application's memory mapped file; when all the host OS can see is a few disk reads with no information to help make the best "disk IO scheduling" decision?

Cheers,

Brendan

Kevin · Post by **Kevin** » Fri Nov 22, 2013 6:20 am

You're starting again to change the requirements you made...

You said that "even though it is less important, the FTP server is allowed to take CPU time and network bandwidth away from the HTTP server", and this simply isn't true. A VM has the same ways of dealing with it as an OS directly running both servers has. The reason is that for two read requests, one from the HTTP server and one from the FTP server, both can't know what the application is going to use the data for.

The only reason why in your new scenario the OS can be a bit more clever is because it was the OS itself who issued the VFS prefetch read, so it can know that it's less important. Fortunately, there is a way to avoid the problem in the guest if you're really worried about this: Tell the guest OS that it shouldn't prefetch in the first place and let the host do it. But I wouldn't even bother...

OSDev.org

OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback

Re: OdinOS: I'd love some design feedback