Reinventing Unix is not my problem

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
eekee
Member
Member
Posts: 892
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: Reinventing Unix is not my problem

Post by eekee »

spotracite wrote:A specific quote, however, from 1987 on the Usenet, draws my attention: "Those who do not understand Unix are condemned to reinvent it, poorly."
In 1987, there were a great many operating systems worse than Unix. I think only a few OSs were better, and they mostly cost Big Money or came with Big Money hardware. OS/9 is one of the "worse" OSs of that era; I recall someone complaining about having to work with it on this forum. It's a bit hard to search for without getting Mac references though. Some of the worse OSs made life difficult by trying to deny you opportunities to do something wrong. Thus, you had to work very hard to do anything interesting. Or their APIs would just be badly designed. A common issue was taking features which could complement each other & making it hard to use them together. Or just to invent a whole different interface for every little feature. We've come a long way.

Edit: I forgot the ever-present bugs. Before the Internet, many of the cheaper OSs wouldn't be updated regularly. When they were, old bugs would be replaced by new ones.

As for my plans, they are, as always, different. How different? Well, for example, I could say "What are these 'directories' of which you speak?" :mrgreen: But that's just one weird tangent I might go on. I'm very unlikely to ever implement a normal window system. I'd only do that for other peoples' sake, not my own.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
kzinti
Member
Member
Posts: 898
Joined: Mon Feb 02, 2015 7:11 pm

Re: Reinventing Unix is not my problem

Post by kzinti »

eekee wrote:OS/9 is one of the "worse" OSs of that era; I recall someone complaining about having to work with it on this forum.
I loved OS/9 back in the days of the CoCo 2 and CoCo 3. I think it was great at the time. Being able to multitask on a home computer at the time was quite amazing. Consider that the CoCo 2 didn't even have an MMU and that the CoCo 3's MMU only had a granularity of 8K (8 banks of 8 KB = 64 KB accessible at a time).

I believe nullplan is still working with some descendant/modern version of OS-9/68K and it apparently hasn't aged so well.
eekee wrote:It's a bit hard to search for without getting Mac references though.
Search for "microware os-9" or "coco os-9".
nullplan
Member
Member
Posts: 1801
Joined: Wed Aug 30, 2017 8:24 am

Re: Reinventing Unix is not my problem

Post by nullplan »

kzinti wrote:I believe nullplan is still working with some descendant/modern version of OS-9/68K and it apparently hasn't aged so well.
Oh dear, did it ever. Although all the ports to other architectures call themselves OS-9000, as if a couple of zeroes would make up for the lack of features. As it currently stands it is the worst blend of DOS and UNIX. Actually more like a DOS with multi-tasking support.
eekee wrote:A common issue was taking features which could complement each other & making it hard to use them together.
Yeah, OS-9 has you covered there. I think it is pretty awful design to implement the BSD socket API on top of an OS that only allows up to 32 paths (OS-9 equivalent of FDs) per process. I will need to write a TCP service that listens to a bunch of ports, and as of yet I will have to launch it multiple times to actually handle all 32 ports (because of course, stdin, stdout, and stderr all count towards those 32 paths as well).
Carpe diem!
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Reinventing Unix is not my problem

Post by nexos »

spotracite wrote:A specific quote, however, from 1987 on the Usenet, draws my attention: "Those who do not understand Unix are condemned to reinvent it, poorly."
Well, I believe the writer probably was talking about the Unix variants that started sprouting up (cough cough Linux) that were poor representations of the simplicity that had made Unix. Now, computers did get more powerful, but some Unixes just made all out awful design choices then. Like SVR4's VM system. To allocate memory it searched through a linked list. That was an idea I came up with when I was a rookie OS developer! Don't even get me started on Linux, and especially GNU. They are perhaps more bloated then Windows in all of its bloat. Using a red black tree in the scheduler? I'm being serious, I really do want somebody to explain the benefits of that over an O(1) queue. So, to close modern Unixes are nothing like the old Unixes. The last Unix that held onto the philosophy is SVR3 / 4.3 BSD, and maybe Mach and Solaris
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Reinventing Unix is not my problem

Post by vvaltchev »

nexos wrote:Don't even get me started on Linux, and especially GNU. They are perhaps more bloated then Windows in all of its bloat.
I politely disagree :-)
nexos wrote:Using a red black tree in the scheduler? I'm being serious, I really do want somebody to explain the benefits of that over an O(1) queue.
Well, that's because processes or, better, tasks have not only a fixed priority, but also a dynamic priority that depends on how they behaved. Also, with simple "fair" schedulers, without dynamic priority, a low-priority process may never get to run, until there is a higher-priority runnable process. That's OK for real-time processes, but definitively not for regular ones as lower-priority processes might suffer from starvation for an indefinite amount of time. What we actually want from a completely fair scheduler instead, is that, no matter what, all processes are guaranteed to get some % of the CPU time, in proportion to their priority.

Assume there are just two processes on the system, process A with prio 9 and process B with prio 11. Assume that higher prio number means higher priority (on linux that is the opposite for regular processes). Assume also that both A and B want to run all the time because they are completely CPU-bound. With a simple scheduler, B will run forever (or, until ends) and A will never run before B dies, despite B have just slightly higher priority than A. What we want instead, is A to get 9/20 of the CPU time and B to get 11/20 of the CPU time. If, as a process runs and consumes its time-slices it's dynamic priority gets lower, at some point A will preempt the higher-priority process B and will run for some time. That's what's desirable, for most of the processes on desktop and server systems.

Because a process' total computed priority (static + dynamic) changes all the time, a red-black tree is used to select the process with the highest priority. Real-time scheduling instead is much simpler and it's done in O(1).

Of course, that was a very rough explanation. For a much better one, read the 4th chapter "Process scheduling" of the Linux Kernel Development book, 3rd edition.
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Reinventing Unix is not my problem

Post by nexos »

vvaltchev wrote:I politely disagree :-)
I don't mind! I don't want everyone having the same opinion as me.
vvaltchev wrote:Well, that's because processes or, better, tasks have not only a fixed priority, but also a dynamic priority that depends on how they behaved. Also, with simple "fair" schedulers, without dynamic priority, a low-priority process may never get to run, until there is a higher-priority runnable process. That's OK for real-time processes, but definitively not for regular ones as lower-priority processes might suffer from starvation for an indefinite amount of time. What we actually want from a completely fair scheduler instead, is that, no matter what, all processes are guaranteed to get some % of the CPU time, in proportion to their priority.

Assume there are just two processes on the system, process A with prio 9 and process B with prio 11. Assume that higher prio number means higher priority (on linux that is the opposite for regular processes). Assume also that both A and B want to run all the time because they are completely CPU-bound. With a simple scheduler, B will run forever (or, until ends) and A will never run before B dies, despite B have just slightly higher priority than A. What we want instead, is A to get 9/20 of the CPU time and B to get 11/20 of the CPU time. If, as a process runs and consumes its time-slices it's dynamic priority gets lower, at some point A will preempt the higher-priority process B and will run for some time. That's what's desirable, for most of the processes on desktop and server systems.

Because a process' total computed priority (static + dynamic) changes all the time, a red-black tree is used to select the process with the highest priority. Real-time scheduling instead is much simpler and it's done in O(1).

Of course, that was a very rough explanation. For a much better one, read the 4th chapter "Process scheduling" of the Linux Kernel Development book, 3rd edition.
That makes sense. I personally implement KISS in my projects, so an O(1) queue is better for my direction.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Reinventing Unix is not my problem

Post by vvaltchev »

nexos wrote:That makes sense. I personally implement KISS in my projects, so an O(1) queue is better for my direction.
I totally understand that. I also try to keep Tilck as simple as possible. Just, "big projects" sometimes have they own reasons for complex code.

Another example: how do you implement memcpy()? It can be done in C with a simple loop or, with some fancy inline assembly to make it faster. Or, do what glibc does: first do a per-byte copy to align the destination, then consider if it's worth using special system code for copying pages, then copies the last page a word at a time. Not KISS, but very efficient:
https://github.com/lattera/glibc/blob/m ... g/memcpy.c

Or strlen(): it can implemented with a few instructions or with some fancy code that reads 4 bytes at a time:
https://github.com/lattera/glibc/blob/m ... g/strlen.c

What we consider the best solution, depends on the context. For an small embedded system, that implementation is definitively overkill. But... for full-blown desktops and servers, it's worth the code size increase to get 2x performance boost.

A completely different story is when the code bloat comes from unnecessary template instantiations, boiler-plate code, copy pasting, abstract factory-factory-factories and friends :-)
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Reinventing Unix is not my problem

Post by thewrongchristian »

vvaltchev wrote:
nexos wrote:That makes sense. I personally implement KISS in my projects, so an O(1) queue is better for my direction.
I totally understand that. I also try to keep Tilck as simple as possible. Just, "big projects" sometimes have they own reasons for complex code.
Depends on how you define "complex".

CFS, while sounding complex, implements a relatively simple principle.

Keeping the tasks sorted in an RBTree is simply an implementation detail, the same could be implemented with a linear linked list (simple, O(n) to insert a pre-empted task back into the run queue.)

But the CFS has very little in the way of heuristics, tunables and corner cases, with the result that the behaviour is less complex and easier to predict, as well as being fairer.

And the use of RBTree shouldn't add significant runtime overhead, the run queue is likely to only ever be a handful of tasks, probably in the low single figures (my laptop linux has a loadavg of about 2, with a couple of web page's JS keeping firefox on the processor more than I'd like), so finding the right place in the tree will be on average at or around 2-3 comparisons.
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Reinventing Unix is not my problem

Post by vvaltchev »

thewrongchristian wrote:Depends on how you define "complex".

CFS, while sounding complex, implements a relatively simple principle.

Keeping the tasks sorted in an RBTree is simply an implementation detail, the same could be implemented with a linear linked list (simple, O(n) to insert a pre-empted task back into the run queue.)

But the CFS has very little in the way of heuristics, tunables and corner cases, with the result that the behaviour is less complex and easier to predict, as well as being fairer.

And the use of RBTree shouldn't add significant runtime overhead, the run queue is likely to only ever be a handful of tasks, probably in the low single figures (my laptop linux has a loadavg of about 2, with a couple of web page's JS keeping firefox on the processor more than I'd like), so finding the right place in the tree will be on average at or around 2-3 comparisons.
I totally agree with all you said. In that context "complex" meant something more complicated than trivial queues per priority level. Considering the tricky dynamic priority calculation, the needs for BSTs and other stuff, CFS is objectively harder to implement than many other types of scheduling algorithms but, as you said, the result behavior is easier to predict and it's fair. So, to me that kind of complexity is fully justified.
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Reinventing Unix is not my problem

Post by nexos »

vvaltchev wrote:
nexos wrote:That makes sense. I personally implement KISS in my projects, so an O(1) queue is better for my direction.
I totally understand that. I also try to keep Tilck as simple as possible. Just, "big projects" sometimes have they own reasons for complex code.
Yes, they do. My goal is to do everything as simple as possible without losing to much speed. It'll be interesting to see the results of my OS! Tilck is exactly the kind of OS I'm talking about: simple, yet comprehensive for many applications
vvaltchev wrote:Another example: how do you implement memcpy()? It can be done in C with a simple loop or, with some fancy inline assembly to make it faster. Or, do what glibc does: first do a per-byte copy to align the destination, then consider if it's worth using special system code for copying pages, then copies the last page a word at a time. Not KISS, but very efficient:
https://github.com/lattera/glibc/blob/m ... g/memcpy.c

Or strlen(): it can implemented with a few instructions or with some fancy code that reads 4 bytes at a time:
https://github.com/lattera/glibc/blob/m ... g/strlen.c

What we consider the best solution, depends on the context. For an small embedded system, that implementation is definitively overkill. But... for full-blown desktops and servers, it's worth the code size increase to get 2x performance boost.
Makes perfect sense! I personally would use SSE / AVX nowadays, however :)
A completely different story is when the code bloat comes from unnecessary template instantiations, boiler-plate code, copy pasting, abstract factory-factory-factories and friends :-)
Exactly why I dislike Windows and C++!
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Reinventing Unix is not my problem

Post by vvaltchev »

nexos wrote:Yes, they do. My goal is to do everything as simple as possible without losing to much speed. It'll be interesting to see the results of my OS!
I'd be curious about that too! Just, if you aim for speed and simplicity a monolithic kernel is still the way to go, IMHO.
nexos wrote:Tilck is exactly the kind of OS I'm talking about: simple, yet comprehensive for many applications
Thanks :-)
nexos wrote:Makes perfect sense! I personally would use SSE / AVX nowadays, however :)
I believe that glibc's memcpy() has a bunch of versions like __memcpy_sse2, __memcpy_ssse3, __memcpy_avx512_no_vzeroupper etc. But.. in kernel we generally cannot use SIMD instructions because it's too expensive to save and restore the extra registers, unless you're going to write plenty of data. In that case, it might be worth saving and restoring the registers, overall.
nexos wrote:Exactly why I dislike Windows and C++!
I have a complicated love-hate relationship with C++ :-)
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Reinventing Unix is not my problem

Post by nexos »

vvaltchev wrote:I'd be curious about that too! Just, if you aim for speed and simplicity a monolithic kernel is still the way to go, IMHO.
Yeah, I was going to do a microkernel, until I figured out just how complex the boot process was going to be, how slow it probably was going to be, so I decided to make a hybrid-esque kernel.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
User avatar
zaval
Member
Member
Posts: 660
Joined: Fri Feb 17, 2017 4:01 pm
Location: Ukraine, Bachmut
Contact:

Re: Reinventing Unix is not my problem

Post by zaval »

to not create a separate thread for such a tiny question, I'm gonna ask it here, since it smells unix more, than enough.

I create a library, that is gonna be a shared object (.so), it has 2 sets of functions:
1) ones for internal usage only - should not be called by nor visible to the library clients.
2) exported functions, comprising the library's API.

the question is how in an ELF toolchain, one achieves this separation if it's possible (I hope it is, because if not, then well...)? separation means preventing the functions from the set 1 to appear in any symtables or whatever mechanism ELF provides for exporting functions. what (trickery) do the library writer apply for telling to the toolchain, that this function is not about to get exported and this one is. in the PE realm, exporting for example can be indicated by a .def file, that tells everything the linker should know for building export as it was supposed to be. does ELF realm have something like that?
ANT - NT-like OS for x64 and arm64.
efify - UEFI for a couple of boards (mips and arm). suspended due to lost of all the target park boards (russians destroyed our town).
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Reinventing Unix is not my problem

Post by Korona »

There is a GNU extension, google for __attribute__((visibility ("hidden"))). You can get something close to Windows by passing -fvisibility=hidden and explicitly annotating the public symbols.

Note that ELF does not distinguish exports and imports.

In general, ELF is a bit more sane than PE with regards to dynamic linking. (Ever tried to build a DLL that uses a non-C ABI, for example by exposing funftions that take complicated C++ classes? It's a mess.)
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
User avatar
zaval
Member
Member
Posts: 660
Joined: Fri Feb 17, 2017 4:01 pm
Location: Ukraine, Bachmut
Contact:

Re: Reinventing Unix is not my problem

Post by zaval »

Korona wrote: There is a GNU extension, google for __attribute__((visibility ("hidden"))). You can get something close to Windows by passing -fvisibility=hidden and explicitly annotating the public symbols.
Thanks! That's a hope, even if an "extension", yet making the code ugly (I kind of accepted that). :)
In general, ELF is a bit more sane than PE with regards to dynamic linking. (Ever tried to build a DLL that uses a non-C ABI, for example by exposing funftions that take complicated C++ classes? It's a mess.)
Don't know saner, or not, I'm concerned with only C for now. And I would be definitely not feel "sane" if it turned out, that all of the private functions of a module (say, main kernel module's ones) got visible, littering those tables. Or, imagine some interpreter library, that has like a couple of API functions - RunThis() and RunThat() and hundreds of internal routines, absolutely not intended to be exposed. Its "exporting" tables (and those, dealing with indirection, because it's PIC) would consist 99% of unneeded garbage.
ANT - NT-like OS for x64 and arm64.
efify - UEFI for a couple of boards (mips and arm). suspended due to lost of all the target park boards (russians destroyed our town).
Post Reply