OS optimised for network communications

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
jamesread
Member
Member
Posts: 49
Joined: Wed Dec 09, 2020 11:32 am

OS optimised for network communications

Post by jamesread »

Hi all,

I want to design a new OS that is designed from the ground up for prioritising network activity such that applications such as web servers and web crawlers will be high performance by default. I'm hoping to start a stimulating conversation about what design choices are appropriate for such a goal.

You may just think I'd be better off modifying an existing system to achieve this goal. That's OK. But I am sure that there could be gains from designing the system from the ground up especially for purpose.

Here's my thinking so far. Web servers have seen massive performance gains by switching from a threaded model to an event driven model using for example epoll in Linux. Nginx outperforms Apache using this model. But is there further work that can be done to increase throughput and number of parallel connections supported by redesigning the OS to prioritise network communications over and above all other work on the system?

Thanks in advance for your insights.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: OS optimised for network communications

Post by bzt »

jamesread wrote:Hi all,

I want to design a new OS that is designed from the ground up for prioritising network activity such that applications such as web servers and web crawlers will be high performance by default. I'm hoping to start a stimulating conversation about what design choices are appropriate for such a goal.
You should learn about appliances. There are many. They are specialized hardware because the weak link is usually not the software, but the hardware. For example Avalanche can generate HTTP requests at really high freq (hundred thousands of connections per sec), filling up a 100G network line. And some Cisco Web server appliances can serve webpages through six 10G ports at once.

So true high-performance means specialized hardware.
jamesread wrote:You may just think I'd be better off modifying an existing system to achieve this goal. That's OK. But I am sure that there could be gains from designing the system from the ground up especially for purpose.
I really don't think there's much you could add to the existing solutions.
jamesread wrote:Here's my thinking so far. Web servers have seen massive performance gains by switching from a threaded model to an event driven model using for example epoll in Linux. Nginx outperforms Apache using this model. But is there further work that can be done to increase throughput
The Linux kernel had a built-in webserver, but it was obsoleted and now there's khttpd for maximum performance. It is fast because it does not need processes, threads, and not even epoll and there's zero kernel-space - user-space overhead at all. That's the fastest software solution there can be. The price of performance is terrible security though (everything runs at ring 0, meaning a crafty HTTP-request could trigger a buffer-overflow directly inside the kernel).
jamesread wrote:and number of parallel connections supported by redesigning the OS to prioritise network communications over and above all other work on the system?
Now that's a completely different question. It's not the priority that matters (most servers does not run anything else than a webserver anyway), and there are RT versions of the Linux kernel you can use to minimize the overhead (combined with khttpd there'll be exactly zero latency). The problem is more related to the number of parallel connections. The proper terminology is "number of concurrent connections", google that. It is limited by many various factors:
1. the NIC throughput
2. the MTU size (which specifies how often an NIC IRQ is required for a stream within a connection, Cisco calls these jumbo frames)
3. the number of IRQs per sec possible on the hardware (see "interrupts limit" within the Linux kernel, plus also read this)
4. size of the TCP session table (if it's full, a webserver can't accept no more new connections, read this)
5. number of available file descriptors (webserver uses some per connection, and without threading and forking this is a global limit, even with khttpd)
6. amount of RAM (how much of the webpage content can be cached)
7. speed of the storage (how slow it is to read the content into the cache)
...etc. however all of these are either hardware-related (not much an OS can do about it) or already run-time configurable in mainstream OSes (use "sysctl -a" under Linux for example).

Furthermore, using a dedicated cache server is also a common practice. With low TTL (1 sec) users won't notice a thing, however it means a lot for the CPU. High-traffic sites often use clusters instead of single server with a very complex architecture:

Code: Select all

clients -- load balancer -- cacheserver(s) -- application server(s)
With VMs like AWS, the load balancer can start new cache and application server instances dynamically if the load increases, meaning the high-performance is achieved through running many machines in parallel. The cheapest load balancer could be a round-rubin DNS with multiple A records, but more serious (and expensive) solutions use VRRP (see also RFC 5798) that operates at ARP (mac address) level that provides very low latency in routing.

Anyway, I don't think you can come up with a better solution than what's already implemented in Linux. (I don't want to ruin your enthusiasm, but the fact is, Linux serves almost every high-traffic sites, so it has been put to test under real-time conditions for several decades now. Everything that worth trying to improve performance has already been tried with Linux, hardware-backed solutions included.)

Knowing this, I don't want to say you shouldn't try, but be prepared you have to be very very experienced to even have a chance to outperform a properly configured khttpd. I strongly suggest to study the subject by researching the phrases and terminology I've used in this post.

Cheers,
bzt
jamesread
Member
Member
Posts: 49
Joined: Wed Dec 09, 2020 11:32 am

Re: OS optimised for network communications

Post by jamesread »

According to comments on https://askubuntu.com/questions/1298952 ... ource-code khttpd seems to be obsolete. No activity since kernel version 2.5.24

Have you ever got khttpd running?
User avatar
eekee
Member
Member
Posts: 892
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: OS optimised for network communications

Post by eekee »

jamesread wrote:According to comments on https://askubuntu.com/questions/1298952 ... ource-code khttpd seems to be obsolete. No activity since kernel version 2.5.24
I don't know about taking info from Ubuntu forums, but 2.5.24 was a veeery long time ago, so I guess you're right. Instead - and this is the one thing I know which bzt didn't cover - we have sendfile() which takes 2 file descriptors and copies data from one to the other inside the kernel. That's the Linux version anyway. Different OSs have different sendfiles, which suggests there may possibly be room for improvement.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: OS optimised for network communications

Post by bzt »

jamesread wrote:Have you ever got khttpd running?
Never used it in production because as I've said the security risk was way too high.

I've used TUX though back in the days when I had to create a local webpage that listed the available bluetooth devices (for a high accuracy in-door positioning system). The web based map UI (running on an UMPC because tablets were not invented yet back then) queried the data from localhost, so there was no remote access at all. (The web UI then used the strength of the signals and triangulation to place a dot on the map to display the position of the user carrying around the UMPC.) That was the only project where I've ever used an in-kernel webserver.

When my job was to create architecture for high performance webservers, I've always used appliances and load balanced clusters because they are a much better and much scalable choice. No matter how much you try, a software-only solution running on a single server will always be second to a cluster which distributes the load among multiple instances. Now there are two in-kernel http servers that have been obsoleted, you should think about why is that.

Cheers,
bzt
Post Reply