OSDev.org

Posted: **Sun Jun 10, 2007 1:07 pm**

While reading about IPC mechanisms for microkernels, a question came up: would it be possible to exploit the 80x86 ISA in such a way that the problem of modularity is solved in another more effective way?

The problem in monolithic operating systems is that each module sees all other modules all the time. Microkernels solve this problem by completely isolating one module from the other, and using special mechanisms for communication between modules. But microkernels have their own complexity and set of well-known problems.

So how about doing a flat-address 'monolithic' kernel where each module is isolated from the others? on 80x86, it could be done by a trick: manually changing the base and offset of the CS segment descriptor on an inter-module call.

A good question is why to change the contents of the CS segment descriptor manually when a far jump can do it automatically. Well, for the following reasons:

1) the LDT has not that many entries (perhaps 16K modules are not enough).
2) the CPU's caches are better utilized if the LDT is short.
3) the protection check is avoided.

When changing the CS segment descriptor manually, a problem may come up: the new CS:IP address might be invalid! so in order to avoid this problem, invocation of the inter-module procedure should be performed through another procedure which has the same IP address in both modules!

The pseudo-assembly for the inter-module invocation could be like this:

Code: Select all

SWITCH_MODULE:
    mov EAX, <target address>                                   ;load target address, relative to the target module
    mov ECS:EDX, <current CS descriptor>                        ;load current CS descriptor in 64-bit register
    mov [<address of CS descriptor>], <target CS descriptor>    ;change the CS descriptor

NO_MANS_LAND:    
    NOP
    NOP
    ...
    NOP
    
RETURN_POINT:

In the target module, invocation resumes with the following code (INVOKE_TARGET and NO_MANS_LAND have the same address relative to the start of each respective module):

Code: Select all

INVOKE_TARGET:
    call EAX                                                    ;call the target subroutine which saves ECX and EDX
    mov [<address of CS descriptor>], ECS:EDX                   ;return to caller
END_INVOKE_TARGET:

After the last instruction is invoked, execution should continue from address RETURN_POINT in the caller's segment.

The benefits of this trick are:

1) the target module can't randomly invoke other modules. Any access outside the module will result in an exception.
2) the TLB is not flushed (not like when changing CR3).
3) there is no need for a full context switch.

All that it takes for this to work is to invoke the target procedure through another procedure which has a common address between the two modules.

The rest of the registers can be modified accordingly (including the data and stack segments).

What do you think? I would like your advice if it could work.

Posted: **Sun Jun 10, 2007 1:36 pm**

Sounds interesting... There is another idea that involves using segmentation, but in a different way: "small address spaces" or "address space multiplexing". This technique keeps the same architecture as a microkernel, but tries to avoid some of the high costs of address-space switches (TLB re-fills, etc.).

This basically involves putting many "small" processes into a single flat linear address space by dividing them into separate logical address spaces using segmentation. Then, when a "small" process grows too large, its contents are copied to a new linear address space (i.e. -- its own set of pages) and from that point on it behaves like a "normal" process.

http://ertos.nicta.com.au/publications/ ... SHH_02.pdf

Your idea might work too, but I like the conceptual clarity of a microkernel architecture better.

Posted: **Sun Jun 10, 2007 1:46 pm**

i wont go into details about the inherent flaws with this system (i would love to see you try it, it might just work, i dont know all that much about x86 segmentation in practice) but i'll say this, if there was a viable method of separating modules in a monolithic system without ruining speed and maintaining adequate protection, chances are someone would be using it right now, and i have never heard of such a thing

Posted: **Sun Jun 10, 2007 2:16 pm**

axilmar wrote:mov ECS:EDX, <current CS descriptor>

Please tell me you mean ECX:EDX? Or is this at&t syntax?

Also, you can't just update CS by changing the relevant descriptor's entry in the GDT/LDT. You need to reload CS with a jmp, call or iret. The reason is that the GDT isn't accessed everytime the processor fetches an instruction, rather a command to load CS fills in a hidden part of the segment register with the base and limit.

There is, however, nothing to stop you from doing:

Code: Select all

mov edi, code_segment_number
shl edi, 3
add edi, gdt_address
mov esi, new_code_segment_descriptor
mov dword [edi], [esi]
mov dword [edi + 4], [esi + 4]

mov edx, inter_module_stub
mov eax, proc_to_call

mov edi, data_segment_number
shl edi, 3
add edi, gdt_address
mov esi, new_data_segment_descriptor
mov dword [edi], [esi]
mov dword [edi + 4], [esi + 4]

mov ecx, [data_segment_selector]
mov ds, ecx
mov es, ecx
mov fs, ecx
mov gs, ecx
mov ss, ecx

mov eax, ecx
jmp [code_segment_descriptor]:edx

The last line can't actually be expressed in x86 asm, you'll need to find some way around it.

You also somehow need to set up the stack for the next process you're jumping to.

Personally, although its an interesting idea, I'd stick to using separate address spaces for each process through the paging mechanism. Code segments no longer exist in long mode.

Regards,
John.

Posted: **Mon Jun 11, 2007 4:34 am**

jnc100 wrote:
axilmar wrote:mov ECS:EDX, <current CS descriptor>
Please tell me you mean ECX:EDX? Or is this at&t syntax?

By ECX:EDX I mean any 64-bit register (since the descriptors are 64 bit)

Also, you can't just update CS by changing the relevant descriptor's entry in the GDT/LDT. You need to reload CS with a jmp, call or iret. The reason is that the GDT isn't accessed everytime the processor fetches an instruction, rather a command to load CS fills in a hidden part of the segment register with the base and limit.

Thanks, it was not clear for me if such a condition existed.

Personally, although its an interesting idea, I'd stick to using separate address spaces for each process through the paging mechanism. Code segments no longer exist in long mode.

It would have been great if the paging mechanism was also a segmentation mechanism: each page could belong to a module, and jumping to a different module would require another set of jump instructions. But, as it is right now, we all have to suffer the design constraints of current CPUs.

Posted: **Mon Jun 11, 2007 1:56 pm**

I believe you're talking about call gates. Go ahead and look them up.

The main troubles with call gates are:

1) They require small module spaces, and you can only call through a gate to a module sharing your page tables. No mechanism exists to change both the page directory and the code segment without dropping into the kernel.
2) They require segmentation, which extremely few processors support at all and no processors support well.

If you have a lot of interest in inter-module calling stuff, I'll go ahead and put up the portal IPC thread I've been foaming at the brain for.

Posted: **Tue Jun 12, 2007 7:28 am**

Crazed123 wrote:I believe you're talking about call gates. Go ahead and look them up.

Thanks, I already know about call gates. I was talking about 'software' gates,i.e. constructing module information on the fly.

Posted: **Wed Oct 03, 2007 7:45 am**

So it's the "We hate segmentation!" bandwagon again ... joy.

Segmentation itself, is not flawed.
Segmentation is clean.

Intel's implementation of Segmentation, while also including the capabilities of Virtualized memory is rather impressive, especially considering that it is entirely possible to execute programs written back in 1988, without changing a single byte...

People always ***** and whine when Segmentation is mentioned, usually like this:

a) 64k boundary limits, aArrGGhh.
- Yeah....... Okay. Legacy constraints people. Segmentation can be very, very powerful if you give it a chance, that... or you actually pay attention to the Intel manuals and LEARN that hey, man, Segments can be much larger than 64K. . .

It amuses me that so many people have such a kneejerk reaction to segmentation, considering that they have to setup Flat Code/Data/Stack segments in the GDT, which commonly range from $0-$FFFFFFFF.

Proof that 64K means jack **** to segments.

b) You can access the same byte in physical memory, with a whole array of crazy, differing logical addresses.

- And you cant see the BENEFIT of that? Think of the power? While paging allows us to share code/data/whatever as we see fit, it has the downside that ... well, (for the most part) unless you have some kind of protocol, what you share will have to be page aligned (4k/2M/4M...whatever).

Segmentation doesnt suffer from that. If you want your Segment to be 16 bytes in size, alright, no problem.

c) If you use Segmentation you cant use Paging.
- . . . . . rrrrriight. Read ze manual, grasshoper . . .

I appologize in advance for this unfocused slightly flamey rant, but... it just bugs me that like, 95% of people say SEGMENTATION BLOOWSSSS and very few of them really have decent reasons for saying so, at all...

As for Segmentation not being in Long mode, ... thats not neccessarily the end all and say all. 32bit processors will be around for a long time, the IA32 architecture will be around for a long time, backwards compatilibity (while we hate it) will for the most part, live on - this means segmentation lives on too.

Also, dont forget the embedded space and that people can and do use older 486 chips for completely... embedded purposes.

... anywho, sorry, rant... coffee... cigarette, need now . . .
~Z

Posted: **Wed Oct 03, 2007 1:05 pm**

i have never heard any of those complaints about segmentation, and i have read every message ever posted to this board and the MT board...

the major complaints about segmentation:
1) it cant use more than 4GB without being combined with paging

2) when you try to combine it with paging it becomes very complicated and prone to difficult to find bugs

3) porting to 64bit will be trouble (and although there is no problem with using 32bit for at least the next 100 years, there is no reason to limit yourself so much...)

4) paging virtual memory, while possible, is much harder than it first appears (because of the requirement to use consecutive addresses, it is completely impractical to use it without paging, and even with paging it becomes a very complicated mess... requiring you to 'defragment' your page tables at runtime...)

5) most compilers will refuse to compile code which is aware of segmentation -- this is actually the biggest complaint i have heard, and completely true -- you can forget using or porting most existing compilers (including the ultra-popular GCC) to an operating system which makes extensive use of segmentation...

6) segmentation doesnt give you anything you cant do with paging -- which is a reason not to use it, simply by being not a reason to use it... (KISS FTW!)

Posted: **Thu Oct 04, 2007 9:29 am**

JAAman wrote:6) segmentation doesnt give you anything you cant do with paging -- which is a reason not to use it, simply by being not a reason to use it... (KISS FTW!)

- potentially less memory usage: only ldt versus pagedirectory + several PTs
- no performance hits due to TLB misses (and especially wrt context switching)

and for the record: "why do we do it? because we can!"

OSDev.org

manual segmentation solves the problem of modularity?

manual segmentation solves the problem of modularity?