Separate code segment and data segment

Congdm · Post by **Congdm** » Thu Sep 27, 2012 3:45 am

Hi everyone,

Now I'm working on my memory manager. I divided memory into two segments, one for code and one for data. Code segment is begin at physical address 0 and expand-up. Data segment is begin at the top of memory and expand-down. They are fully separated, not overlapping. And a special data segment, begin from 0 to FFFFFFFFh.

Normal program only use code and data segments. The special segment is used only to load program into code segment or for I/O purpose.

Also, paging is optional and not implemented yet.

I'm inexperienced so I would like some advices about this memory model. Are there any problems which I need to consider?

bluemoon · Post by **bluemoon** » Thu Sep 27, 2012 4:00 am

My advice is implement pagaing and forget all these special segment, which is only to fix issuesdue to lack of paging.

Antti · Post by **Antti** » Thu Sep 27, 2012 4:00 am

Congdm wrote:Are there any problems which I need to consider?

I would call it as a thing worth considering to that memory segments in itself are quite deprecated. Paging should be enough.

Combuster · Post by **Combuster** » Thu Sep 27, 2012 4:02 am

Well, that particular system is only going to work decently with paging. After all, starting from 0 upwards pretty quickly include the BIOS areas, and you'll get all the hardware devices in the data segment before the segment size would reach RAM.

I think I also missed on where the kernel is supposed to go.

Other than that, using segmentation in a fashion like this is pretty much the oldest way possible to implement W^X security (paging is sufficient? It took the designers quite a while to get there and on the OP's computer it's definitely insufficient). Either way, you'd just seem to need a fair share of polish from where you are now.

Congdm · Post by **Congdm** » Thu Sep 27, 2012 4:32 am

Combuster wrote:Well, that particular system is only going to work decently with paging. After all, starting from 0 upwards pretty quickly include the BIOS areas, and you'll get all the hardware devices in the data segment before the segment size would reach RAM.

Sorry, what I mean is data segment begin at the top of RAM without paging.
With paging it will work decently, data segment begin at $C0000000, leaving the top for I/O devices. And with paging, who need segment?

But without paging it is much more tricker.

The data segment begin at the top of RAM, so I need to check RAM size in order to create the segment. Code segment overlapped with BIOS areas, so I need to mark BIOS areas as un-allocatable.

Combuster wrote:I think I also missed on where the kernel is supposed to go.

This is what bug me, of course kernel code stays at the code segment but where will kernel data stay? For now I put some critical data on code segment and other on data segment. It isn't a good choice.

Note: There is no userspace, all program run in ring 0. This minimal protection mechanism is to protect again accidental error, not malicious user.

Combuster · Post by **Combuster** » Thu Sep 27, 2012 4:44 am

In that case, sticking to that system would want you to have the code and data sections not meet, but instead should be grown whenever an allocation for either domain is made. You will however risk serious problems with fragmentation when one object is holding the limit at some point and the other end is not allowed to pass to get to the available memory.

A lack of paging also implies that you can't link processes to fixed addresses anymore, which means loss of performance using PIC and offset tables for 32-bit x86, or loading everything with relocations - kernel included. Neither way is going to be a trivial task.

Congdm · Post by **Congdm** » Thu Sep 27, 2012 5:46 am

Combuster wrote:In that case, sticking to that system would want you to have the code and data sections not meet, but instead should be grown whenever an allocation for either domain is made. You will however risk serious problems with fragmentation when one object is holding the limit at some point and the other end is not allowed to pass to get to the available memory.

I didn't think carefully about this problem, it need to be taken care of. Reallocation will not work nicely without paging, so probably, I must accept that limit.

Combuster wrote:A lack of paging also implies that you can't link processes to fixed addresses anymore, which means loss of performance using PIC and offset tables for 32-bit x86, or loading everything with relocations - kernel included. Neither way is going to be a trivial task.

I already used offset table on every modules since the very beginning. Everything is dynamical linked. But I feel programming without fixed address is awkward.

Beside, how much performance is lost when I use PIC on all modules, even kernel?

Combuster · Post by **Combuster** » Thu Sep 27, 2012 6:38 am

PIC on x86_32 can sometimes add an additional 10% on top of the regular processing time due to losing one of the six remaining GPRs for a fixed task, but seems quite dependent on the code at hand resulting in a lack of consensus. IIRC the actual average would be between 2%-5%

Congdm · Post by **Congdm** » Thu Sep 27, 2012 7:07 am

Thanks for all the advices. Now I can return to coding, my basic system is near completed. Without OSDev forum, it will be impossible to finished in only 2 months.

NickJohnson · Post by **NickJohnson** » Thu Sep 27, 2012 8:33 am

Combuster wrote:PIC on x86_32 can sometimes add an additional 10% on top of the regular processing time due to losing one of the six remaining GPRs for a fixed task, but seems quite dependent on the code at hand resulting in a lack of consensus. IIRC the actual average would be between 2%-5%

There is also an overhead in acquiring the base address of the loaded object (from EIP) whenever you enter a PIC function from outside a shared library, which I think is pretty significant, but doesn't apply when you're using PIC for an executable like a kernel. Also, there is pretty much zero overhead on x86_64, because of RIP-relative address modes.

qw · Post by qw » Wed Oct 03, 2012 3:50 am

I agree that having separate segments for code and data is the easiest way to implement non-execute protection. I have two questions though:

Are you sure that the addresses covered by your segments are actually present?
Why is the data segment expand-down?

linguofreak · Post by **linguofreak** » Sat Oct 06, 2012 12:59 pm

Hobbes wrote:Why is the data segment expand-down?

If he were using paging it would make sense: It would allow him to have both CS and DS be zero-based (which, as I understand, modern x86's make a special case of to improve performance, since pretty much everything uses a flat memory model nowadays) without having them overlap.

As it is, it runs into the "is the memory even present" issue that you mention.

rdos · Post by **rdos** » Sun Oct 14, 2012 5:58 am

I cannot see much use in this. It is possible to protect code (using paging) from writes, and most of the issues of flat memory models involve data corruption, not code corruption. To achieve the advantages of segmentation, it is necesary to use it properly, and map each object to distinct data & code. The other alternative is to go microkernel. Maybe even go 64-bit and let the upper 16 bits of the 48-bit address represent a "segment". However, long mode has no fine-grained limit checking, so it is not the same thing.

Griwes · Post by **Griwes** » Sun Oct 14, 2012 6:42 am

rdos wrote:and most of the issues of flat memory models involve data corruption

You are either talking about:
1) userspace apps - no problem, let them corrupt themselves, not solvable by os developer
2) device drivers - if you don't trust a driver, not allowing it into kernel space is the correct solution, not relying on deprecated hardware feature. µkernels seems to fix the problem easily (good for you for mentioning it in that post)

rdos wrote:To achieve the advantages of segmentation

Which were what exactly, compared to paging?

rdos wrote:Maybe even go 64-bit

Why "even", looking at modern machines? (Don't trick other people, someone just starting his project will have it anywhere being usable around the time when everyone already has some IA-32e machine).

rdos · Post by **rdos** » Mon Oct 15, 2012 12:25 am

Griwes wrote:
rdos wrote:To achieve the advantages of segmentation
Which were what exactly, compared to paging?

Precis limit checking, which cannot be done with paging. Paging typically has a 4k granularity, and thus cannot be used to enforce limits. Paging can detect limit violation after they occured, but it cannot detect them when they occur.

OSDev.org

Separate code segment and data segment

Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment

Re: Separate code segment and data segment