Segmentation in long mode and LMLE in EFER

lopidas · Post by **lopidas** » Wed Nov 13, 2013 10:54 am

In 32-bit mode I can use segment registers ds and ss to separate stack from heap.
Is it possible to use DS, SS and ES registers in 64-bit mode with LMLE enabled?

Combuster · Post by **Combuster** » Wed Nov 13, 2013 12:33 pm

Oh, look what I found in Intel 1 Chapter 3.3.4

*ahem*

lopidas · Post by **lopidas** » Wed Nov 13, 2013 1:01 pm

Courious again

So reffering here Intel segmentation in long mode

In 64-bit mode, segmentation is generally (but not completely) disabled, creating a flat 64-bit linear-address
space. The processor treats the segment base of CS, DS, ES, SS as zero, creating a linear address that is equal to
the effective address. The exceptions are the FS and GS segments, whose segment registers (which hold the
segment base) can be used as additional base registers in some linear address calculations.

and here amd

In 64-bit mode, data reads and writes are not normally checked for segment-limit violations. When
EFER.LMSLE = 1, reads and writes in 64-bit mode at CPL > 0, using the DS, ES, FS, or SS segments,
have a segment-limit check applied.

So how do we generate instruction using DS,ES or SS, if the segmentation is disabled? Are the access checks applied?

P.S. I found that even SYSCALL refers SS

Brendan · Post by **Brendan** » Wed Nov 13, 2013 5:46 pm

Hi,

lopidas wrote:In 32-bit mode I can use segment registers ds and ss to separate stack from heap.

In theory you can do this in 32-bit protected mode. In practice functions that accept pointers would have to know if the pointers point to something on the stack or on the heap (or somewhere else); and you end up having to use "fat pointers" everywhere ("48-bit pointers" consisting of a 16-bit segment and a 32-bit offset) and doing very slow segment register loads very often; which creates more code (e.g. segment register loads), more bloat (e.g. pointers take up twice the space due to alignment) and more bugs (e.g. right offset but wrong segment). Basically, there's no benefit that anyone cares about, and no sane person would consider crippling performance for no benefit.

Instead; for 32-bit protected mode sane people set DS, ES and SS to "read/write, base is zero, limit is 4 GiB" so that segmentation is effectively disabled (and then use paging for protection).

lopidas wrote:So how do we generate instruction using DS,ES or SS, if the segmentation is disabled? Are the access checks applied?

You generate instructions in the same way you always have; except that you can just use the default segment register and forget about segment override prefixes (because it makes no difference if the instruction uses DS, ES or SS). Basically; it's the same as what sane people have always done in 32-bit protected mode.

I can't remember which access checks are done when. Typically in 64-bit mode you'd load a "read/write data" descriptor into DS, ES and SS; and after that it's impossible to do anything that any "segmentation access check" won't allow anyway. Of course you'd use paging for the actual/useful access checks.

Cheers,

Brendan

rdos · Post by **rdos** » Mon Nov 18, 2013 8:07 am

Brendan wrote: In theory you can do this in 32-bit protected mode. In practice functions that accept pointers would have to know if the pointers point to something on the stack or on the heap (or somewhere else); and you end up having to use "fat pointers" everywhere ("48-bit pointers" consisting of a 16-bit segment and a 32-bit offset) and doing very slow segment register loads very often; which creates more code (e.g. segment register loads), more bloat (e.g. pointers take up twice the space due to alignment) and more bugs (e.g. right offset but wrong segment). Basically, there's no benefit that anyone cares about, and no sane person would consider crippling performance for no benefit.

More bugs? You cannot be serious. Most bugs are pointer overwrites (usually in heap which will corrupt memory of something else or the memory allocation chain). These are bugs that flat kernels usually are full of and which in combination with multicore multitasking creates bugs that never are fixed.

Besides, smart C compilers will have callings conventions and code generation that avoids most of the segment register reloads. By having a single code segment, you also can omit far calls and can treat all function pointers as near.

But this is impossible to do in long mode, due to long mode having crippled the logic of descriptors and use a zero base regardless of the actual base in the descriptor cache (except for FS and GS). I don't thinlk the limit checking works for FS and GS in long mode, only the base is used. You will have to use legacy mode or protected mode in order to use segmentation as it was meant to be used.

Combuster · Post by **Combuster** » Mon Nov 18, 2013 2:52 pm

rdos wrote:More bugs? You cannot be serious.

Yes, more bugs. Running with the whole load of instrumentation enabled also reveals more bugs.

Not that either has any demonstrated effect on the quality of the coder.

Brendan · Post by **Brendan** » Mon Nov 18, 2013 5:39 pm

Hi,

rdos wrote:
Brendan wrote: In theory you can do this in 32-bit protected mode. In practice functions that accept pointers would have to know if the pointers point to something on the stack or on the heap (or somewhere else); and you end up having to use "fat pointers" everywhere ("48-bit pointers" consisting of a 16-bit segment and a 32-bit offset) and doing very slow segment register loads very often; which creates more code (e.g. segment register loads), more bloat (e.g. pointers take up twice the space due to alignment) and more bugs (e.g. right offset but wrong segment). Basically, there's no benefit that anyone cares about, and no sane person would consider crippling performance for no benefit.
More bugs? You cannot be serious.

I'm sure I can be serious if I try hard enough!

Without segmentation, only the "offset" can be wrong. With segmentation, the offset can be wrong, the segment can be wrong, or both can be wrong. Obviously there are 3 times as many possible bugs with segmentation.

The theoretical benefit of segmentation is that it's more able to catch some bugs; so (in theory) even though there are more possible bugs to catch less of them are meant to find their way into the final executable. Of course in practice it doesn't work like that and if you really care about catching bugs then segmentation is extremely inferior to (e.g.) managed code (for both "ability to catch bugs" and performance).

Fortunately, AMD were smart enough to close off this particular "detour to the land of nonsense" by removing support for segmentation. We should all thank AMD for this.

Cheers,

Brendan

linguofreak · Post by **linguofreak** » Mon Nov 18, 2013 7:36 pm

Brendan wrote:Hi,

lopidas wrote:In 32-bit mode I can use segment registers ds and ss to separate stack from heap.
In theory you can do this in 32-bit protected mode. In practice functions that accept pointers would have to know if the pointers point to something on the stack or on the heap (or somewhere else); and you end up having to use "fat pointers" everywhere ("48-bit pointers" consisting of a 16-bit segment and a 32-bit offset) and doing very slow segment register loads very often; which creates more code (e.g. segment register loads), more bloat (e.g. pointers take up twice the space due to alignment) and more bugs (e.g. right offset but wrong segment).

I can imagine some architectural features that would alleviate some of the fat pointer issues, but they would completely break compatibility with any existing member of the x86 family. I think that a properly designed MMU architecture could make segmentation extremely useful, but I've not seen anything that quite fits the bill (the access register mechanism on IBM mainframes probably comes closest, from what I've seen).

rdos · Post by **rdos** » Tue Nov 19, 2013 2:56 am

The "fat" 48-bit pointers in long mode could work almost as well as segment + offset protection, if used in a sane way. However, compilers (GCC) still have serious bugs with their medium and large memory model, which means that popular code for long mode doesn't use this feature either, rather pack all their data structures and use 32-bit relative addressing. That, of course, is just as bad as typical 32-bit flat code that typically is full of bugs.

And it is easy to understand that proper segmentation with base + limit checking should be possible to implement much more efficiently than the 4-level page structure in long mode. If it wasn't for all this "portable" bloat code floating around, Intel and AMD could actually have done this properly.

OSDev.org

Segmentation in long mode and LMLE in EFER

Segmentation in long mode and LMLE in EFER

Re: Segmentation in long mode and LMLE in EFER

Re: Segmentation in long mode and LMLE in EFER

Re: Segmentation in long mode and LMLE in EFER

Re: Segmentation in long mode and LMLE in EFER

Re: Segmentation in long mode and LMLE in EFER

Re: Segmentation in long mode and LMLE in EFER

Re: Segmentation in long mode and LMLE in EFER

Re: Segmentation in long mode and LMLE in EFER