running 32-bit code in LM64

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
kerravon
Member
Member
Posts: 278
Joined: Fri Nov 17, 2006 5:26 am

running 32-bit code in LM64

Post by kerravon »

I have had success in getting Win64 executables to
run under UEFI with a relatively small amount of
"glue" to switch formats. That's what the original
UCX64 at http://pdos.org is.

I decided to see if I could get Win32 executables
to run instead of Win64, using different - more
complicated - "glue".

Here is a simple call to puts, but written in assembler
(so I need a C compiler to generate assembler that
looks like this - a separate exercise):

https://sourceforge.net/p/pdos/gitcode/ ... demo32.asm

So this is dependent on msvcrt.dll, and then this loader code:

https://sourceforge.net/p/pdos/gitcode/ ... /exeload.c

(search for second occurrence of w32puts) will put in the stubs for the win32
executable, and the puts stub can be found here:

https://sourceforge.net/p/pdos/gitcode/ ... 32hack.asm

So anyway, with this in place, it means I can write a 32-bit
windows executable that works on anything from win 95
to win 11, and it also runs on an appropriate 64-bit UEFI-based
system (basically just UCX64 from http://pdos.org). Note that
this is running 32-bit code in long mode (not CM32). And note
that I don't exit boot services.

This has been tested under qemu and on real hardware.

There will be issues for something like printf, where I don't
know how many parameters there are. I could potentially get
around that by making it mandatory for (my) Win32 executables
to call getmainargs and if argc is (faked to be) greater than
x'80000000' (I did something similiar in PDPCLIB for the Amiga)
then it means that argv is also faked, and is a structure, and
you need to go there to fetch the real values of argc and argv,
as well as a "global" variable that contains the number of
arguments when you call a variable-argument function. The
compiler would generate code to say that if that global pointer
is not NULL then set it to the argument count, otherwise, take
no action as you are running under a normal Win32 environment.

I've only done proof of concept for 2 functions so far. Maybe
there is some show-stopper I will find later?

Any thoughts/improvements?

You can download a disk image containing the system from http://pdos.org
at the bottom of the University Challenge x64 section.

Thanks. Paul.
Octocontrabass
Member
Member
Posts: 5501
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

So, it's a compiler that's restricted to the subset of x86 instructions that work (mostly) the same in both 32-bit and 64-bit mode, some stack manipulation trickery, and thunk functions to translate the ABI?

That's definitely an interesting way of doing things. It means you have to recompile your programs to make them work in this new environment, though.

I was expecting something that would involve switching the CPU to 32-bit compatibility mode, which UEFI allows for some reason.
User avatar
iansjack
Member
Member
Posts: 4683
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: running 32-bit code in LM64

Post by iansjack »

I'm sure that I'm missing something, but - What's the point?
rdos
Member
Member
Posts: 3269
Joined: Wed Oct 01, 2008 1:55 pm

Re: running 32-bit code in LM64

Post by rdos »

Octocontrabass wrote: I was expecting something that would involve switching the CPU to 32-bit compatibility mode, which UEFI allows for some reason.
Why wouldn't it, and how could it possible stop it? My 64-bit EFI loader switches to compatibility mode, turns off long mode & paging and then turns on protected mode. Works perfectly well with all 64-bit EFI implementations I've tested it with.
Octocontrabass
Member
Member
Posts: 5501
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

rdos wrote:Why wouldn't it,
Because boot services are still running. There are some restrictions on what you can do before you exit boot services.
davmac314
Member
Member
Posts: 121
Joined: Mon Jul 05, 2021 6:57 pm

Re: running 32-bit code in LM64

Post by davmac314 »

Octocontrabass wrote:... involve switching the CPU to 32-bit compatibility mode, which UEFI allows for some reason.
That's less of an allowance than it is a disallowance.

You can switch mode, but you can't call UEFI services without switching back, and you must disable interrupts or handle them and switch back before passing them through to the firmware handler. Which, logically, you don't need permission to do; the firmware couldn't easily stop you doing this anyway, and if you don't call back into it and don't have interrupts enabled then it wouldn't even be aware that you'd done it.

I.e. rather than reading it as "you are allowed to switch mode" I read it as "you are not allowed to call UEFI services while operating in a different mode, including allowing UEFI services to be entered via an interrupt handler".
Octocontrabass
Member
Member
Posts: 5501
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

davmac314 wrote:the firmware couldn't easily stop you doing this anyway,
The firmware could very easily stop you with a SMI handler that relies on the CPU being in 64-bit mode until you exit boot services.
davmac314
Member
Member
Posts: 121
Joined: Mon Jul 05, 2021 6:57 pm

Re: running 32-bit code in LM64

Post by davmac314 »

Octocontrabass wrote:
davmac314 wrote:the firmware couldn't easily stop you doing this anyway,
The firmware could very easily stop you with a SMI handler that relies on the CPU being in 64-bit mode until you exit boot services.
I guess we're operating with a different meaning of "easily". Sure, it could perhaps go to certain lengths to do so, but there wouldn't be much to gain from doing it, and it'd be in contravention of the usual tack that UEFI takes in terms of restricting applications using protection mechanims (i.e. it doesn't).

Edit: maybe I misunderstood you. But, I'm having difficulty thinking of any good reason to have the SMI handler behave differently pre- and post- exiting boot services. Isn't SMM generally meant to operate correctly independently of whatever mode the processor is otherwise in?
Octocontrabass
Member
Member
Posts: 5501
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

davmac314 wrote:But, I'm having difficulty thinking of any good reason to have the SMI handler behave differently pre- and post- exiting boot services.
The firmware could map a timer's interrupts to SMI to implement the watchdog timer. The UEFI watchdog timer only runs before exiting boot services, so the timer handler could make assumptions about the CPU state if those assumptions were allowed by the UEFI spec.

I don't know if any firmware actually does use SMI for the watchdog timer, but it's one possibility.
davmac314 wrote:Isn't SMM generally meant to operate correctly independently of whatever mode the processor is otherwise in?
That's the idea, but there are definitely some buggy SMI handlers out there.
kerravon
Member
Member
Posts: 278
Joined: Fri Nov 17, 2006 5:26 am

Re: running 32-bit code in LM64

Post by kerravon »

Octocontrabass wrote:So, it's a compiler that's restricted to the subset of x86 instructions that work (mostly) the same in both 32-bit and 64-bit mode, some stack manipulation trickery, and thunk functions to translate the ABI?
Yes.
That's definitely an interesting way of doing things. It means you have to recompile your programs to make them work in this new environment, though.
I personally don't consider that to be a problem, because I consider that I am still at the stage of "gathering PC coding guidelines" - a process that I started around 1987 and still not complete - as well as "gathering appropriate source code" - another process that isn't yet complete, and that process sort of didn't start until the early 1990s - and once those things are gathered, plus an appropriate build process (not really even started - well, I do have pdmake), then I will do a once-off compile of everything and then that's it - it runs on the 80386 and everything subsequent. The applications (not interested in OS) may start breaking with 128-bit or 256-bit x128 and x256 computers in 50 or 500 years from now, but ideally the technique would not require the applications to be rebuilt.
I was expecting something that would involve switching the CPU to 32-bit compatibility mode, which UEFI allows for some reason.
Thanks for that. I think this is probably what I really want.

First though - I want to ask about a possible show-stopper. On the mainframe when I tried to run 32-bit code on a 64-bit system (to cut a long story short), I encountered a problem with negative indexes. Instead of wrapping at the 4 GiB mark as expected, it started accessing non-existent memory in the 4 - 8 GiB region. I was able to solve that problem by using paging and mapping the 4-8 GiB region to 0-4 GiB. It only occurred to me today that the same thing might apply to the x64. Negative indexes are a normal part of life and you would have to go to quite a lot of effort to change the code generator to test for this condition and use different instructions to do a manual truncation. So - when you index using [eax] in LM64, does it auto-wrap at 4 GiB or does it reach above 4 GiB? If the latter, I would presumably need changes to the UEFI page tables to overcome this problem, which is outside of my original plan (and also pointless because I may as well just use CM32 as I need privilege either way). My original plan allowed me to run my OS under UEFI even if UEFI make me run in ring 3 or whatever it is called until I exited boot services (and they could potentially make that the case, even though it isn't currently - this is in answer to the question posed elsewhere about it being "easy" to stop CM32 from being activated).

Anyway, from the sounds of things, the best thing for me to meet my goals (running (certain) Win32 software universally - note that Win64 came out in 2005 so there are potentially still active patents for it - need to wait around 3 years more to be sure I'm clear), I want CM32. And I believe the process is this:

Disable interrupts ("cli").
Obtain pointer to current GDT using "sgdt" and save it.
"lgdt" new GDT which can be copy of the old one but Long-mode code flag in code segment must be clear and Size flag set.
Set code segment and jump to CM32 code.
Now you are in CM32.
To return, reload the old GDT saved in step 2.
Set code segment and jump to LM64 code.
Reenable interrupts ("sti").

https://wiki.osdev.org/Setting_Up_Long_ ... it_Submode
https://wiki.osdev.org/GDT#Segment_Descriptor

Correct?

Thanks. Paul.
Octocontrabass
Member
Member
Posts: 5501
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

kerravon wrote:So - when you index using [eax] in LM64, does it auto-wrap at 4 GiB or does it reach above 4 GiB?
If you use an address size override in 64-bit mode to index using EAX instead of RAX, the effective address is truncated to 32 bits. Or, to quote Intel, "a 32-bit address generated in 64-bit mode can access only the low 4 GBytes of the 64-bit mode effective addresses."
kerravon wrote:even if UEFI make me run in ring 3 or whatever it is called until I exited boot services
Fortunately, UEFI can't do that.
kerravon wrote:And I believe the process is this:
Looks correct to me, but you're missing a 32-bit data segment and a stack.
rdos
Member
Member
Posts: 3269
Joined: Wed Oct 01, 2008 1:55 pm

Re: running 32-bit code in LM64

Post by rdos »

Octocontrabass wrote:
rdos wrote:Why wouldn't it,
Because boot services are still running. There are some restrictions on what you can do before you exit boot services.
I suppose so, but I switch to protected mode after exiting boot services. However, I can see a problem for people that want to run their applications under UEFI.
kerravon
Member
Member
Posts: 278
Joined: Fri Nov 17, 2006 5:26 am

Re: running 32-bit code in LM64

Post by kerravon »

Octocontrabass wrote:
kerravon wrote:So - when you index using [eax] in LM64, does it auto-wrap at 4 GiB or does it reach above 4 GiB?
If you use an address size override in 64-bit mode to index using EAX instead of RAX, the effective address is truncated to 32 bits. Or, to quote Intel, "a 32-bit address generated in 64-bit mode can access only the low 4 GBytes of the 64-bit mode effective addresses."
I'm not using address size overrides - the design is for this to be valid 32-bit code. It first and foremost runs on an 80386. Then I'm hoping it will also work when I switch to LM64.

I've already demonstrated it can be done for a simple puts. Now my question is whether the C compiler can generate appropriate assembler code for more complicated C code.
Octocontrabass
Member
Member
Posts: 5501
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

kerravon wrote:I'm not using address size overrides
Then you're not indexing with EAX. In 64-bit mode, you need an address size override to index using a 32-bit register.
kerravon
Member
Member
Posts: 278
Joined: Fri Nov 17, 2006 5:26 am

Re: running 32-bit code in LM64

Post by kerravon »

Octocontrabass wrote:
kerravon wrote:I'm not using address size overrides
Then you're not indexing with EAX. In 64-bit mode, you need an address size override to index using a 32-bit register.
Ok, so since this is unchanged binary code, that means what used to be indexing with eax is now indexing with rax.

But I have arranged for all the high 32-bits of registers to be 0, so normally everything is fine.

So - negative indexes will indeed be an issue then? And I either need to map 4-8 GiB to 0-4 GiB or I need to switch to CM32, right?

I guess it would have been good if the former - mapping the 4-8 GiB region - was the default on UEFI unless you do a call to activate the high memory.

Actually - UEFI could still do that mapping, and could map the 8-12 GiB region to 4-8 GiB real memory, if it exists.

The unfortunate thing is this "preferred address" of executables being defaulted to somewhere in the 4-8 GiB region instead of either 8-12 (or even higher) or 0-4.

They used the exact address range that I need to be "dead" (remapped/duplicated).

In hindsight, this might have been the way to support 32-bit executables running in a 64-bit environment.

It's the exact thing I did for z/PDOS (but IBM didn't do that for z/OS).
Post Reply