running 32-bit code in LM64

devc1 · Post by **devc1** » Thu Sep 14, 2023 1:53 pm

kerravon wrote:
devc1 wrote:It's 2023 and you guys are still worried about 32 bit mode

Use 32 bit registers in 64 bit mode, it's as easy as it is.
Code written with such 32-bit overrides will end up doing 16-bit if run on an 80386. Not what I wish to do.

Recompile it in x64 architecture.

Octocontrabass · Post by **Octocontrabass** » Thu Sep 14, 2023 11:17 pm

kerravon wrote:If you want to insist on your worldview, so be it - why isn't the MIPS (alone in the world supposedly)

MIPS is the only architecture I know of that doesn't require any mode changes to run all 32-bit software on a 64-bit CPU, but that doesn't mean there aren't others.

kerravon wrote:the right way to do things for all processors?

MIPS was probably just lucky.

On IBM mainframes, addresses were smaller than registers, and the unused upper bits were ignored in address calculations. On MIPS, addresses were the same size as registers, and the CPU's internal cache used every address bit even if external hardware ignored some upper address bits.

On x86, tons of opcodes were already assigned, and adding new 64-bit instructions without changing any opcodes would make all 64-bit instructions extremely long. On MIPS, every instruction is 32 bits, so there were plenty of free opcodes for new 64-bit instructions alongside the existing 32-bit instructions.

kerravon wrote:Specifically I think the MSDOS (actually PDOS/86) executables need to not have any functions that cross a 64k boundary.

That seems like an unnecessary limitation. Not all pointers have to be huge pointers.

kerravon wrote:So that the PDOS/286 (also PDOS/386 running applications with the D bit set to activate PM16 for the applications so that more selectors are available) loader can load huge memory model programs and shuffle the data so that the exact same executable can address 512 MB of memory instead.

Microsoft's implementation for the huge memory model in Windows involves a special variable set by the loader. When performing arithmetic on a huge pointer, programs use that variable to adjust the segment portion of the huge pointer. In real mode, the variable is 0x1000. In protected mode, it's something like 0x8 or 0x10.

That doesn't follow Intel's rules of treating segments as completely opaque, but Microsoft got away with it because the 286 is older than Windows.

kerravon wrote:I want exactly what you said above - follow Intel's rules and the definition of a selector suddenly changes, and your application doesn't care at all. Instead, your application says "whee - 16 MiB or 512 MiB instead of 640 KiB".

Huge pointer arithmetic is very simple if you follow Intel's rules: make the OS do it.

Unfortunately, I think that means you still need at least a little bit of conditional execution in your programs, since MS-DOS doesn't have a huge pointer arithmetic API.

rdos · Post by **rdos** » Fri Sep 15, 2023 12:16 pm

devc1 wrote:What memory protection are you talking about, long mode has paging which includes PAT (on all modern processors) which includes Write protect, write combine, cache disable, write back, write through. What protection are u talking about, and what more do u need ?

U got large pages, huge pages. Multiple cpus, NUMA and SIMD. ......

That's of little use when people build monolithic kernels which combine the whole kernel code and data into adjacent locations. Of course, paging cannot solve these issues. The combined kernel heap that is mapped in all address spaces is even worse.

One solution is to build a micro kernel instead, but this causes many address space switches just to provide isolation. Segmentation is more effective and only requires segment register loads.

devc1 · Post by **devc1** » Fri Sep 15, 2023 12:45 pm

That's of little use when people build monolithic kernels which combine the whole kernel code and data into adjacent locations. Of course, paging cannot solve these issues. The combined kernel heap that is mapped in all address spaces is even worse.

Of course paging can solve every issue, did you hear about relocation? or atleast executable images which make you decide where each section should be in memory ?

If you are worried about a driver screwing up your memory, just make a user mode driver, problem solved!

I personally let drivers and kernel run all in one space, to benefit from TLB extensions because it limits you to 4095 address spaces.
But you can also put them in separated usermode processes, or kernel mode processes with different page tables.

rdos · Post by **rdos** » Fri Sep 15, 2023 2:48 pm

devc1 wrote:
That's of little use when people build monolithic kernels which combine the whole kernel code and data into adjacent locations. Of course, paging cannot solve these issues. The combined kernel heap that is mapped in all address spaces is even worse.
Of course paging can solve every issue, did you hear about relocation? or atleast executable images which make you decide where each section should be in memory ?

Tried it with GCC, and it didn't work. The linker padded the executable with a huge amount of zeros.

If the linker could handle this properly, then you could put each driver in it's own 4G area, and implement the heap per driver. With RIP-relative addressing (which only covers 4G), you could obtain decent isolation, but at a pretty high cost of linear address space usage.

devc1 wrote: If you are worried about a driver screwing up your memory, just make a user mode driver, problem solved!

That's a microkernel. I'm using that approach for filesystem drivers.

rdos · Post by **rdos** » Fri Sep 15, 2023 2:53 pm

Octocontrabass wrote:
rdos wrote:It's severely broken, mostly because compatibility mode thrashes upper 32-bits of registers.
Other architectures work this way too. Why is it only a problem for x86?

It's an interoperability problem. If long mode calls compatibility mode code, then compatibilty code cannot save registers it uses since the higher halves are trashed.

Octocontrabass wrote:
rdos wrote:It's also broken because 64-bit registers cannot be used in protected mode, like 32-bit registers could be used in real mode and 16-bit protected mode.
That's intentional. Allowing 16-bit software to use 32-bit registers is a major design flaw. A 16-bit OS won't save and restore 32-bit registers when it switches tasks, so all running tasks share the same set of registers!

It's not. Mixed bitness designs needs this feature.

Also, some calculations in a 32-bit OS would benefit by using 64-bit integers, and if 64-bit registers were available, these calculations would become a lot faster than using two 32-bit registers.

devc1 · Post by **devc1** » Fri Sep 15, 2023 3:19 pm

rdos wrote:
devc1 wrote:
That's of little use when people build monolithic kernels which combine the whole kernel code and data into adjacent locations. Of course, paging cannot solve these issues. The combined kernel heap that is mapped in all address spaces is even worse.
Of course paging can solve every issue, did you hear about relocation? or atleast executable images which make you decide where each section should be in memory ?
Tried it with GCC, and it didn't work. The linker padded the executable with a huge amount of zeros.

If the linker could handle this properly, then you could put each driver in it's own 4G area, and implement the heap per driver. With RIP-relative addressing (which only covers 4G), you could obtain decent isolation, but at a pretty high cost of linear address space usage.

devc1 wrote: If you are worried about a driver screwing up your memory, just make a user mode driver, problem solved!
That's a microkernel. I'm using that approach for filesystem drivers.

For me everything works fine with MSVC (Microsoft compiler). I also think it's better, because PE executables are generally better.

And considering the filled up zeros, are they in initialized memory or uninitialized one.

Octocontrabass · Post by **Octocontrabass** » Fri Sep 15, 2023 5:33 pm

rdos wrote:Tried it with GCC, and it didn't work. The linker padded the executable with a huge amount of zeros.

That sounds like a mistake in your linker script.

rdos wrote:It's an interoperability problem. If long mode calls compatibility mode code, then compatibilty code cannot save registers it uses since the higher halves are trashed.

I still don't see how that's a problem. If long mode code calls compatibility mode code, then the long mode code is responsible for saving registers.

rdos wrote:It's not. Mixed bitness designs needs this feature.

In mixed designs, you could jump to 32-bit code to access 32-bit registers.

rdos wrote:Also, some calculations in a 32-bit OS would benefit by using 64-bit integers, and if 64-bit registers were available, these calculations would become a lot faster than using two 32-bit registers.

That's a 64-bit OS. We already have the x32 ABI if you want 64-bit registers with 32-bit addresses.

rdos · Post by **rdos** » Sat Sep 16, 2023 1:29 pm

Octocontrabass wrote:
rdos wrote:Tried it with GCC, and it didn't work. The linker padded the executable with a huge amount of zeros.
That sounds like a mistake in your linker script.

rdos wrote:It's an interoperability problem. If long mode calls compatibility mode code, then compatibilty code cannot save registers it uses since the higher halves are trashed.
I still don't see how that's a problem. If long mode code calls compatibility mode code, then the long mode code is responsible for saving registers.

rdos wrote:It's not. Mixed bitness designs needs this feature.
In mixed designs, you could jump to 32-bit code to access 32-bit registers.

rdos wrote:Also, some calculations in a 32-bit OS would benefit by using 64-bit integers, and if 64-bit registers were available, these calculations would become a lot faster than using two 32-bit registers.
That's a 64-bit OS. We already have the x32 ABI if you want 64-bit registers with 32-bit addresses.

I don't want a 64-bit OS since it's impossible to reuse protected mode drivers (primarily because of the issue with thrashing higher halves of registers). I just would want to ocassionally use 64-bit registers in my 32-bit OS. For instance, sectors numbers are 64-bit and so are file sizes, and I could pass them in RAX or RDX instead of using two 32-bit registers. I could also run the filesystem server processes in long mode. The latter still might work by letting the scheduler switch between protected mode and long mode.

devc1 · Post by **devc1** » Sat Sep 16, 2023 3:35 pm

My mind questions, doesn't 64-bit windows support 32 bit applications, so it's possible?

I think it's either you use VMX to create a virtual machine to run 32 bit applications inside the 64 bit (Host)OS inside a 32 bit (Virtual)OS in the VM, and communicate through something such as message pipes.
I think that's what Windows does.

Or use compatibility mode, it should work as normal as a 32 bit program but I think you need to link the drivers to 32 bit functions. That's why you should probably switch to DLLs and let your loader decide how to link the symbols, in this case I took the idea from NT Kernel, in which I use my kernel as a library, which creates the .lib file then when loading it I tell the loader to add the kernel to the library list which contains the export directory pointer inside the kernel file, (do this after relocating!).

I don't know about neither of those, so that's up to you to figure it out. (My OS is 64 Bit only).

I think it's easier for you to perform little edits on your 32 bit drivers and recompile them as 64 bit ones, might take a week or more but it's definitely easier.

I don't know what's the issue with trashing upper halves of 64 bit registers since you are already using only the lower halves (you are in compatibility mode).

Octocontrabass · Post by **Octocontrabass** » Sat Sep 16, 2023 4:20 pm

rdos wrote:I don't want a 64-bit OS since it's impossible to reuse protected mode drivers (primarily because of the issue with thrashing higher halves of registers).

You still haven't explained why it's a problem in the first place. It's your OS with your ABI! Why can't you make 64-bit code preserve the upper halves of 64-bit registers before calling 32-bit code?

devc1 wrote:I think that's what Windows does.

Windows uses compatibility mode.

rdos · Post by **rdos** » Sun Sep 17, 2023 7:52 am

devc1 wrote: I think it's easier for you to perform little edits on your 32 bit drivers and recompile them as 64 bit ones, might take a week or more but it's definitely easier.

Not so. A majority of the drivers are in assembly, and so cannot be recompiled to 64-bit. They also are dependent on the compact memory model where each driver has it's own code & data selector.

devc1 wrote: I don't know what's the issue with trashing upper halves of 64 bit registers since you are already using only the lower halves (you are in compatibility mode).

That's really simple. I have a register-based ABI, which defines which registers are used as input and output. That should be transpaerent to 64-bit code, but thrashing upper halves means 64-bit code must save all registers shared with 32-bit, and then must load the output registers in case there are such.

nullplan · Post by **nullplan** » Sun Sep 17, 2023 9:14 am

rdos wrote:Not so. A majority of the drivers are in assembly, and so cannot be recompiled to 64-bit. They also are dependent on the compact memory model where each driver has it's own code & data selector.

So we finally get to the heart of the issue: You wrote your code in an unportable way, and now are mad that this limitation ended up mattering. This is a lesson I learned early on, when I dabbled in writing Windows programs in assembler and then noticed that for 64-bit mode, everything was different enough as to be a complete rewrite. What was a single option for even the C programmer would have been too much work to bother with in assembler. So for one, using assembler for the task was bloody worthless, since most Windows programs are just long sequences of function calls, for two you end up making mistakes the C compiler would have caught in a split second, and finally there's this issue.

This is why I swore off assembler. I obviously still use it, but only in very limited fashion, and the main logic I write in C. This is also the way most operating systems are written. Because they end up running into an architecture wall the same way as you. Your inability to anticipate an architecture change is not AMD's fault. You made your bed, now lie in it.

rdos wrote:That's really simple. I have a register-based ABI, which defines which registers are used as input and output. That should be transpaerent to 64-bit code, but thrashing upper halves means 64-bit code must save all registers shared with 32-bit, and then must load the output registers in case there are such.

Most software that I know that uses mode changes like that actually defines trampoline routines to handle register loading and saving. Of course, this means your 64-bit code must know that the routines it wishes to call are in 32-bit mode. You could define stub functions for 64-bit mode that only call the trampoline to the 32-bit mode function, though.

devc1 · Post by **devc1** » Sun Sep 17, 2023 2:18 pm

I think that the only way for you now is to implement compatibility mode, it should not take a lot right ?
By the time this forum was created you could have implemented it LOL

rdos · Post by **rdos** » Sun Sep 17, 2023 11:53 pm

nullplan wrote:So we finally get to the heart of the issue: You wrote your code in an unportable way, and now are mad that this limitation ended up mattering.

By the time I started my OS (1988), there was no portable way to write 32-bit code. The 386 processor was pretty new, and the only tool I had was MASM targeting MSDOS. I'm not mad at the current situation, rather have anticipated this for a while.

nullplan wrote: This is a lesson I learned early on, when I dabbled in writing Windows programs in assembler and then noticed that for 64-bit mode, everything was different enough as to be a complete rewrite. What was a single option for even the C programmer would have been too much work to bother with in assembler. So for one, using assembler for the task was bloody worthless, since most Windows programs are just long sequences of function calls, for two you end up making mistakes the C compiler would have caught in a split second, and finally there's this issue.

After the move from Borland C++ to OpenWatcom some 10-15 years ago, I have a way to write device drivers in C. Some complex device drivers are written in C, for instance ACPI, FreeType, HID and Sound codecs. Although, all of them require a stub that register gates & interface with the rest of the kernel. Porting drivers to C has not been an attractive option since the C compiler produce rather poor code for segmentation. I also have most of the file system servers and VFS in C/C++.

nullplan wrote: Your inability to anticipate an architecture change is not AMD's fault. You made your bed, now lie in it.

My design has worked for 35 years, which is pretty good in the software world. I can run on machines from the original 386 up to all modern x86-based processors without recompiling anything. That's not the case for today's 64-bit operating systems that won't run on older hardware.

I have extensive experience with C/C++ and flat memory models in the application area, and such code basically never becomes even close to be bug free. The most common problems are overwriting buffers, using objects after they are freed and double frees, all which can cause memory corruption in random areas. I don't want this in my OS kernel, so I'm absolutely not writing just another flat-memory model long mode kernel. There has to be some method to avoid this to make it an interesting project. For protected mode, the method is segmentation.

OSDev.org

running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64

Re: running 32-bit code in LM64