Brief update on Verbum: protected mode, at last
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Brief update on Verbum: protected mode, at last
Just as a slight update, I have extended the previous Verbum code (which can be found on a new Git branch) to transfer into 32-bit protected mode, clear the screen, and print 'Kernel started' in blue text. This is just a proof of concept rather than my endgame for this project, but it does show that the GDT I came up with works (though I did have to modify it in order to get to there) and that p-mode code is running.
I still have to figure out a) how to load a separate kernel file from the FAT12 file system, preferably an ELF file, b) how to pass the existing data structures (especially the high memory map) to said kernel, and c) how to proceed in creating the remaining core data structures such as a working IDT. I suspect that these three aspects will be at least as much work as everything I've done on this to date.
Still, I am pleased that - after two full decades of procrastination and waffling - I've finally managed to actually get going on this admittedly basic project.
I doubt that I will proceed much further beyond getting a basic ELF kernel going; at that point, I can take the work done so far as lessons learned, and apply myself to a more modern design booting from UEFI. I'll do a more complete conventional OS kernel, presumably in either C or Rust depending on what I decide later, and take what I learn doing that into my longer-term goals for Thelema and Kether. I may take a quasi-sabbatical from that at some point, to work on getting a deeper understanding of compiler development and language design, and may work on some other interim experiments with using various other languages in OS-dev along the way.
I still have to figure out a) how to load a separate kernel file from the FAT12 file system, preferably an ELF file, b) how to pass the existing data structures (especially the high memory map) to said kernel, and c) how to proceed in creating the remaining core data structures such as a working IDT. I suspect that these three aspects will be at least as much work as everything I've done on this to date.
Still, I am pleased that - after two full decades of procrastination and waffling - I've finally managed to actually get going on this admittedly basic project.
I doubt that I will proceed much further beyond getting a basic ELF kernel going; at that point, I can take the work done so far as lessons learned, and apply myself to a more modern design booting from UEFI. I'll do a more complete conventional OS kernel, presumably in either C or Rust depending on what I decide later, and take what I learn doing that into my longer-term goals for Thelema and Kether. I may take a quasi-sabbatical from that at some point, to work on getting a deeper understanding of compiler development and language design, and may work on some other interim experiments with using various other languages in OS-dev along the way.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: Brief update on Verbum: protected mode, at last
Well, protected mode is something, and something is better than nothing.
For the problem of handing the memory map to the kernel, I had the idea of writing the arguments to the stack, the same way it is happening for userspace applications. I have this structure (although it is subject to revision):and then the bootloader just fills it out. The kernel finds it at the initial stack pointer.
Yeah, for the memory map, I was dissatisfied with the usual format, two qwords and a byte hanging off them like an errant bogey. So I decided to just revamp it, and have a list of only the two qwords, and they are sorted by type. So the first nRAM entries in the list are the RAM blocks, then nACPI entries for the ACPI reclaimable stuff, and finally nRESERVED for the reserved stuff (I have nothing for ACPI NVS; I just don't touch any memory not named here).
I think I am reconsidering the "physbase" member. It was supposed to be the physical base of the kernel, but the kernel is a virtually mapped executable, and doesn't need to have just a single base. It can be strewn about the entire memory for all I care. The bootloader maps it correctly, and the kernel inherits the mapping table. And other than that, all the places in physical memory that are occupied with the kernel are supposed to be added to the "reserved" list (as is the initial paging structure and the boot stack). Then the kernel only needs to modify the mappings to take advantage of NX or Global pages, and to get rid of the low memory trampoline.
For the problem of handing the memory map to the kernel, I had the idea of writing the arguments to the stack, the same way it is happening for userspace applications. I have this structure (although it is subject to revision):
Code: Select all
struct irkutsk_memblock {
uint64_t start, len;
};
/* The actual arguments to the Irkutsk kernel */
struct irkutsk_args {
uint32_t version;
char cmdline[32];
uint64_t physbase;
uint16_t nRAM, nACPI, nRESERVED, dummy;
struct irkutsk_memblock memblocks[29];
uint64_t framebuffer_addr;
uint32_t framebuffer_pitch;
uint32_t fb_width, fb_height;
uint8_t fb_bpp, fb_type, color_info[6];
};
Yeah, for the memory map, I was dissatisfied with the usual format, two qwords and a byte hanging off them like an errant bogey. So I decided to just revamp it, and have a list of only the two qwords, and they are sorted by type. So the first nRAM entries in the list are the RAM blocks, then nACPI entries for the ACPI reclaimable stuff, and finally nRESERVED for the reserved stuff (I have nothing for ACPI NVS; I just don't touch any memory not named here).
I think I am reconsidering the "physbase" member. It was supposed to be the physical base of the kernel, but the kernel is a virtually mapped executable, and doesn't need to have just a single base. It can be strewn about the entire memory for all I care. The bootloader maps it correctly, and the kernel inherits the mapping table. And other than that, all the places in physical memory that are occupied with the kernel are supposed to be added to the "reserved" list (as is the initial paging structure and the boot stack). Then the kernel only needs to modify the mappings to take advantage of NX or Global pages, and to get rid of the low memory trampoline.
Carpe diem!
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Brief update on Verbum: protected mode, at last
That's more or less what I had in mind myself, though I had some concerns about ensuring that the old stack itself was still accessible to the protected-mode code at that point. Good to hear that it would be.nullplan wrote:For the problem of handing the memory map to the kernel, I had the idea of writing the arguments to the stack, the same way it is happening for userspace applications.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: Brief update on Verbum: protected mode, at last
Well no. I'm mapping a new stack, and I'm writing the arguments to the new stack.Schol-R-LEA wrote:That's more or less what I had in mind myself, though I had some concerns about ensuring that the old stack itself was still accessible to the protected-mode code at that point. Good to hear that it would be.
Carpe diem!
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Brief update on Verbum: protected mode, at last
Ah, that makes more sense.nullplan wrote:Well no. I'm mapping a new stack, and I'm writing the arguments to the new stack.Schol-R-LEA wrote:That's more or less what I had in mind myself, though I had some concerns about ensuring that the old stack itself was still accessible to the protected-mode code at that point. Good to hear that it would be.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Brief update on Verbum: protected mode, at last
I have to confess, I am somewhat stuck on where to go next.
I am planning to use an ELF file as the kernel proper, but in order to load it I would need to write a floppy driver and a suitable loader, or more likely, a pager, with the kernel file code sections mapped into the higher half of memory.
While a simple floppy driver could work with PIO and polling, I would at some point need a more complete driver using DMA and interrupt 0x06. This means having a working IDT.
In order to have a working IDT, I would need to at least get paging set up enough that I can identity map the IDT. The alternative is to set up a rump IDT and then switch to a more complete one later.
All this implies that the next step is enabling paging and setting up at least certain pages as identity mapped, before I load the kernel proper.
Is this really the logical sequence to do this in, or am I missing something? Are there other equally reasonable directions I could take this in? I can't help but think that paging would be one of the later steps, not the first one upon setting up protected mode.
I am planning to use an ELF file as the kernel proper, but in order to load it I would need to write a floppy driver and a suitable loader, or more likely, a pager, with the kernel file code sections mapped into the higher half of memory.
While a simple floppy driver could work with PIO and polling, I would at some point need a more complete driver using DMA and interrupt 0x06. This means having a working IDT.
In order to have a working IDT, I would need to at least get paging set up enough that I can identity map the IDT. The alternative is to set up a rump IDT and then switch to a more complete one later.
All this implies that the next step is enabling paging and setting up at least certain pages as identity mapped, before I load the kernel proper.
Is this really the logical sequence to do this in, or am I missing something? Are there other equally reasonable directions I could take this in? I can't help but think that paging would be one of the later steps, not the first one upon setting up protected mode.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: Brief update on Verbum: protected mode, at last
Is it possible that you are overthinking this? I would load the kernel from real mode using BIOS. That way, you already support everything BIOS supports, and can load your kernel from floppy, HDD, USB stick, CD-ROM, whatever. In particular, I would look into a simple FS driver for whatever FS you are running on. It only needs read support, and typically not even that to the fullest extent (e.g. if you write an ext2 driver, you typically don't even need support for tripple-indirect blocks).
Once you have that, you can use it to load your kernel file wherever you wish. Typical approach is to simply assume it is going to fit en bloc into the memory at 1MB, but for ELF files in particular, you can load each page separately into page-aligned blocks.
That means, the next thing you need is a memory allocator.
Indeed, this is the direction I am currently going with my ext2 VBR project. Load the kernel into memory using int 13h and int 15h, enable the A20 gate, enable protected mode, set up page tables, enable long mode, jump to kernel. Specifically, enable the A20 gate after the int 15h stuff is done. Seabios tells me the int 15h stuff touches A20, and while they are doing everything right (saving and restoring its state with every interrupt), I would not assume that all BIOSes work that way.
Once you have that, you can use it to load your kernel file wherever you wish. Typical approach is to simply assume it is going to fit en bloc into the memory at 1MB, but for ELF files in particular, you can load each page separately into page-aligned blocks.
That means, the next thing you need is a memory allocator.
You don't need paging for the IDT, and the kernel proper should load its own IDT, anyway. It needs the code for that anyway once SMP enters the mix. Again, I would stay in BIOS mode to load the kernel file (you can copy it with function 87h of interrupt 15h).Schol-R-LEA wrote:In order to have a working IDT, I would need to at least get paging set up enough that I can identity map the IDT. The alternative is to set up a rump IDT and then switch to a more complete one later.
Indeed, this is the direction I am currently going with my ext2 VBR project. Load the kernel into memory using int 13h and int 15h, enable the A20 gate, enable protected mode, set up page tables, enable long mode, jump to kernel. Specifically, enable the A20 gate after the int 15h stuff is done. Seabios tells me the int 15h stuff touches A20, and while they are doing everything right (saving and restoring its state with every interrupt), I would not assume that all BIOSes work that way.
Carpe diem!
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Brief update on Verbum: protected mode, at last
Hmmn, OK, fair point I suppose. I already have a rump FAT12 driver in my boot sector code, as a set of include files, and can easily include them in the second stage loader as well. Thing is, I want to map the kernel to higher half of memory, which is why I was thinking in terms of setting up all the details of the p-mode setup first and writing a small p-mode FAT12 driver.
What you are proposing essentially amounts to a third stage loader, wherein I would put a small p-mode loader into the 1MiB+ HMA, which would allow me to write that final loader in C, I suppose. However, I was hoping to avoid using three stages. I'd rather have the second stage loader do the hand-off to the kernel, even if it means a lot more assembly coding.
I suppose I could use a real-mode C compiler and re-write the second stage in C, but I would rather avoid adding yet another dependency to the project. This isn't really an OS project, per se, just a boot loader project; I do intend to make a simple kernel to show that it all works, but it would be a very bare-bones one.
Any longer-term OS project will switch to using a UEFI loader instead, though, simply because that is what newer hardware supports.
I mostly wanted to finish the real mode boot loader just to show myself I can see a project through to some sort of completion. After 25+ years in programming, mostly on projects which either got cancelled or which I was taken off of before they released, actually finishing something would be a bit of a novelty for me. I sat on this loader for 20 years; I need to see it through before moving on to something else. I'm pretty close to saying, "yes, this is enough" at this point, but it wouldn't feel complete without actually loading some sort of kernel.
Well, that, and I feel I can still learn something from this process that would help with future projects.
I need to consider what you suggested, though. There's little point in making too much out of this, especially given what I just said. Maybe I just need to do the minimum necessary, rather than the maximum.
What you are proposing essentially amounts to a third stage loader, wherein I would put a small p-mode loader into the 1MiB+ HMA, which would allow me to write that final loader in C, I suppose. However, I was hoping to avoid using three stages. I'd rather have the second stage loader do the hand-off to the kernel, even if it means a lot more assembly coding.
I suppose I could use a real-mode C compiler and re-write the second stage in C, but I would rather avoid adding yet another dependency to the project. This isn't really an OS project, per se, just a boot loader project; I do intend to make a simple kernel to show that it all works, but it would be a very bare-bones one.
Any longer-term OS project will switch to using a UEFI loader instead, though, simply because that is what newer hardware supports.
I mostly wanted to finish the real mode boot loader just to show myself I can see a project through to some sort of completion. After 25+ years in programming, mostly on projects which either got cancelled or which I was taken off of before they released, actually finishing something would be a bit of a novelty for me. I sat on this loader for 20 years; I need to see it through before moving on to something else. I'm pretty close to saying, "yes, this is enough" at this point, but it wouldn't feel complete without actually loading some sort of kernel.
Well, that, and I feel I can still learn something from this process that would help with future projects.
I need to consider what you suggested, though. There's little point in making too much out of this, especially given what I just said. Maybe I just need to do the minimum necessary, rather than the maximum.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Brief update on Verbum: protected mode, at last
The original IBM BIOS didn't save and restore the A20 state. You should assume INT 0x15 AH=0x87 will return with A20 disabled.nullplan wrote:Seabios tells me the int 15h stuff touches A20, and while they are doing everything right (saving and restoring its state with every interrupt), I would not assume that all BIOSes work that way.
You don't need three stages. Your second stage can load the kernel into memory, prepare the higher-half mappings, switch from real mode to long mode, and jump to the kernel without a third stage.Schol-R-LEA wrote:However, I was hoping to avoid using three stages. I'd rather have the second stage loader do the hand-off to the kernel, even if it means a lot more assembly coding.
If you don't mind forcing your compiled code and your stack to live within the first 64kB of memory alongside the IVT and BDA, you can use GCC. You can install a #GP handler that extends your segments in unreal mode to access memory beyond that first 64kB, but you'll still have to pay attention to A20 to access memory above 1MB.Schol-R-LEA wrote:I suppose I could use a real-mode C compiler and re-write the second stage in C, but I would rather avoid adding yet another dependency to the project.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Brief update on Verbum: protected mode, at last
OK, that's a fair point. I keep thinking in terms of identity mapping for the kernel, but there's no reason to do that, is there? While there are advantages in identity mapping the IDT (at least according to the wiki), even that isn't strictly necessary. I will definitely need to read up more on virtual memory mapping.Octocontrabass wrote:You don't need three stages. Your second stage can load the kernel into memory, prepare the higher-half mappings, switch from real mode to long mode, and jump to the kernel without a third stage.Schol-R-LEA wrote:However, I was hoping to avoid using three stages. I'd rather have the second stage loader do the hand-off to the kernel, even if it means a lot more assembly coding.
I don't really see that as being worth the effort, to be honest.Octocontrabass wrote:If you don't mind forcing your compiled code and your stack to live within the first 64kB of memory alongside the IVT and BDA, you can use GCC.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Brief update on Verbum: protected mode, at last
Correct. You can't identity-map a higher-half kernel anyway (at least not in long mode - in protected mode there are chipsets that might allow it).Schol-R-LEA wrote:I keep thinking in terms of identity mapping for the kernel, but there's no reason to do that, is there?
Where does it say that? I can't think of any advantages.Schol-R-LEA wrote:While there are advantages in identity mapping the IDT (at least according to the wiki)
Maybe. It depends on how much x86 assembly you can write before you get lost in your own code. (Certainly a problem I've run into...)Schol-R-LEA wrote:I don't really see that as being worth the effort, to be honest.
Re: Brief update on Verbum: protected mode, at last
That's what I meant. That's why I'm not tampering with A20 until I'm done with BIOS. (Although one of the A20 methods IS BIOS, yes).Octocontrabass wrote:The original IBM BIOS didn't save and restore the A20 state. You should assume INT 0x15 AH=0x87 will return with A20 disabled.
Well, the only thing you really need to use GCC in 16-bit mode is a code model in which exactly one segment is used. It doesn't necessarily have to be segment 0, but all segment registers must be equal. Also, even 16-bit GCC code requires at least a 386, so if you had any plans to at least detect and filter out older CPUs, you will have to do that in the loader. And you will need an assembly loader.Octocontrabass wrote:If you don't mind forcing your compiled code and your stack to live within the first 64kB of memory alongside the IVT and BDA, you can use GCC.
BTW, that code is pretty simple (this one expects CS and DS to be 0, though):
Code: Select all
cputooold:
movw $cpuoldmsg, %si
jmp error /* routine that prints the string at SI and then hangs in an infinite HLT loop */
cpuoldmsg: .asciz "CPU too old"
detect_cpu:
/* 8086 and 80186 handle PUSH SP differently from more modern CPUs */
pushw %sp
popw %ax
cmpw %ax, %sp
jnz cputooold
/* If we get here, we are on a 286 at least. The 186 introduced the "invalid opcode" exception, so I just use that. */
cli
movw $cputoold, 6*4
movw $0, 6*4+2
sti
movl $0x80000000, %eax
cpuid
cmpl $0x80000000, %eax
jbe cputooold
movl $0x80000001, %eax
cpuid
btl $29, %edx
jnc cputooold
What effort? Subtracting the BDA and IVT, you have 64000 bytes of RAM to do with what you want, and all you need to do is load a file into memory and do some ancillary stuff. The only question is what to do with the other 62000 bytes. And since you can assume a 386 at least, you can also slightly extend your reach by way of inline assembler using the FS and GS segments.Schol-R-LEA wrote:I don't really see that as being worth the effort, to be honest.
Yeah, I am getting there myself. I think I may want to rewrite parts of my ext2 loader in C. Because I am now at the stage where I need to do several loops over complicated data structures, and that is really not fun in assembler.Octocontrabass wrote:Maybe. It depends on how much x86 assembly you can write before you get lost in your own code. (Certainly a problem I've run into...)
Carpe diem!
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Brief update on Verbum: protected mode, at last
I wonder if you can rely on INT 0x13 not touching A20, since DOS can load part of itself to the HMA...nullplan wrote:That's why I'm not tampering with A20 until I'm done with BIOS.
True - I was specifically describing using GCC with a 4GiB address space, where it makes the most sense to use segment 0.nullplan wrote:Well, the only thing you really need to use GCC in 16-bit mode is a code model in which exactly one segment is used. It doesn't necessarily have to be segment 0, but all segment registers must be equal.
For whatever reason, that's the way Intel says to do it.nullplan wrote:I never understood why some people want to muck about with reserved bits in the flags register.
Or extend it a lot using a #GP handler to enter unreal mode.nullplan wrote:And since you can assume a 386 at least, you can also slightly extend your reach by way of inline assembler using the FS and GS segments.