[HOW TO]Bios development: resources and information to share

Programming, for all ages and all languages.
ignus
Member
Member
Posts: 26
Joined: Thu Jan 30, 2014 9:49 am

[HOW TO]Bios development: resources and information to share

Post by ignus »

It's me again :) (is it the right section? Because this thread is neither on OS development nor general programming...)
I was looking at the source code of SeaBios (coreboot's legacy bios) and I thought it should be funny to develop myself a small one. I decided to archive some goals:
- boot from floppy in QEMU
- VGA and serial initialization and output
- optional: read from HDD

The main problem is that, differently from OS development, doesn't exist any forum.biosdev.org, and much less documentation is available (and much less clear, can i make an example?: Intel CPUs should load bios at fixed address 000F FFF0 (or F000:FFF0, if you like it more), but that part of memory should be already taken)

I found some information both on SuperBios source code (sort of open source implementation, not working ok QEMU) and the book "Bios companion 2", and finally here on i/o ports and devices to initialize (DMA, pic, timer...)

But still I don't know where to start

Can someone give me some advices? Any information shared with me (us, 'cause we are legion) is really really appreciated :)
Thanks guys
ignus
Member
Member
Posts: 26
Joined: Thu Jan 30, 2014 9:49 am

Re: [HOW TO]Bios development: resources and information to s

Post by ignus »

If someone is interested, I found another great resource here inside coreboot wiki (I thought there wasn't anything interesting in their site)
It's also incredible how many things are still obscure inside Intel's architecture
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: [HOW TO]Bios development: resources and information to s

Post by Brendan »

Hi,
ignus wrote:The main problem is that, differently from OS development, doesn't exist any forum.biosdev.org, and much less documentation is available (and much less clear, can i make an example?: Intel CPUs should load bios at fixed address 000F FFF0 (or F000:FFF0, if you like it more), but that part of memory should be already taken)
For ancient CPUs (80386 and older) the firmware's ROM actually was at 0x000F0000 and when the CPU started it started executing at 0x000FFFF0 (or "0xF000:0xFFF0" in real mode addressing). 64 KiB is not a lot of space, so when people started expecting the BIOS to handle things like PCI, USB, etc it wasn't enough. To fix this, the firmware's ROM was shifted to just below 0xFFFFFFFF so that it could be much larger (and the CPU started executing code at 0xFFFFFFF0 instead, which ends up as a strange "0xFFFFF000:0xFFF0" real mode address created from playing non-standard tricks with CS segment's base). This created a minor backward compatibility problem (software expected the BIOS to be 0x000F0000) that was solved by copying the "run-time" part of the BIOS into that area of RAM and then telling the memory controller to pretend that area of RAM is "read only".

If you're creating your own BIOS ROM (for modern-ish CPUs) then you'd need to do the same - have the firmware's initialisation code, etc just below 0xFFFFFFFF, and copy a "run-time part" (possibly including decompression, and likely excluding initialisation code that isn't needed after POST) into the legacy area.

Of course it's much simpler (and faster) to forget about the BIOS and write a special bootloader for your OS, so that the OS can be installed in ROM and will boot from ROM with no BIOS (and no disk IO, etc) at all.

Also note that for a real computer there's a lot of tricky stuff that needs to happen (e.g. starting with detecting RAM chip sizes and initialising the memory controller, before you can even use any RAM for stack, etc). For emulators (Bochs, Qemu) it's much much easier, as (unlike real hardware) RAM (and a bunch of other things that the firmware is supposed to initialise) are setup/working before the firmware is started.
ignus wrote:But still I don't know where to start
The best place to start would be the same place the CPU starts - the code at 0xFFFFFFF0 (or "0xFFFFF000:0xFFF0"). This mostly has to be a JMP instruction to some code that should switch to protected mode (without using any RAM), followed by code to detect and initialise RAM chips (which can be a "do nothing" stub for Qemu or Bochs), followed by code to setup other parts of the chipset (if/where necessary for the specific chipset), followed by code to decompress/copy the "run-time" part into the legacy BIOS area.

Once that's all done you'd add code to create the IVT; then start working on code to scan PCI buses, find "option ROMs" (e.g. video card's ROM) and initialise them; plus code for the BIOS's own interrupt handlers.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
ignus
Member
Member
Posts: 26
Joined: Thu Jan 30, 2014 9:49 am

Re: [HOW TO]Bios development: resources and information to s

Post by ignus »

Of course it's much simpler (and faster) to forget about the BIOS and write a special bootloader for your OS, so that the OS can be installed in ROM and will boot from ROM with no BIOS (and no disk IO, etc) at all.
Wonderful is not enough to describe your idea :) That will be my goal
For emulators (Bochs, Qemu) it's much much easier, as (unlike real hardware) RAM (and a bunch of other things that the firmware is supposed to initialise) are setup/working before the firmware is started.
Surely I will use QEMU. I don't trust myself enough to flash that 'thing' on my lovely computer :)
CPU starts - the code at 0xFFFFFFF0 (or "0xFFFFF000:0xFFF0")
It's it just 16 bytes below 4GB limit? What happens if my system doesn't have so much memory (my QEMU is configurated to use just 32 MB of memory, how can it access to 0xFFFFFFF0? :shock: )
From StackOverflow i found this:
If you follow the normal real-mode addressing scheme, the physical address should be CS.Selector*16+IP, or, with the values substituted, 0xFFFF0. However, the CPU actually calculates the address using CS.Base+(E)IP (in the real and 16/32-bit protected mode, but not in virtual 8086 or 64-bit protected mode), hence the first address that the CPU requests from the memory is going to be 0xFFFFFFF0. Your inability to use far jumps to code within the ROM at that high address may be due to the fact that loading into CS will reset CS.Base to 16 * the new value of CS.Selector. So, jumping to, say, 0xF000:0xFFF0 will transfer control to 0xFFFF0 instead of 0xFFFFFFF0 and unless the ROM is also mapped at that low location in the memory and the code in it is suited for running with CS(.Selector)=0xF000, it's not going to run.
And this from wikipedia
The reset vector for the 80386 and later x86 processors is physical linear address FFFFFFF0h. The value of the selector portion of the CS register at reset is F000h, the value of the base portion of the CS register is FFFF0000h, and the value of the IP register at reset is FFF0h to form the segmented address FFFFF000h:FFF0h in real mode.
So the firmware inside the rom is loaded twice, just before 4GB limit and at address 0xF000:0xFFF0?
The best place to start
Should my code be similar to this?

Code: Select all

[ORG 0xFFF0]
[BITS 16]

	mov ax, 0xF000 ; I don't need these two lines, right?
	mov cs, ax	

	; go to protected mode
	; optional: initialize ram
	; setup devices
		; DMA
		; PIC
		; PS/2 controller
		; VGA (argh!)

	; code interrupt vector table
	; LOAD your os HERE
Really really thanks man :) Your help is sincerely appreciated
If you wanna share any other information, I'm here :)
ignus
Member
Member
Posts: 26
Joined: Thu Jan 30, 2014 9:49 am

Re: [HOW TO]Bios development: resources and information to s

Post by ignus »

The mystery is unveiled:
The value of 0xfffffff0 is slightly less then 4Gb, so unless the machine has 4Gb physical memory, it cannot point to a valid memory address. The computer's hardware translates this address so that it points to a BIOS memory block.
BIOS stands for Basic Input Output System, and it is a chip on the motherboard that has a relatively small amount of read-only memory (ROM). This memory contains various low-level routines that are specific to the hardware supplied with the motherboard. So, the processor will first jump to the address 0xfffffff0, which really resides in the BIOS's memory. Usually this address contains a jump instruction to the BIOS's POST routines.
So, while I'm writing my firmware, I should'n care about this particular, I just have to write down my bios and it will be loaded at address 0xF000:0xFFF0 and it will be executed from the first instruction, should I?
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: [HOW TO]Bios development: resources and information to s

Post by Brendan »

Hi,
ignus wrote:
CPU starts - the code at 0xFFFFFFF0 (or "0xFFFFF000:0xFFF0")
It's it just 16 bytes below 4GB limit?
Yes, this is why you'd need to start with some sort of JMP (e.g. maybe a near jump to 0x0000 in the same code segment).
ignus wrote:What happens if my system doesn't have so much memory (my QEMU is configurated to use just 32 MB of memory, how can it access to 0xFFFFFFF0? :shock: )
It's ROM and not RAM. I'm not sure about Qemu, but for Bochs the firmware's ROM can be any size from about 64 KiB up to 4 MiB.
ignus wrote:From StackOverflow i found this:
If you follow the normal real-mode addressing scheme, the physical address should be CS.Selector*16+IP, or, with the values substituted, 0xFFFF0. However, the CPU actually calculates the address using CS.Base+(E)IP (in the real and 16/32-bit protected mode, but not in virtual 8086 or 64-bit protected mode), hence the first address that the CPU requests from the memory is going to be 0xFFFFFFF0. Your inability to use far jumps to code within the ROM at that high address may be due to the fact that loading into CS will reset CS.Base to 16 * the new value of CS.Selector. So, jumping to, say, 0xF000:0xFFF0 will transfer control to 0xFFFF0 instead of 0xFFFFFFF0 and unless the ROM is also mapped at that low location in the memory and the code in it is suited for running with CS(.Selector)=0xF000, it's not going to run.
And this from wikipedia
The reset vector for the 80386 and later x86 processors is physical linear address FFFFFFF0h. The value of the selector portion of the CS register at reset is F000h, the value of the base portion of the CS register is FFFF0000h, and the value of the IP register at reset is FFF0h to form the segmented address FFFFF000h:FFF0h in real mode.
It would've been faster to read the relevant section of Intel's manual. ;)

Note that this is why I said you'd want to switch to protected mode (with a sane CS, rather than a slightly "unreal" CS) as soon as possible.
ignus wrote:So the firmware inside the rom is loaded twice, just before 4GB limit and at address 0xF000:0xFFF0?
Erm.

Imagine you create a 64 KiB "legacy BIOS" binary that contains the "run-time" part of the BIOS (real mode code) and none of the initialisation code, that is intended to be in the legacy BIOS area (from 0x000F0000 to 0x000FFFFF); and you call this binary file "runtime.bin". Now imagine that you compress this 64 KiB binary down to 32 KiB of compressed data and call the resulting file "runtime.bin.gz".

Also imagine that you've got a 1 MiB ROM image that contains all of the (mostly 32-bit protected mode) initialisation code; and (using something like NASM's "incbin") this 1 MiB ROM image includes the "runtime.bin.gz" file; plus some code to do decompression.

Now, the computer starts and your ROM is mapped by the chipset into the 1 MiB area from 0xFFF00000 to 0xFFFFFFFF. Your code starts and initialises a bunch of things (including RAM), and then decompresses that "runtime.bin.gz" file that you included in the ROM so that the decompressed code and data (that was "runtime.bin") ends up stored in the 64 KiB area of RAM from 0x000F0000 to 0x000FFFFF. After this you tell the chipset/memory controller to pretend the 64 KiB area of RAM from 0x000F0000 to 0x000FFFFF is "read only", so that it looks like ROM (even though it's not).

Of course this is only needed for a BIOS. If you're doing your own "boot the OS directly from ROM" code (or for UEFI) there's no reason to bother with the legacy BIOS area from 0x000F0000 to 0x000FFFFF and it's easier to leave that area as "usable RAM" and just do everything with the code in ROM in the area from 0xFFF00000 to 0xFFFFFFFF.

Does this make sense?
ignus wrote:Should my code be similar to this?

Code: Select all

[ORG 0xFFF0]
[BITS 16]

	mov ax, 0xF000 ; I don't need these two lines, right?
	mov cs, ax	

	; go to protected mode
	; optional: initialize ram
	; setup devices
		; DMA
		; PIC
		; PS/2 controller
		; VGA (argh!)

	; code interrupt vector table
	; LOAD your os HERE
That's (very roughly) the general idea, yes.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
ignus
Member
Member
Posts: 26
Joined: Thu Jan 30, 2014 9:49 am

Re: [HOW TO]Bios development: resources and information to s

Post by ignus »

your ROM is mapped by the chipset into the 1 MiB area from 0xFFF00000 to 0xFFFFFFFF
Ehm that's quite strange (1MB -1 = address 0xFFFFF)

There's something I miss. Intel's 80386 manual says:
Execution begins with the instruction addressed by the initial contents of the CS and IP registers. To allow the initialization software to be placed in a ROM at the top of the address space, the high 12 bits of addresses issued for the code segment are set, until the first instruction which loads the CS register, such as a far jump or call. As a result, instruction fetching begins from address OFFFFFFFOH. Because the size of the ROM is unknown, the first instruction is intended to be a jump to the beginning of the initialization software. Only near jumps may be performed within the ROM-based software. After a far jump is executed, addresses issued for the code segment are clear in their high 12 bits.
But how can my rom be mapped in RAM at a 32bit address if I can access only 20bit address? I'm quite sure that CS segment is 16bit, shifted left and added IP makes a 20bit address.

EDIT: I hope it's true
From cs.stackexchange wrote:32-bit CPUs start up in what is colloquially known as unreal mode.
I think it is something similar to this: when the machine starts up, it starts at 0xFFFF FFF0, which points not to a RAM address but to ROM.
In this last 16 bits inside my ROM firmware, I should jump back at the begin of my ROM (the ROM is mapped 64kB before 4GB limit?) to address 0xFFFF 0000 (Can I jump to a 32bit value in unreal mode?)
Then, option1: I run my software from address 0xFFFF 0000 to 0xFFFF FFFF if it fits into 64kB or
option2: I compress my software with something, then I decompress it at any address in RAM (still, how can I access at first part of the ram?)

Quite complicate, isn't it?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: [HOW TO]Bios development: resources and information to s

Post by Combuster »

Are you sure you read everything?
Brendan wrote:The best place to start would be the same place the CPU starts - the code at 0xFFFFFFF0 (or "0xFFFFF000:0xFFF0"). This mostly has to be a JMP instruction to some code that should switch to protected mode (without using any RAM), followed by code to detect and initialise RAM chips (which can be a "do nothing" stub for Qemu or Bochs), followed by code to setup other parts of the chipset (if/where necessary for the specific chipset), followed by code to decompress/copy the "run-time" part into the legacy BIOS area.
However, the CPU actually calculates the address using CS.Base+(E)IP
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
ignus
Member
Member
Posts: 26
Joined: Thu Jan 30, 2014 9:49 am

Re: [HOW TO]Bios development: resources and information to s

Post by ignus »

Combuster wrote:Are you sure you read everything?
However, the CPU actually calculates the address using CS.Base+(E)IP
But if CS is 16bit + IP 16bit (EIP at startup is 0x0000 FFF0, so points to a 16 bit area), how can it point to 0xFFFF FFF0?
EDIT: is CS.Base something different of CS (16bit code segment)?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: [HOW TO]Bios development: resources and information to s

Post by Combuster »

Combuster wrote:Are you sure you read everything?
Obviously not, because this is answered where that quote came from.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
ignus
Member
Member
Posts: 26
Joined: Thu Jan 30, 2014 9:49 am

Re: [HOW TO]Bios development: resources and information to s

Post by ignus »

Combuster wrote:Obviously not, because this is answered where that quote came from.
Obviously I didn't understand it well, because I usually read other's people advices before trying to understand 80386 manual.
Maybe you can add something if you've time to waste :)
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: [HOW TO]Bios development: resources and information to s

Post by bluemoon »

ignus wrote:
Combuster wrote:Obviously not, because this is answered where that quote came from.
Obviously I didn't understand it well, because I usually read other's people advices before trying to understand 80386 manual.
Maybe you can add something if you've time to waste :)
I suggest the opposite, read the manual and know it well before you start something serious, it is one of the Required_Knowledge.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: [HOW TO]Bios development: resources and information to s

Post by Brendan »

Hi,
ignus wrote:
Combuster wrote:Are you sure you read everything?
However, the CPU actually calculates the address using CS.Base+(E)IP
But if CS is 16bit + IP 16bit (EIP at startup is 0x0000 FFF0, so points to a 16 bit area), how can it point to 0xFFFF FFF0?
EDIT: is CS.Base something different of CS (16bit code segment)?
For modern CPUs, regardless of what mode the CPU is operating in, each segment register has 4 parts:
  • The visible value that you can see (e.g. the value that would be loaded into AX by an "mov ax, cs" instruction)
  • A hidden "base address"
  • A hidden "limit"
  • A hidden set of attributes (CPL, read, write, execute, etc)
One of the main differences between protected mode (and long mode) and real mode is how a segment register's hidden parts are effected whenever the visible part is loaded. For protected mode (and long mode), when you load the visible part (e.g. "mov cs, ax") the hidden parts are loaded from data in the GDT or LDT (and the visible part is used as an index into the GDT or LDT). For real mode, when you load the visible part the hidden "base address" part is set to "visible part * 16" and the other hidden parts (limit and attributes) are left unchanged.

Now; when a modern CPU (80486 or later) is started, the CPU sets the visible part of CS to 0xF000, but the hidden "base address" part of this segment register is not set to 0x000F0000 like it would be for a normal load, and is actually set to 0xFFFF0000 instead. IP is set to 0xFFF0; so this means that the first instruction that the CPU executes will be at "CS.baseAddress + IP", which is "0xFFFF0000 + 0x0000FFF0", which is the physical address 0xFFFFFFF0.

Note that "a modern CPU (80486 or later)" does not include older CPUs (e.g. 80386) and therefore Intel's 80386 Programmer's Reference Manual will not include this information. More recent Intel manuals do describe this oddity appropriately in the relevant section of the "Processor Management and Initialisation" chapter.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
ignus
Member
Member
Posts: 26
Joined: Thu Jan 30, 2014 9:49 am

Re: [HOW TO]Bios development: resources and information to s

Post by ignus »

bluemoon wrote:I suggest the opposite, read the manual and know it well before you start something serious
I agree with you
I just wanted to write a small snippet of code that makes something, like a beep is enough, to see how difficoult is the goal, what concept I still need, in how many files divide the project.
I know how important is the manual, but it can't be my only reference (it's really to compact, short, I cannot rely completely on it, I've found other different reading more specific in some part eg. memory). However, thanks for the feedback :)
Note that "a modern CPU (80486 or later)" does not include older CPUs (e.g. 80386) and therefore Intel's 80386 Programmer's Reference Manual will not include this information. More recent Intel manuals do describe this oddity appropriately in the relevant section of the "Processor Management and Initialisation" chapter.
Tah-dah! :shock:

PS: Brendan I really love you. It's a sort of school project (high school thesis), but fortunatly I still got enough time (to complete it)
I want you as my teacher ahah
Octocontrabass
Member
Member
Posts: 5633
Joined: Mon Mar 25, 2013 7:01 pm

Re: [HOW TO]Bios development: resources and information to s

Post by Octocontrabass »

Brendan wrote:Note that "a modern CPU (80486 or later)" does not include older CPUs (e.g. 80386) and therefore Intel's 80386 Programmer's Reference Manual will not include this information.
I'm not sure where you heard that the 486 was the first CPU to behave this way, because it's not. Take a look at the Intel 80386 Programmer's Reference Manual, chapter 10, section 10.2.3:
After RESET, address lines A{31-20} are automatically asserted for instruction fetches. This fact, together with the initial values of CS:IP, causes instruction execution to begin at physical address FFFFFFF0H. Near (intrasegment) forms of control transfer instructions may be used to pass control to other addresses in the upper 64K bytes of the address space. The first far (intersegment) JMP or CALL instruction causes A{31-20} to drop low, and the 80386 continues executing instructions in the lower one megabyte of physical memory. This automatic assertion of address lines A{31-20} allows systems designers to use a ROM at the high end of the address space to initialize the system.
Post Reply