Writing an Assembler - How?

Programming, for all ages and all languages.
Post Reply
mioline
Posts: 3
Joined: Sun Mar 09, 2008 7:26 am

Writing an Assembler - How?

Post by mioline »

Hi there!

This is my first post here, so if I ask something lame... Do whatever you want with me. :D

I want to write an Assembler for the i8086 CPU, the parsing is OK, but I don't understand, how the machine-code instructions and a plain (for example a *.COM) binary file seem. I would appreciate if somebody could help me.

Thanks for the answers in advance!

mioline
User avatar
DavidCooper
Member
Member
Posts: 1150
Joined: Wed Oct 27, 2010 4:53 pm
Location: Scotland

Re: Writing an Assembler - How?

Post by DavidCooper »

mioline wrote:I want to write an Assembler for the i8086 CPU, the parsing is OK, but I don't understand, how the machine-code instructions and a plain (for example a *.COM) binary file seem.
You don't understand how they seem? You need to state that more clearly so that people have a chance of working out what you mean. It appears that a .com file (http://en.wikipedia.org/wiki/COM_file) has no header, so all you have to do is fill it with binary code, the first byte of which will, I presume, be designed to sit at address 0 within a 64K segment. If you're capable of writing an assembler, I can't imagine what you're now having difficulty with, so you need to spell out the problem clearly.
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c

MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
mioline
Posts: 3
Joined: Sun Mar 09, 2008 7:26 am

Re: Writing an Assembler - How?

Post by mioline »

You don't understand how they seem? You need to state that more clearly so that people have a chance of working out what you mean. It appears that a .com file (http://en.wikipedia.org/wiki/COM_file) has no header, so all you have to do is fill it with binary code, the first byte of which will, I presume, be designed to sit at address 0 within a 64K segment. If you're capable of writing an assembler, I can't imagine what you're now having difficulty with, so you need to spell out the problem clearly.
OK, I was wrong, sorry... I should have leave the "COM-part" out from my previous message... So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Writing an Assembler - How?

Post by Combuster »

Tried the processor manual? The current one shows exactly how instructions are encoded. The only thing you need beyond that is a list of instructions valid on the 8086/8088 since it will be far less than the current spectrum of instructions.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
DavidCooper
Member
Member
Posts: 1150
Joined: Wed Oct 27, 2010 4:53 pm
Location: Scotland

Re: Writing an Assembler - How?

Post by DavidCooper »

mioline wrote:OK, I was wrong, sorry... I should have leave the "COM-part" out from my previous message...
No, you weren't wrong and have nothing to apologise for. The probem is that you don't seem to know what the word "seem" means, so it's very hard to make sense of your question. If I try to answer it literally, this happens:-
So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?
They seem very attractive to me. I love the way they've been organised into gorgeous groups of similar numbers with similar functions.

Are you trying to ask for a map of them showing how they are arranged?

Edit: I've had a look round, and it's hard to find anything that sets it out in a friendly way. This site isn't too bad as an introduction: http://courses.engr.illinois.edu/ece390 ... codes.html - I started out with this document over a decade ago and built up my own map of the instructions from it. Another version of it can be found here (http://www.scribd.com/doc/67624438/8086-OPCODE), but with the addition of an incomplete map at the top. If you need more detail than that document provides, that's when you should turn to the Intel processor manuals.
Last edited by DavidCooper on Fri Nov 18, 2011 5:42 pm, edited 1 time in total.
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c

MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
mioline
Posts: 3
Joined: Sun Mar 09, 2008 7:26 am

Re: Writing an Assembler - How?

Post by mioline »

DavidCooper wrote:
So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?
They seem very attractive to me. I love the way they've been organised into gorgeous groups of similar numbers with similar functions.

Are you trying to ask for a map of them showing how they are arranged?
Oh my God! I should learn more English grammar. (And maybe semantics?) :D

So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Writing an Assembler - How?

Post by gerryg400 »

The Intel manuals have it all.
If a trainstation is where trains stop, what is a workstation ?
ACcurrent
Member
Member
Posts: 125
Joined: Thu Aug 11, 2011 12:04 am
Location: Watching You

Re: Writing an Assembler - How?

Post by ACcurrent »

It might be good Idea for you to start small first and implement two or three instructions. Intel manuals can be difficult to read if english is not your 1st language.
Get back to work!
Github
User avatar
DavidCooper
Member
Member
Posts: 1150
Joined: Wed Oct 27, 2010 4:53 pm
Location: Scotland

Re: Writing an Assembler - How?

Post by DavidCooper »

mioline wrote:So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
I added a bit to the end of my previous post pointing towards a particular site which probably gives you the easiest way in: http://courses.engr.illinois.edu/ece390 ... codes.html. Even so, it takes a lot of re-reading to understand it all. If you're using the 32-bit addressing mode, most oorrrmmm bytes use the mmm part to select the register which is pointing to the memory to be used, but when the oorrrmmm byte has its mmm part set to 100 and the oo part is less than 11, an extra instruction byte has to follow to specify which register is used to point at memory, and there is room left over in that byte for another register to be used as a scaled index, as well as a scale factor. The largest two bits are the scale factor (00=x1, 01=x2, 10=x4, 11=x8), the next three bits are the register containing the index to be multiplied by 1/2/4/8, and the smallest three bits indicate the register which would have been selected by the mmm part of oorrrmmm if it hadn't been set to 100 instead.

Eg: mov EAX,[EBX+2*ECX] becomes the three bytes 139, 4, 75 (decimal values - sorry, but that's the way I write machine code). The 75 in binary is 01,001,011, so the scale factor is 2 (the 01 part), the index to be scaled by that is ECX (the 001), and the result has to be added to the value in EBX (the 011) to get the required memory address.

[Edit: crucial typing error corrected - I'd typed EBX instead of ECX at the top.]

And, my guess is that SIB stands for Scaled Index Byte.
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c

MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
User avatar
neon
Member
Member
Posts: 1567
Joined: Sun Feb 18, 2007 7:28 pm
Contact:

Re: Writing an Assembler - How?

Post by neon »

This is what we found useful with ours:

http://www.sandpile.org/
http://ref.x86asm.net/

Recognize the patterns in the ModRM and SIB tables and you will see that they really arent that complex. The ModRM byte stores the operands and addressing mode. In the case an SIB byte follows, the ModRM addressing mode is [sib+displacement]. This allows you to combine an SIB addressing scheme with a displacement, if any. SIB is only valid in 32 bit mode, however.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
azblue
Member
Member
Posts: 147
Joined: Sat Feb 27, 2010 8:55 pm

Re: Writing an Assembler - How?

Post by azblue »

mioline wrote: So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
When the processor encounters a command like "ADD", it needs to know "add what?". That's what the mod/rm byte is; it specifies what data the preceding command works on.

The SIB byte does the same thing but if you're just writing for the 8086 you can ignore it, as that's only 386+.

You may want to try using an existing assembler and viewing the output from a few simple commands with a hex editor. Look at how it's making the mod/rm byte in relation to the commands used and compare it to the information on the mod/rm byte. Doing it like that kinda helps it all make sense.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Writing an Assembler - How?

Post by Solar »

I recommend The Art of Assembly, Chapter 4.7, The 80x86 MOV Instruction.

Actually I recommend the whole book, as its introductionary chapters go into great detail about underlying logic, and how the x86 opcodes came to be.
Every good solution is obvious once you've found it.
bitshifter
Member
Member
Posts: 50
Joined: Sun Sep 20, 2009 4:03 pm

Re: Writing an Assembler - How?

Post by bitshifter »

I put my reply in this attachment due to formatting problems...
Attachments
encode.txt
(1.2 KiB) Downloaded 178 times
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Writing an Assembler - How?

Post by Love4Boobies »

bitshifter wrote:I put my reply in this attachment due to formatting problems...
You should have used the

Code: Select all

 tag.

(This thread was too long for me to read right now so that's all I have to say :-))
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
Post Reply