Machine language questions

Programming, for all ages and all languages.
Post Reply
OldGuy63
Posts: 8
Joined: Fri Jul 12, 2013 5:23 pm

Machine language questions

Post by OldGuy63 »

As I understand it, there is something called machine language, which exists at a level below assembly and is wholly binary. To that, I have two questions.

1. Does machine language exist?
2. Are there any examples of OSs that were/are written entirely in machine language?
Mikemk
Member
Member
Posts: 409
Joined: Sat Oct 22, 2011 12:27 pm

Re: Machine language questions

Post by Mikemk »

Programming is 80% Math, 20% Grammar, and 10% Creativity <--- Do not make fun of my joke!
If you're new, check this out.
User avatar
iansjack
Member
Member
Posts: 4724
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Machine language questions

Post by iansjack »

1. Yes, machine code does exist. It is a sequence of hexadecimal numbers.
2. No-one actually programs in machine code. Assembly language is essentially just a way of representing machine code as easier to remember and read mnemonics.

In the early days of computers programmers did work directly with machine code but there would be no reason to do so nowadays. You only really need to know it if you are writing an assembler to convert assembly language to machine code.
Casm
Member
Member
Posts: 221
Joined: Sun Oct 17, 2010 2:21 pm
Location: United Kingdom

Re: Machine language questions

Post by Casm »

Machine code is that which is directly executed by the processor. I only know of one person who writes programs in machine code. Any sane person who needs to program at that low level uses assembly language, which is basically a human readable version of machine code.

All computer programs, including operating syatems, are eventually compiled into machine code, so that the processor can execute them, but no operating system has ever been written in machine code. The only exception to that last remark might have been the very first computers, but the programs which ran on them could hardly be graced with the name "operating system", because they only had a few hundred bytes of memory.
User avatar
dozniak
Member
Member
Posts: 723
Joined: Thu Jul 12, 2012 7:29 am
Location: Tallinn, Estonia

Re: Machine language questions

Post by dozniak »

All programmers of the past were able to enter machine language using 8 switches and 1 button. The art is nearly lost, but there's at least one person on this forum still able to do so for x86.
Learn to read.
User avatar
DavidCooper
Member
Member
Posts: 1150
Joined: Wed Oct 27, 2010 4:53 pm
Location: Scotland

Re: Machine language questions

Post by DavidCooper »

(1) Machine language on most computers takes the form of instructions which are held in one or more bytes and which may be followed by further bytes which will be invoved in the operations triggered by those instructions (e.g. on the PC, 04 05 is an instruction to add the value 5 to whatever value is in the CPU register AL, so it's only the 04 there that's an instruction byte). These numbers held in bytes are not specifically hexadecimal, but can be typed in as binary, hexadecimal, decimal, base 256, base one, or any other base you care to use to type them in. It's sometimes best to think of it as binary since that's how the bytes are physically stored, but from the instruction point of view it may be more accurate to think of them as base 256 (on the PC at least) due to the arbitrary manner in which the numbers are assigned to the functionality, though it's a lot more complex than that, as can be seen most clearly if you look at the way instructions are constructed on ARM and Itanium processors. In all these cases, there will be a part of the byte or group of bytes which is checked first which will determine how the rest of the bits are to be interpreted by the processor, and there will be often be a series of further checks which follow where other bits are tested next to narrow down further how the rest of the bits are to be interpreted. None of these systems are perfectly neat and the assignments of bits are highly arbitrary by design, although wherever possible, groups of bits will have functionality which neatly ties in with the values held by them. This can be seen in the oorrrmmm bytes on the PC which often occur as the second byte of an instruction, the rrr and mmm bits holding values from 0 to 7 which map them neatly to different CPU registers.

(2) A PC OS written in machine code can be found at the link in my sig., though it's fairly primitive. You can run it in Bochs (or on a real machine if you're brave). [A much more advanced version exists (booting from flash drive and with a proper GUI) but it isn't ready for release just yet because it may fry a monitor if someone was to press the wrong key at a bad time, and it also needs some work done on it to make it compatible with an emulator again - it half works on QEMU and half works on Bochs, but it is currently useful on neither, but that will be fixed when time allows.]
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c

MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: Machine language questions

Post by NickJohnson »

Understanding machine code can be useful in some circumstances. In many cases, code injected into a buffer overflow must not contain zero bytes; if you know some tricks involving machine code representations, you can get rid of them. Similarly, many pieces of malware use code obfuscation techniques that involve injecting garbage into machine code (or, in the case of the x86, sometimes jumping into the middle of instructions).

To prove my point about obfuscation, take a look at this perfectly valid cdecl function, and see if you can figure out what parameter will make it return 0:

Code: Select all

mov edx, [esp+0x4]
call dword 0x9
pop eax
add eax, 0xc180c031
add eax, 0xe18089b0
add eax, 0xf180b5b4
add eax, 0xb110e0c1
add eax, 0xe98069b0
add eax, 0xe1c0bfb4
add eax, 0xb1c2af0f
add eax, 0xc329e883
add eax, 0x7a3f5eb6
jmp eax
Mikemk
Member
Member
Posts: 409
Joined: Sat Oct 22, 2011 12:27 pm

Re: Machine language questions

Post by Mikemk »

Often, to prevent reverse engineering, companies will add one or two random bytes to the beginning of functions, and adjust the references to call past those. This causes disassemblies to be wrong.
Programming is 80% Math, 20% Grammar, and 10% Creativity <--- Do not make fun of my joke!
If you're new, check this out.
Post Reply