Direct memory addressing VS. variable addressing in C.
Direct memory addressing VS. variable addressing in C.
I have been working on a way to compress my driver data into memory areas I call apartments & buildings. There is one building per device type, but there can be multiple apartments within the building (multiple device of the same type). The 'type' of driver allocates a certain amount of memory space that I have calculated for it to need, and if multiple deviced of the same 'type' are found, there then becomes two apartments but within on building per'se.
I have also specified the location of variables for each apartment aligned with the amount of space allocated per apartment.
say each device requires 16k of memory, then the second device would start initializing variables at offset 16k, and then allocate another 16k for its needs.
my main question is: with this kind of variable assignment that must be aligned and precise, should I just access that memory via something like *(unsigned char*)0x12345678, or would it be just as efficient to declare a variable, then re-assign its memory position to suit the memory map?
I personally think, cutting out the variables would cutout some overhead.
I have also specified the location of variables for each apartment aligned with the amount of space allocated per apartment.
say each device requires 16k of memory, then the second device would start initializing variables at offset 16k, and then allocate another 16k for its needs.
my main question is: with this kind of variable assignment that must be aligned and precise, should I just access that memory via something like *(unsigned char*)0x12345678, or would it be just as efficient to declare a variable, then re-assign its memory position to suit the memory map?
I personally think, cutting out the variables would cutout some overhead.
Website: https://joscor.com
Barring optimization and other goofy things I can't address without seeing all the code:
*(unsigned char*)0xABCDEF = 12345;
will result in the same general machine instructions as:
unsigned char* address = 0xABCDEF;
*address = 12345;
You may save yourself some headaches if you stick things in variables. A compiler, even with optimizations turned off, will handle both of these in very similar (or identical) ways.
I hope I got the gist of the problem..
Edit: kind of a silly mistake:
mov [0xABCDEF], %rax
vs.
lea %rax, [address wrt %rip]
mov [%rax], 12345
The first line would be possible (and 1 cycle faster I believe), but fixing the address of everything in your kernel seems like a bad idea. You'd end up with the second example if you want to support a variable number of devices (like an array), and I assume you do.
*(unsigned char*)0xABCDEF = 12345;
will result in the same general machine instructions as:
unsigned char* address = 0xABCDEF;
*address = 12345;
You may save yourself some headaches if you stick things in variables. A compiler, even with optimizations turned off, will handle both of these in very similar (or identical) ways.
I hope I got the gist of the problem..
Edit: kind of a silly mistake:
mov [0xABCDEF], %rax
vs.
lea %rax, [address wrt %rip]
mov [%rax], 12345
The first line would be possible (and 1 cycle faster I believe), but fixing the address of everything in your kernel seems like a bad idea. You'd end up with the second example if you want to support a variable number of devices (like an array), and I assume you do.
Neptune - 64 bit microkernel OS in D - www.devlime.com
in LongModespeal wrote: mov [0xABCDEF], %rax
vs.
lea %rax, [address wrt %rip]
mov [%rax], 12345
Code: Select all
mov [imm], r0-r15 ;7 bytes, write to mem on static addr
;you'll need to declare additional variable probaby? :
;15 bytes, 11bytes for dword
var dq 0
mov [var], rax
;7bytes also
lea rax, [rcx+127]
mov [rax], rax
;6 bytes
lea eax, [rcx+127] ;if rcx replaced with r8-r15 then +1 byte
mov [rax], rax
using r8-r15 in "mov [rax], reg64" instead of rax will add 1 more byte
so 11 bytes lea+mov in worst ase scenario
I think doing such optimization in C is pointless.
Also, this may be a very basic ASM question, but when performing a mov such as:
no registers are altered correct?
wouldn't that be another advantage over the other method using variables or registers to store the value?
Code: Select all
mov byte [0x00500000], byte 0xFF;
wouldn't that be another advantage over the other method using variables or registers to store the value?
Website: https://joscor.com
Let the compiler deal with it. Internally the compiler creates temporary variables all over the shop, for example:I personally think, cutting out the variables would cutout some overhead.
Code: Select all
a = *(unsigned int*)0x1000;
Code: Select all
unsigned int *tmp = 0x1000;
a = *tmp;
Code: Select all
unsigned int a = b + c - (d+e);
Code: Select all
unsigned int a = b + c;
unsigned int tmp = d + e;
a = a - tmp;
Once that form has been created, assembly code is created and optimisations take place - where to store each temporary - register? stack? can two instructions be merged because of complex addressing modes? etc.
So really the variable declarations in your C code have absolutely no correlation with what the compiler outputs (on anything over -O0).
Whether the code you give is better or worse than two instructions that achieve the same result depends wholly on the processor in question.Also, this may be a very basic ASM question, but when performing a mov such as:no registers are altered correct?Code: Select all
mov byte [0x00500000], byte 0xFF;
wouldn't that be another advantage over the other method using variables or registers to store the value?
On the one hand, the entire operation is achieved in one instruction. Which is good.
On the other hand, that instruction might be heavily microcoded, which slows things down (remember that not all instructions in CISC architectures are as heavily optimised - ones that compilers use get optimised more).
The instruction you give (a store immediate) is so common that I would personally consider it more efficient than a register-move, register-store. However also bear in mind that a store immediate instruction takes up more space than a register store, so if you're using the same constant over again I would reccommend storing it temporarily somewhere.
Yes, basically you are turning variables into #define or EQU statements. It is certainly best to let the compiler/linker deal with the details. This only works in virtual memory, of course, and doesn't work well at all in physical mem. But when you put things at known memory locations, it DOES create more opportunities for tightening up the assembler code. Heck, the entire *concept* of assembler SIB byte addressing [base + offset + index*size] assumes that either "base" or "offset" is a known *fixed constant* memory address. Without known fixed constant memory addresses, that entire CPU feature becomes much less useful, and much less of an enhancement to your code.
In ProtectedMode x86: savings start when you "mov" same constant(a byte like 01000101 wants) 6 or more timesJamesM wrote:[However also bear in mind that a store immediate instruction takes up more space than a register store, so if you're using the same constant over again I would reccommend storing it temporarily somewhere.
'mov cl, 3' is not considered because its slight hit on performance in most cases
Same goes for LongMode x86-64 if r8-r15 not used
;29bytes
mov ecx, 3
mov [0723872h], cl
mov [0723872h], cl
mov [0723872h], cl
mov [0723872h], cl
;28 bytes
mov byte [0723872h], 3
mov byte [0723872h], 3
mov byte [0723872h], 3
mov byte [0723872h], 3
However x86 is optimized for eax reg:
;25 bytes in ProtectedMode, same 29byte in LongMode
mov eax, 5
mov [0723872h], al
mov [0723872h], al
mov [0723872h], al
mov [0723872h], al
Wouldn't it be better to do:exkor wrote:In ProtectedMode x86: savings start when you "mov" same constant(a byte like 01000101 wants) 6 or more timesJamesM wrote:[However also bear in mind that a store immediate instruction takes up more space than a register store, so if you're using the same constant over again I would reccommend storing it temporarily somewhere.
'mov cl, 3' is not considered because its slight hit on performance in most cases
Same goes for LongMode x86-64 if r8-r15 not used
;29bytes
mov ecx, 3
mov [0723872h], cl
mov [0723872h], cl
mov [0723872h], cl
mov [0723872h], cl
;28 bytes
mov byte [0723872h], 3
mov byte [0723872h], 3
mov byte [0723872h], 3
mov byte [0723872h], 3
However x86 is optimized for eax reg:
;25 bytes in ProtectedMode, same 29byte in LongMode
mov eax, 5
mov [0723872h], al
mov [0723872h], al
mov [0723872h], al
mov [0723872h], al
Code: Select all
mov dword[0x723872],0x05050505
Also, why to copy one same value several times in the same memory location?
YouTube:
http://youtube.com/@AltComp126
My x86 emulator/kernel project and software tools/documentation:
http://master.dl.sourceforge.net/projec ... ip?viasf=1
http://youtube.com/@AltComp126
My x86 emulator/kernel project and software tools/documentation:
http://master.dl.sourceforge.net/projec ... ip?viasf=1
Even though I was talking about moving bytes of data, if what is said above to be true, wouldn't moving dwords be more efficient and optimised as far as machine instructions per asm line?
Website: https://joscor.com
I think you meanspeal wrote:Barring optimization and other goofy things I can't address without seeing all the code:
*(unsigned char*)0xABCDEF = 12345;
will result in the same general machine instructions as:
unsigned char* address = 0xABCDEF;
*address = 12345;
Code: Select all
static const unsigned char* address = 0xCAFEBABE;
*address = 12345;
That said, what about the following?
Code: Select all
struct Memory_Mapped_Device {
uint32_t reg_foo;
uint32_t reg_bla;
uint32_t reg_blub;
...
} *device = 0xCAFEBAB0;
debice->reg_foo = 17;
device->reg_bla = 23;
Code: Select all
*(uint32_t*)0xCAFEBAB0 = 17;
*(uint32_t*)0xCAFEBAB4 = 23;
MfG
Goswin
Life - Don't talk to me about LIFE!
So long and thanks for all the fish.
So long and thanks for all the fish.
I posted a few code snippets in an earlier post about re-loacting a struct, I think that would also be a fair approach to make things more readable all while controlling the memory allocation process for variables.
basically, fill a struct with variables, then move then entire struct, and once place, the variables stack up from the base of the struct.
basically, fill a struct with variables, then move then entire struct, and once place, the variables stack up from the base of the struct.
Website: https://joscor.com