Stupid assembly question
Stupid assembly question
Is [ss:esp] the same as ss:[esp] in NASM syntax?
Currently developing Lithium OS (LiOS).
Recursive paging saves lives.
"I want to change the world, but they won't give me the source code."
Recursive paging saves lives.
"I want to change the world, but they won't give me the source code."
Re: Stupid assembly question
Hi,
Of course an SS segment override prefix is unnecessary when ESP or EBP is involved because SS is the default in those cases (in the same way that you could use "[esi]" instead of "[ds:esi]"). To avoid needing to remember the rules that determine which segment is the default segment you could just put segment override prefixes everywhere, or set SS=DS=ES and forget them.
Cheers,
Brendan
Yes. I think you can also do (e.g.) "ss mov eax,[esp]" if you want (but I wouldn't recommend this as it's less obvious what the "ss" is for).BMW wrote:Is [ss:esp] the same as ss:[esp] in NASM syntax?
Of course an SS segment override prefix is unnecessary when ESP or EBP is involved because SS is the default in those cases (in the same way that you could use "[esi]" instead of "[ds:esi]"). To avoid needing to remember the rules that determine which segment is the default segment you could just put segment override prefixes everywhere, or set SS=DS=ES and forget them.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Stupid assembly question
I agree with your self-assessment that this is a stupid question.
Stupid not because you should be expected to know the answer but because you could have tested it so easily. It would have taken no time to write the code, assemble it, and check the opcodes produced.
An unwillingness to do a little experimentation and test things out doesn't bode well for OS development. It's (IMO) similar to those who ask "what is wrong with this code" without first doing a little elementary debugging for themselves.
Stupid not because you should be expected to know the answer but because you could have tested it so easily. It would have taken no time to write the code, assemble it, and check the opcodes produced.
An unwillingness to do a little experimentation and test things out doesn't bode well for OS development. It's (IMO) similar to those who ask "what is wrong with this code" without first doing a little elementary debugging for themselves.
Re: Stupid assembly question
If we had to make code aligned in a certain boundary, would it be better to use nops or unnecessary segment prefixes? This is a micro-optimization issue. For example, for loop starts the alignment could be "adjusted" to be always e.g. 16-byte. These unnecessary segment prefixes could be one tool to give some margin.
Whether this alignment has any impact on efficiency is another thing. In short: is a nop or an unnecessary segment prefix more ignored by the CPU. Intuitively, it feels that nops are better.
Whether this alignment has any impact on efficiency is another thing. In short: is a nop or an unnecessary segment prefix more ignored by the CPU. Intuitively, it feels that nops are better.
Re: Stupid assembly question
My intuition says otherwise, I believe prefixes should involve less overhead, especially in comparison to multi-byte nops, which are real full instructions that just happen to have no effect. But it's still just a guess.
In a quick look at the Intel Optimization Reference Manual I found their suggestions on what instructions to use for nop, and that prefixes tend to make things slower, but I couldn't find a comparison between the methods for padding. Perhaps they just didn't consider prefixes for padding?
In a quick look at the Intel Optimization Reference Manual I found their suggestions on what instructions to use for nop, and that prefixes tend to make things slower, but I couldn't find a comparison between the methods for padding. Perhaps they just didn't consider prefixes for padding?
Re: Stupid assembly question
Hi,
I agree - it would take no time to write the code, assemble it, and check the opcodes produced, if I were running my development setup. However, I am not in my development setup (I am in Windows and do not have NASM installed). I'm sure asking a simple question on here and getting a decent answer from someone like Brendan would be quicker than finding my linux hard drive, rebooting and testing.
Also I was lucky enough for Brendan to provide an excellent answer which offers extra information, which would not have been gleaned from experimenting. Any other people who may not have known about the minor details of the NASM syntax in question and have read this thread will now also know, which is an added benefit.
The only reason I labelled it stupid was to try to avoid responses like yours; it obviously hasn't worked.iansjack wrote:Stupid not because you should be expected to know the answer but because you could have tested it so easily. It would have taken no time to write the code, assemble it, and check the opcodes produced.
I agree - it would take no time to write the code, assemble it, and check the opcodes produced, if I were running my development setup. However, I am not in my development setup (I am in Windows and do not have NASM installed). I'm sure asking a simple question on here and getting a decent answer from someone like Brendan would be quicker than finding my linux hard drive, rebooting and testing.
Also I was lucky enough for Brendan to provide an excellent answer which offers extra information, which would not have been gleaned from experimenting. Any other people who may not have known about the minor details of the NASM syntax in question and have read this thread will now also know, which is an added benefit.
Currently developing Lithium OS (LiOS).
Recursive paging saves lives.
"I want to change the world, but they won't give me the source code."
Recursive paging saves lives.
"I want to change the world, but they won't give me the source code."
Re: Stupid assembly question
Hi,
I'd be tempted to suspect that the fastest way (where possible) may be to increase the size of previous instructions (e.g. by using 8-bit or 32-bit "zero displacements") rather than inserting additional instructions of any kind; especially if the previous instructions aren't within a loop.
Cheers,
Brendan
Intel's latest optimisation guide suggests an operand size override prefix for a 2-byte NOP (but not for larger NOPs). AMD's optimisation manual recommends redundant operand size override prefixes or larger NOPs.Kevin wrote:My intuition says otherwise, I believe prefixes should involve less overhead, especially in comparison to multi-byte nops, which are real full instructions that just happen to have no effect. But it's still just a guess.
In a quick look at the Intel Optimization Reference Manual I found their suggestions on what instructions to use for nop, and that prefixes tend to make things slower, but I couldn't find a comparison between the methods for padding. Perhaps they just didn't consider prefixes for padding?
I'd be tempted to suspect that the fastest way (where possible) may be to increase the size of previous instructions (e.g. by using 8-bit or 32-bit "zero displacements") rather than inserting additional instructions of any kind; especially if the previous instructions aren't within a loop.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: Stupid assembly question
While the first rule of assembly code is that it's inherently unportable across architectures, the first rule to assembly micro-optimization is that it's not portable across microarchitectures. Thus, such questions become meaningless in the general case. As far as coding goes, I would generally recommend not tuning for any particular microarchitecture, except perhaps as an aside to a generic version which the build system should link by default.
That being said, here's a summary of the situation so far:
That being said, here's a summary of the situation so far:
- An instruction's encoding, except sometimes for prefixes (see below), has no effect on its execution time. Also true for prefix encodings.
- NOP's do take cycles to execute so they should be used when all else fails.
- The multibyte versions of NOP are preferred to multiple individual NOP's, on CPU's which support them.
- As long as the maximum instruction length of 15 bytes is not exceeded, there is no limit on the number of prefixes.
- Although not future-proof, meaningless segment override prefixes are currently ignored. Redundant ones are future-proof, though.
- Except for VEX, any prefix may be reused a given instruction.
- Address size prefixes slow down the decoding process.
- AMD CPU's impose a performance penalty on instructions with more than 3 prefixes.
- Executing the instruction both with dummy prefixes, used for alignment, and without, by branching, may confuse the CPU's optmizer.
But quicker than downloading around 1 MiB, which is the size of NASM's Windows port, which can be ran out of the box?BMW wrote:quicker than finding my linux hard drive, rebooting and testing.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]
Re: Stupid assembly question
Or even taking the trouble to install the Windows version of NASM. I agree; it's much easier to just let someone else do the work for you. But is it a good grounding for OS development? - I don't think so.I'm sure asking a simple question on here and getting a decent answer from someone like Brendan would be quicker than finding my linux hard drive, rebooting and testing.
In exactly the same way, as I said, it's easier to ask someone else to debug your code for you than to do the work yourself.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Stupid assembly question
As far as I recall, the one-byte NOP only gets to a special case in the decoder which in turn decodes it to zero µops, which means it might not even take a cycle.NOP's do take cycles to execute so they should be used when all else fails.
Re: Stupid assembly question
Code: Select all
entry1:
; Code
; Fall through
entry2:
; Code
; Fall through
entry3:
; Code
ret
Perhaps the preferable one. At least more elegant.Brendan wrote:I'd be tempted to suspect that the fastest way (where possible) may be to increase the size of previous instructions
Re: Stupid assembly question
The difference between the two is one gives belief whereas the other gives knowledge.BMW wrote:I'm sure asking a simple question on here and getting a decent answer from someone like Brendan would be quicker than finding my linux hard drive, rebooting and testing.
Every universe of discourse has its logical structure --- S. K. Langer.
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: Stupid assembly question
There is indeed special support for single-byte NOP's (but not multibyte ones), but you're misremembering the details: It's only implemented on Intel CPU's and what it does is to remove the dependency on EAX on the pipeline, since NOP is really "xchg eax, eax." They still have 1 µop and, depending on the CPU, the latency is 0--1 cycles, and the throughput is always more than 0 cycles but never more than 1.Combuster wrote:As far as I recall, the one-byte NOP only gets to a special case in the decoder which in turn decodes it to zero µops, which means it might not even take a cycle.NOP's do take cycles to execute so they should be used when all else fails.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]
Re: Stupid assembly question
Except for the fact that Brendan's answer is incorrect. One of the forms you listed works; the other results in a syntax error.BMW wrote:Also I was lucky enough for Brendan to provide an excellent answer which offers extra information, which would not have been gleaned from experimenting. Any other people who may not have known about the minor details of the NASM syntax in question and have read this thread will now also know, which is an added benefit.
Those who understand Unix are doomed to copy it, poorly.
Re: Stupid assembly question
Hi,
Cheers,
Brendan
I tested - you're rightMinoto wrote:Except for the fact that Brendan's answer is incorrect. One of the forms you listed works; the other results in a syntax error.BMW wrote:Also I was lucky enough for Brendan to provide an excellent answer which offers extra information, which would not have been gleaned from experimenting. Any other people who may not have known about the minor details of the NASM syntax in question and have read this thread will now also know, which is an added benefit.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.