Page 1 of 2

Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 12:01 pm
by vvaltchev
Guys,
I've just discovered that the following code:

Code: Select all

#include <inttypes.h>

static __attribute__((always_inline)) inline void
bar(uint8_t a) {
    /* do nothing */
}

void foo(void)
{
   bar(0);
}
When compiled with clang, ANY version, from 3.x to 11.x, using the options: -m32 -O0 -ffreestanding, generates:

Code: Select all

foo:                                    # @foo
        push    ebp
        mov     ebp, esp
        sub     esp, 1                        # What? Why are you making the stack unaligned?
        mov     byte ptr [ebp - 1], 0
        add     esp, 1
        pop     ebp
        ret
Check the example on Compiler Explorer: https://gcc.godbolt.org/z/KPeqaPqjY
Is this a compiler bug? Notes:

1. It doesn't happen with any other compiler.
2. -O0 -ffreestanding have no effect, I just added to remark that it works online without optimizations and that -ffreestanding have no effect.
3. It happens ONLY when __attribute__((always_inline)) is used
4. It happens ONLY when bar() takes an argument that is < sizeof(void *)
5. It happens ONLY with -m32

As you can imagine, I discovered it while debugging my project, after enabling UBSAN and the real example generates much more code,
but, it doesn't matter. This is the shortest and simplest code that reproduces the problem.

In opinion this is definitively a bug, but what do you think about it?

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 12:29 pm
by thewrongchristian
vvaltchev wrote:Guys,
I've just discovered that the following code:

Code: Select all

#include <inttypes.h>

static __attribute__((always_inline)) inline void
bar(uint8_t a) {
    /* do nothing */
}

void foo(void)
{
   bar(0);
}
When compiled with clang, ANY version, from 3.x to 11.x, using the options: -m32 -O0 -ffreestanding, generates:

Code: Select all

foo:                                    # @foo
        push    ebp
        mov     ebp, esp
        sub     esp, 1                        # What? Why are you making the stack unaligned?
        mov     byte ptr [ebp - 1], 0
        add     esp, 1
        pop     ebp
        ret
Check the example on Compiler Explorer: https://gcc.godbolt.org/z/KPeqaPqjY
Is this a compiler bug? Notes:

1. It doesn't happen with any other compiler.
2. -O0 -ffreestanding have no effect, I just added to remark that it works online without optimizations and that -ffreestanding have no effect.
3. It happens ONLY when __attribute__((always_inline)) is used
4. It happens ONLY when bar() takes an argument that is < sizeof(void *)
5. It happens ONLY with -m32

As you can imagine, I discovered it while debugging my project, after enabling UBSAN and the real example generates much more code,
but, it doesn't matter. This is the shortest and simplest code that reproduces the problem.

In opinion this is definitively a bug, but what do you think about it?
I don't think so. $esp isn't being dereferenced, so it's not being used as an invalid or unaligned pointer.

It's probably just boiler plate code that has not yet been optimised away.

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 12:36 pm
by qookie
I don't see the problem with this though? foo is a leaf function since it doesn't call anything else. As such, nothing depends on what the stack pointer is aligned to inside of it, and I assume the compiler is smart enough to order things in such a way to respect alignment requirements of the data it puts on the stack.

For example, if inside of bar you add a call to a third function, baz, defined simply like so:

Code: Select all

void baz(uint8_t);
the compiler properly allocates 8 bytes on the stack to keep the proper alignment needed by the ABI.

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 12:40 pm
by vvaltchev
thewrongchristian wrote:I don't think so. $esp isn't being dereferenced, so it's not being used as an invalid or unaligned pointer.
It's probably just boiler plate code that has not yet been optimised away.
Well, the problem is: what happens when an interrupt occurs while the function is running? It will have the whole stack unaligned. Actually, that works fine on x86 but it's inefficient and it's undefined behavior in C. I found that by enabling UBSAN and I got an UBSAN unaligned access failure in an IRQ, while attempting to read a pointer-sized integer from the stack. That happens every single time if we get an IRQ while a specific function is on the stack. So, I discovered that function is making the ESP pointer misaligned. I don't think that's fine, in particular because I'm compiling with -ffreestanding. An interrupt might occur: what then?

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 12:46 pm
by vvaltchev
qookie wrote:I don't see the problem with this though? foo is a leaf function since it doesn't call anything else. As such, nothing depends on what the stack pointer is aligned to inside of it, and I assume the compiler is smart enough to order things in such a way to respect alignment requirements of the data it puts on the stack.

For example, if inside of bar you add a call to a third function, baz, defined simply like so:

Code: Select all

void baz(uint8_t);
the compiler properly allocates 8 bytes on the stack to keep the proper alignment needed by the ABI.
The problem is what: happens when an interrupt occurs?
Btw, that was the simplest code reproducing the bug. The actual code was:

Code: Select all

static void textmode_enable_cursor(void)
{
   const u8 s_start = 0; /* scanline start */
   const u8 s_end = 15;  /* scanline end */

   outb(0x3D4, 0x0A);
   outb(0x3D5, (inb(0x3D5) & 0xC0) | s_start);  // Note: mask with 0xC0
                                                // which keeps only the
                                                // higher 2 bits in order
                                                // to set bit 5 to 0.

   outb(0x3D4, 0x0B);
   outb(0x3D5, (inb(0x3D5) & 0xE0) | s_end);    // Mask with 0xE0 keeps
                                                // the higher 3 bits.
}
This function does real stuff. outb() and inb() are functions with inline assembly defined as always_inline. But, as you can see, the behavior does not depend on the inline assembly at all.

I cannot believe the compiler is allowed to do that, in particular if unaligned access is UB, even on architectures such as x86 where that's fine. We had already a long discussion about that in another thread.

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 12:55 pm
by qookie
vvaltchev wrote: The problem is what: happens when an interrupt occurs?
Hm, that is a good point. Originally I was thinking the compiler can't be aware of interrupts, but it still has to be aware of signals, which behave basically like interrupts. I'm not too sure if the standard allows this, and requires that signal handlers perform additional alignment if needed, or if this is indeed a bug in LLVM (since I assume it affects more than just clang).

BTW, I noticed the same happens without -m32, except that you also need -mno-red-zone, since otherwise the compiler just places it in the red zone below the stack (which signals must carefully avoid writing into).

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 1:46 pm
by Korona
Use -mstack-alignment=8 in your kernel to make sure that this does not bite you.

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 2:26 pm
by vvaltchev
Korona wrote:Use -mstack-alignment=8 in your kernel to make sure that this does not bite you.
Just tried: https://gcc.godbolt.org/z/MGo7f7a8P
Unfortunately, it has no effect :-(

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 2:56 pm
by vvaltchev
Update: the problem can be reproduced even by just using _Atomic(bool):
This code:

Code: Select all

#include <inttypes.h>
#include <stdatomic.h>
#include <stdbool.h>

extern _Atomic(bool) var;

void foo(void) {
    atomic_store_explicit(&var, false, memory_order_relaxed);
}
Generates:

Code: Select all

foo:                                    
        push    ebp
        mov     ebp, esp
        sub     esp, 1                          # Unaligned stack ptr!
        mov     byte ptr [ebp - 1], 0
        mov     al, byte ptr [ebp - 1]
        mov     byte ptr [var], al
        add     esp, 1
        pop     ebp
        ret
Check on Compiler Explorer: https://gcc.godbolt.org/z/d891bPjYh

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 3:07 pm
by kzinti
I guess the easiest thing to do here then is to align your stack when entering interrupt handlers... It's one more instruction... Not great, but sounds like it is necessary.

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 3:11 pm
by thewrongchristian
kzinti wrote:I guess the easiest thing to do here then is to align your stack when entering interrupt handlers... It's one more instruction... Not great, but sounds like it is necessary.
It won't help. The CPU has already pushed the eflags, esp and cs onto the stack with the dodgy alignment before your interrupt code even has a chance to run.

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 3:13 pm
by kzinti
thewrongchristian wrote:It won't help. The CPU has already pushed the eflags, esp and cs onto the stack with the dodgy alignment before your interrupt code even has a chance to run.
But that doesn't matter. x86 is perfectly happy working with unaligned addresses.

The concerns as I understand it are:

1) Respecting the ABI
2) Performance when executing kernel code

Both are addressed with aligning the stack before entering C code.

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 3:13 pm
by Korona
Hm, that looks quite scary indeed. It'd suggest to report it as a Clang bug (mention the interrupt use case).

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 4:01 pm
by vvaltchev
I reported the bug: https://bugs.llvm.org/show_bug.cgi?id=49828
But later I've also found this: https://groups.google.com/g/llvm-dev/c/ ... O7J0baAAAJ

It looks like @kzinti might be right, but I hope the clang guys will suggest something else (e.g. an option).
The problem is that adjusting the stack pointer doesn't look like so simple to me. Maybe I'm not thinking clearly at the moment, since it's late but, let's still consider my low-level IRQ handler:

Code: Select all

FUNC(asm_irq_entry):

   kernel_entry_common
   push_custom_flags (0)

   push offset .irq_resume
   mov eax, esp
   cld            # Set DF = 0, as C compilers by default assume that.
   push eax
   call irq_entry

   add esp, 8     # Discard the previousy-pushed 'eax' and .irq_resume

.irq_resume:
   pop_custom_flags
   kernel_exit_common

END_FUNC(asm_irq_entry)
Sorry for the assembly macros, but I suppose that with them the code is still understandable.
Now, between "push offset .irq_resume" and "mov eax, esp" I could do something like: "and esp, ~3", and the stack pointer will be aligned but the "struct regs" will be unreadable because now I've moved 0..3 bytes away from it. So, I not only have to re-align the stack pointer, but also shift the whole struct by 0..3 bytes. That's a nightmare.

Btw, do you know my theory about why the Linux guys haven't hit this problem yet? Linux compiles with clang (as far as I know) and it has support for UBSAN as well, but "-fsanitize=alignment" is not turned on for architectures supporting unaligned access. And, even if you turn it on, it will just log a ton of unaligned access warnings. Not panic handling like in my case.

If feel like I have no much choice other than not enabling "-fsanitize=alignment" for clang. Just, I'm tried on having to handle an infinite amount of IFs in the build system and #ifdefs in the code for supporting stuff :-(

Re: Clang emits code making ESP unaligned. Compiler bug?

Posted: Sat Apr 03, 2021 5:12 pm
by thewrongchristian
vvaltchev wrote: Btw, do you know my theory about why the Linux guys haven't hit this problem yet? Linux compiles with clang (as far as I know) and it has support for UBSAN as well, but "-fsanitize=alignment" is not turned on for architectures supporting unaligned access. And, even if you turn it on, it will just log a ton of unaligned access warnings. Not panic handling like in my case.
Your example is not optimized. As soon as you enable even minimal optimization (-O), the problem disappears:

https://gcc.godbolt.org/z/Y4xc66MqT