Kernel Having weird UB issues i don't understand whats going on
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Kernel Having weird UB issues i don't understand whats going on
Sometimes it 0xd's, sometimes its 0xe's at memcpy, sometimes it 0xes infinitely at the scheduler. i dont know what to do
try it yourself i dont understand whats happening
https://github.com/rayanmargham/NyauxKC
try it yourself i dont understand whats happening
https://github.com/rayanmargham/NyauxKC
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Kernel Having weird UB issues i don't understand whats going on
How do you know it's undefined behavior if you haven't found the code responsible for the exception?
Anyway, you can start by sharing more information about the CPU state when the first exception occurs. Perhaps the output from QEMU's "-d int" log?
Anyway, you can start by sharing more information about the CPU state when the first exception occurs. Perhaps the output from QEMU's "-d int" log?
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
its ub because different exceptions happen every qemu boot.
here are some of the pastebins of some exceptions that can happen per boot
https://pastebin.com/JXErQRpb
https://pastebin.com/s4LAvey6
the exceptions are very random and make no sense at ALL
here are some of the pastebins of some exceptions that can happen per boot
https://pastebin.com/JXErQRpb
https://pastebin.com/s4LAvey6
the exceptions are very random and make no sense at ALL
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
ive been debugging this for 6 hours
i can tell you its very much UB. i starred at the disassembly so long and nothing is making sense
i can tell you its very much UB. i starred at the disassembly so long and nothing is making sense
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
This is an issue that won't be solved for weeks most likely. as there is some really difficult bug to track somewhere in the code thats causing this
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Kernel Having weird UB issues i don't understand whats going on
What happens if two CPUs call kmalloc() at exactly the same time?
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
i dont have a lock. so i dont know
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
Code: Select all
spinlock_t mem_lock;
void* kmalloc(uint64_t amount)
{
spinlock_lock(&mem_lock);
if (amount > 1024)
{
void* him = kvmm_region_alloc(amount, PRESENT | RWALLOWED);
memset(him, 0, amount);
spinlock_unlock(&mem_lock);
return him;
}
else
{
#ifdef __SANITIZE_ADDRESS__
void* him = slaballocate(amount + 256);
memset(him + amount, 0xFD, 256);
spinlock_unlock(&mem_lock);
return him;
#else
void* him = slaballocate(amount);
memset(him, 0, amount);
spinlock_unlock(&mem_lock);
return him;
#endif
}
}
void kfree(void* addr, uint64_t size)
{
spinlock_lock(&mem_lock);
if (size >> 63)
{
kprintf("kfree: memory corruption detected\n");
spinlock_unlock(&mem_lock);
__builtin_trap();
}
if (size > 1024)
{
kvmm_region_dealloc(addr);
spinlock_unlock(&mem_lock);
}
else
{
slabfree(addr);
spinlock_unlock(&mem_lock);
}
}
still having UB issues, nothing really changed
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
not UB issues anymore , adding a lock has made the behaviour consistent!
Code: Select all
arch_late_init(): CPU 9 is Online!
arch_late_init(): CPU 12 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 1 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 2 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 5 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 4 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Page Fault! CR2 0xaUBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 17 is Online!
arch_late_init(): CPU 8 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 16 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
RIP is 0xffffffff800318ef. Error Code 0x0UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
arch_late_init(): CPU 11 is Online!
Page Fault! CR2 0xaUBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 3 is Online!
-> Function: schedd() -- 0xffffffff80031848UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
arch_late_init(): CPU 18 is Online!
arch_late_init(): CPU 14 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
-> Function: schedd() -- 0xffffffff80031848UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
-> Function: schedd() -- 0xffffffff80031848
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
-> Function: schedd() -- 0xffffffff80031848
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
arch_late_init(): CPU 10 is Online!
-> Function: sched() -- 0xffffffff80002ed8UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Kernel Having weird UB issues i don't understand whats going on
Are there any other places where two CPUs might access the same data structure at the same time?
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
I am unsure, they would not usually.
but i dont know what is causing this as it is NOT null??
but i dont know what is causing this as it is NOT null??
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Kernel Having weird UB issues i don't understand whats going on
Clearly it is null, otherwise UBSAN wouldn't be complaining. So when does it become null?
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
it shouldnt be though because in arch_late_init i create it and put it in the segment register gs
-
- Member
- Posts: 59
- Joined: Tue Jul 05, 2022 12:37 pm
Re: Kernel Having weird UB issues i don't understand whats going on
it was a far return meme with the asm code
solved now!!
solved now!!