Page 1 of 1
Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:13 pm
by RayanMargham
Sometimes it 0xd's, sometimes its 0xe's at memcpy, sometimes it 0xes infinitely at the scheduler. i dont know what to do
try it yourself i dont understand whats happening
https://github.com/rayanmargham/NyauxKC
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:19 pm
by Octocontrabass
How do you know it's undefined behavior if you haven't found the code responsible for the exception?
Anyway, you can start by sharing more information about the CPU state when the first exception occurs. Perhaps the output from QEMU's "-d int" log?
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:26 pm
by RayanMargham
its ub because different exceptions happen every qemu boot.
here are some of the pastebins of some exceptions that can happen per boot
https://pastebin.com/JXErQRpb
https://pastebin.com/s4LAvey6
the exceptions are very random and make no sense at ALL
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:26 pm
by RayanMargham
ive been debugging this for 6 hours
i can tell you its very much UB. i starred at the disassembly so long and nothing is making sense
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:32 pm
by RayanMargham
This is an issue that won't be solved for weeks most likely. as there is some really difficult bug to track somewhere in the code thats causing this
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:52 pm
by Octocontrabass
What happens if two CPUs call kmalloc() at exactly the same time?
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:54 pm
by RayanMargham
i dont have a lock. so i dont know
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:56 pm
by RayanMargham
Code: Select all
spinlock_t mem_lock;
void* kmalloc(uint64_t amount)
{
spinlock_lock(&mem_lock);
if (amount > 1024)
{
void* him = kvmm_region_alloc(amount, PRESENT | RWALLOWED);
memset(him, 0, amount);
spinlock_unlock(&mem_lock);
return him;
}
else
{
#ifdef __SANITIZE_ADDRESS__
void* him = slaballocate(amount + 256);
memset(him + amount, 0xFD, 256);
spinlock_unlock(&mem_lock);
return him;
#else
void* him = slaballocate(amount);
memset(him, 0, amount);
spinlock_unlock(&mem_lock);
return him;
#endif
}
}
void kfree(void* addr, uint64_t size)
{
spinlock_lock(&mem_lock);
if (size >> 63)
{
kprintf("kfree: memory corruption detected\n");
spinlock_unlock(&mem_lock);
__builtin_trap();
}
if (size > 1024)
{
kvmm_region_dealloc(addr);
spinlock_unlock(&mem_lock);
}
else
{
slabfree(addr);
spinlock_unlock(&mem_lock);
}
}
adding a lock like this
still having UB issues, nothing really changed
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 7:59 pm
by RayanMargham
not UB issues anymore , adding a lock has made the behaviour consistent!
Code: Select all
arch_late_init(): CPU 9 is Online!
arch_late_init(): CPU 12 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 1 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 2 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 5 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 4 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Page Fault! CR2 0xaUBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 17 is Online!
arch_late_init(): CPU 8 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 16 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
RIP is 0xffffffff800318ef. Error Code 0x0UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
arch_late_init(): CPU 11 is Online!
Page Fault! CR2 0xaUBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
arch_late_init(): CPU 3 is Online!
-> Function: schedd() -- 0xffffffff80031848UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
arch_late_init(): CPU 18 is Online!
arch_late_init(): CPU 14 is Online!UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
-> Function: schedd() -- 0xffffffff80031848UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
-> Function: schedd() -- 0xffffffff80031848
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
-> Function: schedd() -- 0xffffffff80031848
Page Fault! CR2 0xa
RIP is 0xffffffff800318ef. Error Code 0x0
arch_late_init(): CPU 10 is Online!
-> Function: sched() -- 0xffffffff80002ed8UBSAN: type_mismatch @ src/sched/sched.c:95:11 (member access within NULL pointer of type 'struct per_cpu_data')
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 8:00 pm
by Octocontrabass
Are there any other places where two CPUs might access the same data structure at the same time?
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 8:01 pm
by RayanMargham
I am unsure, they would not usually.
but i dont know what is causing this as it is NOT null??
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 8:08 pm
by Octocontrabass
Clearly it is null, otherwise UBSAN wouldn't be complaining. So when does it become null?
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 8:14 pm
by RayanMargham
it shouldnt be though because in arch_late_init i create it and put it in the segment register gs
Re: Kernel Having weird UB issues i don't understand whats going on
Posted: Mon Dec 23, 2024 10:16 pm
by RayanMargham
it was a far return meme with the asm code
solved now!!