Page 1 of 1

Your how-the-heck-did-it-worked moments?

Posted: Wed May 11, 2011 3:36 am
by Artlav
Have you ever found a bug so evil that you wondered how the heck did it all worked at all?
I guess you had, so tell the story?

To start, i was recently debugging the filesystem handling code, looking for the weird tiny problem that popped up in a multithreaded write test. The file content was swapped between threads every now and then.
Eventually i got down to the thread handling, where i found that thread-private areas of all threads were reallocated on each thread creation! Thus, all of the thread private variables could get randomized.
Fixing that collapsed everything.
Luck of the memory manager until now? Not quite. Turned out that call gates were not thread safe, and the reply could have been returned to any thread that might have called the same module at the same time. Thus, threads merged and swapped their data every now and then, somehow cancelling two problems most of the time at cost of rare random crashes.

And an entire damn OS was working fine on top of that!

Re: Your how-the-heck-did-it-worked moments?

Posted: Wed May 11, 2011 5:27 pm
by TylerH
Next time I have a problem, I just delete a random "volatile." (Thus making the number of bugs even.) That should fix it.

Re: Your how-the-heck-did-it-worked moments?

Posted: Thu May 12, 2011 1:07 pm
by Artlav
Interference pattern of bugs.
You disentangle them, the pattern breaks, and you see black spots.
So, just add a bug that would light them up. :)

Re: Your how-the-heck-did-it-worked moments?

Posted: Thu May 12, 2011 1:22 pm
by Solar
A couple of days ago I asked a customer from another department (who's using one of our libs in his product) what debugger he's using on the target machine (AIX).

His answer:

"None. We do development on Windows, and when it works there we just recompile everything on AIX, and it should work there, too."

(Note that we didn't deliver a Windows version of our libs - I assume he stubbed them for his "debugging".)

Here I ask myself, how the heck does this whole thing work? I mean, as a company? :roll:

Re: Your how-the-heck-did-it-worked moments?

Posted: Fri May 13, 2011 12:57 am
by xenos
I just had one of these moments...

While going through some forum posts, I found this post and I wondered what Bochs' debugger would tell me if I enter "info idt" to see the IDT contents of the 64 bit port of my kernel. Most of it looked fine (exception handlers, IRQs mapped to 0x20 - 0x2f...), but for interrupts 0x80 and the following the table contained nonsense which looked like GDT entries for code and data segments. In other words, the GDT was located right in the middle of the IDT.

It turned out that I simply copied parts of my 32 bit ldscript to the 64 bit ldscript, including the part where I reserve some space for IDT and GDT. In 32 bit, I reserve 256 * 8 = 2k for the IDT, so I just did the same in 64 bit - completely forgetting that IDT descriptors should now be 16 bytes long instead of 8. So I changed the 2k to 4k in my ldscript and now the result of "info idt" makes more sense. Fortunately I found this bug before using interrupts 0x80 and above...

And now I'm struggling with the next problem - GRUB2 doesn't boot from an iso image in Bochs 2.4.6, giving me some error message about an "Unaligned pointer"... The same iso image boots fine in QEMU, VBox and SimNow!, so I guess it will take some time to figure out where the problem is...