Your how-the-heck-did-it-worked moments?

All off topic discussions go here. Everything from the funny thing your cat did to your favorite tv shows. Non-programming computer questions are ok too.
Post Reply
User avatar
Artlav
Member
Member
Posts: 178
Joined: Fri Aug 21, 2009 5:54 am
Location: Moscow, Russia
Contact:

Your how-the-heck-did-it-worked moments?

Post by Artlav »

Have you ever found a bug so evil that you wondered how the heck did it all worked at all?
I guess you had, so tell the story?

To start, i was recently debugging the filesystem handling code, looking for the weird tiny problem that popped up in a multithreaded write test. The file content was swapped between threads every now and then.
Eventually i got down to the thread handling, where i found that thread-private areas of all threads were reallocated on each thread creation! Thus, all of the thread private variables could get randomized.
Fixing that collapsed everything.
Luck of the memory manager until now? Not quite. Turned out that call gates were not thread safe, and the reply could have been returned to any thread that might have called the same module at the same time. Thus, threads merged and swapped their data every now and then, somehow cancelling two problems most of the time at cost of rare random crashes.

And an entire damn OS was working fine on top of that!
TylerH
Member
Member
Posts: 285
Joined: Tue Apr 13, 2010 8:00 pm
Contact:

Re: Your how-the-heck-did-it-worked moments?

Post by TylerH »

Next time I have a problem, I just delete a random "volatile." (Thus making the number of bugs even.) That should fix it.
User avatar
Artlav
Member
Member
Posts: 178
Joined: Fri Aug 21, 2009 5:54 am
Location: Moscow, Russia
Contact:

Re: Your how-the-heck-did-it-worked moments?

Post by Artlav »

Interference pattern of bugs.
You disentangle them, the pattern breaks, and you see black spots.
So, just add a bug that would light them up. :)
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Your how-the-heck-did-it-worked moments?

Post by Solar »

A couple of days ago I asked a customer from another department (who's using one of our libs in his product) what debugger he's using on the target machine (AIX).

His answer:

"None. We do development on Windows, and when it works there we just recompile everything on AIX, and it should work there, too."

(Note that we didn't deliver a Windows version of our libs - I assume he stubbed them for his "debugging".)

Here I ask myself, how the heck does this whole thing work? I mean, as a company? :roll:
Every good solution is obvious once you've found it.
User avatar
xenos
Member
Member
Posts: 1118
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: Your how-the-heck-did-it-worked moments?

Post by xenos »

I just had one of these moments...

While going through some forum posts, I found this post and I wondered what Bochs' debugger would tell me if I enter "info idt" to see the IDT contents of the 64 bit port of my kernel. Most of it looked fine (exception handlers, IRQs mapped to 0x20 - 0x2f...), but for interrupts 0x80 and the following the table contained nonsense which looked like GDT entries for code and data segments. In other words, the GDT was located right in the middle of the IDT.

It turned out that I simply copied parts of my 32 bit ldscript to the 64 bit ldscript, including the part where I reserve some space for IDT and GDT. In 32 bit, I reserve 256 * 8 = 2k for the IDT, so I just did the same in 64 bit - completely forgetting that IDT descriptors should now be 16 bytes long instead of 8. So I changed the 2k to 4k in my ldscript and now the result of "info idt" makes more sense. Fortunately I found this bug before using interrupts 0x80 and above...

And now I'm struggling with the next problem - GRUB2 doesn't boot from an iso image in Bochs 2.4.6, giving me some error message about an "Unaligned pointer"... The same iso image boots fine in QEMU, VBox and SimNow!, so I guess it will take some time to figure out where the problem is...
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
Post Reply