SMP and the lost + wake-up problem

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
kzinti
Member
Member
Posts: 898
Joined: Mon Feb 02, 2015 7:11 pm

Re: SMP and the lost + wake-up problem

Post by kzinti »

Sounds reasonable. Essentially you are adding a spinlock in the TCB to say that the thread is actually running.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: SMP and the lost + wake-up problem

Post by thewrongchristian »

kzinti wrote:I might have found another problem... You could end up with a CPU picking up a "suspended" thread that is not yet really suspended:

Code: Select all

CPU 1 thread A                  CPU 2 thread B

A: enter monitor
A: try to get mutex (fail)
A: queue in wait list
A: leave monitor
                                 B: enter monitor
                                 B: release mutex
                                 B: wakeup thread A
                                 B: leave monitor
                                 B: schedule()
                                 A: thread A starts to run
A: schedule()
C: thread C starts to run
How would your code prevent the above from happening? This is basically revisiting something I previously mentioned: when you add the current thread the the monitor's wait list, it is still running. The CPU is still using it's stack and address space at a minimum. That stack should not be used by another CPU until your thread switch completes on the current CPU.

PS: I am not trying to point out flaws in your implementation, I am just trying to fix mine and understand how to do it :).
I'm just glad I've got other eyes on my code. Much of this is virgin code that has not seen the light of day in the real world, it being unreleased as it is.

I think you're right, and your later suggestion of a spin lock in the TCB itself would be a perfect solution to this.

Thanks!
kzinti
Member
Member
Posts: 898
Joined: Mon Feb 02, 2015 7:11 pm

Re: SMP and the lost + wake-up problem

Post by kzinti »

Of course the problem with this spinlock or flag inside the TCB is that after the switch, you need to know what was the task that was running before the switch. I am not sure I like this.

I've also looked into using futexes and I am not sure it solves the problem. But in light of the above, I might try to implement futexes nevertheless since I want them in the end for my user space locking needs.
Post Reply