Cond_wait() function rationale

Hi,

While reviewing your implementation of conditional variables I didn’t understand the rationale behind when the mutex is given (or unlock, as you called it):

void cond_wait(cond_t *cond, mutex_t *mutex)
{
    unsigned irqstate = irq_disable();
    thread_t *me = thread_get_active();

    mutex_unlock(mutex);                       //// Why did you give the mutex here...
    sched_set_status(me, STATUS_COND_BLOCKED);
    thread_add_to_list(&cond->queue, me);      
    irq_restore(irqstate);                     //// ... and not here ...
    thread_yield_higher();
                                               //// ... or here?
    /*
     * Once we reach this point, the condition variable was signalled,
     * and we are free to continue.
     */
    mutex_lock(mutex);
}

If you release the mutex before you insert the thread into the queue there’s the risk of corrupting it if other thread aquires the mutex and inserts or retrieves an item before such insert.

Thank you!

Welcome @fjrg76. First a disclaimer, that I am not the implementer of that function and only what I state is only conjecture, but due to irq_disable() there can not happen any context/thread changes, as the context switching in RIOT is handled in ISR. So as soon irq_disable() is called, there is no other thread able to acquire the mutex. As far as I can see it is even safer to unlock the mutex within the IRQ-disabled section (which ends with irq_restore(irqstate) as doing it outside could cause a race condition.

This is generally true, but mutex_unlock() is an exception. It actively calls sched_switch() on the prio of the next waiting thread in the “waits for this mutex” list (which is sorted by prio).

It seems worth to validate that this is indeed correct, and if so, add some additional comments.

1 Like

@miri64 and @maribu, thanks for your promtp reply.

Let’s see the mutex_unlock() code:

void mutex_unlock(mutex_t *mutex)
{
    unsigned irqstate = irq_disable();

    DEBUG("PID[%" PRIkernel_pid "] mutex_unlock(): queue.next: %p\n",
          thread_getpid(), (void *)mutex->queue.next);

    if (mutex->queue.next == NULL) {
        /* the mutex was not locked */
        irq_restore(irqstate);
        return;
    }

    if (mutex->queue.next == MUTEX_LOCKED) {
        mutex->queue.next = NULL;
        /* the mutex was locked and no thread was waiting for it */
        irq_restore(irqstate);
        return;
    }

    list_node_t *next = list_remove_head(&mutex->queue);

    thread_t *process = container_of((clist_node_t *)next, thread_t, rq_entry);

    DEBUG("PID[%" PRIkernel_pid "] mutex_unlock(): waking up waiting thread %"
          PRIkernel_pid "\n", thread_getpid(),  process->pid);
    sched_set_status(process, STATUS_PENDING);

    if (!mutex->queue.next) {
        mutex->queue.next = MUTEX_LOCKED;
    }

    uint16_t process_priority = process->priority;

    irq_restore(irqstate);
    sched_switch(process_priority);
}

The first instruction saves the IRQ’s state, which happens to be, for the analysis we’re doing, disabled, so IRQs will be disabled when abandoning the function. I guess the last instruction might signals that there must be a context switch in the next availabe system tick. So what @miri64 pointed out seems to be correct: as the IRQs were disabled in cond_wait() there aren’t any chances that the CPU is preempted when calling the mutex_unlock() function until cond_wait() enables them again.

Note that even though the CPU is not interrupted by IRQs, the calling thread can still be preempted as mutex_unlock() explicitly calls sched_switch() (last line of the function). Without any closer investigation, I think it is possible that in this corner case the higher priority thread could indeed signal the condition variable prior to thread_add_to_list(&cond->queue, me) is called. It would be nice to extend the unit test to construct just this corner case.

I also think that reordering the instructions would allow to get rid of disabling IRQs altogether, as only the owner of the mutex is allowed to operate on the cond variable, right? Hence, no concurrent accesses are done anyway.

2 Likes

Hi,

You’re right. sched_switch() calls thread_yield_higher() whenever a regular task made the call (e.g., not from inside a ISR). thread_yield_higher() in turn calls arquitecture_specific_context_save() , arquitecture_specific_sched_run() and arquitecture_specific_context_restore(), in that specific order. That makes me think that mutex_unlock() actually performs a context switch due to in arquitecture_specific_sched_run() it retrieves a new context.

Let me insert the original code again:

void cond_wait(cond_t *cond, mutex_t *mutex)
{
    unsigned irqstate = irq_disable();
    thread_t *me = thread_get_active();

    mutex_unlock(mutex);                       //// Potential context switch!!!
    sched_set_status(me, STATUS_COND_BLOCKED);
    thread_add_to_list(&cond->queue, me);      
    irq_restore(irqstate );
    thread_yield_higher();

    /*
     * Once we reach this point, the condition variable was signalled,
     * and we are free to continue.
     */
    mutex_lock(mutex);
}

I can’t suggest any new order, because you are the ones that wrote the kernel and you have the testbeds, but mutex_unlock() might be placed after irq_restore(irqstate) (ups, I did it!).