Current best practice for sleeping in a device driver?

Hi,

Sorry for my ignorance, but what is the current best practice to sleep in a blocking device driver, waiting for an interrupt to take place? I couldn't find anything reasonable.

In many other context I've seen people to use `sleep` and `wakeup`, and even went that far that I tried to introduce a variant of `thread_sleep` that can be entered with interrupts disabled (see below), but couldn't get it working.

Here is an excerpt of my current code:

void isr_saadc(void) {
    NRF_SAADC->EVENTS_END = 0;
    NRF_SAADC->INTENCLR = SAADC_INTEN_DONE_Msk;
    thread_wakeup(sleeper);
}

int adc_sample(adc_t line, adc_res_t res)
{
    ...
    /* enable EVENTS_END interrupt */
    const uint32_t irqs = irq_disable();
    sleeper = thread_getpid();
    NRF_SAADC->INTENSET = SAADC_INTEN_END_Msk; /* Disabled at ISR handler */

    ...

    /* Wait for the interrupt */
    thread_sleep_with_interrupts_disabled(irqs);

This doesn't work. The thread doesn't wake up, even though the interrupt does take place and `thread_wakeup` states that the thread _is_ sleeping when called:

Created thread idle. PID: 1. Priority: 15.
Created thread main. PID: 2. Priority: 7.
main(): This is RIOT! (Version: 2018.10-devel-585-gd2de4-dhcp-85-86.eduroam.aalto.fi-feature-refactor-ble-core)
Entering main function.
Entering main loop.
thread_wakeup: Trying to wakeup PID 2...
thread_wakeup: Thread is sleeping.
thread_wakeup: Trying to wakeup PID 2...
thread_wakeup: Thread is sleeping.

And then nothing happens.

Here is the tentative patch to introduce `thread_sleep_with_interrupts_disabled`: ``` diff --git a/core/include/thread.h b/core/include/thread.h index 64a828ab3..4abdb3d25 100644 --- a/core/include/thread.h +++ b/core/include/thread.h @@ -399,6 +399,20 @@ int thread_getstatus(kernel_pid_t pid);   */ void thread_sleep(void);

+/** + * @brief Puts the current thread into sleep mode with the interrupts already disabled. + * + * @param[in] interrupt_state Saved interrupt state from irq_disable(). + * + * Places the calling thread to sleep and re-enables interrupts to + * the `interrupt_state` before calling thread_yield_higher(). + * Meant to be called from drivers that want to be waken up + * from an interrupt routine. + * The calling thread to be woken up externally. + * + */ +void thread_sleep_with_interrupts_disabled(int interrupt_state);

Hi Pekka, What we have done in several other places is to use a mutex. The device driver blocking code will attempt to lock the mutex twice, and the ISR will unlock it to let the thread continue. Does that work in your use case?

Best regards, Joakim

Den tis 11 sep. 2018 16:40Nikander Pekka <pekka.nikander@aalto.fi> skrev:

Hi Joakim,

I tried also exactly that, but for some reason I couldn’t make it work. When I tried to unlock it in the ISR, the thread waiting for the mutex for the second time still didn’t acquire it, but was blocked again in the scheduler. I didn’t dive deep into the scheduler to see why. Maybe I did some mistake.

Could you please point to some example code?

–Pekka

Hi again, Are you on a cortex m platform? Did you remember to add cortexm_isr_end() at the end of the ISR function? /Joakim

Den tis 11 sep. 2018 20:21Nikander Pekka <pekka.nikander@aalto.fi> skrev:

Thanks Joakim,

No, I didn’t know about cortexm_isr_end(). (See also in the end.) Adding it, locking mutex twice and releasing it in the ISR handler works now.

However, if I use the same mutex as the driver is using anyway, there is a potential conflict window:

void isr(void) {
mutex_unlock(&lock);
cortex_isr_end();
}

int blocking_drive_function(...) {
/* Block the driver from other threads */
mutex_lock(&lock);
....
/* Prevent the interrupt from happening too early */
uint32_t irq_state = irq_disable();
....
/* Wait for an interrupt */
irq_restore(irq_state);
// NOTE: Potential conflict window here
mutex_lock(&lock);
...
/* Release the driver for other threads */
mutex_unlock(&lock);
}

If the IRQ takes place early, it will be served between irq_restore() and mutex_lock(). The driver appears to work, since the ISR handler releases the mutex and the blocking function just reacquires the mutex without blocking.

However, if a HIGHER priority thread is also ready to run and accesses the driver, it will get the mutex before the waiting thread, allowing two threads into the driver at the same time. Not good.

Of course, I can use a separate mutex and lock it twice. But that is not beautiful.

Furthermore, locking a mutex twice from the same thread is not considered a standard practice. Some mutex implementations even give you a panic if you do that. Hence, I suspect not many people think of it as a potential solution.

Hence, I would suggest that we add a couple of new functions, for example:

void thread_sleep_keep_mutex(mutex_t *m); void thread_wakeup_mutex(mutex_t *);

The first one would basically just lock the mutex second time, but it could be called with interrupts already disabled. It would enable interrupts while the thread is sleeping, and disable them again before returning, if they were disabled when called. As an additional safeguard, it would check that the calling thread already possesses the mutex.

The second one would wakeup one thread sleeping on the mutex. In practice, it would be just syntactic sugar over mutex_unlock().

If that makes sense, I can make an initial implementation and open a PR.

Unfortunately I don’t have time to properly test it or writing test cases. For that I would need some help.

Now, looking at _mutex_lock() in this light, I started to wonder if there is a similar potential conflict window:


else if (blocking) {
thread_t *me = (thread_t*)sched_active_thread;
DEBUG("PID[%" PRIkernel_pid "]: Adding node to mutex queue: prio: %"
PRIu32 "\n", sched_active_pid, (uint32_t)me->priority);
sched_set_status(me, STATUS_MUTEX_BLOCKED);
if (mutex->queue.next == MUTEX_LOCKED) {
mutex->queue.next = (list_node_t*)&me->rq_entry;
mutex->queue.next->next = NULL;
}
else {
thread_add_to_list(&mutex->queue, me);
}
irq_restore(irqstate);
------>
thread_yield_higher();
/* We were woken up by scheduler. Waker removed us from queue.
* We have the mutex now. */
return 1;

Could bad things happen there between irq_restore() and thread_yield_higher(), as the interrupts may be enabled before PendSV is set and the thread stack changed (on Cortex-M)? I am not sure, but I suspect so.

AFAICS, there is at least a potential performance issue here, since if there is an interrupt there that does not send PendSV, the usual PendSV interrupt optimisation will not take place.

BTW, w.r.t. cortexm_isr_end(), I didn’t even think of needing to set PendSV that since in my Cortex-M scheduler (several years ago) I set the PendSV flag directly in the scheduler and synchronisation primitives themselves, whenever needed.

Indeed, looking at the code, I think the sched_context_switch_request should be replaced a suitable inline function.

Should I open an issue on replacing sched_context_switch_request with inline functions for setting, clearing, and testing it? Once the inline functions are in place, the code could then be optimised for various CPUs, allowing e.g. the Cortex-M support to get rid of cortexm_isr_end().

–Pekka

Hi Joakim, all,

It turns out that locking the mutex twice sometimes blocks the thread. Or at least it does that in my implementation, when running with multiple other threads.

My updated nRF52 ADC driver can be found here:

   https://github.com/AaltoNEPPI/RIOT/blob/feature-nrf52-adc-interrupts/cpu/nrf52/periph/adc.c#L171

The commit that changes from polling to interrupts is here:

  https://github.com/AaltoNEPPI/RIOT/commit/4eaf9ff275b903bf2eab97874cc6171e2cbbc974

I'm currently working on this on our own custom board (made by Aalto students) with custom peripherals, so there isn't any easy way I could provide a test program. But I'll make one unless I manage to fix this.

When running just one thread, the ADC driver works just fine. However, when I combine it with my main program, with multiple threads, the "CT" thread that calls the ADC driver blocks at some point. All other threads continue running and the system is stable.

ps shows as follows:

	pid | name                 | state    Q | pri | stack  ( used) | base addr  | current     
	  - | isr_stack            | -        - |   - |   8192 (  712) | 0x200022b8 | 0x20004268
	  1 | idle                 | pending  Q |  15 |   1024 (  192) | 0x20004abc | 0x20004dfc 
	  2 | main                 | bl rx    _ |   7 |   1536 (  612) | 0x200044bc | 0x2000499c 
	  3 | BLE                  | bl rx    _ |   2 |   2048 ( 2044) | 0x20006080 | 0x20006794 
	  4 | led_thread           | bl rx    _ |   6 |   2048 ( 2044) | 0x200068fc | 0x2000700c 
	  5 | MPU                  | bl rx    _ |   5 |   2048 ( 2044) | 0x20004f4c | 0x20005644 
	  6 | CT                   | bl mutex _ |   8 |   2048 ( 2044) | 0x20007114 | 0x200077cc 
	  7 | shell                | running  Q |   8 |   2048 ( 2044) | 0x200057fc | 0x20005e14 
	    > SUM                  |            |     |  20992 (11736)

There are no other mutexes in the CT thread; hence, I'm 99.9% sure that it blocks at the ADC driver mutex.

I'll proceed next to implement what I suggested earlier:

Hence, I would suggest that we add a couple of new functions, for example:

   void thread_sleep_keep_mutex(mutex_t *m);    void thread_wakeup_mutex(mutex_t *);

The first one would basically just lock the mutex second time, but it could be called with interrupts already disabled. It would enable interrupts while the thread is sleeping, and disable them again before returning, if they were disabled when called. As an additional safeguard, it would check that the calling thread already possesses the mutex.

The second one would wakeup one thread sleeping on the mutex. In practice, it would be just syntactic sugar over mutex_unlock().

I'll report what happens.

--Pekka

Hi Pekka,

It turns out that locking the mutex twice sometimes blocks the thread. Or at least it does that in my implementation, when running with multiple other threads.

If I understand correctly, you'd like to block a user thread until an ISR arrives, and there might be multiple user threads, right?

// initialize mutex locked mutex_t isr_lock = MUTEX_INITIALIZE_LOCKED;

isr() {   // to wake thread:   mutex_unlock(&isr_mutex); }

user_thread() {   // I guess only one user thread should interact with the device   mutex_lock(&driver_mutex);

  // this blocks until ISR unlocks   mutex_lock(&isr_mutex);

  // possibly disable ISR here

  ... // handle ISR

  if (mutex_trylock(&isr_mutex)) {     // mutex was unlocked since last lock     // handle if necessary   }      // posibly restore ISR here   // release driver   mutex_unlock(&driver_mutex); }    Does this capture your use case?

Kaspar

Hi Kaspar,

Well, that has the drawback of having two mutexes per driver. But I'll try that first instead, as the RIOT mutexes don't seem to keep track of who is holding them, only who are waiting on them.

--Pekka

Hi again Pekka, see my replies inline below.

Thanks Joakim,

No, I didn't know about cortexm_isr_end(). (See also in the end.) Adding it, locking mutex twice and releasing it in the ISR handler works now.

However, if I use the same mutex as the driver is using anyway, there is a potential conflict window:

   void isr(void) {
       mutex_unlock(&lock);
       cortex_isr_end();
   }

   int blocking_drive_function(...) {
       /* Block the driver from other threads */
       mutex_lock(&lock);
       ....
       /* Prevent the interrupt from happening too early */
       uint32_t irq_state = irq_disable();
       ....
       /* Wait for an interrupt */
       irq_restore(irq_state);
       // NOTE: Potential conflict window here
       mutex_lock(&lock);
       ...
       /* Release the driver for other threads */
       mutex_unlock(&lock);
   }

If the IRQ takes place early, it will be served between irq_restore() and mutex_lock(). The driver appears to work, since the ISR handler releases the mutex and the blocking function just reacquires the mutex without blocking.

However, if a HIGHER priority thread is also ready to run and accesses the driver, it will get the mutex before the waiting thread, allowing two threads into the driver at the same time. Not good.

Of course, I can use a separate mutex and lock it twice. But that is not beautiful.

A separate mutex to ensure mutual exclusion for the driver calls would be preferable IMO. The ISR can only serve one thread at a time anyway, so it does not seem like your driver will be usable from parallel threads.

Furthermore, locking a mutex twice from the same thread is not considered a standard practice. Some mutex implementations even give you a panic if you do that. Hence, I suspect not many people think of it as a potential solution.

The mutex here is not used as a traditional mutex, but rather as a binary semaphore. Right now there is no semaphore API in RIOT, but there may be a need for it. A different approach to achieve a binary semaphore in RIOT is to use thread flags (http://api.riot-os.org/group__core__thread__flags.html#gab3113f4c21484922b730e372eae6dc0a). Then you will need to pass the PID number to the ISR instead of a mutex pointer See the Kinetis I2C driver for one example of using thread flags.

Hence, I would suggest that we add a couple of new functions, for example:

   void thread_sleep_keep_mutex(mutex_t *m);    void thread_wakeup_mutex(mutex_t *);

The first one would basically just lock the mutex second time, but it could be called with interrupts already disabled. It would enable interrupts while the thread is sleeping, and disable them again before returning, if they were disabled when called. As an additional safeguard, it would check that the calling thread already possesses the mutex.

The second one would wakeup one thread sleeping on the mutex. In practice, it would be just syntactic sugar over mutex_unlock().

If that makes sense, I can make an initial implementation and open a PR.

Unfortunately I don't have time to properly test it or writing test cases. For that I would need some help.

I too have far too little time to spend on these things.

Now, looking at _mutex_lock() in this light, I started to wonder if there is a similar potential conflict window:

    else if (blocking) {
        thread_t *me = (thread_t*)sched_active_thread;
        DEBUG("PID[%" PRIkernel_pid "]: Adding node to mutex queue: prio: %"
              PRIu32 "\n", sched_active_pid, (uint32_t)me->priority);
        sched_set_status(me, STATUS_MUTEX_BLOCKED);
        if (mutex->queue.next == MUTEX_LOCKED) {
            mutex->queue.next = (list_node_t*)&me->rq_entry;
            mutex->queue.next->next = NULL;
        }
        else {
            thread_add_to_list(&mutex->queue, me);
        }
        irq_restore(irqstate);
------>
        thread_yield_higher();
        /* We were woken up by scheduler. Waker removed us from queue.
         * We have the mutex now. */
        return 1;

Could bad things happen there between irq_restore() and thread_yield_higher(), as the interrupts may be enabled before PendSV is set and the thread stack changed (on Cortex-M)? I am not sure, but I suspect so.

I did not have time to investigate this, but will there really be an issue when the current thread is already blocked inside the critical section with IRQs disabled?

AFAICS, there is at least a potential performance issue here, since if there is an interrupt there that does not send PendSV, the usual PendSV interrupt optimisation will not take place.

Not sure what you mean by the "usual PendSV interrupt optimization". It was a while since I looked at the scheduler code.

BTW, w.r.t. cortexm_isr_end(), I didn't even think of needing to set PendSV that since in my Cortex-M scheduler (several years ago) I set the PendSV flag directly in the scheduler and synchronisation primitives themselves, whenever needed.

Platform specific optimized synchronization primitives could possibly yield a performance boost, but we need to make sure that there are comprehensive regression tests that can detect any deviations from the generic implementation behavior. This may take significant effort to implement...

Indeed, looking at the code, I think the `sched_context_switch_request` should be replaced a suitable inline function.

Should I open an issue on replacing `sched_context_switch_request` with inline functions for setting, clearing, and testing it? Once the inline functions are in place, the code could then be optimised for various CPUs, allowing e.g. the Cortex-M support to get rid of cortexm_isr_end().

Getting rid of cortexm_isr_end would help reducing hard to spot mistakes in the periph drivers, but this is a very important piece of code and any changes must be made carefully and with confidence. I am not sure how to create an automatic test for this, since there are so many different drivers which affect this flag and all are related to specific hardware modules in the CPU.

/Joakim