Diagnose Error with two threads blocking on receive

Hi,

I am new to RIOT and really impressed with the Getting Started. My project is to build Tony Hoare CSP (Concurrent Sequential Processes) on RIOT and I am trying to get to grips with the behaviour of RIOT threads and message passing. The classic CSP needs threads to block on both msg_send and msg_rcv.

I am surprised by the behaviour of this code in RIOT (see below)

What I expect is to have:

  • proc1 blocks on msg_rcv
  • proc2 msg_send to proc1 succeeds and both procs are now scheduled
  • proc1 blocks on msg_send -or- proc2 blocks on msg_rcv (doesn’t matter, the other one proceeds)
  • when the other proc gets there the counterbak message succeeds
  • and so it goes on

What I see is actually:

  • proc1 (p4) does msg_receive & blocks
  • proc2 (p6) does msg_send, succeeds & continues
  • proc2 does msg_receive & blocks

So if I go ps in the term, I see that both proc1 and proc2 are blocked on receive.

It looks like proc1 doesn’t get (re)scheduled after the first loop.

I have tried swapping priorities and setting to the same priority with no change.

Please can someone help me get a mental model for what is happening?

~librasteve

#include <stdio.h>
#include <string.h>

#include "shell.h"
#include "thread.h"
#include "msg.h"
#include "xtimer.h"

static kernel_pid_t proc1_pid;
static char proc1_stack[THREAD_STACKSIZE_DEFAULT];

static kernel_pid_t proc2_pid;
static char proc2_stack[THREAD_STACKSIZE_DEFAULT];

static void *proc1(void *arg)
{
    (void)arg;

     msg_t msg_out, msg_bak;

     while (1) {
         msg_receive(&msg_out);
         printf("Received by proc1 %" PRIu32 "\n", msg_out.content.value);

         msg_bak.content.value = msg_out.content.value * 2;
         msg_send(&msg_bak, proc2_pid);
         printf("Sent by proc1 %" PRIu32 "\n", msg_bak.content.value);
     }

    return NULL;
}

static void *proc2(void *arg)
{
    (void)arg;

    msg_t msg_out, msg_bak;

    msg_out.content.value = 0;

    while(1) {
        msg_send(&msg_out, proc1_pid);
        printf("Sent by proc2 %" PRIu32 "\n", msg_out.content.value);
        msg_out.content.value++;

        msg_receive(&msg_bak);
        printf("Received by proc2 %" PRIu32 "\n", msg_bak.content.value);

        xtimer_sleep(2);   //optional just to see what is happening
    }

    return NULL;
}

int main(void)
{
    puts("This is mimod7");

    proc1_pid = thread_create(proc1_stack, sizeof(proc1_stack),
                            THREAD_PRIORITY_MAIN - 3, 0, proc1, NULL, "proc1");
    proc2_pid = thread_create(proc2_stack, sizeof(proc2_stack),
                            THREAD_PRIORITY_MAIN - 1, 0, proc2, NULL, "proc2");

    char line_buf[SHELL_DEFAULT_BUFSIZE];
    shell_run(NULL, line_buf, SHELL_DEFAULT_BUFSIZE);

    return 0;
}

Hi @librasteve and welcome to the RIOT community!

I think there’s a data race with the globals used to pass the PIDs. Mind you, the first thread_create() starts a thread with a higher priority (lower value) than the main thread, so that thread starts right away (before returning to main). At that point, proc2_pid is unset. The compiler is free to read the global within proc1() at any time, so it might read garbage.

When you add THREAD_CREATE_WOUT_YIELD to the flags field of the thread_create() calls, and add a manual thread_yield() after them, your code works as expected.

Hope this helps.

Ah - thanks @Kaspar that is very helpful and all is working now.