To global seed or not to global seed

Cenk_Gundogan1 · 7 March 2017 23:18

Dear RIOTers,

for quite some time now we have semi-active discussions (mostly on GitHub) about various PRNG and seeding concepts. In this mail I merely want to focus on the scope of our PRNG implementations' seeds/states.

Currently, all function calls in our PRNG implementations and the abstraction layer `sys/random` do *not* allow to pass a custom state. This means, there is *one* global state, that is overwritten with successive calls to `random_init()`, not necessarily from the same thread. At the moment, `auto_init` is the only instance, that calls `random_init` (albeit with `0`, but this is another story). So everything seems to be OK for now (?), but will it stay this way? In the current state, and to guarantee deterministic PRNG sequences, all calls to `random_init()` *must* be ignored, but not the first. Of course, this can be done in various ways ..

1) we can define it as BCP to *not* use `random_init()` if `auto_init` is used => it's hard to guarantee a one-time call to `random_init()` as human do surely err (especially if several nested modules are involved).

2) check within `random_init()` for an unitialized state and only initialize if such a state is prevalent. Ignore the call to `random_init()` if the PRNG was initialized before. => this introduces a further check that is done with each call to `random_init()` and feigns to the user a freshly initialized PRNG.

In contrast to the current procedure of having a global state, we rather should opt to allow local states for each thread (not excluding a global state). Although I have no special use case in mind at the moment, I believe that it should be possible to let user applications initialize "their own PRNG" to produce deterministic sequences. With the current approach, the sequence of our PRNG is shared with all available threads. This means, a user application would share the sequence with the network stack's NDP, RPL, TCP, ... implementations.

IMO, our random API needs to be extended to provide function definitions that also allow the passing of local seed states. However, different PRNGs use different kind / different numbers of seeds. A "local scope" seed struct would need to encapsulate enough information to be usable with all PRNGs; and that's the part that deserves some thinking.

What is your opinion on having / allowing local scope seed states?

Cheers, Cenk

Kaspar · 8 March 2017 08:09

Hey,

we rather should opt to allow local states for each thread (not excluding a global state).

Interesting. Up to now our trouble with RNGs was mostly on how to make them more random. Now we're trying to make them predictable. What's your use case for that?

How about an interface a la

rand_init_<RNGname>(rnd_<RNGname>_t *rnd); rand_seed32(rnd_t *rnd, uint32_t); rand_seed(rnd_t *rnd, const uint8_t *in, size_t len); rand_get(rnd_t *rnd, uint8_t *out, size_t n); rand_get32(rnd_t *rnd);

typedef struct { <seed, get function pointer> } rnd_t;

typedef struct { rnd_t rnd; <tinymt32-state>; } rnd_tinymt32_t;

That way we'd have:

- user controlled state - the ability to overload (e.g., combine hwrng, collected entropy, prng but with the same interface)

Kaspar

Cenk_Gundogan1 · 8 March 2017 09:06

Heyho,

Hey,

> we rather > should opt to allow local states for each thread (not excluding a global > state).

Interesting. Up to now our trouble with RNGs was mostly on how to make them more random. Now we're trying to make them predictable. What's your use case for that?

Well .. PRNGs are predictable with enough knowledge (algorithm used + seed) and an enthusiastic observer (or someone who reached divine omniscience) can even derive numbers by looking at the sequence across several runs and constant seeds. The trouble, IMO, is seeding those PRNGS randomly, but that's not what I wanted to focus on in this mail. Sorry, if I gave this kind of impression.

My point is: a user application may need a specific seed to have a deterministic sequence (especially interesting for simulation/testing). And such determinism is removed from the application's POV, if NDP, RPL, or TCP are also advancing the pointer in the sequence.

How about an interface a la

rand_init_<RNGname>(rnd_<RNGname>_t *rnd); rand_seed32(rnd_t *rnd, uint32_t); rand_seed(rnd_t *rnd, const uint8_t *in, size_t len); rand_get(rnd_t *rnd, uint8_t *out, size_t n); rand_get32(rnd_t *rnd);

typedef struct { <seed, get function pointer> } rnd_t;

typedef struct { rnd_t rnd; <tinymt32-state>; } rnd_tinymt32_t;

That way we'd have:

- user controlled state - the ability to overload (e.g., combine hwrng, collected entropy, prng but with the same interface)

Looks good at first sight. We also would need some sort of synchronization for concurrent access, e.g. a mutex in the `rnd_t` struct, if two threads should use the same local state.

Cheerio! Cenk

Kaspar · 8 March 2017 09:13

Hey,

How about an interface a la

Looks good at first sight. We also would need some sort of synchronization for concurrent access, e.g. a mutex in the `rnd_t` struct, if two threads should use the same local state.

Do we need that kind of synchronization if all state is localized in the overloaded rnd_t, at that level?

Applications *seeding* the same state in multiple threads will be rare and can synchronize externally.

Whether "reading" from the rng needs to be synchronized should be up to the RNG implementation.

Kaspar

Oleg_Hahm · 8 March 2017 09:21

Dear Cenk,

thanks for bringing up this discussion.

1) we can define it as BCP to *not* use `random_init()` if `auto_init` is used => it's hard to guarantee a one-time call to `random_init()` as human do surely err (especially if several nested modules are involved).

Basically, we have similar problems for other modules that should (or even must) not be initialized twice. So far, my/our take on this was to document this rather than programmatically prevent this. The memory overhead is small, but existent, the runtime overhead is probably negligible for a function that should not be called more than once.

In contrast to the current procedure of having a global state, we rather should opt to allow local states for each thread (not excluding a global state).

Is testing and simulation the only use case you can imagine? I'm somewhat reluctant to add code just for non-production purposes.

Cheers, Oleg

Ludwig_Knupfer · 8 March 2017 09:25

Hi,

Ludwig_Knupfer · 8 March 2017 09:28

Hi,

Oleg_Hahm · 8 March 2017 09:30

Hi Ludwig!

tcschmidt · 8 March 2017 09:51

Hi Oleg, Ludwig,

Hi Ludwig!

Is testing and simulation the only use case you can imagine? I'm somewhat reluctant to add code just for non-production purposes.

Since we outspokenly target researchers with RIOT this is a production feature.

However we might want to move this feature into a dedicated implementation of the same interface.

Thanks, that was more or less what I meant.

I'm not sure whether we 'over-complicate' the issue.

There are good PRNGs with state space of one (or very few) int, call it seed. So if this function is called as 'rand(seed)', rand can be stateless ... and seed is allocated memory of the application. Seeds could be initially generated by 'random_init(seed)'.

At the same time, one can overload with a stateful 'rand()' which is auto-initialized and convenient ... and for those who don't care for more control. IMO, this would not need a 'random_init()' user call.

Cheers, Thomas

Btw: David Jones (UCL Bioinformatics) writes "Rule #1: Do not use system generators. ... Almost all of these generators are badly flawed. Even when they are not, there is no guarantee that they were not flawed in earlier releases of the library."

Mathias_Tausig · 8 March 2017 14:39

Hey,

> > we rather > should opt to allow local states for each thread (not excluding a global > state). Interesting. Up to now our trouble with RNGs was mostly on how to make them more random. Now we're trying to make them predictable. What's your use case for that?

Using the random numbers for a stream cipher, for instance.

How about an interface a la

rand_init_<RNGname>(rnd_<RNGname>_t *rnd); rand_seed32(rnd_t *rnd, uint32_t); rand_seed(rnd_t *rnd, const uint8_t *in, size_t len); rand_get(rnd_t *rnd, uint8_t *out, size_t n); rand_get32(rnd_t *rnd);

typedef struct { <seed, get function pointer> } rnd_t;

typedef struct { rnd_t rnd; <tinymt32-state>; } rnd_tinymt32_t;

That way we'd have:

- user controlled state - the ability to overload (e.g., combine hwrng, collected entropy, prng but with the same interface)

Look good, imo.

cheers Mathias

Daniel_Krebs · 8 March 2017 18:34

In contrast to the current procedure of having a global state, we rather should opt to allow local states for each thread (not excluding a global state).

Is testing and simulation the only use case you can imagine? I'm somewhat reluctant to add code just for non-production purposes.

Deterministic PRNGs can also be used for the implementation of certain MAC protocols for instance in channel assignment (e.g. frequency hopping schedule, time slot assignment).

Cheers, Daniel