at86rf2xx radio driver -- potential severe performance issues for multi-hop communication

Andreas_Weigel · 15 October 2015 11:57

Hi everyone,

I've recently executed some experiments on the performance of Atmel's extended operating mode on the ATmega256RFR2 (which afaik is basically the same as the AT82RF2xx transceivers) in multihop collection traffic scenarios, with unslotted CSMA/CA. For those, we used our own framework for WSNs (CometOS). Result is, that the extended operating mode can severely impact performance (in terms of PRR) in the presence of large fragmented datagrams sent over 6LoWPAN. However, I expect that there will be an impact in any traffic scenario with "enough" traffic to get some node's queue filled to some extent (resulting in repeated transmissions towards the same node).

Problem is the property of the TX_ARET mode to discard any incoming transmissions during the backoff phase of the CSMA mechanism. This leads to a large number of lost frames on longer paths (4 hops are enough to produce a large number of losses), even with rather high backoff exponents and the maximum number of retries. This problem caused me to reimplement our mac driver for the ATmega256RFR2 using the basic operating mode.

As far as I can see, the at86rf2xx driver in the riot repo uses the same extended operating mode and therefore will very likely have the very same weakness. If the time is available, it would probably be a good idea to change the driver to use the basic operating mode or alternatively include a note for users explaining the impact of the extended operating mode for said traffic scenarios. The frame solely caused by the extended operating mode can render results of experiments virtually useless (speaking from my own, sad experience ;-/ )

Kind Regards, Andreas Weigel

PS: The results of my experiments are published as "Hardware-Assisted 802.15.4 Transmissions and Why to Avoid Them" (http://link.springer.com/chapter/10.1007%2F978-3-319-23237-9_20). If anyone without free access to the resource is interested in the paper, please let me know and I can provide you the whole article.

Alexander_Aring1 · 15 October 2015 12:13

Hi,

Hi everyone,

I've recently executed some experiments on the performance of Atmel's extended operating mode on the ATmega256RFR2 (which afaik is basically the same as the AT82RF2xx transceivers) in multihop collection traffic scenarios, with unslotted CSMA/CA. For those, we used our own framework for WSNs (CometOS). Result is, that the extended operating mode can severely impact performance (in terms of PRR) in the presence of large fragmented datagrams sent over 6LoWPAN. However, I expect that there will be an impact in any traffic scenario with "enough" traffic to get some node's queue filled to some extent (resulting in repeated transmissions towards the same node).

Problem is the property of the TX_ARET mode to discard any incoming transmissions during the backoff phase of the CSMA mechanism. This leads to a large number of lost frames on longer paths (4 hops are enough to produce a large number of losses), even with rather high backoff exponents and the maximum number of retries. This problem caused me to reimplement our mac driver for the ATmega256RFR2 using the basic operating mode.

As far as I can see, the at86rf2xx driver in the riot repo uses the same extended operating mode and therefore will very likely have the very same weakness. If the time is available, it would probably be a good idea to change the driver to use the basic operating mode or alternatively include a note for users explaining the impact of the extended operating mode for said traffic scenarios. The frame solely caused by the extended operating mode can render results of experiments virtually useless (speaking from my own, sad experience ;-/ )

I am not a riot-dev but did you try to turn off CSMA-CA handling?

You can do that and I think all (except at86rf230) supports it by setting MAX_CSMA_RETRIES to 7 which means no CSMA-CA handling.

I would _not_ use TX_ON mode only, because in this mode the ack request bit in 802.15.4 MAC will be ignored always and sometimes you need to the result of tx trac status if ACK was received or not. (like in possible mlme-ops)

TX_ARET_ON will automatic wait for an ACK if ack request bit is set and not if it isn't set.

Possible solution (maybe also in linux, because I am do a lot of 802.15.4 linux stuff) would be to insert a no CSMA-CA setting which performs no CSMA-CA handling, but still supports ack handling stuff then.

I was thinking about that already about something like that (in linux) for such case, because several transceivers supports for disable CSMA-CA handling.

- Alex

Andreas_Weigel · 15 October 2015 12:38

Hi again,

indeed this should solve the issue -- I have to admit, that I hadn't considered this possibility for our driver, because I adapted the TinyOS MAC layer (using the basic operating mode) and had everything there already, including a handling of the ACKs in software, which works reasonably well.

You will obviously still have to implement the CSMA-CA mechanism then if you need it. Is there already usable (for the at86rf2xx) code available for a CSMA-CA in Riot? Then probably combining it with the existing driver (with deactivated CSMA-CA) would be a good solution.

Regards, Andreas

Alexander_Aring1 · 15 October 2015 12:48

Hi,

Hi again,

indeed this should solve the issue -- I have to admit, that I hadn't considered this possibility for our driver, because I adapted the TinyOS MAC layer (using the basic operating mode) and had everything there already, including a handling of the ACKs in software, which works reasonably well.

Sorry, this sounds awful. Ack handling in software is a mess because you have very timing ciritcal things there. That's why at86rf2xx supports the RX_AACK states which do AACK handling -> ACK handling on the transceiver side.

You will obviously still have to implement the CSMA-CA mechanism then if you need it. Is there already usable (for the at86rf2xx) code available for a CSMA-CA in Riot? Then probably combining it with the existing driver (with deactivated CSMA-CA) would be a good solution.

CSMA-CA handling is the same like above, in my opinion you can not do this in software. Okay you working in a little mcu world and yes you maybe can do this by calculate all factors, spi-bus speed, max interrupt latency, etc. inside the mcu world. If you don't do that and see if it's fit, then don't do that.

CSMA-CA handling is also done by on transceiver side (no software mac implementation required) in TX_ARET mode (except you disable it by change csma retries to 7).

I need to admit, I don't know how RIOT handle timing critical 802.15.4 mac things. On linux, we don't support it in the MAC implementation and we never will do that, because we can't.

- Alex

Daniel_Krebs · 15 October 2015 13:05

Hi Andreas, hi Alexander,

I'm implementing a new MAC layer [1] for RIOT at the moment. It's inspired by TinyOS LPL and ContikiMAC, but not compatible (at least not yet). I'm developing against our at86rf2xx driver as a reference. You can use it afaict the same as if it were using Basic Mode by disabling NETOPT_AUTOACK, so no need to rewrite the driver. Concerning CSMA you can also easily disable CSMA/CA via NETOPT_CSMA if you wish.

I also kind of implemented my own ACKs (called wakeup request - WR) that will be sent with CSMA/CA enabled, but without automatic resending (CSMA_RETRIES = 0). On the other hand I send the payload using AUTOACK so that I can tell if it really arrived at the destination. I couldn't test it on a bigger testbed yet, let alone multi-hop, but I can not yet imagine that performance would suffer that much. Still have to read your article though.

Cheers, Daniel

[1] https://github.com/RIOT-OS/RIOT/pull/3730

Andreas_Weigel · 15 October 2015 18:08

Hi again,

Sorry, this sounds awful. Ack handling in software is a mess because you have very timing ciritcal things there.

It still works and I would not call it a mess. With the software implementation, you can easily use the specified macAckWaitDuration of 864 us (the same is used by the hardware mechanism) and still have a reliably working network.

CSMA-CA handling is the same like above, in my opinion you can not do this in software.

On most transceiver chips you do not have any options but have to implement CSMA in software because they simply do not provide any hardware mechanism for backoff->CCA->backoff again/send -- the only ones I found that DO have such a mechanism in hardware are Atmel's AT86RF2xx series (including ATmega128RFA1/256RFR2 series) and some Microchip transceiver.

On linux, we don't support it in the MAC implementation and

we never will do that, because we can't.

I agree. Neither do I say you should.

I just wanted to point out an performance issue -- which occurs with a certain configuration and in certain traffic scenarios -- to enable other people to avoid having the same trouble I had with interpreting the strongly biased results from a testbed.

Kind Regards, Andreas

Thomas_Eichinger1 · 15 October 2015 19:43

Hi Andreas,

thank you very much to point us to your work. I didn't find time yet to read it but will do for sure. From what you shared with us by now I think this are very useful informations we definitely should consider for our driver.

I have to admit when I once started it, the promises extended operating mode made were too auspicious to not use it, and I never came to run actual multi hop tests on it yet.

At least with the possibility of disabling CSMA-CA chances are we don't have to rewrite it again.

Taken from your description below, the main problem is that the transceiver drops any packets, right? Does IEEE802.15.4 specify anything on this?

Anyhow, I will try to get your article tomorrow and thanks again for sharing this with us.

Best, Thomas

Andreas_Weigel · 16 October 2015 10:19

Hi Thomas,

Taken from your description below, the main problem is that the transceiver drops any packets, right?

as soon as the the transceiver enters TX_ARET, it will not receive any incoming packets (I guess this is due to having only one buffer available for TX and RX, and the requirement that the data is present in the buffer before entering TX_ARET) and I do not know any possibility to change this*.

A --> B --> C

During the automated backoff, the receiver is so to say "deaf". In a multihop scenario as above, when a node has to transmit several frames along the same path (e.g., because a larger datagram is fragmented by 6LoWPAN, as in my case), you repeatedly get a situation where B is already trying to forward to C, while A is trying to send to B its next fragment. But B cannot hear what A is sending as long as it is in its own backoff phase. Factoring in C, which also tries to access the channel, the probability for A to finally give up after seven retries, is highly increased.

Mean thing about this is, that, depending on the traffic scenario (I used really extremely large datagrams of 1200 bytes, which caused 20% PRR (extended operating mode) vs. 97% PRR (software MAC)), the effect will be from minor to dramatic, but the probability for additional retries will certainly always increase.

Does IEEE802.15.4 specify anything on this?

AFAIK, 802.15.4 does not specify anything about the capability of a sender to receive during it\s backoff phase, but I'm quite sure that it does not forbid it in any way.

Regards, Andreas

*if you find any method I have overlooked in the datasheet, please let me know

Oleg1 · 16 October 2015 17:25

Hi Alex, hi all!

First of all, I have to apologize for possible spelling and formatting mistakes. I'm not used to write mails on the phone but currently I don't have my laptop.

CA handling is also done by on transceiver side (no software mac implementation required) in TX_ARET mode (except you disable it by change csma retries to 7).

As Andreas wrote in the other mail: it is quite common in the area of WSNs and IoT to implement ACKs and CSMA (and sometimes even address recognition) in software because many transceivers (particular older and cheaper ones) simply don't support this in hardware. However, I second you in having this handled in hardware if available should usually be preferred for efficiency and performance. But of course only if the hardware support is working well.

Cheers, Oleg

Oleg1 · 16 October 2015 17:32

Hi Daniel!

I couldn't test it on a bigger testbed yet, let alone multi-hop, but I can not yet imagine that performance would suffer that much. Still have to read your article though.

I hope you are aware of the great and open IoT-LAB testbed where everyone can conduct their experiments. The M3 nodes there are using the at86rf231 radio and should be one of best supported platforms in RIOT. Plus, the testbed is integrated into our build system which makes running the experiments very simple.

Cheers, Oleg

Cheers

Oleg1 · 16 October 2015 17:38

Hi Thomas!

I have to admit when I once started it, the promises extended operating

mode made were too auspicious to not use it, and I never came to run actual multi

hop tests on it yet.

I did run various experiments on the testbed, both with multiple hops and payloads of more than 1kB. However, I'm not quite sure if I also tested the combination over more than two hops. We should try this asap on IoT-LAB using some static routes.

Cheers, Oleg