CoAP remote shell?

As I think it’d distract too much from the CoAP security thread I’d like to pull the CoAP remote shell topic out; @Kaspar and @miri64 made very good points there, and it’s not the first time this comes up.


My (also historical, but largely maintained) position on the shell over network is shaped by two observations:

  1. I’m personally not a heavy shell user on RIOT (I am on Linux, but that’s a different story).
  2. Shell commands are diverse beasts. Many of them behave as “single input, single ouput” (with the output being practically immediate / one could block everything but interrupts until it’s here), but there are progressiv-output commands (ping), commands with delayed output (CoAP client) and possibly interactive commands.

Their diversity makes it hard to abstract over them, and makes any sufficiently powerful CoAP interface either complex (different interaction patterns for different beasts of commands) or ugly (emulating a stream of input and output).

The road I’d like to take on the shell is to provide idiomatic remote replacements for shell commands. Some work on that I’ve already started in the Rust riot-coap-handler-demos. These already provide a simple ps, an even more minimal netif (like ifconfig; ugly addresses shown here because CBOR tag support is broken in my implementation):

shell

Transcript (because I can't copy-paste the colorful HTML my shell creates in here)
$ ./aiocoap-client 'coap://[2a02:b18:c13b:8014:a8a8:6858:5197:8001]/ps/'
CBOR message shown in naïve Python decoding
[[1, 'Sleeping', 'main'],
 [2, 'ReceiveBlocked', '6lo'],
 [3, 'ReceiveBlocked', 'ipv6'],
 [4, 'ReceiveBlocked', 'udp'],
 [5, 'Running', 'coap'],
 [6, 'ReceiveBlocked', 'nrf802154']]
$ ./aiocoap-client 'coap://[2a02:b18:c13b:8014:a8a8:6858:5197:8001]/netif/'
CBOR message shown in naïve Python decoding
[[6,
  b'\xaa\xa8hXQ\x97\x80\x01',
  [b'\xfe\x80\x00\x00\x00\x00\x00\x00\xa8\xa8hXQ\x97\x80\x01',
   b'*\x02\x0b\x18\xc1;\x80\x14\xa8\xa8hXQ\x97\x80\x01']]]
$ 

It also provides minimal support for GPIOs (no setup yet, just reading and writing), graphics (setting images on supported displays like the microbit one) and some i2p. Also for a ping I have plans, and (still in C) also for file system access. All these can be compact on the air, and support block-wise without any additional buffers.

This is how I think things should be on the long run. Enhancements to the tooling on the CoAP side could help make the experience more shell-like. Access control can be done on a by-command or even ore detailed. (Plus some of these might be easy to turn into shell commands with a more readable serialization, just not the other way 'round).


Now that leaves the issue of transition, and of cases where we do want interactive commands.

That case I’ve so far left out of consideration because of observation 1.

One thing that’s already included in the above demos (but not executed well yet) is the “other” side of shell: Interacting with stdio in the role of a program (ie. making RIOT put characters out on the stdio UART, and reading from them). But that could just as well be short-circuited into a buffered stdio, so that any existing application with printf etc could be run. The USB and semihosting stdio implementations are probably good templates.

The downside of this generality is that it’s not a pretty interface, CoAP-wise: There’d need be POSTs to send input with checks for lost messages (or full lock-stepping) and manual deduplication (no RIOT CoAP library deduplicates), and an GET+Observe for output (with some trickery to ensure a continuous flow and recovery). Using this would need tooling we don’t currently have on the client side – I can probably come up with something in aiocoap, if that’s sufficiently universal.

On the side, most of this can probably double as remote UART – either way, it’s Telnet over CoAP.


To tie everything back to the original topic, most of this (and definitely the second half) is only viable if we have a good security story for overall CoAP use with RIOT; unless that’s set up, a remote shell should nowhere be default, and only activated after having clicked through a very explicit warning about how wide open the system is.

</ted-talk>

Isn’t ping progressive output just a case of CoAP Observe?

What I see as the challenge is finding the sweet spot between a shell and a CoAP based CORECONF interface to the device.

this seems awesome. I think, to make it really sing, “aiocoap-client” needs to be a kind of shell, with libreadline, command line history (all on the host), and tab-completion. One way to do this is to just use the Unix shell. zsh has programmable command completion, and has the history, etc. It just needs to establish the address of the target in some place so that it doesn’t need to be repeated each time.

You are probably too young to have ever used VMS, but rather than “telnet” or “rlogin” to some host, one would do “set host remotehost”, and the shell would then essentially rsh everything to that host. The underlying security was shit compared to today, but it was pretty cool.

I have a number of experiences with hardware designers who do bring up and debugging of new designs. They mostly (due to ignorance) always ignored our software drivers with the GPIO testing code in them, and would often write then own code to debug. The couple of times when I’d show them a Tcl (state of the art at one point) or Python interface to toggle GPIO pins would make them really happy.

Being able to do GPIO bit bashing is really useful to the hardware side of a team.

Being able to do this via USB is also a win. I wrote in another thread about SLIPmux vs having a PPPmux. But, being able to have an entire stack in a host application that just happened to speak PPP+IPv6LL+CoAP to run your aiocoap-client and worked on Winblows… would really rock… [or VirtualBox + USB passthrough]

On the side, most of this can probably double as remote UART – either way, it’s Telnet over CoAP.

:slight_smile:

To tie everything back to the [original topic](https://forum.riot-os.org/t/whole-stack-secure-coap- operations-and-deployment-story/3337/4), most of this (and definitely the second half) is only viable if we have a good security story for overall CoAP use with RIOT; unless that’s set up, a remote shell should nowhere be default, and only activated after having clicked through a very explicit warning about how wide open the system is.

You know where I feel this way. We need to provision the security we want.

It is. But it’s also a request with side effects, requiring POST, so that’d be “POST here, get a Location, observe that”. (Which is exactly what I want for the explicitly CoAP ping). But for other commands that’d be a bad pattern, because they’re side-effect free, like a GET for ps.

CORECONF

That’d be great for some of these things. Unfortunately I’ve yet to see any simple application by which to approach this – all examples seem to be configuring five-digit model number routers. Can you point me to any reasonably low-threshold implementation?

“aiocoap-client” needs to be a kind of shell, with libreadline

aiocoap-client has an interactive mode; it’s just not very mature. (And TBH a bit weird syntax-wise, because it’s all just calling aiocoap-client again just w/o the name).

What I think would be the most important feature to add there is link following: The interactive mode should implement URI references (so after GET coap://foo/bar/ a GET 0 should go to coap://foo/bar/0), and ideally some kind of URI recognition (I already parse link-format, CoRIs would be next) so that you can also use some kind of short-hand, a bit like

> GET coap://foo/.well-known/core
</config/>{0};rt=...,
</sensors/1>{1};rt=...,
> GET 0
Getting <coap://foo/config/>
...

As for integrating into the shell, while that’s a neat trick (too young for VMS, but in the 90s I wrote lots of DOS batch based menu systems that would abuse the shell a lot), but setting these up portably … yikes.

being able to have an entire stack in a host application

I’m not sure I can quite follow here, can you elaborate?

CORECONF

> That'd be great for some of these things. Unfortunately I've yet to see
> any simple application by which to approach this -- all examples seem
> to be configuring five-digit model number routers. Can you point me to
> any reasonably low-threshold implementation?

I don’t have an implementation, sorry. I think there are three aspects to the sense of intimidation:

  1. security
  2. URL & namespace
  3. very verbose YANG models inherited from SNMP.

(1) security is actually identical regardless. I think that (2) is psychological.

(3) is a question of picking and choosing the right pieces.

>> “aiocoap-client” needs to be a kind of shell, with libreadline

> aiocoap-client *has* an interactive mode; it's just not very
> mature. (And TBH a bit weird syntax-wise, because it's all just calling
> aiocoap-client again just w/o the name).

Cool.

> What I think would be the most important feature to add there is link
> following: The interactive mode should implement URI references (so
> after `GET coap://foo/bar/` a `GET 0` should go to `coap://foo/bar/0`),
> and ideally some kind of URI recognition (I already parse link-format,
> CoRIs would be next) so that you can also use some kind of short-hand,
> a bit like

so, “cwd” and “ls” kind of thing?

> As for integrating into the shell, while that's a neat trick (too young
> for VMS, but in the 90s I wrote lots of DOS batch based menu systems
> that would abuse the shell a lot), but setting these up portably
> ... yikes.

For practical purposes, DOS doesn’t have a shell. VyOS has done some interesting things such that it has turned a shell into something that works a lot like Cisco/Juniper/etc. router interaction, but I haven’t figured what they did.

>> being able to have an entire stack in a host application

> I'm not sure I can quite follow here, can you elaborate?

The application could talk SLIP or PPP to the (USB) serial port directly, sending what look like IPv6 packets, but not using the host stack at all.

Thinking of a general UARTish-over-CoAP protocol, it could look like this – provided we do need live interactivity and not line-buffered operation (it can do the latter but is overkill):

  • Separate resources for send and receive. They’re somewhat linked, but could conceptually operate standalone and the same way (but it might be that the RIOT side only implements server mode for both).
  • Receiving output from a server happens by GETting or observing the output resource. The output is typically stored in a ring buffer of chunks, and the specified behavior is to send at least one chunk on a GET (but more might be sent at the sender’s discretion). Payload is a CBOR sequence of information chunks with [index, bytecontent] data.
    • Indices are into the infinite stream and not the ring buffer (whose size need not be agreed on), and should probably wrap in a defined way similar to observe numbers.
    • If the receiver notices a gap, and will just know (and can display) that output was lost.
  • Sending to an input of a server happens by POSTing in the same format. Messages don’t need to be confirmable; if there’s loss the server will see that things start beyond the known data and go 4.xx. The client can NON No-Response messages while the user is typing fast, and even send older content along with the latest one like mosh does.
  • As echoing is expected, the client can No-Response quite often; if send and receive side are somehow linked (static configuration on the constrained device, discovered by the “terminal client”), it’ll know which latest seen send event is co-acked by the incoming data in a CBOR tag (or just some record that’s not content).

This has some extensibility not needed for a first version:

  • A receiver that sees a gap in the observed values (which is perfectly legal due to eventual consistency) can FETCH the missing section – of course that can fail if the server has already forgotten.
  • Tagged CBOR data can be used to do what’s done in escape sequences in telnet, like setting baud rate, sending breaks or receiving framing errors.
  • dynlink can be used to make the device POST the output somewhere instead.
  • Either party can nagle as they like.
  • Timing information can be added through tagged CBOR data, in case the receiver needs to know even over a slowly polled connection.
  • The main mode of operation here is best-effort delivery of output, and a buffer overrun will never slow down the application (just lose content). A different mode (blocking stdout – is that even a thing in RIOT? A UART might do XOFF or hardware flow control…) can be implemented by configuration, which then needs the POSTed clear-buffer-up-to acks that are already used in the other direction.

Concrete flow example:

> GET /.well-known/core
<     </stdio/out>;if=tag:riot.org,2021:ser-out;rt=stdout,
<     </stdio/in>;if=tag:riot.org,2021:ser-in;rt=stdin,
<     </stdio/in>;rel=pair;anchor="/stdio/out"

> GET (Observe:0) /stdio/out
<     [[0, b"This is RIOT version blah\n> "]]
(later notifications indicatd by <~~~~ signs)

> POST (NON; No-Response:2) /stdio/in [[0, b"he"]]
(message lost) <~~~~ [[21, b"blah\n> he"], PairAck(2)]
> POST (NON, No-Response:2) /stdio/in [[0, b"helx\n"]]
<~~~~ [[21, b"blah\n> helx\nCommand not found\n"], PairAck(5)]

> POST (NON, No-Response:2) /stdio/in [[5, b"help\n"]] # user typing really fast now ;-)
<~~~~ [[51, b"Table of contents:\n"...], PairAck(10)]
<~~~~ [[105, b"nformation.\nztimer     Introspect what ztimer is doing\n> "], PairAck(10)]

This is the minimal (first items) version. Note that all pairs so far chose to send contiguous chunks instead of multiple ones. On the leg from the console to the constrained device, this may make good practice to avoid the need for the constrained device to seek through the message multiple times; on the other leg that’d allow some optimizations, like sending the current content now and later content if there’s still room in the message, or to store data as memory slices when large litanies like the help content are printed. (That’ll only work with some abstractions in the stdio writing systems, but that can be arranged as things progress). Also, we haven’t lost critical content (only the first part of the interactive session, which was easily corrected by the client sending as much backlog as practical); if one of the observation results had gotten lost of swallowed by a proxy, only a FETCH would have recovered them if still in the backlog.

The link between in and out is for the PairAcks to gain their semantics.

Random selection of later events:

# Hey we're doing DMX without hardware driver (well, almost)
> POST (NON, No-Response:2) /stdio/in [Baud(250000), Break, PauseMs(20), h"ff0000808080"]

# Buffer recovery
> POST (NON, No-Response:2) /stdio/in [[5, b"help\n"]] # user typing really fast now ;-)
(message lost) <~~~~ [[51, b"Table of contents:\n"...], PairAck(10)]
<~~~~ [[105, b"nformation.\nztimer     Introspect what ztimer is doing\n> "], PairAck(10)]
> FETCH /stdio/in [51, 105]
# If that didn't fit in a single message, it might trigger block-wise,
# although the server could just as well send what fits and maybe
# the client fetches more later.
< [[51, b"Table of contents:\n"...]]

# Blockling mode
> POST (NON, No-Response:2) /stdio/in [[124, b"help\n"], PairAck(325)]
# Server nagles until ring buffer is full, then sends it as two
# halves because its CBOR serializer doesn't like scatter-gather stuff
<~~~~ [[325, b"Table of contents:\n"...], [475, b"\nifconfig    alias for ip\n"....], PairAck(129)]
# Server blocks any printf calls until this comes in
> POST (NON, No-Response:2) /stdio/in [PairAck(425]]
<~~~~ [[425, b"nformation.\nztimer     Introspect what ztimer is doing\n> "]]

# I'm just the keyboard, please deliver the output somewhere else
# (that's the most remote use case)
> POST /dynlink <coap://[fe80::42]/lpr>;rel=bind;anchor="/stdio/out"
# Guillemets indicating traffic between RIOT device and line printer
» POST /lpr [[0, b"> "]]
« ACK
> POST /stdio/in [[0, b"help\n"]]
< ACK
» POST /lpr [[2, b"help\nTable of contents..."]]
« ACK
1 Like

Probably one of the most useful initial uses might be network console. (No input, just output) This could involve sending to a multicast address, but I’m not sure how we’d arrange that from a CoAP point of view. Well, perhaps I don’t mean multicast, so much as anycast.

Technically that would be pretty easy (at least on the paper): Just send NON messages to the multicast address. The receivers will eventually get one of the messages and (due to the format) be aware that they have lost data in between, which they than can fetch via unicast (provided that the ring buffer still has them).

Multicasting out is easy (did that in an old project, albeit on plain UDP), but the three major downsides are so bad that I wouldn’t see it as part of a road towards something recommendable: it needs explicitly configured multicast address; it’s terrible on performance to keep around on anything approaching deployments; no practical transition towards secure access because multicast sending involves asymmetric signature operation.

But there’s a middle ground: POST your console’s network address to the device, and then it does the same over unicast. If you wrap your address in <> characters and put a bind to it, it becomes almost exactly the dynlink example above (the POST /lpr one; I implied CON rather than NON). You can probably even put a multicast address in there (resolving downside 1), but at this point why bother.

I like @chrysn’s proposal. It should provide a generalized secure access to a device (assuming CoAPS/OSCORE and given there is also some authentication happening at some point, but I guess this could be done with some additional link) while also fitting well into the constraint use-case.

On the RIOT-side, there are several open threads that IMHO need to be addressed first:

  • Our CoAP API needs to be able to handle such things (blatant advert for the breakout session at RIOT summit 2021 by @chrysn and me)
  • Our stdio needs some heavy clean-up to be able to be multiplexed (see #13469 regarding this [somewhat staled] discussion)

Thanks!

I’m not sure all that multiplexing (while important on its own, and orthogonal to some extent) is necessary for this – a remote-console in the first iteration could just be a stdio_ringbuffer module that’d replace the others. (But probably the clean-up will predate any actual work on it anyway.)

I agree on this. If we’d have proper fd handling/handlers, slipping in a multiplexing handler is an extra step that is orthogonal.

While I agree, that multiplexing is orthogonal, we already have precedence for what happens if multiplexing is not in place when implementing a remote shell :wink:.

The more I think about it, the more I like the idea that one can retrieve boot messages from the console log via the network. As a read-only process, I am okay with having no application security. Maybe two ACLs would desireable: a) IPv6-LL only, b) some IPv6/64 only (as #define), c) the IPv6/64 that the RPL PIO specified.

retrieve boot messages from the console log via the network.

Right now these are indistinguishable from console output – both go through stdout,

What goes through the debug logging mechanism should be pretty easy to split out into a separate stream, though.

I’m happy if I see all the console messages via read-only view.

Two brief updates here:

  • stdio_coap is now a pull request – no OSCORE yet, but it works just fine with DTLS, so hey, it could be usable without having to define I_KNOW_TELNET_IS_INSECURE_AND_WILL_NOT_TO_USE_THIS_INPRODUCTION=pinkypromise (but I still may want to add a module for commissioning credentials, b/c right now when using this with the DTLS example the DTLS PSK is literally “secretKey”, and that’s no better than telnet).
  • There might be some interest in formally describing how this works, especially given that this, in passing, describes an alternative to block-wise transfer. Keep an eye open for T2TRG documents.