So I’m building a system with a “bootloader” and an “app”. I want to online-upgrade the latter.
The problem: I’d like to find a way to do that without transmitting a complete image. There’s some binary delta algorithms out there, but AFAICT none of them allow the target to be overwritten in-place, much less in Flash-page-sized chunks,
Any pointers to preferably-not-too-complex code that does something like this?
RIOT uses an active/passive (two-slot) scheme, to reduce complexity. To update in-place, you’d need the bootloader to do the swap update, with all sorts of complexity increasing issues (bootloader would need to actually fetch the image (implying it has a network stack), do the upgrade, handle errors, security, …).
Doing the swap in a live image (updating the flash that the cpu is executing from) would be an interesting challenge.
If I were to implement this, I’d probably try to change the split between RIOT’s slot0 and slot1, so one only contains the absolute minimum to do the upgrade. Then, if the application image figures out there’s an update, boot into the update slot, and have the update logic there.
I’m not trying to do any live update of the “app” part while it’s running. The “boot” part is self-sufficient and contains enough chunks of the “network” stack to do the download and install of the “app” code.
I’m also not going to do slot-based booting; there’s not going to be enough flash space to do that. My boot code asks the network what to do before starting the “app” (which is more like a loadable library) and if it really needs to fall back to a previous version then I’ll have to re-flash that.
I have thought a lot about this in years past when flash was smaller.
But, more importantly: no matter how big flash is, at some point product management will be persuaded by Very Important Customer to use 55% of it because “new feature”, and then what? Maybe the extra 5% is because of debugging of critical customer situation.
What I thought about was yes, having bootloader be able to fetch image over CoAP block mode. But, all of DTLS/EDHOC and routing would be setup by main image, stored to flash, and bootloader reads info. This results in serializing of DTLS/EDHOC/CoAP state into flash
I am not too familiar with the current situation, even though I should be.
Anyways. Wouldn’t it make more sense to download the image in the app code, storing it to some kind of persistant storage (SD, external flash etc.) and then boot into just the bootloader, do some verification and flash the new firmware.
That reduces the complexity of the bootloader, which saves some space. Also, it makes it possible to move to another protocol when necessary. From my understanding that would be more complicated if you do that all in the bootloader.
That’s what #17379 implements. (Currently it only does the ‘flashing raw images from external storage’ step, downloading/verifying the firmware via SUIT would be a second PR that builds on top of that.)
I would also keep the signature validation out of the bootloader to reduce complexity there. Instead the application firmware will verify the signature and only set the update flag for the bootloader if the signature is valid.
I would also keep the signature validation out of the bootloader to reduce complexity there. Instead the application firmware will verify the signature and only set the update flag for the bootloader if the signature is valid.
I just thought about that as well after I send the message. Sounds like a solid design. Have to check that one out. Thanks.
/e Thinking about it. Wouldn’t that create an attack vector? Just writing into the flash from the outside and skipping the verification this way?
Well it depends on what your attack scenario is. Obviously you can’t have random users put arbitrary files on your storage remotely, but that is generally not a good idea. And if the attacker has physical access, they might as well re-flash the main MCU directly (which on some MCUs you can prevent, but for most applications it’s not worth the hassle).
So you would only have the signature check in the bootloader as a mitigation against buggy application code that permits arbitrary flash writes.
However the main reason for doing the signature check outside the bootloader is that it makes rollbacks really simple: Before flashing the new firmware, dump the old firmware to a file and set the flag to use that file for the next firmware update.
If the new firmware fails to remove the flag (because it crashed on init) the bootloader will automatically restore the old, working firmware.
Of course this could be solved by appending the signature to the firmware image when flashing it to ROM, but so far there was no application to justify the added complexity.
Well, in my case the network is a two-to-four-wire serial bus, so TLS, CoAP etc. is not really an issue. (Also I don’t want to require external storage.) What is an issue is the bus bandwidth.
I’m not in the “I want the boot loader to be as small as possible” camp. If the boot loader fills 60% of the flash that’s perfectly fine as long as a decompressor+flasher plus a compressed new boot loader can be squeezed into the other 40%.
I’m thus re-using the boot loader as a shared library which I simply link the app to. One reason for doing this is that my serial bus handling is a bit complicated so that it’d be a waste of space to have the code in there twice.
I’ll probably handle this by uploading to mostly-identical devices in parallel.
It might be worth pointing out that we already do have riotboot variants that obtain-and-flash rather than select-the-image: riotboot-dfu (which I use a lot) and riotboot-serial (which I haven’t used yet). They could serve as templates with the one limitation that they still use riotboot’s split-image (although technically they’d be fine with a single application image).
I didn’t find in the riotboot-vfs PR whether that adds the possibility to have less than two application images or whether that’s something yet to be added (but we should have that, also for riotboot-dfu and -serial).