VM/interpreter syscall interface for bindings

The intention behind the descriptive syscalls would be precisely that: A VM specific program derives a suitable ABI for the specified API, and expresses that transition in platform specific code. The vm_push/pop was imprecise; its intention was already to be specific to the example VM (embedvm would has a stack based ABI, and thus generate such calls).

The hedgedoc linked in the main post has more precision and better examples.

Limiting what can be in signatures is certainly a good thing. Trouble arises there when accessing struct based APIs like SAULā€™s phydat ā€“ one either sends whole structs through the API (thus necessitating a concept of structs, or dict in uniffi parlance) there, or one provides accessors for their fields.

Limiting syscalls to N flavors sounds intriguing, but I fear N would become large; looking at the hedgedocā€™s list of system calls, I barely see two of compatible signatures (even considering all handles the same, which makes sense here). But this may only be because the list is intentionally spread wide.

Preferably it would be possible to generate these bindings from a template for each VM. At least for rBPF it would be trivial to generate the C call for the VM-specific parts from a YAML description containing the name and the arguments and which function to call in RIOT.

To me the vm_push/pop sounds like pushing a lot of the syscall burden into a potentially slow VM. Where with rBPF syscalls can be used to speed up processing by calling ā€˜nativeā€™ code via syscalls and speed things up. Adding the vm_push/pop appears to me as extra overhead inside the VM that Iā€™d rather avoid.

To me the vm_push/pop sounds like pushing a lot of the syscall burden

I hope Iā€™ve said it before, but this thread is getting long: The vm_push was an example for a concrete VM; so whenever you read vm_push in earlier posts, please s/vm/embedvm/ (but AIU I canā€™t edit the old posts). It should be up to the VM glue generator whether arguments are parsed in registers, in static memory, on the stack or however else the VM works.

Iā€™ve pushed the main bits of my PoC to a branch.

The current architecture consists of a generic call module that handles the complex bits of the interface: Checking the access and the memory permissions and requesting allocations for objects. The VM implementation provides a set of functions to check access and allocate objects, and the virt_syscall module uses these to check permissions and such. This way the error-prone aspects are grouped in a single module and can be used by multiple VM implementations.

The VMs themselves just have to convert the VM-specific calling conventions into the host calling conventions together with a way to check and allocate memory. This code is most of the time simple enough that it could be generated from a {JSON,YAML}-based description.

The memory allocator is required to prevent sharing memory between the VM application itself and the host RIOT architecture, as those might be completely different. The virt_syscall module requests memory and a handle for an object and the handle is returned to the VM application. All access to the structs has to through calls. The downside here is that all VMs need a sort of heap allocator in them, but most of the VMs already have one.

So far the overall code looks simple enough and adaptable between different VM implementations.

Currently I have rough bindings for:

  • Random
  • SAUL
  • CoAP

For both WAMR and rBPF.