True static partitioning with Xen Dom0-less
The Xen Project hypervisor has relied on a special virtual machine, Dom0, to perform privileged operations since the early days of the project. Dom0 has always been the very first environment to come up at boot time, providing a familiar Linux command line interface for the users. With Dom0-less, we introduced a radical new way of using the Xen hypervisor aimed at embedded Arm systems: it makes it possible to have a Xen deployment with multiple virtual machines, without any Dom0, and without any Xen userspace tools.
The main reason for using Xen Dom0-less is to shorten the boot-time of critical applications. In a traditional Xen and Dom0-based system, one has to wait for Dom0 to be fully booted before creating any other virtual machines. Xen was originally written with the datacenter in mind, where short boot-times are nice to have but not a strict requirement. Without Dom0less, in the best of cases, it will take several seconds to boot a VM measured from system bring-up: Xen needs to boot first, then the Dom0 kernel, then Dom0 userspace, finally `xl’ becomes available, and other VMs can be started. At Xilinx, we focus on embedded deployments: often, our users need to be able to have several VMs up and running in less than a second, which is an order of magnitude less than what we were able to do until now.
With Dom0-less, Xen boots selected VMs in parallel on different physical CPU cores directly from the hypervisor at boot time. That means that a VM carrying a motor control application has only to wait for Xen to be ready before starting. It drastically reduces boot times — booting in less than a second is certainly possible now.
Hardware resources can be directly assigned to Dom0-less VMs. Xen Dom0-less is a natural fit for static partitioning, where a user splits the platform into multiple isolated domains and runs different operating systems on each domain. The configuration (both the domains to boot and device assignments) is done via device tree. U-Boot is responsible for loading all the required binaries, such as the domains kernels and ramdisks. It is possible to issue the U-Boot commands and add any device tree configurations by hand, but it is best to have it done automatically for you. A set of tools that were recently published by the community (see gitlab.com/ViryaOS/imagebuilder and wiki.xenproject.org/wiki/ImageBuilder) can be used to generate a U-Boot script that loads everything necessary and modifies an existing device tree automatically at boot time to make the system boot under Xen without any manual intervention. They make the task of configuring a Xen Dom0-less system quick and easy.
Despite the name “Dom0-less”, Xen still starts a Dom0 VM at boot, but the hypervisor creates additional VMs in parallel without any interactions or help from Dom0. Having a Dom0 environment can be convenient because it allows users to monitor the system, start/stop additional VMs, reboot the platform, etc. However, since Dom0 is not required to start or run VMs, it becomes possible to get rid of it, leading to “true Dom0-less” systems where only regular unprivileged guests are running. True Dom0-less is desirable for security reasons, to reduce the surface of attack, to build a system without any privileged VMs, or simply to save on resource utilization.
One interesting side effect of not having Dom0 is that the Xen userspace tools are not necessary anymore. Especially in cross-building environments, it is not easy to compile the Xen tools: typically a fully-featured build system such as Yocto is required. With Dom0-less, only the Xen hypervisor binary is necessary to start multiple VMs. Thus, users just build the hypervisor binary, which takes far less time and can be accomplished with a single `make’. In addition to shortening the actual boot time, Dom0-less also reduces the overall time to build, configure, and assemble a multi-domains system.
Dom0-less, including device assignment, is fully upstream in the upcoming Xen v4.13 release. User feedback started to seep through and has been very positive so far. I want to encourage the readers to give it a try and explore this new way of bringing virtualization to embedded systems.