Mixed-Criticality Real-time Systems
Aim
Develop a trustworthy platform for building mixed-criticality real-time systems.
Overview
Many trusted systems (i.e. systems whose safety, security or reliability we trust) must deliver a response to an external stimulus within a certain time period, and are therefore critical real-time systems. These include cyber-physical systems, such as medical implants (e.g. pacemakers), avionic systems and industrial controllers. A failure to perform a computation or operation within a given time may have catastrophic consequences in these environments.
Such systems are increasingly complex. An example are wearable or implanted medical devices, which combine life-supporting functionality with user interfaces to allow monitoring by the patient or network drivers and protocol stacks for remote monitoring. Other devices contain multiple real-time functions with different levels of criticality, such as automotive safety and control systems.
Subsystems of different criticality must be strongly isolated, in order to prevent malfunction of a less critical function unduly affecting a highly-critical one. This isolation is often achieved by physical separation, however, this is frequently impossible, due to space and energy constraints (e.g. in medical implants) or where the number of different functions is becoming too large (e.g. in cars).
The figure on the left shows a representative example from a cyberphysical system, such as an autonomous vehicle. The critical subsystem is the flight control, which operates at the time frames of the physical world, meaning it has deadlines in the millisecond range. But the system also has an internal network (eg Ethernet) that requires servicing at much tighter time frames (microseconds).
To avoid packet loss, the network driver must be able to preempt the controller. The driver is too complex to be trustworthy, hence the system must provide mechanisms to prevent the high-priority driver from monopolising the processor.
Trustworthiness of such a system can only be achieved if strong functional as well as temporal isolation (with tightly-controlled communication) can be guaranteed between subsystems. This puts stringent requirements on the underlying operating system, whose job it is to provide this isolation. Our real-time systems work aims at providing this foundation for trustworthiness. We focus primarily on the seL4 microkernel, as its formal verification already eliminates many sources of failure. Specifically, we focus on providing kernel mechanisms for enforcing temporal isolation.
Kernel mechanisms for real-time systems
We address the MCS challenge by evolving the seL4 API to treat time as a first-class resource. Specifically, we introduce a notion of scheduling contexts, which are capability-protected kernel objects that represent the right to access the CPU and that can get passed along IPC channels. A thread can only execute if it is associated with a scheduling context.
Scheduling contexts define a thread's time budget and a period, and they guarantee that over the period the thread cannot consume more than its budget. This allows the construction of responsive systems, where less critical but latency-sensitive threads can preempt critical threads operating at longer time scales, while still guaranteeing that the critical threads meet their deadlines.
In the example shown in the figure, the driver is given a high priority but a limited budget of 2Μs every 3Μs, which limits its maximum CPU bandwidth to 67%. The control system is given a budget of 25ms every 100 ms. Provided the WCET of the controller is no more than 25ms, this will guarantee that it meets its deadline, no matter what the driver does. At the same time, the driver, as long is it not malfunctioning, will be able to handle all network traffic.
The ability to guarantee deadlines can even be maintained where threads of different criticalities share a service, time-out exceptions allow recovering from misbehaving low-criticality threads.
The new MCS variant of seL4 is presently undergoing verification. As it is backward compatible with the previously verified base seL4 kernel, it will become the new standard once its verification is complete.
Related TS Research
The MCS kernel supports combining highly critical with less critical components irrespective of scheduling policies. In order to conduct a schedulability analysis, it needs knowlege of the kernel's worst-case execution-times. Determining those is the aim of the WCET Analysis project.
People
Past
|
Contact
Gernot Heiser, gernot@unsw.edu.au
Publications
|
Gernot Heiser The formally verified seL4 microkernel – a high-assurance foundation for MCS Keynote at IEEE Conference on Embedded and Real-Time Computing and Applications, August, 2020 | |
|
Gernot Heiser Verified seL4 on secure RISC-V processors at linux.conf.au, Gold Coast, January, 2020 | |
Anna Lyons Mixed-criticality scheduling and resource sharing for high-assurance operating systems PhD Thesis, UNSW, August, 2018 | ||
Anna Lyons, Kent Mcleod, Hesham Almatary and Gernot Heiser Scheduling-context capabilities: A principled, light-weight OS mechanism for managing time EuroSys Conference, Porto, Portugal, April, 2018 | ||
Gernot Heiser For safety's sake: we need a new hardware-software contract! IEEE Design and Test, Volume 35, Issue 2, pp. 27-30, March, 2018 | ||
|
Gernot Heiser Flying autonomous aircraft: Mixed-criticality support in seL4 at linux.conf.au, Sydney, January, 2018 | |
|
Anna Lyons and Gernot Heiser Mixed-criticality support in a high-assurance, general-purpose microkernel Workshop on Mixed Criticality Systems, pp. 9–14, Rome, Italy, December, 2014 | |
Stefan M. Petters, Martin Lawitzky, Ryan Heffernan and Kevin Elphinstone Towards real multi-criticality scheduling IEEE Conference on Embedded and Real-Time Computing and Applications, pp. 155–164, Beijing, China, August, 2009 | ||
Gernot Heiser Hypervisors for consumer electronics IEEE Consumer Communications and Networking Conference, pp. 1–5, Las Vegas, NV, USA, January, 2009 |