Mixed-criticality scheduling and resource sharing for high-assurance operating systems
Authors
CSIRO's Data61, Australia
UNSW, Australia
Abstract
Criticality of a software system refers to the severity of the impact of a failure. In a high-criticality system, failure risks significant loss of life or damage to the environment. In a low-criticality system, failure may risk a downgrade in user-experience. As criticality of a software system increases, so too does the cost and time to develop that software: raising the criticality also raises the assurance level, with the highest levels requiring extensive, expensive, independent certification. For modern cyber-physical systems, including autonomous aircraft and other vehicles, the traditional approach of isolating systems of different criticality by using completely separate physical hardware, is no longer practical, being both restrictive and inefficient. The result is mixed-criticality systems, where software applications with different criticalities execute on the same hardware. Sufficient mechanisms are required to ascertain that software in mixed-criticality systems is sufficiently isolated, otherwise, all software on that hardware is promoted to the highest criticality level, driving up costs to impractical levels. For mixed-criticality systems to be viable, both spatial and temporal isolation are required. Current aviation standards allow for mixed-criticality systems where temporal and spatial resources are strictly and statically partitioned in time and space, allowing some improvement over fully isolated hardware. However, further improvements are not only possible, but required for future innovation in cyber-physical systems. This thesis explores further operating systems mechanisms to allow for mixed-criticality software to share resources in far less restrictive ways, opening further possibilities in cyber-physical system design without sacrificing assurance properties. Two key properties are required: first, time must be managed as a central resource of the system, while allowing for overbooking with asymmetric protection without increasing certification burdens. Second, components of different criticalities should be able to safely share resources without suffering undue utilisation penalties. We present a model for capability-controlled access to processing time without incurring over-head related capacity loss or restricting user policy, including processing time in shared resources. This is achieved by combining the idea of resource reservations, from resource kernels, with the concept of resource overbooking, which is central to policy freedom. The result is the core mechanisms of scheduling contexts, scheduling context donation over IPC, and timeout exceptions which allow system designers to designate their own resource allocation policies. We follow with an implementation of the model in the seL4 microkernel, a high-assurance, high-performance platform. Our final contribution is a thorough evaluation, including microbenchmarks, whole system benchmarks, and isolation benchmarks. The micro- and system benchmarks show that our implementation is low overhead, while the user-level scheduling benchmark demonstrates policy freedom in terms of scheduling is retained. Finally, our isolation benchmarks show that our processor temporal isolation mechanisms are effective, even in the case of shared resources.BibTeX Entry
@phdthesis{Lyons:phd, author = {Anna Lyons}, month = aug, note = {Available from publications page at \url{http://ts.data61.csiro.au/}}, paperurl = {https://trustworthy.systems/publications/papers/Lyons%3Aphd.pdf}, school = {UNSW}, title = {Mixed-Criticality Scheduling and Resource Sharing for High-Assurance Operating Systems}, year = {2018} }