

# **How to Build Truly Trustworthy Systems**

**Gernot Heiser NICTA and University of New South Wales** Sydney, Australia



Australian Government

**Department of Broadband, Communications** and the Digital Economy

**Australian Research Council** 





















THE UNIVERSITY OF QUEENSLAND AUSTRALIA

#### Windows

An exception 06 has occured at 0028:C11B3ADC in VxD DiskTSD(03) + 00001660. This was called from 0028:C11B40C8 in VxD voltrack(04) + 00000000. It may be possible to continue normally.

Press any key to attempt to continue.

 Press CTRL+ALT+RESET to restart your computer. You will lose any unsaved information in all applications.

Press any key to continue

#### Present Systems are NOT Trustworthy!













Corollary [with apologies to Dijkstra]:

Testing, code inspection, etc. can only show *lack of trustworthiness*!





#### **Dealing with Complexity: Physical Isolation**



#### **How About Logical Isolation?**





### Isolation is Key!

Identify, minimise and

isolate critical





©2012 Gernot Heiser NICTA





©2012 Gernot Heiser NICTA

### **NICTA Trustworthy Systems Agenda**



- 1. Dependable microkernel (seL4) as a rock-solid base
  - Formal specification of functionality
  - Proof of functional correctness of implementation
  - Proof of safety/security properties
- 2. Lift microkernel guarantees to whole system
  - Use kernel correctness and integrity to guarantee critical functionality
  - Ensure correctness of balance of trusted computing base
  - Prove dependability properties of complete system
    - despite 99 % of code untrusted!





#### Establishing trustworthiness

Agenda

Motivation

- From kernel to system
- Sample system 1: Secure access controller

• What is a microkernel, and what is L4?

seL4 – designed for trustworthiness

• Sample system 2: RapiLog



13

### **Monolithic Kernels vs Microkernels**

- Idea of microkernel:
  - Flexible, minimal platform, extensible
  - Mechanisms, not policies
  - Goes back to Nucleus [Brinch Hansen, CACM'70]







#### **First generation**

• Eg Mach ('87)

| Memory Objects  |  |  |  |  |
|-----------------|--|--|--|--|
| Low-level FS,   |  |  |  |  |
| Swapping        |  |  |  |  |
| Devices         |  |  |  |  |
| Kernel memory   |  |  |  |  |
| Scheduling      |  |  |  |  |
| IPC, MMU abstr. |  |  |  |  |

- 180 syscalls
- 100 kLOC
- 100 µs IPC

#### Second generation

• Eg L4 ('95)

- Kernel memory Scheduling IPC, MMU abstr.
- ~7 syscalls
- ~10 kLOC
- ~ 1 µs IPC

#### **Third generation**

• seL4 ('09)



- ~3 syscalls
- 9 kLOC
- < 1 µs IPC

#### 2<sup>nd</sup>-Generation Microkernels



- 1<sup>st</sup>-generation kernels (Mach, Chorus) were a failure
  - Complex, inflexible, slow
- L4 was first 2<sup>nd</sup>-G microkernel [Liedtke, SOSP'93, SOSP'95]
  - Radical simplification & manual micro-optimisation, fast IPC

A concept is tolerated inside the microkernel only if moving it outside the kernel, i.e. permitting competing implementations, would prevent the implementation of the system's required functionality

- Family of L4 kernels:
  - Original GMD assembler kernel ('95)
  - Fiasco (Dresden '98), Hazelnut (Karlsruhe '99), Pistachio (Karlsruhe/UNSW '02), L4-embedded (NICTA '04)
    - L4-embedded commercialised as OKL4 by Open Kernel Labs
    - Deployed in >1.5 billion phones
  - Commercial clones (PikeOS, P4, CodeZero, ...)
  - Approach adopted e.g. in QNX ('82) and Green Hills Integrity ('90s)

### **Microkernel Principles: Minimality**

Strict adherence to minimality leads to a very small kernel

#### Advantages:

- Easy to implement, port?
  - in practice limited architecture-specific micro-optimization
- Less code to optimise
- Hopefully enables a minimal *trusted computing base* (TCB)
  - small attack surface, fewer failure modes
- Easier debug, maybe even *prove* correct?

#### Challenges:

- API design: generality with small code base
- Kernel design and implementation for high performance
  - ... and correctness!



### **Consequence of Minimality: User-level Services**





- Kernel provides no services, only mechanisms
- Strongly dependent on fast IPC and exception handling

#### **Microkernel Principles: Policy Freedom**





#### **Policies limit**

- May be good for many cases, but always bad for some
- Example: disk pre-fetching

#### "General" policies lead to bloat

- Implementing combination of policies
- Try to determine most appropriate one at run-time



- Kernel determines layout, knows executable format, allocates stack
  - limits ability to import from other OSes
  - cannot change layout
    - small non-overlapping address spaces beneficial on some archs
  - kernel loads apps, sets up mappings, allocates stack
    - requires file system in kernel or interfaced to kernel
    - bookkeeping for revokation & resource management
    - heavyweight processes
  - memory-mapped file API

## **Policy-Free Address-Space Management**





- mapping may be side effect of IPC
- kernel may expose data structure
- kernel mechanism for forwarding page-fault exception
- "External pagers" first appeared in Mach [Rashid et al, '88]
  - ... but were optional

### What Mechanisms?



- Fundamentally, the microkernel must abstract
  - Physical memory
  - CPU
  - Interrupts/Exceptions
- Unfettered access to any of these bypasses security
  - No further abstraction needed for devices
    - memory-mapping device registers and interrupt abstraction suffices
    - ...but some generalised memory abstraction needed for I/O space
- Above isolates execution units, hence microkernel must also provide
  - Communication (traditionally referred to as IPC)
  - Synchronization



#### Traditional hypervisor vs microkernel abstractions

| Resource        | Hypervisor         | Microkernel                    |  |
|-----------------|--------------------|--------------------------------|--|
| Memory          | Virtual MMU (vMMU) | Address space                  |  |
| CPU             | Virtual CPU (vCPU) | Thread or scheduler activation |  |
| Interrupt       | Virtual IRQ (vIRQ) | IPC message or signal          |  |
| Communication   | Virtual NIC        | Message-passing IPC            |  |
| Synchronization | Virtual IRQ        | IPC message                    |  |

### **Issues of 2G L4 Kernels**



- L4 solved performance issue [Härtig et al, SOSP'97]
  - ... but left a number of security issues unsolved
- Problem: ad-hoc approach to protection and resource management
  - Global thread name space  $\Rightarrow$  covert channels
  - Threads as IPC targets  $\Rightarrow$  insufficient encapsulation
  - Single kernel memory pool  $\Rightarrow$  DoS attacks
  - Insufficient delegation of authority  $\Rightarrow$  limited flexibility, performance
- Addressed by seL4
  - Designed to support safety- and security-critical systems

### Agenda



- Motivation
- What is a microkernel, and what is L4?
- seL4 designed for trustworthiness
- Establishing trustworthiness
- From kernel to system
- Sample system 1: Secure access controller
- Sample system 2: RapiLog



#### seL4 Design Goals





#### **Fundamental Design Decisions for seL4**



Isolation

- 1. Memory management is user-level responsibility
  - Kernel never allocates memory (post-boot) <sub>o</sub>
  - Kernel objects controlled by user-mode servers
- 2. Memory management is fully delegatable
  - <sup>°</sup>– Supports hierarchical system design
  - Enabled by capability-based access control
- 3. "Incremental consistency" design pattern  $\bigcirc \circ \stackrel{\circ}{-}$  Fast transitions between consistent states
  - Restartable operations with progress guarantee
- 4. No concurrency in the kernel 。
  - Interrupts never enabled in kernel
  - Interruption points to bound latencies
  - Clustered multikernel design for multicores



**Perfor-**

mance

**Real-time** 

 $\bigcirc$ 



### seL4 User-Level Memory Management



#### seL4 Memory Management Mechanics: Retype





### **Example: Destroying IPC Endpoint**





### **Difficult Example: Revoking IPC "Badge"**





#### **Approaches for Multicore Kernels**







| Property                 | Big Lock    | Fine-grained<br>Locking | Multikernel |
|--------------------------|-------------|-------------------------|-------------|
| Data structures          | shared      | shared                  | distributed |
| Scalability              | poor        | good                    | excellent   |
| Concurrency in<br>kernel | zero        | high                    | zero        |
| Kernel<br>complexity     | low         | high                    | low         |
| Resource<br>management   | centralised | centralised             | distributed |


# **Microkernel Principle: Policy Freedom**

![](_page_37_Picture_1.jpeg)

Kernel must not dictate policy

Kernel must not introduce avoidable overhead

![](_page_37_Figure_4.jpeg)

### **Performance of Big Kernel Lock**

![](_page_38_Picture_1.jpeg)

![](_page_38_Figure_2.jpeg)

# **Resulting Design: Clustered Multikernel**

![](_page_39_Picture_1.jpeg)

![](_page_39_Figure_2.jpeg)

#### L3 cache / Main memory

# Agenda

– NICTA

- Motivation
- What is a microkernel, and what is L4?
- seL4 designed for trustworthiness
- Establishing trustworthiness
- From kernel to system
- Sample system 1: Secure access controller
- Sample system 2: RapiLog

![](_page_41_Figure_0.jpeg)

## **Proving Functional Correctness**

![](_page_42_Picture_1.jpeg)

![](_page_42_Figure_2.jpeg)

| <b>datatype</b><br>rights = Read<br>  Write<br>  Grant<br>  Create                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | orrectnes                                                              | S NICTA                                                                                                                  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------|
| <pre>record cap =    entity :: entity_id    ric  reco constdefs    schedule :: "unit s_monad" type "schedule = do</pre>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | <pre>lemma iso "[sane s s' ∈ isEnt .sEnt .sEnt antit ; :&gt; ⇒ c</pre> | <pre>lation: s; execute cmds s; ityOf s e; y c = e; subSysCaps s e<sub>s</sub>] :&gt; subSysCaps s' e<sub>s</sub>"</pre> |
| <pre>schedule :: Kernel () schedule = do action &lt;- getSchedulerAction void setPriority(tcb_t *tptr, prio_t prio) {     prio_t oldprio;     if(thread_state_get_tcbQueued(tptr-&gt;tcbState)) {         oldprio = tptr-&gt;tcbPriority;         ksReadyQueues[oldprio] = tcbSchedDequeue(tptr, ksReadyQueues[ot         if(isRunnable(tptr)) {             ksReadyQueues[prio] = tcbSchedEnqueue(tptr, ksReadyQueues[ot             if(isRunnable(tptr)) {                 thread_state_ptr_set_tcbQueued(%tptr-&gt;tcbState, false);         }         tptr-&gt;tcbPriority = prio;         j         tptr-&gt;tcbPriority = prio;         target-&gt;tcbTimeSlice; += ksCurThread-&gt;tcbTimeSlice;         target-&gt;tcbTimeSlice += ksCurThread-&gt;tcbTimeSlice;         target-&gt;tcbTimeSlice += ksCurThread-&gt;tcbTimeSlice;         target-&gt;tcbTimeSlice += ksCurThread-&gt;tcbTimeSlice;         target-&gt;tcbTimeSlice;         targe</pre> |                                                                        |                                                                                                                          |

# Why So Long for 9,000 LOC?

![](_page_44_Picture_1.jpeg)

![](_page_44_Figure_2.jpeg)

![](_page_45_Picture_1.jpeg)

| Haskell design          | 2 ру     |
|-------------------------|----------|
| C implementation        | 2 weeks  |
| Debugging/Testing       | 2 months |
| Kernel verification     | 12 ру    |
| Formal frameworks       | 10 ру    |
| Total                   | 25 ру    |
|                         |          |
| Repeat (estimated)      | 6 ру     |
| Traditional engineering | 4—6 ру   |

#### Did you find bugs???

- During (very shallow) testing: 16
- During verification: 460
  - 160 in C, ~150 in design, ~150 in spec

#### Kinds of properties proved

- Behaviour of C code is fully captured by abstract model
- Behaviour of C code is fully captured by executable rodel
- Kernel never fails, behaviour is always well-defined
  - assertions never fail
  - will never de-reference null pointer
  - cannot be subverted by misformed input
- All syscalls terminate, reclaiming memory is safe, ...
- Well typed references, aligned objects, kernel always mapped...
- Access control is decidable

Can prove further poperties on abstract level!

![](_page_46_Picture_14.jpeg)

![](_page_47_Figure_0.jpeg)

# **Integrity: Limiting Write Access**

![](_page_48_Picture_1.jpeg)

![](_page_48_Figure_2.jpeg)

#### To prove:

- Domain-1 doesn't have write *capabilities* to Domain-2 objects
   ⇒ no action of Domain-1 agents will modify Domain-2 state
- Specifically, *kernel does not modify on Domain-1's behalf!* 
  - Prove kernel only allows write upon capability presentation

![](_page_49_Figure_0.jpeg)

# **Availability: Ensuring Resource Access**

![](_page_50_Picture_1.jpeg)

![](_page_50_Figure_2.jpeg)

- Strict separation of kernel resources
  - $\Rightarrow$  agent cannot deny access to another domain's resources

![](_page_51_Figure_0.jpeg)

![](_page_52_Figure_0.jpeg)

#### To prove:

Domain-1 doesn't have read capabilities to Domain-2 objects
 ⇒ no action of any agents will reveal Domain-2 state to Domain-1

#### **Non-interference proof in progress:**

- Evolution of Domain 1 does not depend on Domain-2 state
- Presently cover only overt information flow

![](_page_53_Figure_0.jpeg)

# **Timeliness**

![](_page_54_Figure_1.jpeg)

![](_page_55_Figure_1.jpeg)

## Result

![](_page_56_Picture_1.jpeg)

![](_page_56_Figure_2.jpeg)

WCET presently limited by verification practicalities
10 µs seem achievable

![](_page_57_Figure_0.jpeg)

![](_page_58_Figure_0.jpeg)

## **Proving seL4 Trustworthiness**

![](_page_59_Picture_1.jpeg)

![](_page_59_Figure_2.jpeg)

## seL4 – the Next 24 Months

![](_page_60_Picture_1.jpeg)

![](_page_60_Figure_2.jpeg)

©2012 Gernot Heiser NICTA

UPMARC SS, June'12

![](_page_61_Figure_0.jpeg)

# **Multikernel Verification**

![](_page_62_Picture_1.jpeg)

- By definition, multikernel images execute independently
  - except for explicit messaging

![](_page_62_Figure_4.jpeg)

- To prove:
  - isolated images are initialised correctly
  - images maintain isolation at run time

Essentially noninterference

# Agenda

![](_page_63_Picture_2.jpeg)

- Motivation
- What is a microkernel, and what is L4?
- seL4 designed for trustworthiness
- Establishing trustworthiness
- From kernel to system
- Sample system 1: Secure access controller
- Sample system 2: RapiLog

# **Phase Two: Full-System Guarantees**

![](_page_64_Picture_1.jpeg)

 Achieved: Verification of microkernel (8,700 LOC)

 Next step: Guarantees for real-world systems (1,000,000 LOC)

![](_page_64_Picture_4.jpeg)

## **Overview of Approach**

![](_page_65_Picture_1.jpeg)

![](_page_65_Figure_2.jpeg)

- Build system with minimal TCB
- Formalize and prove security properties about architecture
- Prove correctness of trusted components
- Prove correctness of setup
- Prove temporal properties (isolation, WCET, ...)
- Maintain performance

# **Specifying Security Architecture**

![](_page_66_Figure_1.jpeg)

#### **Device Drivers**

![](_page_67_Picture_1.jpeg)

![](_page_67_Figure_2.jpeg)

#### **Driver Development**

![](_page_68_Picture_1.jpeg)

![](_page_68_Figure_2.jpeg)

#### **Driver Development**

![](_page_69_Picture_1.jpeg)

![](_page_69_Figure_2.jpeg)

# **Driver Synthesis as Controller Synthesis NICTA** OS requests = control objective send() - send a network packet Driver = controller device Packet has been sent

## Synthesis Algorithm (Main Idea)

![](_page_71_Picture_1.jpeg)

![](_page_71_Figure_2.jpeg)

©2012 Gernot Heiser NICTA
# **Drivers Synthesised (To Date)**





LEDICER Rev \*B Cypress Semiconductor

SD host controller









#### From Drivers to File Systems?



# **Building Secure Systems: Long-Term View**





# Agenda

**NICTA** 

- Motivation
- What is a microkernel, and what is L4?
- seL4 designed for trustworthiness
- Establishing trustworthiness
- From kernel to system
- Sample system 1: Secure access controller
- Sample system 2: RapiLog

# **Proof of Concept: Secure Access Controller NICTA** AUS NATO SIN US WWW SAC FFFFFF FFF FFFFFF FFFFFF FFFFFF FFFF

# **Logical Function**









# **Minimal TCB**





# Implementation





UPMARC SS, June'12

# Agenda

NICTA

- Motivation
- What is a microkernel, and what is L4?
- seL4 designed for trustworthiness
- Establishing trustworthiness
- From kernel to system
- Sample system 1: Secure access controller
- Sample system 2: RapiLog

# **Database Transactions**

Various approaches, but today usually *write-ahead logging*:

















**NICTA** 



# **RapiLog: Use Virtualization**



#### Performance



#### Also maintain durability on power failure!

# **Trustworthy Systems – We've Made a Start!**





# **Thank You!**

<u>mailto:gernot@nicta.com.au</u> Twitter @GernotHeiser Google: "nicta trustworthy systems"