Trustworthy Systems

Interrupts considered harmful

Authors

Peter Chubb and Yang Song

NICTA

UNSW

Abstract

Interrupts are a way for a piece of hardware to tell the operating system, `I want some attention, please'. Linux deals with them by stealing time from the currently running process and giving the stolen time to the device driver for the interrupting device.

While the interrupt handler is running, other interrupts are usually blocked. Device driver writers jump through hoops to ensure that interrupts are not held off too long --- and Linux provides a plethora of mechanisms to defer work after the device is told to stop interrupting. However, it is up to each individual device driver to `do the right thing' --- and not all do. In addition, there can be legitimate reasons for `interrupt storms' --- for example, a gigabit network adapter on a busy network --- that cause much time to be stolen from other processes.

This architecture has a number of problems. The main one is that other processes (both real time and interactive, but of course the problem is more severe for real-time processes) become sluggish.

One way used by other operating systems (and also in Linux with Ingo Molnar's PREEMPT:RT patch) is to make interrupt handlers first class threads, so that the time spent in them can be controlled relative to other processes' requirements. However, Ingo's patch does not go far enough --- the mechanisms for deferred work and for interrupt mitigation are still used by the drivers, resulting in poor performance, and, under heavy load, real time processes still miss their deadlines.

The User Level driver work I did (and reported in a previous LCA) suggests that restructuring drivers with a model where interrupts are just one event among many that that a driver copes with, can eliminate most of complexity of deferred work and interrupt processing, and can provide performance at least as good as the current Linux implementation --- and better than the PREEMPT:RT approach.

Currently, I have some preliminary results showing real-time processes missing their deadlines with PREEMPT:RT under heavy interrupt load; by the time of the conference I expect to have an GigE driver with the new model, and some hard benchmark results to prove my point.

BibTeX Entry

  @misc{Chubb_Song_10,
    address          = {Wellington, NZ},
    author           = {Chubb, Peter and Song, Yang},
    booktitle        = {Linux.conf.au},
    month            = jan,
    paperurl         = {https://trustworthy.systems/publications/nicta_full_text/3460.pdf},
    title            = {Interrupts considered harmful},
    year             = {2010}
  }

Download