Mark Smotherman
Last updated: November 2015

Summary: Interrupts are a vital part of sequencing a modern computer. They were developed for exception handling and were later applied to I/O events. Today, they also also play a role in protection and memory management. However, interrupts can be disruptive to both program design and to high performance.

.. under construction ..


Machines discussed:
Univac-I (1951) - overflow trap
NBS DYSEAC (1954) - I/O interrupt
IBM 704 (1955) - transfer trap (for debugging)
Univac 1103/1103A (1956 innovation) - wind tunnel interrupt
Lincoln Labs TX-2 (1957 paper) - multiple-sequence system
IBM Stretch (1957 paper) - vectored interrupt system
Electrologica X-1 (1957-1958 innovation) - vectored interrupt system
Manchester Atlas (1961) - page fault

Comments on interrupts
Turner and Rawlings

Design tree

Performance techniques

Deferred interrupt handler structure


In my 1989 paper, "A Sequencing-Based Taxonomy of I/O Systems and Review of Historical Machines," Computer Architecture News, vol. 17, no. 5, pp. 5-15, September 1989 (and reprinted as Paper 1 of Chapter 7, in M. Hill, N. Jouppi, and G. Sohi (eds.), Readings in Computer Architecture, Morgan Kaufmann, San Francisco, 2000),

Short Review of the History of Interrupts

However, the rest of the story includes

Turner and Rawlings' comments on interrupts

From their 1958 paper cited above.

It is necessary to interlace computing and information transfers to achieve required speeds. It is well known that almost all the cycle time of input-output devices is available for computing if coding which is clever enough can be prepared. In practice, the difficulty of such coding is so great that little use is usually made of this theoretically available time.
[Discussing possible coding and hardware errors when interrupts are used] The heart of these difficulties is a subtle difference between interrupt and noninterrupt modes of operation. In the conventional mode, not only does the result of calculation depend only on the initially stored code and on the data, but so does every intermediate state of the computer. With interrupt, because of the random timing of interruption, this is no longer true; in fact, even with a correct program, only the answers are expected to be predictable. One of the startling effects of this variability is that a programming or coding error, by producing a variable error or sometime none at all, may be indistinguishable from a random computer malfunction. Many hours were spent in discovering this pecularity.
... The nature of INTERRUPT with randomly-timed devices is such that the thin line which has separated program difficulties from hardware difficulties is now invisible. A problem with a program error may be successfully run many times before the peculiar circumstances which produce a failure show up. When they do show up there is little or no evidence to be analyzed.
[capitalization in original]

Strachey's comments on interrupts

A widely-cited paper that described the use of interrupts is Christopher Strachey, "Timesharing in Large Fast Computers," International Conference on Information Processing, June 1959, Paris, pp. 336-341 [IFIP conference; the proceedings were published by UNESCO in 1960].

In the paper, he first describes a simple interruption scheme for handling magnetic tape transfers and then argues for a more sophisticated design:

This plan leaves several difficulties unresolved; it makes, for example, no provision for lock-out during transfers. Another difficulty is the possibility of several "interrupt program" signals coming at the same time (from different units) or of one special sequence of orders being interrupted by another (or even worse, by itself).
A fairly straightforward way out of these difficulties is to divide all possible interrupt signals into a small number of classes. The classes are allotted in order of priority, and any interrupt signal can only interrupt the program immediately if this is of a lower priority class. If the program being executed is of the same or higher priority class, the interrupt signal is merely stored. At the end of a sequence of orders of one priority, the machine scans the waiting list and starts to operate on the call with the highest remaining priority.
The top priority would contain operations which cannot afford to wait e.g. transfer of words to or from magnetic tape. The lowest priority would be the main program. There might have to be one or two intermediate classes, the principle of classification being that those which have interia must have a higher priority that those that have none.
... it may be necessary to store the content of various arithemetic and index registers before proceeding with the new program. If this is necessary the general rule should be to store them in locations which are associated with the interrupting program and to restore them again at the end of that program. The amount of information to be stored is determined by what use the interrupting program makes of these common registers. In this way no time will be wasted storing more information than is necessary.

By "lock-out" he means preventing access to an I/O buffer by the main program until the I/O transfer is complete.

He goes on to describe a time-shared machine that could support a long-running "base load" production program, a computer operator starting "short-run" programs as needed, a programmer debugging a new program, and an engineer who could be maintaining and testing the peripherals.

He describes a software "Director" that would manage the transitions between the different program sequences and have access to special instructions, memory protection limit registers, and a facility for limiting execution time. Today, we would recognize his description of a Director as an operating system running in a protected execution mode with access to memory base and bounds registers, privileged I/O instructions, and a watchdog timer.

He gives a tentative priority list of program sequences from highest priority to lowest:

and then notes:

It may be desirable to divide these into various groups, or even allow the priority of a program to be decided by the Director.

With regard to the relative innovation of the ideas presented, this paper appears a bit late in the chronology of computers with interrupt systems. For example, compare this paper with Forgie's 1957 paper on the TX-2, Brooks' 1957 paper on interrupt systems, and Gill's 1958 paper on parallel programming. On the other hand, Strachey explicitly states twice in his paper that he knows of no actual or projected computer that implemented or would implement his ideas, and Gill cites Strachey's idea of two programs running simultaneously on one computer in Gill's own 1958 paper. (In the discussion session included as part of the published Strachey paper, Ted Codd of IBM briefly mentions that the IBM Stretch is similar in approach and that he hopes to describe multiprogramming on the Stretch at the next EJCC. Strachey filed for a patent on his ideas in Great Britain in February 1959 and in the U.S. in February 1960. He was granted US Patent 3,222,647 in December 1965.)

Dijkstra's comments on interrupts

Denning's comments on interrupts

Peter J. Denning, "Principles for Reliable Operating Systems," January 28, 2003

Hardy's comments on interrupts

Norm Hardy's history of interrupts

Design Tree

(adapted from Blaauw and Brooks, section 7.4, on interrupts)

  1. terminology and connotations (there are no well-defined, widely-accepted distinctions among the various terms)
    1. asynchronous, external
      1. interrupt - external device (I/O, clock, etc.)
      2. machine check - hardware error or failure
    2. synchronous, internal
      1. general terms
        1. alarm
        2. exception - special or undefined case (e.g., divide by zero)
        3. internal interrupt
        4. program check
        5. trap (sprung like a "mousetrap")
      2. terms associated with no continued execution
        1. abort - exception in the middle of an instruction
      3. terms associated with continued execution (either restart or resume)
        1. fault - missing information (e.g., page fault)
      4. terms associated with user request to OS
        1. mme (master mode entry)
        2. software interrupt (e.g., INT opcode in x86)
        3. svc (supervisor call)
        4. syscall
        5. trap (trap always on SPARC)

  2. entry point (i.e., where CPU execution diverted)
    1. fixed memory address (e.g., PDP-8)
      1. poll to find cause (e.g., PDP-8)
      2. code indicating cause placed in register
    2. one of many memory locations (interrupt vector)
      1. number of vectors
        1. single, globally shared (e.g., Electrologica X-1, S/360, x86)
        2. multiple, one per process (e.g., IBM Stretch)
      2. vector contents
        1. interrupt vector entry holds new PC (and maybe new PSW) (e.g., S/360, x86)
        2. interrupt vector holds blocks of instructions (e.g., SPARC)
    3. different PC (e.g., DYSEAC, TX-2)

  3. return linkage
    1. fixed memory location(s) (e.g., PDP-8, S/360)
    2. memory stack (e.g., x86, M68K)
    3. register (e.g., SPARC)

  4. permission
    1. single interrupt enable bit (e.g., PDP-8, 8086 w/o PIC)
      • enable/disable interrupt instructions
    2. interrupt enable mask in PSW (e.g., S/360)
    3. interrupt priority code in PSW (e.g., SPARC)
    4. nonmaskable (cannot be ignored)
    5. breakpoint bit in each instruction (e.g., DYSEAC)
      • device priority along with break and dismiss bit in each instructions (e.g., TX-2)

  5. techniques to accelerate interrupt handling
    1. cause determination (see cause register and interrupt vector)
    2. additional register sets
    3. interrupt poll to coalesce multiple interrupts (e.g., ITI on B5500 in 1960s, test pending interrupt on S/370 XA)

Performance techniques

on another note, Rob Warnock writing in comp.arch on 16 Jul 2001:

As far as the areas that I've been personally involved in -- mostly high-speed networking -- the best hardware/software tradeoffs have been mostly just judgment calls by very experienced folks. What I've noticed is that a *LOT* of "lessons learned" decades ago are still being re-learned by later generations, over and over again. We as an industry don't seem to have been doing a very good job of passing on the "secret sauce" to subsequent generations [in many cases, even *within* the same companies!].
For example, way back in the early 1970s at Digital Communications Assoc. (in Atlanta, made communications front-ends for DEC PDP-10s and IBM 360s using DEC PDP-8e's, successfully sold in *competition* with DEC's own PDP-11-based front-ends!], we made simple interrupt-per- character PIO-based terminal controllers that were -- in terms of host CPU cycles -- *more* efficient (and a lot cheaper!) than DEC's complex DMA-based controllers, partly because we used three simple techniques, two programming and one hardware: (1) interrupt "coalescing", (2) "dallying" before dismissing an interrupt, and (3) interrupt "holdoff". That is, (1) during a single interrupt, process *all* of the data that's available or that *becomes* available (looping to repeat the availability test as needed) during a single interrupt; (2) spin-wait/test a certain amount (which may even be context/data-dependent!) before dismissing an interrupt, so as to catch new data and process it without having to dismiss/interrupt again; and (3) once you finally *do* dismiss an interrupt, arrange the hardware so that you won't get another one until a certain (sometimes dynamically) chosen time interval has expired (also known as "interrupt rate-limiting", which, together with the other two, prevents "thrashing" in the interrupt service routine).
[There's another variant of #2 that's worth mentioning: when doing PIO writes to a FIFO and hitting a FIFO-full condition, "dallying" (spin-waiting) a certain amount first rather than enabling a watermark interrupt and going away. If you pick the dally amount appropriately, by "wasting" a few cycles you actually can save *lots* of total CPU time, since you (hopefully) avoid a later heavy-weight interrupt/context-switch/dismiss.]
These techniques were *NOT* unique to DCA!! In talking to other engineers at conferences & trade shows, we learned that such things were generally widely known among workers in the field (especially embedded systems, datacomm, and real-time folks), but were not often discussed openly, being considered somewhat "trade secrets". [That is, why tell your competitor why his performance sucks and yours doesn't?!?]
In the 30+ years since then, I've been astounded how many times the techniques of coalescing/dallying/holdoff have been rediscovered, but I've been *more* astounded how many times they haven't been used *at all* in many places where they were definitely needed, or how they were reinvented badly. [E.g., if you're implementing hardware assistance for interrupt holdoff, it is *extremely* important that the holdoff period be effective only *after* the dismissing of an interrupt, and not delay the response to or processing of a "new" event that arrives when the system has been idle for longer than the holdoff.] "Computer Science" courses don't seem to ever discuss such practicalities, either.

(E.g., see US 3,789,365 issued January 1974 and US 4,342,082 issued July 1982)

Deferred interrupt handler structure

In this section I want to trace the development of the idea of "bottom half" or "deferred" interrupt handlers (a.k.a. fork routines, interrupt queueing, interrupt stacking, as well as the term "Cutler kernel" structure).

The idea is that a "first half" interrupt handler dynamically creates a new thread or "fork routine" for later scheduling and execution, as opposed to an interrupt handler that runs to completion or a barebones interrupt handler that merely awakens a previously-created driver process. (For example, in RSX-11M, interrupt handlers fork a routine for later execution by adding a four-word fork block to a FIFO fork queue. The fork routines must be executed before any user processes are scheduled.)


  1. run the more-involved handling with interrupts re-enabled
  2. serialize handler access to kernel data structures in a uniprocessor (using a special queue of deferred handler tasks)
  3. allow the later invocation a non-resident system routine as part of the interrupt handling

The idea appears to date from at least the early 1960s with the soft-real-time monitor developed for the Project Mercury space program. [See the discussion of "Cutler kernel" in Bruce R. Montague, "JN: An Operating System for an Embedded Java Network Computer," UCSC-CRL-96-29, 1996.]

Fred Brooks said that having to stack interrupts influenced Seymour Cray to design the CDC 6600 with polling rather than interrupts.

CDC 3600

two systems with deferred handling are cited by Richard Watson in Timesharing System Design Concepts, 1970, pp. 126,144:

Montague, op cit., also mentions the IBM TSS (mid-1960s) as using deferred handling.

genealogical tree involving Dave Cutler

Linux deferred handlers

See Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman, Linux Device Drivers, Third Edition


Hardware support in x86

Corrections are welcome!

[History page] [Mark's homepage] [CPSC homepage] [Clemson Univ. homepage]