Instruction Predecoding and Decoded Instruction Caches
Mark Smotherman
Last updated: April 2004
UNDER CONSTRUCTION
Summary: ... tbd...
... intro to be written
... incomplete - lots of ideas and patents,
... will try to include the early ones ...
- decoded instruction queues
- IBM
Stretch (1961) - predecoded up to two instructions at a time
after instruction fetch, held predecoded (and some pre-executed)
instructions in its four-level lookahead unit
- IBM S/370 M165 (1971) and 3033 (1978) - four-entry decoded
instruction queue between IPPF ("instruction pre-processing
function") and execution units; IPPF starts the fetching
of memory operands, similar to IBM Stretch
- decoded instruction caches
- LLNL
S-1 (1978) - expanded 36-bit word to 56-bit icache format
to reduce inst. decoding time
- Jim Pomerene and Rudolph Rechtschaffen,
"Cache memory architecture with decoding"
US Patent 4,437,149 (filed Nov. 1980, granted to IBM March 1984)
- Dave Patterson, RISC-II (ISCA 1983 paper) -
"on the miss" expansion between memory and icache
- Yale Patt,
HPS (1985)
- AT&T CRISP (1987) - variable-length source instructions were
predecoded into fixed-length entries and placed in a 32-entry DIC;
also each branch wwas folded into the previous instruction's DIC
entry by including a next-address field; conditional branches were
handled by including a second, alternate next-address field and
information for determining a misprediction
- IBM RS/6000 (1989) - instructions are predecoded into eight general
classes to assist in routing to the function units
(see also US Patents 5,828,895, filed 1995, and 6,286,094, filed 1999)
- ... lots recently ... (perhaps discuss MIPS R10000)
- instruction caches with instruction length info added
- ... early patents ...
- AMD K5 (1994) - adds 5 bits per instruction byte (start, end, prefix,
opcode, number of Rops); see US Patent 5,758,114 (parent filed 1995,
granted to AMD in 1998); only 2 bits added per byte in K8
- ...
- instruction caches with scheduling information added
- NS
Swordfish (1991) - instruction pair dependency bit is contained
in each decoded i-cache entry; it is set on i-cache refill by
predecode hardware and yields LIW issue of independent instruction
pairs; no bits are used in the normal instruction format;
see US Patent 5,669,011 (parent filed 1990, granted to NS 1997)
- Minagawa/Saito/Aikawa (1991) - "Pre-decoding mechanism for superscalar
architecture," IEEE Pacific Rim Conf. on Comm., Comp., and Sig. Proc.,
pp. 22-24; on i-cache miss, a predecoder adds instruction grouping
("priority") and function unit assignment fields;
see US Patent 5,377,339 (parent filed 1991, granted 1994)
- ...
- trace caches
- Alex Peleg and Uri Weiser, "Dynamic flow instruction cache memory
organized around trace segments independent of virtual address line,"
US Patent 5,381,533 (parent filed 1992, granted to Intel 1995)
- ...
- ...
(US Patents - search subclass 213 under class 712)
[History page]
[Mark's homepage]
[CPSC homepage]
[Clemson Univ. homepage]
mark@cs.clemson.edu