Mark Smotherman. Last updated January 24, 2018

The present invention offers several advantages over prior art branching apparatus and methods. Primary among these are the ability to anticipate branches, the ability to reduce the number of branches and the ability to reduce branch delays. With these abilities, the branching apparatus and method according to the invention can enable a highly parallel computer to overlap most branches such that branching takes a minimal delay, and essentially full data rate is maintained across the branch points.-- U.S. Patent 3,577,189 (Cocke, Randell, Schorr, and Sussenguth)

- 48-bit words
- 24-bit halfwords
- 8-bit bytes

- half-word addressability
- 24-bit address within program, 12-bit "key" added to provide 36-bit virtual address
- a single user program could address 2**24 halfwords (= 48 MB) using its normal key and another 2*24 halfwords using its alternate key
- four "key" (segment base) registers
- PNK - program normal key
- PAK - program alternate key
- SNK - supervisor normal key
- SAK - supervisor alternate key

- 32 X (index) registers, 24 bits each, X0 always zero
- 32 A (arithmetic) registers, 48 bits each, A0 always zero
- within the implementation, each A register was shadowed by one dedicated back-up register and thus arithmetic loads were allowed to proceed and use the back-up register even if the corresponding A register was still busy
- single-precision floating point used a 48-bit format (as desired by Livermore), with a 12-bit exponent
- double-precision floating-point used a 96-bit format
- each double-precision operand required a register pair
- fifteen pairs: A2/A3, A4/A5, ... A30/A31
- A0/A1 was treated as zero

- 24 C (condition) bits
- some number of T (i/o transmission) registers
- 20 S (special) registers:
- condition, 24 bits
- problem exception, 21 bits
- problem mask, 21 bits
- supervisory exception, 11 bits
- supervisory mask, 11 bits
- problem normal key, 12 bits (high 12 bits of 36-bit virtual address)
- problem alternate key, 12 bits
- supervisory normal key, 12 bits
- supervisory alternate key, 12 bits
- interrupt return address, 24 bits
- effective branch address, 24 bits
- machine state, 16 bits
- supervisor/problem state, 1 bit
- disable/enable interrupts, 1 bit
- concurrent/sequential execution, 1 bit
- branch state, 3 bits
- skip state, 1 bit
- multiprecision carry, 3 bits
- previous interrupt state, 3 bits
- previous SVC state, 3 bits

- cycle count, 24 bits
- instruction count, 24 bits
- timer, 24 bits
- external signal, 24 bits
- four general purpose registers, 24 bits each

- load-store architecture,
- 3-register (RRR) instruction format for ALU operations
- full 24-bit literal for addressing, double indexing on many memory addresses
- 8-bit opcode
- 1-bit skip flag in each instruction
- 5-bit register specifier fields
- two instruction lengths:
24-bit (short) instruction format:

opcode S R1 R2 R3 48-bit (long) instruction format:

opcode S R1 X2 X3 24-bit literal value

(The opcode list below is taken from Lynn Conway's simulator notes of April 1967, which differs slightly from the ACS-1 MPM Instruction Manual of January 1968. The two columns are used to indicate to which decoder the instruction is sent. L or S after the opcode indicates long or short format.)

op code
index unit
arithmetic unit
Load/store 001 S LXH - load index (halfword format) 002 L LX - load index 003 L LXA - load index using alternate key 004 S STXH - store index (halfword format) 005 L STX - store index 006 L STXA - store index using alternate key 007 L LXC - load index and count 008 L LXCA - load index using alternate key 009 S LAH - load arithmetic (halfword format) (both) 010 L LA - load arithmetic (both; as replace for A-unit) 011 L LAA - load arithmetic using alternat key (both; replace) 012 S STAH - store arithmetic (halfword format) (both) 013 L STA - store arithmetic (both) 014 L STAA - store arithmetic using alternate key (both) 015 S LDH - load arithmentic double (halfword format) (both; replace) 016 L LD - load arithmetic double (both; replace) 017 S STDH - store arithmetic double (halfword format) (both) 018 L STD - store arithmetic double (both) 019 S LATH - load arithmetic (scaled indexing, halfword format) (both; replace) 020 L LAT - load arithmetic (scaled indexing) (both; replace) 021 S STATH - store arithmetic (scaled indexing, halfword format) (both) 022 L STAT - store arithmetic (scaled indexing) (both) 023 L LL - load arithmentic, left half (both; replace) 024 L LR - load arithmetic (right half) (both; replace) 025 L STL - store arithmetic (left half) (both) 026 L STR - store arithmetic (right half) (both) 027 L LMX - load multiple index 028 L STMX - store multiple index 029 L LMA - load multiple arithmetic (both; replace) 030 L STMA - store multiple arithmetic (both) 031 L LMS - load multiple special 032 L STMS - store multiple special 033 L STMZ - store multiple zeros 034 L STMZA - store multiple zeros using alternate key Move 038 S MXA - move index to arithmetic (both) 039 S MAX - move arithmetic to index (both) 040 L MKL - move constant to left half arithmentic (both) 041 L MKR - move constant to right half arithmentic (both) 042 S MLX - move location to index 043 S MXS - move index to special 044 S MSX - move special to index 045 S MSXZ - move special to index and zero 046 S MSXO - move special to index by or 047 S MXC - move index bit to condition bit 048 S MCX - move condition bit to index bit 049 S MLC - move location bit to condition bit (both) 050 S MAC - move arithmentic bit to condition bit (both) 051 MXP . move index to (P?) 052 MKP - move constant to (P?) Arithmetic 056 S AN - add normalized 057 S ADN - add double normalized 058 S AR - add rounded 059 S ADR - add double rounded 060 S AU - add unnormalized 061 S ADU - add double unnormalized 062 S SN - subtract normalized 063 S SDN - subtract double normalized 064 S SR - subtract rounded 065 S SDR - subtract double rounded 066 S SU - subtract unnormalized 067 S SDU - subtract double unnormalized 068 S MN - multiply normalized 069 S MDN - multiply double normalized 070 S MR - multiply rounded 071 S MDR - multiply double rounded 072 S MU - multiply unnormalized 073 S MDU - multiply double unnormalized 074 S MMN - multiply mixed normalized 075 S MMU - multiply mixed unnormalized 076 S DN - divide normalized 077 S DDN - divide double normalized 078 S DR - divide rounded 079 S DDR - divide double rounded 080 S DMN - divide mixed normalized 081 S DMR - divide mixed rounded 082 S RND - round 083 S SPF - set positive, floating 084 S SNF - set negative, floating 085 S CVS - convert to short floating 086 S CVF - convert to full floating 089 S AI - add integer 090 S SI - subtract integer 091 S MI - multiply integer 092 S MMI - multiply mixed-length integer 093 S DI - divide integer 094 S DMI - divide mixed-length integer 095 S ACL - add continued low-order 096 S SCL - subtract continued low-order 097 S ACH - add continued high-order 098 S SCH - subtract continued high-order 099 S SPI - set positive integer 100 S SNI - set negative integer 101 S CVN - convert to normalized 102 S CVI - convert to integer 106 S AX - add index 107 S SX - subtract index 108 S MX - multiply index 109 S DRX - divide with remainder index 110 S DX - divide index 111 S RX - remainder index 112 S AXC - add index to short constant 113 L AXK - add index to constant 114 L MXK - multiply index by constant 115 L DRXK - divide with remainder index by constant 116 L DXK - divide index by constant 117 L RXK - remainder index by constant 118 S SPX - set positive index 119 S SNX - set negative index Compare 123 S CGEN - compare GE normalized (both) 124 S CEQN - compare EQ normalized (both) 125 S CGED - compare GE double (both) 126 S CEQD - compare EQ double (both) 127 S CMGEN - compare magnitude GE normalized (both) 128 S CMEQN - compare magnitude EQ normalized (both) 129 S CMGED - compare magnitude GE double (both) 130 S CMEQD - compare magnitude EQ double (both) 131 S CGEI - compare GE integer (both) 132 S CEQI - compare EQ integer (both) 133 S CUGEI - compare unsigned GE integer (both) 134 S CGEX - compare GE index 135 S CEQX - compare EQ index 136 S CUGEX - compare unsigned GE index 137 L CGEXK - compare GE index with constant 138 L CEQXK - compare EQ index with constant 139 L CUGEXK - compare unsigned GE index with constant 140 S CBA - compare bytes arithmetic (both) 141 S CBMA - compare bytes multiple arithmetic (both) 142 S CBX - compare bytes index 143 S CBMX - compare bytes multiple index Shift 147 S SHA - shift arithmetic 148 S SHX - shift index 149 S SHAC - shift arithmetic by constant 150 S SHXC - shift index by constant 151 S SHD - shift double 152 S SHDX - shift double index 153 S SHDC - shift double by constant 154 S SHDXC - shift double index by constant 155 L SWA - swap storage with arithmentic (both) 156 L SWX - swap storage with index 157 S IFA - insert field arithmetic 158 S IFX - insert field index 159 S IFZA - insert field and zero arithmetic 160 S IFZX - insert field and zero index 161 S SIA - shift integer arithmetic 162 S SIX - shift integer index 163 S SIAC - shift integer arithmetic by constant 164 S SIXC - shift integer index by constant 165 S SID - shift integer double 166 S SIDC - shift integer double by constant Logical 170 S ANDA - and arithmetic 171 S TAFA - and-not arithmetic (true and false) 172 S FAFA - nor arithmetic (false and false) 173 S ORA - or arithmetic 174 S TOFA - or-not arithmetic (true or false) 175 S FOFA - nand arithmetic (false or false) 176 S EQA - equal arithmetic 177 S XORA - exclusive-or arithmetic 178 S ANDX - and index 179 S TAFX - and-not index (true and false) 180 S FAFX - nor index (false and false) 181 S ORX - or index 182 S TOFX - or-not index (true or false) 183 S FOFX - nand index (false or false) 184 S EQX - equal index 185 S XORX - exclusive-or index 186 S ANDC - and condition 187 S TAFC - and-not condition (true and false) 188 S FAFC - nor condition (false and false) 189 S ORC - or condition 190 S TOFC - or-not condition (true or false) 191 S FOFC - nand condition (false or false) 192 S EQC - equal condition 193 S XORC - exclusive-or condition 194 S CNTT - count total ones arithmetic 195 S CNTAA - count leading alike arithmetic 196 S CNTDA - count leading different arithmetic 197 S CNTAX - count leading different index 198 S CNTDX - count leading different index Branch 202 L BAND - branch at exit and 203 L BTAF - branch at exit and-not (true and false) 204 L BFAF - branch at exit nor (false and false) 205 L BOR - branch at exit or 206 L BTOF - branch at exit or-not (true or false) 207 L BFOF - branch at exit nand (false or false) 208 L BEQ - branch at exit equal 209 L BXOR - branch at exit exclusive-or 210 S BU - branch at exit unconditionally 211 S EXIT - exit (change inst. addr. reg.) (both) 212 S EXITL - exit, save location and stop skipping (both) 213 S EXITA - ? (both) 214 S EXITP - ? (both) 215 S SKAND - skip on and (both) 216 S SKTAF - skip on and-not (true and false) (both) 217 S SKFAF - skip on nor (false and false) (both) 218 S SKOR - skip on or (both) 219 S SKTOF - skip on or-not (true or false) (both) 220 S SKFOF - skip on nand (false or false) (both) 221 S SKEQ - skip on equal (both) 222 S SKXOR - skip on exclusive-or (both) 223 L IVIB - invalidate instruction buffers and branch (both) 224 S? NOP - no operation (both) Control 225 S PAUSE - pause (interrupt/exception fence) 226 S PI - pause with exception 227 L SCAN - scan in of interrupted state 228 S SVC - supervisor call 229 S SVR - supervisor return 230 S IC - interrupt call (internally generated) 231 S IR - interrupt return 235 L SIO - start i/o 236 S HIO - halt i/o 237 S TCH - test channel 238 S MTX - move T register to index 239 S MXT - move index to T register 240 S MZT - move zero to T register bit 241 S MOT - move one to T register bit 242 S ITUMA - invalidate tag and update memory using alternate key 243 S ITUMP - invalidate tag and update memory using PDA 244 ? IDA - ? 245 ? LDA - ? 246 ? LDHAA - ? 247 ? LDHBA - ? 248 ? LDHCA - ? 249 ? LDHDA - ? 250 ? STHAA - ? 251 ? STHBA - ? 252 ? STHCA - ? 253 ? STHDA - ?

Instructions to support commercial data processing, such as decimal and variable-field-length instructions, were ruled out by the designers. Other S/360 instruction set features such as the execute instruction (i.e., a single-instruction subroutine) and the register-memory (RX) instruction format were also ruled out.

Notes:

- No cumulative multiply (fused multiply-add)
- LXC "load index and count" is a form of auto-increment (but dropped by 1968)
- AXT "add index and test" (added by 1968) set a condition bit to 1 when the sign changed for an index register (e.g., nonnegative to negative); it may have helped in loop control when a loop count was repeatedly decremented by one; it also had short and long literal forms, AXCT and AXKT
- The character encoding was not specified, but several "compare bytes" operations were defined to treat a 48-bit word as six 8-bit bytes.

Recognizing that they could lose half or more of the design's performance on branching, the designers took several aggressive steps to reduce the number of branches:

- A set of 24 condition code bits allowed precalculation of branch
conditions and also allowed a single branch instruction
to specify any logical operation between two of the condition codes.
This similar in concept to the eight independent condition codes in
the IBM RS/6000 and PowerPCs.
Example:

CEQX 10,1,2 // compare index regs // X1 and X2 and set // condition C10 if // X1==X2 CGEX 11,3,4 // compare index regs // X3 and X4 and set // condition C11 if // X3>=X4 BOR 10,11,0,LABEL // establish a branch // predicate of // (C10 or C11) and // set corresponding // branch target // address to LABEL EXIT // perform transfer of // control if branch // predicate is true

- As illustrated above, a change in program flow requires two instructions.
Ed Sussenguth recognized that the actions of a conditional branch could
be divided into three separate categories:
- branch target address calculation,
- taken/untaken determination, and
- instruction counter update.

The ACS-1 combined the first two actions in its "branch" instruction and used the "exit" instruction to accomplish the third action. This is in effect a prepare-to-branch approach that allows a variable number of branch delay slots to be filled (termed within ACS as "anticipating a branch"). Perhaps even more importantly, it provides for multiway branch specification. That is, multiple branch instructions could be executed and only the first successful branch instruction would be used by the exit instruction to transfer control.

Example:

... BOR 1,2,0,ALPHA // establish the first // branch predicate // of (C1 or C2) and // set corresponding // branch target // address to ALPHA ... BAND 3,4,0,BRAVO // establish the second // branch predicate // of (C3 and C4) and // set corresponding // branch target // address to BRAVO BEQ 5,6,0,BRAVO // establish the third // branch predicate // of (C5 == C6) and // set corresponding // branch target // address to BRAVO // - thus BRAVO if // (C3andC4)or(C5==C6) ... BAND 7,7,0,CHARLIE // establish the fourth // branch predicate // of (C7) and set // corresponding // branch target // address to CHARLIE ... EXIT // perform transfer of // control to a branch // target address based // on the first true // branch predicate; // if none are true, // then fall through; // as part of this // instruction the exit // condition is reset to // no true predicates

if( condition_1 || condition_2 ) goto ALPHA else if( condition_3 && condition_4 ) goto BRAVO else if( condition_5 == condition_6 ) goto BRAVO else if( condition_7 && condition_7 ) goto CHARLIE

- To handle the case of a forward conditional branch with a small
displacement, a conditional execution bit ("skip flag") was added to each
instruction format. A special form of the prepare-to-branch instruction
was used as a conditional "skip". At the point of condition resolution,
if the condition in the skip instruction was true, any instructions marked
as conditional were removed from the instruction queues. If the
condition was false, then the marked instructions were unmarked and
allowed to execute. The skip instruction replaced a regular conditional
branch and there was no nonsequential change in the instruction counter
This is similar to predication in the HP PA-RISC and Intel IA-64
instruction sets and is the same concept behind the conditional move
instructions found in several other instruction sets.
Example:

CGEN 10,1,2 // compare arithmetic // regs A1 and A2 // and set condition // C10 if A1>=A2 BAND 10,10,0,LABEL // establish a branch // predicate of (C10) // and set corresponding // branch target address // to LABEL EXIT // perform transfer of // control if required AN 5,3,4 // add arithmetic regs // A3 and A4 and place // result in A5 LABEL: ...

CGEN 10,1,2 // compare arithmetic // regs A1 and A2 // and set condition // C10 if A1>=A2 SKAND 10,10 // establish a skip // predicate of (C10) AN* 5,3,4 // add arithmetic regs // A3 and A4 and place // result in A5 if the // skip predicate is // not true

See US Patent 3,577,190, John Cocke, Phil Dauber, Herb Schorr, and Ed Sussenguth, "Apparatus in a digital computer for allowing the skipping of predetermined instructions in a sequence of instructions in response to the occurrence of certain conditions," filed June 1968, and issued May 1971.

The example ACS-1 code segment shown below is adapted from the MPM Timing Simulation report of August 1967. It is described as part of a Crout reduction program for matrix decomposition.

LOOP: CGEX 2,4,3 // compare index regs // X4 and X3 and set // condition C2 if // X4>=X3 BAND 2,2,0,LOOP // establish a branch // predicate of (C2) // and set corresponding // branch target address // to LOOP CGEN 1,1,2 // compare arithmetic regs // A1 and A2 and set // condition C1 if // A1>=A2 AXK 3,3,0,1 // add constant of 1 to // index reg X3 AN 1,1,8 // add arithmetic regs // A1 and A8 and place // result in A1 LA 8,0,0,1000 // load arithmetic reg A8 AN 2,2,9 // add arithmetic regs // A2 and A9 and place // result in A2 LA 9,0,0,2000 // load arithmetic reg A9 SKOR 1,1 // set skip predicate to // (C1) MN* 1,1,2 // multiply arithmetic regs // A1 and A2 and place // result in A1 if skip // predicate is not true EXIT // perform transfer of // control if required STA 1,0,0,1000 // store arithmetic reg A1 STOP

**Navigation within IBM ACS pages:**

Next sections: Logical Organization / Instruction Issue / Memory Hierarchy / Maintenance