Alternate Mode is an attempt to take the concepts developed in Universal Mode, and to further fine-tune the tradeoffs made in the various alternate modes offered, so as to produce a mode that may serve to replace many of these modes.
As in vector register mode, and unlike the scratchpad modes, instead of using the scratchpad pointer registers to indicate areas with 64 elements, additional register banks are used in this mode; one set of 64 supplementary accumulator/index registers, and one set of 64 supplementary floating-point registers.
In addition to this, to further approach the capabilities provided by Cray computers, two sets of eight long vector registers are provided, one with sixty-four 64-bit fixed point registers, and one with sixty-four 128-bit floating-point registers in each long vector register.
This resembles the complement of registers provided by the earliest Cray computers; the later Cray computers had considerably more, and even larger, vector registers. If sufficient register space is available in an implementation of this architecture, this too can be approached, by a set of sixty-four integer long vector registers and a set of sixty-four floating-point long vector registers, forming the long vector register scratchpad. (Upon re-examining the specifications for these computers, it appears I may have misread them, and, while they did increase the number of elements in a vector register from 64 to 128, or perhaps even the number of vector registers from 8 to 16, they did not have a second set of 64 vector registers as I mistakenly thought.)
The additional register banks used in this mode should be regarded as an optional feature of the architecture. Since they represent a large additional increment in the complexity of an implementation, and they are only applicable to some types of problems, there will be applications for which including these register banks would be wasteful.
The complement of registers provided in a full implementation of this mode is illustrated in the diagram below:

Because of the power and complexity of this mode, its description will be split up among several succeding pages.
While a short vector is composed of four double-precision floating-point numbers or eight single-precision floating-point numbers, and so on, a long vector always has 64 elements, independently of the type of its elements.
This will lead to some complications in arranging the path from memory to the arithmetic units, if these operations are implemented by means of 64 arithmetic units operating in parallel, as they may be in a high-performance implementation of this architecture (and a method of dealing with these complications is shown on the next page), but it also means that these vector operations can be implemented simply by the high-speed pipelining of a single arithmetic unit, as they were on many of the early vector architectures.
The potential presence of 64 arithmetic-logic units in an ultimate-performance implementation of the architecture suggests that a further increment in computing power might be obtained if each arithmetic-logic unit had a simple control unit associated with it, creating 64 separate computers. It would be possible to parcel out some portion of the internal cache memory of the chip to each of these processors; if some eight megabytes of cache memory were provided, for example, using half of that could provide each one with 65,536 bytes of memory.
The method of making use of this capability is described in some of the later sections among those that will follow.
It may be noted, however, that if there are 64 arithmetic-logic units available essentially identical to the main arithmetic-logic unit, attempting 65-way superscalar operation will suggest itself, and fully employing this will require the ability to decode instructions in the computer's full instruction set on a parallel basis. None the less, an advantage is derived from making each control unit simple for this form of operation, so that it can be provided on less ambitious implementations.
The instruction formats have been divided up among a number of diagrams, because a large number of instruction formats are required for memory-reference and register-to-register operations to handle all the types of operation that it is intended to support.
This diagram shows the most basic instruction formats for the standard memory-reference instructions and their related register to register instructions:

In most of the formats, the opcode corresponds to the last seven bits of an opcode for full opcode mode, and is thus a seven-bit opcode in the same format as used with universal mode.
In the scratchpad format instructions, however, a five-bit opcode is used. This allows this type of instruction to work with only three types, 32-bit fixed-point quantities, and 32-bit and 64-bit floating-point quantities. These are the data types most commonly used in FORTRAN programs.
A modification of this mode produces semi-RISC mode:

In this mode, the arithmetic/index and floating-point registers from 0 to 3 only are used in the register scratchpad instructions, so that indexed instructions using the normal base registers are possible; these indexed instructions are confined to the load and store instructions, which is what earns this mode its name, despite not really involving a particularly "reduced" instruction set.
Also, in this mode, the seven-bit opcodes are the same as those in normal mode, not as shown below, excluding the unnormalized floating-point operations. This will create additional opcode space for use in multi-way vector operations.
Another modification of this mode produces full opcode alternate mode, where the standard memory-reference instructions are extended as for full opcode mode to include simple floating, register packed, and register compressed operations; here, the addressing formats look like this:

In this mode, the vector register operations will be suitably modified to reflect the changed form of the opcode field and the bits before it; however, register packed and register compressed instructions, if supported, may not run at speeds comparable to other vector register instructions, as it is not expected for implementations to provide more than one packed decimal arithmetic unit. This may also apply to the decimal exponent modes as well, but the multiple integer ALUs which may be provided to accelerate vector register operations will support simple floating operation.
The eight-bit opcodes are the same as those for full opcode mode, and can be found in the section for that mode.
Two types of operate instructions are also shown in this chart, the population count instruction (SEBI: Separate Bits), and the short form of the shift instruction, since they are outside of the standard space for operate instructions which remains largely the same in different instruction modes.
As in universal mode, economy of opcode space is achieved by using two three-bit fields instead of three for the register entries in standard instructions. This fits with the requirements of a register-to-register instruction, and a memory reference instruction that is not indexed. These two types of instruction now have to be distinguished by prefix bits.
When the base register field in a memory-reference instruction is zero, this indicates that the instruction belongs to an additional addressing mode. When the first bit of the halfword following the first halfword of the instruction is a zero, the mode used for indexed addressing is used. When this bit is a one, the additional addressing modes described on the subsequent pages will apply.
Of the remaining 31 bits of the next two halfwords, three are used for an index register field: this may be zero for compatible non-indexed instructions, and three are used for a base register field: this refers to one of the eight scratchpad registers instead of one of the eight base/address registers. The remaining 25 bits are the displacement.
In the scratchpad instructions, the source register is one of the 64 supplementary registers of the appropriate type, fixed or floating, while the destination register is either one of the eight arithmetic/index registers or one of the eight floating-point registers. The larger number of supplementary registers allows many short routines to avoid use of memory, and to avoid a need for instructions longer than 16 bits.
The available memory-reference instructions are shown in the table below. The opcodes are shown in three columns; the first shows the four-bit opcode in binary for the indexed load and store instructions in semi-RISC mode, the second shows the five-bit opcodes in binary, for the register scratchpad instructions, in which a five-bit opcode is available for the instruction, and the third shows, in octal, the first halfword of an instruction containing a seven-bit opcode. The second digit, which is either 0 or 1 in this table, actually has its first two bits determined by the type of the instruction.
0000xx SWB Swap Byte
0001xx CB Compare Byte
0000 0002xx LB Load Byte
0001 0003xx STB Store Byte
0004xx AB Add Byte
0005xx SB Subtract Byte
0010xx IB Insert Byte
0011xx UCB Unsigned Compare Byte
0012xx ULB Unsigned Load Byte
0013xx XB XOR Byte
0014xx NB AND Byte
0015xx OB OR Byte
0016xx STGB Store if Greater Byte
0020xx SWH Swap Halfword
0021xx CH Compare Halfword
0010 0022xx LH Load Halfword
0011 0023xx STH Store Halfword
0024xx AH Add Halfword
0025xx SH Subtract Halfword
0026xx MH Multiply Halfword
0027xx DH Divide Halfword
0030xx IH Insert Halfword
0031xx UCH Unsigned Compare Halfword
0032xx ULH Unsigned Load Halfword
0033xx XH XOR Halfword
0034xx NH AND Halfword
0035xx OH OR Halfword
0036xx MEH Multiply Extensibly Halfword
0037xx DEH Divide Extensibly Halfword
00000 0040xx SW Swap
00001 0041xx C Compare
0100 00010 0042xx L Load
0101 00011 0043xx ST Store
00100 0044xx A Add
00101 0045xx S Subtract
00110 0046xx M Multiply
00111 0047xx D Divide
01001 0051xx UC Unsigned Compare
01011 0053xx X XOR
01100 0054xx N AND
01101 0055xx O OR
01110 0056xx ME Multiply Extensibly
01111 0057xx DE Divide Extensibly
0060xx SWL Swap Long
0061xx CL Compare Long
0110 0062xx LL Load Long
0111 0063xx STL Store Long
0064xx AL Add Long
0065xx SL Subtract Long
0066xx ML Multiply Long
0067xx DL Divide Long
0071xx UCL Unsigned Compare Long
0073xx XL XOR Long
0074xx NL AND Long
0075xx OL OR Long
0076xx MEL Multiply Extensibly Long
0077xx DEL Divide Extensibly Long
0100xx SWM Swap Medium
0101xx CM Compare Medium
1000 0102xx LM Load Medium
1001 0103xx STM Store Medium
0104xx AM Add Medium
0105xx SM Subtract Medium
0106xx MM Multiply Medium
0107xx DM Divide Medium
0110xx MEUM Multiply Extensibly Unnormalized Medium
0111xx DEUM Divide Extensibly Unnormalized Medium
0112xx LUM Load Unnormalized Medium
0113xx STUM Store Unnormalized Medium
0114xx AUM Add Unnormalized Medium
0115xx SUM Subtract Unnormalized Medium
0116xx MUM Multiply Unnormalized Medium
0117xx DUM Divide Unnormalized Medium
10000 0120xx SWF Swap Floating
10001 0121xx CF Compare Floating
1010 10010 0122xx LF Load Floating
1011 10011 0123xx STF Store Floating
10100 0124xx AF Add Floating
10101 0125xx SF Subtract Floating
10110 0126xx MF Multiply Floating
10111 0127xx DF Divide Floating
0130xx MEU Multiply Extensibly Unnormalized
0131xx DEU Divide Extensibly Unnormalized
0132xx LU Load Unnormalized
0133xx STU Store Unnormalized
0134xx AU Add Unnormalized
0135xx SU Subtract Unnormalized
0136xx MU Multiply Unnormalized
0137xx DU Divide Unnormalized
11000 0140xx SWD Swap Double
11001 0141xx CD Compare Double
1100 11010 0142xx LD Load Double
1101 11011 0143xx STD Store Double
11100 0144xx AD Add Double
11101 0145xx SD Subtract Double
11110 0146xx MD Multiply Double
11111 0147xx DD Divide Double
0150xx MEUD Multiply Extensibly Unnormalized Double
0151xx DEUD Divide Extensibly Unnormalized Double
0152xx LUD Load Unnormalized Double
0153xx STUD Store Unnormalized Double
0154xx AUD Add Unnormalized Double
0155xx SUD Subtract Unnormalized Double
0156xx MUD Multiply Unnormalized Double
0157xx DUD Divide Unnormalized Double
0160xx SWQ Swap Quad
0161xx CQ Compare Quad
1110 0162xx LQ Load Quad
1111 0163xx STQ Store Quad
0164xx AQ Add Quad
0165xx SQ Subtract Quad
0166xx MQ Multiply Quad
0167xx DQ Divide Quad
0170xx MEUQ Multiply Extensibly Unnormalized Quad
0171xx DEUQ Divide Extensibly Unnormalized Quad
0172xx LUQ Load Unnormalized Quad
0173xx STUQ Store Unnormalized Quad
0174xx AUQ Add Unnormalized Quad
0175xx SUQ Subtract Unnormalized Quad
0176xx MUQ Multiply Unnormalized Quad
0177xx DUQ Divide Unnormalized Quad
The operate instructions have the same format as in normal mode, even to the extent that if an index register field is present, it can be used, just as it is available for the memory-reference instructions with five-bit opcodes, except that when the base register field is zero, then again the sixteen-bit address field is replaced by a 32-bit field with an index and base register specification; the corresponding index register field in the instruction itself must be zero in that case.