[Next] [Up] [Previous]

Universal Mode

Universal mode has been carefully designed to meet the requirements of compilers. Its main distinguishing feature is how it deals with instructions using indexed addressing.

A typical program module is unlikely to make use of 4,096 simple variables, or even 512 simple variables. Despite this, the short page modes tend to be restrictive in practice. This is because a program module may refer to more then seven arrays, and an array may extend over an area more than 4,096 bytes in size, which means that one base register value has to be defined to reach each such array in use.

Universal mode attempts to deal with this, while also providing more effective use of opcode space by not placing a destination register, source index, and source base field in every memory-reference instruction, but instead reducing the number of register fields to two, by modifying the form of indexed instructions.

A normal memory-reference instruction can come in three forms.

The address field in an indexed memory-reference instruction can have two possible forms.

Thus, a program needs only to allocate one of the array scratchpad registers in order for its instructions to have the ability to refer to 4,096 different arrays of arbitrary size.

The source index field in an indexed memory-reference instruction may be zero, indicating that indexing is not taking place. For the instructions with a long displacement, this is useful because it allows references to specific elements of arrays indicated with a constant subscript. To achieve the same goal in the short displacement form, a zero index field in such an instruction instead indicates absolute addressing, where a base register is not required, because the address field in the instruction is 32 bits in length, or 64 bits in length if 64-bit addressing is in effect.

Note also that, while it may seem odd to use 28 bits for a displacement, and three bits to specify a base register, for a total of 31 bits, when 32 bits are enough to specify an absolute address, this addressing mode becomes much more reasonable and relevant when 64-bit addressing is used.

Many of the formats of the memory reference instructions in this mode are shown below:

Since 2 is not used as an opcode for shift instructions, opcode space is available to squeeze in the population count instructions.

Since neither 07 nor 17 are used as opcodes for the bit field instructions, opcode space is available to squeeze in the long vector instructions.

In all the long vector instruction formats, both those shown above, and those to be described in the next illustration below, the opcodes for certain functions which are not usable in vector operations are instead used for tests which set the bits of the mask register specified in the instruction.

The unsigned compare, multiply extensibly, and divide extensibly instructions are used for this purpose. These are integer opcodes only. Since the operations are single-address in nature, the operand is indicated by the source field, and the destination field must indicate a register; if that register is register 0, an operation on an integer type is indicated; if that register is register 4, an operation on the corresponding floating type is indicated.

0001001 (0)  SMBLVZB   Set Mask Bit Long Vector if Zero Byte
0001110 (0)  SMBLVPB   Set Mask Bit Long Vector if Positive Byte
0001111 (0)  SMBLVNB   Set Mask Bit Long Vector if Negative Byte

0001001 (4)  SMBLVZM   Set Mask Bit Long Vector if Zero Medium
0001110 (4)  SMBLVPM   Set Mask Bit Long Vector if Positive Medium
0001111 (4)  SMBLVPM   Set Mask Bit Long Vector if Negative Medium

0011001 (0)  SMBLVZH   Set Mask Bit Long Vector if Zero Halfword
0011110 (0)  SMBLVPH   Set Mask Bit Long Vector if Positive Halfword
0011111 (0)  SMBLVNH   Set Mask Bit Long Vector if Negative Halfword

0011001 (4)  SMBLVZF   Set Mask Bit Long Vector if Zero Floating
0011110 (4)  SMBLVPF   Set Mask Bit Long Vector if Positive Floating
0011111 (4)  SMBLVPF   Set Mask Bit Long Vector if Negative Floating

0101001 (0)  SMBLVZ    Set Mask Bit Long Vector if Zero
0101110 (0)  SMBLVP    Set Mask Bit Long Vector if Positive
0101111 (0)  SMBLVN    Set Mask Bit Long Vector if Negative

0101001 (4)  SMBLVZD   Set Mask Bit Long Vector if Zero Double
0101110 (4)  SMBLVPD   Set Mask Bit Long Vector if Positive Double
0101111 (4)  SMBLVND   Set Mask Bit Long Vector if Negative Double

0111001 (0)  SMBLVZL   Set Mask Bit Long Vector if Zero Long
0111110 (0)  SMBLVPL   Set Mask Bit Long Vector if Positive Long
0111111 (0)  SMBLVNL   Set Mask Bit Long Vector if Negative Long

0111001 (4)  SMBLVZQ   Set Mask Bit Long Vector if Zero Quad
0111110 (4)  SMBLVPQ   Set Mask Bit Long Vector if Medium Quad
0111111 (4)  SMBLVPQ   Set Mask Bit Long Vector if Negative Quad

Compare instructions exist for integer and floating-point types, and they are also not usable with vector operations. Here, they act not only for the long vector instructions, but for the other memory-to-memory vector instructions as well, to indicate multiply and accumulate instructions.

0010001  MAH  Multiply and Accumulate Halfword
0100001  MA   Multiply and Accumulate
0110001  MAL  Multiply and Accumulate Long
1000001  MAM  Multiply and Accumulate Medium
1010001  MAF  Multiply and Accumulate Floating
1100001  MAD  Multiply and Accumulate Double
1110001  MAQ  Multiply and Accumulate Quad

As well, it is illegal for the source and destination registers to be the same in the register to register instructions. (This is also true for the bit-matrix multiply instructions as well.) This can also be used to squeeze in additional long vector addressing modes:

Where both the source and destination register fields in what would have been a register-to-register instruction are zero, an orthogonal set of scalar instructions is provided that allows for instructions that refer to memory, registers, or the sixty-four scratchpad registers.

Where both the source and destination register fields in what would have been a register-to-register instruction are one, we have a three-address memory-to-memory vector instruction. The P bit enables overlap with subsequent instructions if a 1, and the C bit, if a 1, bars the use of the external vector coprocessor for the instruction, as with the corresponding addressing mode within the vector register instruction mode. The K bit, applicable to operand and source operands, allows that operand to be a scalar rather than a vector.

Where both the source and destination register fields in what would have been a register-to-register instruction are two, we have the three-address long-vector instruction. This instruction has three operands, which may orthogonally be register or memory operands. When an operand is in a vector register, or is a constant in a scalar register, the corresponding scratchpad index field is not used, and is zero. The K bit is used as in the format above. Note that, as with many of the other addressing modes shown, not all possible combinations of operand addressing modes are shown; in the case of this format, only one of the many possible combinations of addressing modes is illustrated; each of the three operands may be either the scratchpad registers (used as a vector accumulator), one of the eight vector registers, one of the sixty-four elements of the vector scratchpad, an operand in memory indicated by one of the eight base registers and a 16-bit address, an indexed operand in memory indicated by one of the eight scratchpad registers and a 28-bit address, or an indirect post-indexed operand in memory, with a 12-bit address and one of the eight array scratchpad registers used to indicate the address to be indexed to find the operand value. Thus, there are 216 possible combinations of addressing modes in this type of instruction.

In the case where the destination operand is a register operand, the first three bits of the otherwise unused scratchpad index field are used as an op2 field, providing three additional opcode bits. At present, the only value other than zero, which allows instructions to be interpreted normally, which has a defined action in this field is 1, which is used to indicate the following opcodes are to be interpreted as performing bit-reversed load and shuffle instructions for accelerating a Fast-Fourier transform in the Pease framework.

0000000  BRL16B    0010000  BRL16H    0100000  BRL16     0110000  BRL16L

0000010  BRL32B    0010010  BRL32H    0100010  BRL32     0110010  BRL32L
0000011  SH32B     0010011  SH32H     0100011  SH32      0110011  SH32L
0000100  BRL64B    0010100  BRL64H    0100100  BRL64     0110100  BRL64L
0000101  SH64B     0010101  SH64H     0100101  SH64      0110101  SH64L
0000110  US128B    0010110  US128H    0100110  US128     0110110  US128L
0000111  SH128B    0010111  SH128H    0100111  SH128     0110111  SH128L


1000000  BRL16M    1001000  BRL16F    1100000  BRL16D    1110000  BRL16Q

1000010  BRL32M    1010010  BRL32F    1100010  BRL32D    1110010  BRL32Q
1000011  SH32M     1010011  SH32M     1100011  SH32D     1110011  SH32Q
1000100  BRL64M    1010100  BRL64F    1100100  BRL64D    1110100  BRL64Q
1000101  SH64M     1010101  SH64F     1100101  SH64D     1110101  SH64Q
1000110  US128M    1010110  US128F    1100110  US128D    1110110  US128Q
1000111  SH128M    1010111  SH128F    1100111  SH128D    1110111  SH128Q

Where both the source and destination register fields in what would have been a register-to-register instruction are three, then we have memory to vector-register instructions with stride.

Although these instructions use seven-bit opcodes, they are unlike the seven-bit opcodes used with other instruction modes that may take on only 96 values. The seven-bit opcodes in universal mode may take on all 128 possible values, and are interpreted as the last seven bits of an eight-bit opcode, as used in full opcode mode, which begins with zero. Thus, the unnormalized floating-point arithmetic instructions are used in this mode.

Note that, in addition to the modifications to the format of the extended operate and long vector instructions, as compared to the format used in comprehensive mode, to fit into the available opcode space for this mode and to use seven-bit opcodes, where a source index and source base field are used, the source index field becomes a source mode field;

This same modification is made to the operate instructions, as shown below for the character and packed decimal operate instructions:

The remaining operate instructions have the form shown below:

For the jump instructions, when a conventional base register is used, the scratchpad pointer registers instead of the base registers are used, and displacements are in halfwords instead of bytes, since those registers only point into code.

Note that not every possible addressing mode is illustrated in the diagrams above, but some of the addressing modes not illustrated are not possible, as they are not applicable, such as jumping to a register.

Note that while significant efforts were made to provide most of the functionality of vector register mode in this mode, it will still be needed to use the INWM instruction or similar measures to invoke cache-internal parallel computing, or to perform multi-way vector operations.

Register Scratchpad Universal Mode

Register scratchpad universal mode is identical to universal mode, except that the bit-matrix multiply instructions are omitted, therefore being accessible only by means of the INWM instructions or similar measures, and are replaced by instructions in the following format:

These instructions use register 0 of the appropriate type as their destination operand, and one of the sixty-four supplementary registers as their source operand. Note that although these are register to register instructions, the store instruction is applicable to them.


[Next] [Up] [Previous]