This is now my fifth attempt to propose a successor to my original Concertina architecture, intended to be simple and straightforward enough to have a chance of gaining interest and acceptance.

The basic complement of registers is as follows:

There is a bank of 32 fixed-point registers; each one is 64 bits wide, since that is the maximum length used. As well, there is another bank of 32 floating-point registers, each one nominally termed 128 bits wide, as they are used to contain data types which do not exceed 128 bits in length in memory; however, they each have a few extra bits to indicate the type of data they contain, and are internally 151 bits wide so as to be able to accomodate extended-precision decimal floating-point numbers (one bit for sign, fourteen bits for exponent, and thirty-four decimal digits in packed decimal form, each taking four bits).

There are also banks of 128 fixed-point registers and 128 floating-point registers; these are for the use of instructions that execute without using interlocks that check for whether preceding instructions that modify the contents of registers they take data from have finished. Instead, those instructions contain bits by which they explicitly indicate which instructions they are dependent upon, thus functioning in the fashion associated with Digital Signal Processor (DSP) chips.

The fixed point registers contain binary bits in the same format in which they appear in memory. These bits may represent a signed two's complement number or an unsigned binary number, and they may also represent a packed decimal number. Packed decimal numbers in the fixed point reigsters are in a modified ten's complement form: if the first digit is from 0 to 9, the number is positive; negative numbers are indicated by a first digit from A through F, where F999 9999 9999 9999 represents -1, and A000 0000 0000 0000 represents -5,999,999,999,999,999 which is the negative number of largest magnitude that this representation provides for in a 64-bit register.

The floating-point registers, on the other hand, do *not* contain numbers in the formats in which
they appear in memory.

For IEEE-754 floating-point numbers, the exponent field is expanded to the length used for extended-precision numbers, and the hidden first bit is restored; thus, the format is that of extended-precision numbers in memory, but with fewer mantissa bits. The total length of the number, however, will be greater than the size in memory of the type of number involved for types other than extended-precision types.

The floating-point registers, however, may also be used for other data types. The full list of basic types of data which they may contain are:

- Floating-point numbers which follow IEEE 754 conventions, including both types described in the standard, and other similar types with nonstandard lengths;
- Decimal floating-point numbers as described in later versions of the IEEE 754 standard;
- Compatible floating-point numbers, having the same format as floating-point numbers on the IBM System/360 computer, or the format referred to as Hexadecimal Floating-Point (HFP) in current IBM System z mainframe computers;
- Compatible decimal floating-point numbers, which have a format similar to compatible floating-point numbers, but which have a mantissa composed of packed decimal numbers, the exponent now being a power of ten, but still being represented in binary encoding and in excess-64 representation; floating-point numbers of this format were used by the Wang VS series of mainframe computers, only in the 64 bit size;
- Modified compatible decimal floating-point numbers, which use their first 12 bits for the sign bit and an excess-1024 exponent, followed by a mantissa of packed decimal numbers; this allows the full range of exponents, plus one extra exponent digit, usually seen in scientific pocket calculators to be represented, at the cost of the loss of one mantissa digit. This format is also only applied to floating-point numbers 64 bits in length.

Associated with each floating-point register, but not directly accessible to the programmer, are
status bits which indicate the type of data currently contained in the register. Conversions between
types are *not* automatically performed, and an attempt to perform an operation involving a
floating-point type incompatible with that of the current contents of the register will result in that
register being set to a zero of the new type before the operation is performed.

Compatibility between types is as indicated by the following chart:

FP CFP DFP CDFP MCDFP Floating-Point * 0 0 0 0 Compatible Floating-Point 0 * 0 0 0 Decimal Floating-Point 0 0 * X X Compatible Decimal Floating-Point 0 0 X * X Modified Compatible Decimal Floating-Point 0 0 X X * * : types match, operation performed without issue X : types are compatible, conversion takes place 0 : types are incompatible, register contents are zeroed before operation

For the various decimal floating-point types, the internal register contents are the same, a mantissa composed of packed decimal digits, and a binary power-of-ten exponent.

For the two binary floating-point types, the internal register contents differ, with the exponent being a power of two and the mantissa being normalized so that the first binary digit is a one for IEEE 754 types, but the exponent being a power of 16, and the mantissa being normalized so that the first hexadecimal digit is nonzero for the compatible floating-point type.

However, the floating-point registers also have **Load Multiple** and **Store
Multiple** instructions available which operate on them.

A **Store Multiple** instruction will function as a generic store instruction: the
contents of the register will be converted to their external representation in memory as appropriate for
their current type, and that will be stored.

A **Load Multiple** instruction will load the data from memory into the floating-point
registers without conversion, flagging the contents of the registers as untyped. When a floating-point
register has untyped content, then the conversion from memory representation to the appropriate internal
representation is deferred until a floating-point operation on that register takes place: while floating-point
operations do not convert *between* types, they **do** automatically convert from untyped
memory-image data.

In this way, programs which only use one basic format of floating-point data may operate transparently without concern for the fact that the internal format of floating-point data in floating-point registers is not the same as that used in memory.

The intent of this conversion is to make floating-point computation more efficient; one specific benefit is that denormals are processed as quickly as ordinary floating-point numbers.

Four sets of eight base registers are provided separately from the arithmetic-index registers; as the contents of these registers are normally static, this both keeps the arithmetic-index registers available for calculations and maintains the compactness of instruction formats. The first set of eight base registers is used in jump and jump to subroutine instructions, and the same eight registers are used for those instructions regardless of the size of the segment implied by the instruction format. The other three sets of base registers are used for all other memory-reference instructions. The second set of eight base registers, the short base registers, is used to point to segments that are 4,096 bytes in length; the third set of eight base registers, the long base registers, is used to point to segments that are 65,536 bytes in length, and the fourth set of base registers, the extended base registers, is used to point to segments that are 1,048,576 bytes in length.

Many instructions have a three-bit index register field.

Sixteen-bit register-to-register instructions are provided, which conserve opcode space by dividing the thirty-two registers in both the integer and floating-point banks into four groups of eight registers, with the intention of allowing four instruction sequences, each using eight of the registers, to be interleaved in order to assist with efficient instruction pipelining.

A similar principle is used with the index register field; the meaning of
the index register field *may* be determined by the destination register
field of the instruction, so that a register from the same group of eight registers
as the destination register is used as the index register.

However, it may be more appropriate for an index register to be shared among the subthreads of a calculation instead of being local to each one, so some values instead always refer to a register in the first group of eight arithmetic-index registers.

Thus, the index register field is to be interpreted as follows:

Field Register used for indexing value 0 No indexing 1 1 2 2 3 3 4 4/12/20/28 5 5/13/21/29 6 6/14/22/30 7 7/15/23/31

The short vector registers are 256 bits wide. There are sixteen such registers.

There are 64 long vector registers; each long vector register will contain 64 entries. These entries will be 128 bits long.

The instruction formats are as shown below:

The floating-point formats intended for regular use that are supported by the architecture are as shown below:

In addition, as will be described in a later section, Decimal Floating Point is supported, as is the Compatible floating-point format, which corresponds to that used in certain popular mainframe computers. Also, two additional decimal floating point formats, with different exponent sizes, that use packed decimal instead of Densely Packed Decimal are provided.