[Next] [Up] [Previous]

The Direct Cache Modes

This page describes an additional alternate form of each of the addressing modes we have seen thus far, which allows the use of two distinct address spaces, that of conventional main memory, and that of a 512 kilobyte area of high-performance memory.

In these modes, a base register field can either contain 0, which is used in determining the addressing mode, or 1, 2, or 3, which would be the only base registers that could be used to point to conventional memory for addressing in the mode, or 1xx, where the two bits xx are prepended to the corresponding address field, usually 16 bits in length, but 12 bits in length for the short page modes, to allow instructions to refer to high-speed storage.

Note that, in the case of the operate instructions, this will lead in some cases to addresses referring to cache memory whose first two bits are not necessarily immediately adjacent to the remainder of the address because of the way in which some of the operate instructions are organized.

The principle of having instructions refer to a high-speed storage instead of main memory as used here is inspired by the Control Data 1604 computer, which had a small main memory, and a larger random-access memory which was slower, and which could only be accessed through block transfers. Unlike that computer, these addressing modes retain the ability to operate directly on main memory; attempting to eliminate that ability would run into a problem for most addressing modes, because it would conflict with the use of zero in the base register field to indicate a register to register instruction. While there are some modes which do not use a base register value of zero to distinguish between instruction formats, doubling the size of the allocated cache only for those modes at the cost of denying them the normal access to main memory available in other modes has been avoided as leading to unwarranted additional complexity.

In the case of some modes, the base register field or the address field are different in length, leading to a different maximum amount of usable high-speed storage, thus, the number of bytes that can be used in the cache for each of the full cache modified addressing modes is shown below:

1 000 0000: direct cache normal mode                                           262,144
1 000 0001: direct cache aligned operand mode                                  262,144

1 000 1000: direct cache extended operate mode                                 262,144

1 000 1011: direct cache aligned condensed mode                                131,072

1 001 1000: direct cache symmetric address mode                                262,144

1 001 1100: direct cache symmetric vector register mode                      1,048,576

1 010 0000: direct cache stateless scratchpad mode                             262,144
1 010 0001: direct cache register scratchpad mode                              262,144

1 010 0111: direct cache simple compact mode                                   262,144
1 010 1000: direct cache stateful scratchpad mode                              262,144

1 010 1010: direct cache mutable scratchpad mode                               262,144

1 010 1100: direct cache plain stateful scratchpad mode                        262,144

1 010 1110: direct cache plain mutable scratchpad mode                         262,144

1 011 0000: direct cache vector register mode                                1,048,576

1 111 1101: direct cache stack machine mode                                    262,144

Note that for the direct cache versions of the selective short page modes, only instructions where a base register field is selected, that is, where the first bit of a 16-bit address element is a one, are modified by the change to the direct cache version of the mode; when that bit is zero, the instructions refer to a 32,768-byte segment of normal memory, as pointed to by base/address register zero, in the normal fashion.

For direct cache universal mode, only memory-reference instructions without indexing can refer to the cache; this limits the applicability of this feature to that mode, but is unavoidable, since otherwise three areas of different sizes would have to be separately allocated in the cache.

In direct cache double base mode, the cache is referenced when the base select bit is zero; since only one bit in the instruction is used to indicate the base register in use, the area in cache used is the same size as, rather than larger than, the area in memory indicated by a base register.

It might seem that in the selective or extended short page modes, that direct cache mode would not be applicable. However, that is not the case.

When a selective short page mode is used, the cache is referenced when the bit normally used as the indirect bit is a 1, indicating that a base register is specified, and what would have been the base register field begins with a 0; in this way, a large page in memory is referenced when the indirect bit is 0, pointed to by base/address register 0, and when the index bit is 1, followed by 0 as the first bit of the fifteen bits following, then a page half that size, spanned by a fourteen-bit address, is used in cache.

When an extended short page mode is used, then the bit used in the regular short page modes as the indirect bit needs to be zero, indicating a page in memory with a short address, for a cache reference to be possible; thus, base registers 4 through 7 contain the starting points for 4,096-byte short pages in main memory, and array scratchpad registers 1 through 7 contain the starting points for 268,435,456-byte long pages in main memory.

Note that in the various short memory reference modes, since the displacement is 15 bits long, the extra bit differentiating between normal memory-reference instructions and indexed instructions separates the first two bits of a direct cache address from the remaining 15 bits of that direct cache address.

As for the block transfer instructions, there is a limited amount of opcode space that is unused in the area from which the single operand, floating single-operand, and normalize operations are derived. Therefore, in the direct cache vector register mode, and the direct cache symmetric vector register mode, the block transfer instructions would take this form:

and in most of the other direct cache modes, the block transfer instructions look like this:

This provides, in these modes, the two instructions:

174000 BLC    Block Load Cache
175000 BSTC   Block Store Cache

Note that cache-internal parallel computing leads to an additional, separate, allocation of cache to a process.

(Note that yet another means of specifying accesses to cache memory is available; a postfix supplementary bit mode is available which provides an additional bit for each instruction halfword which can be used indicate if cache memory is used for each address in the instruction. Either the block transfer instructions to be specified for the full cache modes can be used in this case, or the INWM instruction can be used to execute a block transfer instruction from one of the direct cache modes; in this case, both will refer to the same cache allocation. In the case of the vector register modes, the latter approach is particularly useful, as the former would allow a maximum block length of only half the potential allocation of cache memory. This assumes this mode is not being used with a full cache mode; in that case, the full cache block transfer instructions would refer to the allocation of cache indicated within the instruction proper, and an INWM instruction referring to a direct cache mode would be needed to refer to the allocation indicated by the supplementary bit. If a direct cache mode is involved, the instructions peculiar to the direct cache mode itself always refer to the allocation accessed by means of the mechanism in that direct cache mode; the full cache instructions are available for the allocation the use of which is indicated by the supplementary bit when they are otherwise unused.)

An Important Note

When the bit in the Program Status Block labelled 32/28 bit displacements is set, the 16-bit (or 12-bit, or 18-bit) address fields referring to main memory are lengthened by an additional 16 bits, but the 18-bit (or 14-bit, or 20-bit) address fields referring to high-performance memory are not changed or extended.

Alternate Data Memory Widths

The reserved high-speed memory consists of 256 kilobytes, and the basic unit for a block transfer is 512 bytes, when normal 32-bit words are used. When data memory width control is used to modify the size of a basic unit of data, these sizes are affected as follows:

                    High-Speed Memory Size      Block Transfer Unit
32-bit word         64K words                   128 words

24-bit word         64K words                   128 words
36-bit word         32K words                    64 words
40-bit word         32K words                    64 words

60-bit word         32K words                    64 words

Important Note

Switching from one instruction mode to another is normally a nonprivileged operation. However, reserving cache memory for the exclusive use of any one process is a privileged operation, and this would need to be done either through privileged instructions, or through a request made of the operating system, before instructions referencing high-speed memory in this mode would function.

Note also that the cache would still be used in the normal fashion to improve the efficiency of operations referencing main memory, although data in the cache as the result of a regular main memory access may be given a lower priority (that is, a faster rate of aging) than data in the regular cache belonging to other processes, since this mode indicates by its very nature that main memory is intended to be referenced only occasionally in a direct fashion.

Also note that a process in this mode may be given exclusive use of an area of the cache, or a 256 kilobyte area of main memory may be mapped to the cache, and references to that memory would simply be given a higher priority in the cache than references to other memory. In that case, referencing the high-performance area simply acts as a form of cache hint Since this would make block transfers into memory-to-memory moves, they may be performed by the external vector coprocessor units, if available. It is intended that high-performance implementations of this architecture would allow processes with an actual area of cache reserved for them to execute concurrently with other processes for which the high-performance area is mapped to an area of memory.

Also, in a simpler implementation, even in the complete absence of cache, the mode might improve performance simply by making it unnecessary to add the base register contents to addresses. (Mapping the high-performance area to a fully-aligned block would allow its addresses to be formed simply by appending bits to the beginning of the address.)


[Next] [Up] [Previous]