Although the PDP-8 has been deferred until a later section, here are the basic instruction formats for three other computers that had a 12-bit word, the Control Data 160, the Nuclear Data 812, and the Honeywell H-112:

In general, for the CDC 160-A, the two-word format only applied to instruction modes 01 and 10; the modes were:
00 Direct 01 Indirect Memory 10 Forward Constant 11 Backward Specific
An instruction in the Specific mode still only occupies one word, and refers to the last location of the first memory bank. Instructions in the Direct and Indirect modes refer to locations on the 64-word page zero. Memory is normal direct addressing, Constant has an immediate operand (alternate opcodes allowed many instructions to have a No Address mode, in which the immediate operand was in the second 6 bits of the instruction), and Forward and Backward provided relative addressing. Note that memory location 0 could be referenced in the Direct mode.
In the Nuclear Data 812, the K bit indicated that the instruction would use a second accumulator as its destination; the F bit indicated that the last two bits of the instruction would be treated as the most significant two bits of an address; otherwise, they would be ignored, and the address would refer to the current 4,096-word field of memory.
The instruction format of the Honeywell H-112 is very similar to that of the PDP-8, except that the indirect bit comes before, instead of after, the opcode. However, it had seven instead of six standard memory-reference instructions, so it had both load and store instructions instead of relying upon deposit and clear to allow add to replace load as the PDP-8 did, and instead of one-bit shift instructions, it had a shift instruction with a four-bit shift count. Although it starts with an architecture less limited than that of the PDP-8, in practice, as it was implemented with a serial adder, it was aimed at less ambitious tasks.
The Electronic Research Associates 1101 and 1103 computers were also called the Atlas and Atlas II computers, respectively; (this may not be strictly true; if the Atlas computer was the one used internally by the NSA, it would not have been identical to the commercial model, since other references noted that when ERA obtained permission to market the computer commercially, it removed one or more instructions: presumably, they were bit-manipulation instructions such as those later provided openly in some computers by companies such as Control Data) this has led at least one web site to claim that the ERA 1103 (later the Univac 1103) had a 48-bit word length. It did not; like successors such as the Univac 1108, it had a 36-bit word length. In the case of the 1103, that word length was supported by 37 (count 'em!) Williams tubes, one of which provided a parity bit.
However, the ERA 1101 and 1102 computers did have a 48-bit word length and a 24-bit instruction size. These computers only had drum memory, and the instruction format included a six-bit opcode field and a 14-bit address field. The remaining four bits in the instruction were used to indicate where the next instruction in sequence is to be found; it was a "skip" value, unlike the use of a complete address for that purpose in other drum computers like the IBM 650.
And, since I am now speaking about computers with a word length of 36 bits, here are some of their instruction word formats, along with that of a computer with a 40-bit instruction word:

The bit labelled as L in the diagram in the Univac 1107 and 1108 instruction word causes the contents of an increment field in the index register to be added to the address portion of the index register after use; a similar scheme of dividing the index register into two parts is used in the SDS 9300, but in that computer, the increment field was shorter than the address portion, there being room in a 24-bit register for only 9 bits after the last 15 bits were used for the address.
The Part field in the Univac 1107 and 1108 instruction word allowed instructions to refer to halfwords or characters; the Conf field in the TX-2 instruction word performed a similar function, but in addition allowed for operations involving dividing the accumulator into two or four parts, for vector operations on short numbers, somewhat like the MMX feature of Intel microprocessors.
The three bit index field for the IBM 7094 computer indeed indicated, as one might expect, one of seven index registers, and, if it was zero, that indexing did not take place. However, the earlier computers with the same basic architecture had only three index registers, allowing more than one index register to be selected at the same time. (The IBM 7094 could also go into a mode providing full compatibility with these earlier computers, called multiple tag mode.)
When multiple index registers were selected, their contents were ORed together before use. Another peculiarity of the index registers in this architecture is that their contents were subtracted instead of added in forming addresses. The index registers were designated by letters, so the index field was interpreted as follows:
All Versions Multiple Tag Mode/ 7094
Earlier Machines
000 No Indexing
001 Index Register A (XR1)
010 Index Register B (XR2)
011 A or B D/XR3
100 Index Register C (XR4)
101 A or C E/XR5
110 B or C F/XR6
111 A or B or C G/XR7
Although the letter designations of the index registers of the 7094 were out of order, due to compatibility with the earlier machines in the series, an XRn notation was also provided that avoided this.
The GE-645 included a bit, labelled U in the diagram, to protect instructions against being interrupted, and a B bit which affected which memory map was used for an instruction. The Modifier field controlled indirection and indexing.
The H and D bits in the TX-2 instruction word stand for hold and defer.
Also shown is the 40-bit instruction word of an early Canadian transistor computer built for the Defense Research Telecommunications Establishment. It had hardware floating-point and square root. Although there is no index field in the instruction format, it did have three index registers, called E, F, and G. For indexing, it appears to have used a subroutine call instruction that came in three forms to indicate which register would be used within the subroutine. This appears to lose the flexibility of having more than one index register; however, I only have limited information on its instruction set.
Some computers with instruction words shorter than 24 bits also sought to avoid the type of complications in memory addressing discussed earlier in connection with computers with a 16-bit word length:

The PDP-1 had space for an address that could span a 4,096 word memory; shortening the opcode field allowed the PDP-4 and its successors to deal with a memory twice as large. But features were added to the PDP-7 to handle a 32,768-word memory; instead of using only the last 13 bits of a location containing an indirect address, the last 15 bits were used.
Then the PDP-15 added an index register, a feature the PDP-4 had tried to avoid by making locations 8 through 15 in low memory, when used as indirect addresses, also imply autoincrement addressing. For compatibility, it also offered a bank addressing mode, where the bit indicating indexing was used instead to indicate which of two 4K word banks was used. The same status bit which previously indicated whether indirect addresses were 13 or 15 bits long was used for this purpose, so while compatibility with the PDP-7 and PDP-9 was kept, compatibility with the PDP-4 was dropped.
A version of the PDP-15, called the XVM, added a full-fledged memory-management unit, and the ability (through a privileged instruction) to allow the last 17 bits, instead of the last 15 bits, of an indirect address to be used, quadrupling the address space available within a program to 131,072 words of memory.
Because the PDP-4 was DEC's second computer design, and the oldest one for which compatible successors were made, the PDP-7, PDP-9, and PDP-15 had other issues dealing with legacy code. The PDP-4 did not use ASCII as its character set, while the PDP-7 did; however, an early operating system for the PDP-7 stored file names in 5-level teletypewriter code, and required source language input to compilers to be in FIODEC, the code used for the console typewriter of the PDP-4. This particular complexity did not last long. Because the PDP-1 used one's complement arithmetic, the PDP-4 offered instructions to perform both one's complement and two's complement arithmetic; one's complement add was ADD, two's complement add was TAD. This is why, on the PDP-8, which only had two's complement arithmetic, the mnemonic for the addition instruction was TAD. Although basic operations in both systems were provided, there was a tendency to favor one's complement arithmetic with respect to some features; for example, an instruction to increment the accumulator, producing the correct results when its contents are considered to be in two's complement form, was only added with the PDP-15.
The Lincoln Laboratories TX-0 computer could be considered to be the architectural inspiration for the PDP-1 computer, as well as the PDP-15 and PDP-8 families. Originally, the memory reference instructions had only a two-bit opcode, so there were three of them: add, store, and jump if negative; this was reasonable given that the machine's original purpose was simply to test memory to be used on another computer. When the first two bits of an instruction were ones, the remaining bits indicated single operations, as was done with the PDP-1, PDP-4, and PDP-5 and their successors. Later, when the machine was moved to the Massachusetts Institute of Technology, it was modified to have an index register, and a four-bit opcode field; programs written for the original design, and using only the first 8 K of memory, were originally compatible with the upgraded design.
The Univac 418 had a 6-bit opcode, and a 12-bit address field. This 12-bit address field was usually program-relative, but for some, not all, instructions, a status bit would cause the contents of a register called the Special Register to be prefixed to the value in the address field instead, at least in the later 418-III. Although it wasn't considered a separate field in the documentation, the last bit of the opcode field would distinguish between the regular and indexed form of many instructions.
Also, the shift and I/O instructions had a format with only a 6-bit field for the shift count or channel number, so that they could all share one main opcode.
Ironically, it would seem, the Univac 9300 was offered as a peripheral for the 418-III. It would seem ironic because the 9300 was a computer compatible with one form of the IBM System/360 mainframe. However, what it was compatible with was the System/360 Model 20, whose binary arithmetic capabilities were limited to 16-bit numbers, but which still offered the standard decimal instructions. Unlike the Model 20, it could have up to 32K bytes, rather than 16K bytes, of main memory.
The GE 225 and its relatives, with a 20-bit word length, had room for a 2-bit index register field, and addresses that could span an 8,192-word memory. However, the index registers ended up also being used as base registers, since memories of up to 16,384 words were used.
Also shown is the instruction word format shared by the ORDVAC, the ILLIAC I, and several other machines based on the machine at Princeton's Institute for Advanced Study designed by John von Neumann, and the instruction word format of the Manchester Mark I, and the commercial Ferranti Mark I based on it. These were both early machines with a 40-bit word. On the Manchester and Ferranti Mark I computers, the index register field was either zero, or gave the number of an index register, the contents of which were added to the entire instruction before use. Thus, having the address in the most significant part of the instruction actually served the purpose of avoiding problems with carries. These were the machines that originated the index register concept, terming these registers "B-lines".
Another computer with a 40-bit word and two instructions per word illustrated here is the WEIZAC, which was built at the Weizmann Institute in Israel.
These pages include descriptions of a large number of computers, but they fall short of describing every computer architecture that I have ever heard of. Two more complicated architectures, those of the Motorola 68000 series of chips and those of the Intel 80386 and its successors, are described on the last page of this section, and the PDP-8 is described in more detail on a later page.
In addition, some types of computer were specifically excluded from this page. Computers using a recirculating memory as their main memory, or computers which stored decimal digits instead of binary bits, are excluded. This disqualified the IBM 650 on two points. It also excluded many of the earliest stored-program computers, such as the EDVAC, the Univac I, the English Electric DEUCE, and even the Ferranti Pegasus, which is famed for having introduced the modern concept of general registers. Computers which accompanied data bits in memory with a "flag" bit which was significant to the programmer (parity bits, of course, are irrelevant) were excluded as well, so this eliminated the IBM 1401 and the much more recent BIT 483. It also meant the decimal IBM 1620 was disqualified on two points rather than one. Also, to save space in the diagrams, computers with similar word lengths were grouped together. Thus, computers with 39-bit instruction words, or 22-bit instruction words, and so on, were, to a limited extent, avoided: space was found to fit in 20-bit instructions with the 18-bit instructions, and 30-bit instructions with the 32-bit instructions, to allow at least some unusual word lengths to be included. Thus, the computers illustrated here are primarily representative of "modern" architectures, at least in some sense.