[Next] [Up] [Previous]

The Proposed Decimal Floating-Point Standard

The table given previously showing IBM's official formats for Densely Packed Decimal and Chen-Ho encoding is reproduced here again:

BCD digits         Chen-Ho encoding    Densely Packed
                                       Decimal encoding

0abc 0pqr 0uvw     0abpquvcrw          abcpqr0uvw
0abc 0pqr 100W     110pqabcrW          abcpqr100W
0abc 100R 0uvw     101abuvcRw          abcuvR101w
100C 0pqr 0uvw     100pquvCrw          uvCpqr110w
0abc 100R 100W     11110abcRW          abc10R111W
100C 0pqr 100W     11101pqCrW          pqC01r111W
100C 100R 0uvw     11100uvCRw          uvC00R111w
100C 100R 100W     1111100CRW          00C11R111W

     0pqr 0uvw     0pqruvw             pqr0uvw
     0pqr 100W     111rpqW             pqr100W
     100R 0uvw     100Ruvw             uvR101w
     100R 100W     110R00W             10R111W

as Densely Packed Decimal forms part of the specification for decimal floating point which is included in the revision of the IEEE 754 standard currently under consideration.

Decimal Floating-Point

The new z9 computer from IBM has added a decimal floating-point format which makes use of IBM's Densely Packed Decimal encoding.

Floating-point numbers in this format are not necessarily normalized. The intent behind this does not appear to be to provide significance arithmetic as it is normally understood, as this number format is largely intended for use with quantities whose significance is specified.

The number of decimal digits of precision in this format is always of the form 3n+1. The reason for this could be to allow an efficient way to encode special values to indicate infinities and other NaN (not a number) quantities, but it is also true that the use of this field allows the lengths of the component fields in the number, occupying either 32, 64, or 128 bits, to work out more nicely, with a gradual increase in exponent size along with precision.

A number in this decimal floating point format consists of the following elements:

The lengths of these fields are:

Overall     Sign   CF     BXCF    CCF

    32         1    5        6     20
    64         1    5        8     50
   128         1    5       12    110

the length of the CCF field always being a multiple of ten bits for effective use of the Densely Packed Decimal format.

The format of the CF is as follows:

First digit of mantissa/coefficient:

 0 to 7: aaa
0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111

 8 or 9: A
8 0
9 1

First two bits of biased exponent:
00, 01, or 10: bb

Formats of the CF:
11110: infinity
11111: NaN

Note that the CF is encoded using the same division of the decimal digits into a group of eight digits and a group of two digits that lies at the basis of Chen-Ho encoding and Densely Packed Decimal encoding.

The next remaining piece of information about the format is the bias used for the exponent:

Length of        Number of Possible      Exponent               Precision of Number
Number in Bits   Exponent Values         Bias                   in Digits

    32           3 *    64 =    192           101 (   96+ 5)     2 * 3 + 1 =  7
    64           3 *   256 =    768           398 (  384+14)     5 * 3 + 1 = 16
   128           3 * 4,096 = 12,288         6,176 (6,144+32)    11 * 3 + 1 = 34

The exponent bias is, surprisingly, not the exact midpoint of the range of the possible exponents. However, given that it is intended to use numbers routinely in unnormalized form in this format, increasing the exponent bias facilitates this, and, in fact, the discrepancy between the exponent bias and half the exponent range is always two less than the number of digits of precision provided by the given format.

The following diagram illustrates these formats:

It should also be noted that the exponent bias, as given, is based on the decimal point being placed at the right of the coefficient, not at its left, so when the exponent, after the bias is added, equals zero, the number is an integer.

It has been noted that numbers in this format may be unnormalized.

One possible use for unnormalized numbers is significance arithmetic. This format, however, comes with a set of rules about the "ideal exponent" of the result of an arithmetic operation (this term acknowledges that the range of exponents is finite, and thus cases will arise where the choice of exponents to use in the representation of a number may be limited) that do not correspond to the rules of significance arithmetic. Instead, they follow the IEEE-754 philosophy of producing exact results as far as possible.

The basic intent of those rules is that 100 plus 5.25 should be 105.25 and not 105.250000000; 2.7 times 8.4 should be 22.68 and not 22.680000000. Thus, it is intended that the routines that input and output numbers should create unnormalized values based on the form of numbers read in, and should print numbers with trailing zeroes omitted to the extent indicated by the degree of unnormalization to be printed.

This is a further extension of the reason for using a decimal exponent base in the JOSS system, so that .3 plus .7 might be 1.0 instead of 0.9999999999; the goal is not merely decimal arithmetic, but humanized arithmetic. Doing this within the computations themselves, rather than merely removing trailing zeroes on output, is what is novel about this format.

Previous attempts at humanizing the arithmetic operations of computers such as that in JOSS have tended to be dismissed by the computing community as not worth the trouble, but given the popularity of spreadsheets, for example, it may be that this will prove to be a useful idea.

One thing that occurs to me is that perhaps a decimal floating-point number ought to have a flag bit indicating whether the bits past the end of the number are to be taken as certainly zero, or unknown, so that if either of the numbers in an operation have that bit set, the rules of significance arithmetic are followed instead of those of humanized arithmetic; this would make for a general floating-point arithmetic that is also able to handle the numbers one usually thinks of floating-point as being applicable to, values of physical quantities of limited precision. Actually, this is somewhat of an oversimplification: if a trailing asterisk is used to indicate the flag bit, for addition the rules would work like this:

2.345 + 7.1 = 9.445
2.345* + 7.1 = 9.445*
2.345 + 7.1* = 9.4*
2.345* + 7.1* = 9.4*

If the less precise quantity in an addition has the flag bit set, the rules of significance arithmetic are followed, and the flag is preserved; but if the more precise one has the flag bit set, then the less precise one is still taken as the exact quantity it claims to be.

In the case of multiplication, we also have multiple cases:

26.34 * 1.7 = 44.778
26.34* * 1.7 = 44.77*
26.34 * 1.7* = 45*
26.34* * 1.7* = 45*

Here, the number of significant digits, instead of the precision as a magnitude, is what is compared.

When I think of numbers represented internally in decimal form, I also tended to think of COBOL programs, not spreadsheets, and if one is using a program to calculate a payroll, one would be normally using fixed-point numbers as well: if new rounding rules are needed, inventing a new floating-point format for that purpose seemed wasteful to me. But once it is understood that the idea is to have a general tool that can be easily used for arbitrary calculations, relieving users, as opposed to programmers, of having to specify the range of numbers becomes an obvious necessity.

It may also be noted that IBM intends to license its Densely Packed Decimal patent on a royalty-free basis to implementors of this format as it is about to be specified in the revised IEEE 754 standard.

Leading Bit Suppression in Decimal Floating Point

Below, I discuss an alternative way of handling decimal floating point, with the intent of obtaining a closer resemblance to the binary floating-point representation in the IEEE 754 standards. While I believe an improvement in numerical properties is obtained, changing to a format which must always be normalized of course loses the important property of the format described above of retaining an indication of how many trailing zeroes it is reasonable to print when a floating-point number is output.

The original floating-point format of the IBM System/360 architecture was subject to criticism on the basis that its large radix, 16, meant that the precision of floating-point numbers was highly variable.

Since this effect was judged tolerable in the Burroughs B5500 and the Atlas, which had a floating-point radix of 8, perhaps a radix of 10 is not excessive.

But if we view a floating-point radix of 2 as the ideal, can we modify decimal floating-point to have, in effect, a radix near 2, and, if so, can we further accomplish what has been achieved with binary floating-point: a gain in precision by hiding the first bit of the number, which (except for the number zero) must always be 1?

As a matter of fact, it *is* possible to do this. A scheme for doing so is outlined below, in a first, crude version:

1     1   00  ***
2    10   01  0**
3    11   01  1**
4   100   10  00*
5   101   10  01*
6   110   10  10*
7   111   10  10*
8  1000   11  000
9  1001   11  001

The first column shows the leading decimal digit of a floating-point quantity. The second column shows its binary representation. The third column shows the two bits to be appended after the least significant bits of the exponent which affect only the value of the first digit, not the rest of the mantissa/coefficient/significand. The fourth column shows how the first digit of the mantissa is represented, leaving space, represented by asterisks, for a fraction to appear at the end of the number. Three asterisks allow nothing, 1/8, 1/4, 3/8, 1/2, 5/8, 3/4, and 7/8 of the units value of the least significant digit to be added at the end; two asterisks, nothing, 1/4, 1/2, and 3/4, one asterisk, nothing or 1/2.

This crude scheme has an obvious flaw, however. The extra precision added to the end of the number is *binary* in nature. Instead, something that integrates well with decimal notation is desired.

Another flaw that is easier to remedy was left in: as ten, rather than sixteen, digit values are encoded, not all eight possible values occupying three bits are used, so some precision is wasted.

If one is familiar with the gradations along the span of a slide rule, however, an alternative offers itself. Instead of trying to use 1/4 and 1/8 as additional units, in addition to 1/2, simply use 1/5 in addition to 1/2. This means that instead of four exponent values corresponding to the same shift of the decimal place for most of the mantissa, only three exponent values so correspond.

The least significant part of the decimal exponent and the first digit, left-justified, along with the additional data to be appended to the least significant part of the number as a result of the space saved by the left-justification, can be combined into a single four-bit field as shown below:

0000   1 ... -       1 ... 0
0001   1 ... 1/5     1 ... 2
0010   1 ... 2/5     1 ... 4
0011   1 ... 3/5     1 ... 6
0100   1 ... 4/5     1 ... 8
0101   2 ... -       2 ... 0
0110   2 ... 1/2     2 ... 5

0111   3 ... -       3 ... 0
1000   3 ... 1/2     3 ... 5

1001   4 ... -       4 ... 0
1010   4 ... 1/2     4 ... 5
1011   5             5 ... 0

1100   6             6 ... 0

1101   7             7 ... 0

1110   8             8 ... 0

1111   9             9 ... 0

If the first digit of a number is 1, then a fraction in fifths is included at the end of the number after the least significant digit of the mantissa. If it is 2, 3, or 4, a fraction in halves is included; if it is 5, 6, 7, 8, or 9, then no significance is added to the mantissa.

Instead of thinking in terms of fractions, it may perhaps be easier to understand if the code is thought of as indicating in combination the digit to be appended to the left of the main mantissa field, and the digit to be appended to the right of the main mantissa field.

In effect, instead of the radix jumping by steps of 10, the least significant unit of a decimal number now moves by three gentler steps of 2, 2.5, and 2.

If the middle of the mantissa holds three decimal digits only, as an example, the result of representing an additional partial last digit of the mantissa along with the first digit in this form is to keep the value of the unit in the last place of the number within tighter bounds, as shown in the table below:

.10000 to

one unit in the last place, .00002, is .02% to .01%,

.20000 to

one unit in the last place, .00005, is .025% to .01%, and

.5000 to

one unit in the last place, .0001, is .02% to .01%.

whereas, if one simply had a four-digit mantissa going from .1000 to .9999, one unit in the last place would vary from .1% to .01%, a factor of 10, instead of the maximum range of a factor of 2.5 achieved above.

Peace at Last?

If we were to change the three ranges in the table above to:

Mantissa Range      Unit in     Size relative to
                    last place  number

.10000 to .19998    .00002       .02% to .01%
.20000 to .39995    .00005       .025% to .0125%
.4000  to .9999     .0001        .025% to .01%

we still keep the precion of numbers within the range of a factor of 2.5, even though we lose one bit of precision for numbers whose first digit is 4.

Thus, our combined first-and-last digit field now can have the following coding:

0000   0
0001   1 ... -       1 ... 0
0010   1 ... 1/5     1 ... 2
0011   1 ... 2/5     1 ... 4
0100   1 ... 3/5     1 ... 6
0101   1 ... 4/5     1 ... 8
0110   2 ... -       2 ... 0
0111   2 ... 1/2     2 ... 5

1000   3 ... -       3 ... 0
1001   3 ... 1/2     3 ... 5
1010   4             4 ... 0

1011   5             5 ... 0

1100   6             6 ... 0

1101   7             7 ... 0

1110   8             8 ... 0

1111   9             9 ... 0

which frees up the code 0000 to represent 0 as a leading digit. But, as a way of allowing unnormalized numbers, it has the serious flaw that the last digit, having five possible values when the first digit is 1, has now suddenly disappeared. So a jarring loss of precision takes place when a number is unnormalized; in effect, it is possible to indicate when the first two or more digits are zero, but not when the first digit is zero.

Fixing this seems to require accepting a result which no longer makes such a good fit: as a minimum, zero as a first digit has to be replaced by five different codes, making for a total of 20 codes, which is somewhat wasteful to code in five bits. One way to fix this is to go from 20 codes to 60 codes, a good fit to six bits, using the same trick with the leading part of the exponent that was used for the combination field in the standard decimal floating point format described above.

Perhaps we can also solve matters by adding a zero first digit to another coding, which is not quite as good a fit to begin with.

As a first try, we can think in terms of adding a third of a digit to our target numerical precision, to get a table like the following:

0...0   1...0   2...0   3...0   4...0   5...0   6...0   7...0   8...0   9...0
0...1   1...1
0...2   1...2   2...2   3...2
0...3   1...3
0...4   1...4   2...4   3...4
0...5   1...5                   4...5   5...5   6...5   7...5   8...5   9...5
0...6   1...6   2...6   3...6
0...7   1...7
0...8   1...8   2...8   3...8
0...9   1...9

This results in a need for 42 additional codes; this is just over twice 20 codes, and thus still has its limitations as a fit. We could, however, go to 52 codes by appending 00, 05, 15, 20, 25, and so on to the end of the mantissa when the first digit is zero, and this would be a reasonable manifestation of concern for the precision of unnormalized numbers. A similar step would increase the number of codes in the case discussed above from 20 codes to 25.

The next possibility is to add two-thirds of a digit to our target precision.

So, the first digit 1 would have twenty codes allocated to it, for appended digits 00, 05, 10, to 95 after the main mantissa. The first digits 2 and 3 would each have ten codes allocated to them, for appended digits 0, 1, 2, through 9 appended after the main mantissa. The first digits 4 through 7 would each have five codes, for appended digits 0, 2, 4, 6, and 8 appended. As with the first digit 4 in the first case considered, keeping the precision within bounds of a factor of 2.5 only requires two codes (instead of five) for the first digits 8 and 9, for appended digits 0 and 5.

This works out to a total of 64 codes. While above, a tight fit took 15 codes, and a loose fit 16, here a tight fit takes 64 codes, and a loose fit 70 codes.

An additional 20 codes are needed for zero; if that number is increased, the result would be 50 codes. 50 plus 70 is 120, which is indeed a good fit to 128. That is equivalent to adding a bit to the length of the mantissa in order to allow unnormalized values... which, of course, illustrates that the previous scheme really did achieve the near-equivalent of suppressing the first bit of a decimal mantissa!

But it does also mean that it is awkward to set up a system which combines this particular modification of floating-point with support for unnormalized numbers.

[Next] [Up] [Previous]