Now that we've seen how Archimedes calculated pi, let's first look at how the concept of pi advanced in subsequent years.

The diagram above is an attempt to justify the formula for the area of a circle, A = pi * r^2. This kind of diagram can be found in many mathematics textbooks, and one was included in a book by Sato Moshun, in Japan, in 1698; also, an illustration of the same idea appears in the Notebooks of Leonardo da Vinci.
Basically, one can cut the circle into more and more slices, and as one does so, the shape that can be built by rearranging those slices gets closer and closer to a rectangle with the radius of the circle as its height, and half the circumference of the circle as its width.
This made ancient diagrams like this particularly notable, since the idea of a limit of an estimate of the area of a shape by cutting it into smaller and smaller pieces is a basic idea that led to the integral calculus.
Now, let's go back to Archimedes. Since we've now seen how pi relates to the area of a circle, not just its circumference, the next step is to gain an understanding of how pi relates to the volume of a sphere.

It was Archimedes of Syracuse who found that the surface area of a sphere is the same as that of the curved surface of a cylinder with the same diameter and the same height - just the curved surface, not including the flat top and bottom.
Why this is the case is illustrated by the diagram: the cosine of the angle theta both determines the size of a circle at a given height of the sphere, and the amount vertical distances on the sphere's surface are foreshortened when projected on the surrounding circle.
This is the basis, incidentally, of an equal-area cylindrical map projection, which I describe on this page.
So, since the surface of the cylinder can be unrolled into a rectangle with a height equal to the sphere's diameter, and a width equal to the sphere's circumference, the surface area of a sphere is equal to pi * d^2, or 4 * pi * r^2.
Just as a circle can be cut into many tiny pie wedges, a sphere can be thought of as many tiny little pyramids, with a height equal to the radius of the sphere, and the bases of which total in area to the surface area of the sphere.
This would tell us the volume of a sphere, now that we know the formula for the surface area of a sphere, if we knew how to calculate the volume of a pyramid.

The volume of a pyramid is one-third of the area of its base times its height.
So the volume of a sphere is (4/3) * pi * r^3.
Why the volume of a pyramid is exactly one-third of the product of its base and height can be illustrated the most simply by the diagram above. A cube has six faces, as anyone who has ever played a game with (conventional!) dice knows: and its volume can be divided into six pyramids, with bases having an area equal to the side of the cube squared, and a height equal to half the height of the cube.
And, of course, one-sixth is one-half of one-third.
Archimedes is the first person known to have derived the value of pi by means of mathematical reasoning. Using polygons of 96 sides, one enclosing the circle, and one enclosed by the circle, he established that pi was less than 3 1/7, but greater than 3 10/71 in his work Measurement of the Circle. He was credited by Heron of Alexandria with having subsequently improved those bounds, proving that pi was less than 195882/62351 but greater than 211872/67441 (these figures are, in fact, a modern guess at what was meant, as at least in the copies of Heron that we have, they are garbled); this was in a lost work with the title Plinthides and Cylinders.
Many years later, also using polygons and geometry, Ludolph van Ceulen calculated the value of pi to 34 decimal places. Numerous references state that because of this, pi became generally known as the Ludolphine Number (Ludolphische Zahl) in Germany; thus, for example, it appeared in the title of an 1885 paper by Weierstrass; however, that term was most popular prior to 1910, and was much less common in Germany in the postwar era.
This method of calculating pi was difficult, but at first it was the only valid mathematical method known. Eventually, as a consequence of the development of calculus, it became understood how to easily develop Taylor series for the various elementary functions, but the arctangent series was developed before the invention of calculus.
The earliest mathematical formula for pi was that derived from how it might be calculated geometrically with polygons (starting from a square, rather than from a hexagon as Archimedes did) by François Viète:
pi 2 2 2 ---- = --------- * ------------------- * ----------------------------- * ... 2 sqrt(2) sqrt(2 + sqrt(2)) sqrt(2 + sqrt(2 + sqrt(2)))
This is a somewhat modernized form of his formula, not in the exact form he originally gave. The same is true for the infinite product given by John Wallis in 1650:
pi 2 * 4 4 * 6 6 * 8 ---- = ------- * ------- * ------- * ... 4 3 * 3 5 * 5 7 * 7
The power series for the arctangent function, which can be used to calculate pi, is as follows:
3 5 7
x x x x
atn(x) = --- - ---- + ---- - ---- + ...
1 3 5 7
This series is known as Gregory's series, after James Gregory, who discovered it in 1671; not until much later did Western mathematicians learn that it was discovered by Madhava of Sangamagrama more than 250 years previously.
Since the arctangent given by this series is in radians, the arctangent of 1 is equal to one-quarter of pi. That, however, is a value for which this series converges at an extremely slow rate, so slow as to be useless in practice as a way to calculate pi. For x less than 1, however, it converges at an acceptable rate, faster as x becomes smaller.
If it is being used in an arctangent function, for x greater than one, one would calculate the arctangent of 1/x and subtract that from pi/2, as the series does not converge for x greater than one. For values of x close to 1, either above or below it, say between 1/2 and 2, another transformation so that the angle away from pi/4 would be calculated instead would be used in practice. But today techniques like CORDIC would be used instead, as they are faster. None of these techniques help in calculating the value of pi, however.
One way to use a value less than 1 as the input to the arctangent series and yet produce a result that does lead to a value for pi would be to use the fact that 30 degrees, or, in radians, one-sixth of pi, is the arcsine of 1/2. The Pythagorean theorem can be used to determine that the arcsine of 1/2 is also the arctangent of one over the square root of three.
Since the terms of the arctangent series involve powers of x that are multiplied by x squared at each step, one can do the calculation only using whole numbers until you multiply in the square root of three at the very end. This was how pi was calculated by Abraham Sharp in 1699 to 71 digits.
Isaac Newton, one of the two independent inventors of the calculus, derived the arcsine formula in 1676:
3 5 7
1 x 1 * 3 x 1 * 3 * 5 x
arcsin(x) = x + --- * --- + ------- * --- + ----------- * --- +...
2 3 2 * 4 5 2 * 4 * 6 7
and he used it to calculate pi to at least 15 digits from the fact that pi/6 is the arcsine of 1/2.
Incidentally, pi/10 is the arctangent of sqrt(5 - (2/5) * sqrt(5)) and the arcsine of sqrt(1 + sqrt(5))/4, which latter figure is one half of the golden ratio. This has more to do with the relationship between the golden ratio and the pentagon than any relationship between it and pi, of course.
Because it is possible to construct a 17-sided polygon by straightedge and compasses, there are also expressions involving the square root of 17 that could also be used in this fashion.
It would be more convenient, however, if a simpler quantity not involving a square root, and significantly smaller than 1, could be used in the arctangent formula, because that would lead to a series that would converge more quickly even than the arcsine formula for x=1, let alone the arctangent formula for that value.
While no single rational value of x between 0 and 1 has an arctangent that is a rational multiple of pi, if one is willing to evaluate the arctangent function two or more times, this simplification can be obtained.
The diagram to the right illustrates how one can calculate, given the tangents of two angles, the tangent of the sum of those angles.
Let the length of the line segment from A to O be equal to 1.
Then, the length of the line segment from A to D is the tangent of the angle theta;
as the length of the line segment from B to O also equals 1, the length of the line segment from B to C is the tangent of the angle phi;
and the length of the line segment from A to G is the tangent of the angle theta plus phi.
Let us denote the tangent of theta by P, the tangent of phi by Q, and the tangent of theta plus phi by R.
From the Pythagorean theorem, we know the length of the line segment from D to O is equal to the square root of P squared plus 1. And therefore the length of the line segment from D to E is equal to Q*sqrt((P^2)+1).
Given that the angle FDE is also theta, the same ratio is applied a second time, and the length of the line segment from D to F is equal to Q*((P^2)+1).
While the small triangle that remains to be understood in order to work out the length of the line segment from F to G which remains is not a right triangle, it could be broken into two pieces that are right triangles. However, it is apparent at this point that we're not taking quite the right approach, and we need to change one thing in the diagram.
In this diagram, ignoring the point F, and instead paying attention to a new point, H, this time, as the right triangle is turned around, the length of the line segment from D to H is simply Q.
The remaining triangle is now a right triangle. But the angle GEH is neither phi nor theta, it's phi plus theta, and the tangent of that is what we want to calculate from P and Q. Are we in trouble?
No, we aren't. The length of the line segment EH is clearly equal to P times Q. The ratio of P plus Q to 1 minus P*Q is equal to R. One can drop a perpendicular from E down to the line segment AO to make it obvious how this conclusion can be reached: since the ratio of the lengths of the line segments HG and HE is the arctangent of theta plus phi, just as the ratio of the lengths of the line segments AG and AO is the same value, a triangle, smaller in size by the factor (1-(P*Q))/1 can be formed in which the ratio of P+Q to 1-(P/Q) can be seen to be R.
One example, due to Euler, and based on which these diagrams were drawn, is that atn(1) = atn(1/2) + atn(1/3). So if 1/2 = tan(theta) and 1/3 = tan(phi), tan(theta+phi) is 5/6 divided by (1 - (1/2)*(1/3)), which is 5/6 divided by itself, or 1.
Applying this formula repeatedly, though, it becomes possible to obtain even better results.
So that we could use something that converges even faster than powers of 1/2, could it be that if we solve atn(1/2) = atn(x) + atn(1/3), we would have something useful as x; that way, we wouldn't have to calculate the arctangent three times, we could just multiply atn(1/3) by two.
As it turns out, x equals 1/7, since 1/2 is 10/21 divided by (1 - (1/3)*(1/7)) or 20/21.
If atn(1/2) = atn(1/7) + atn(1/3), then it's also true that atn(1/3) = atn(1/2) - atn(1/7).
Thus, we now have pi/4 = atn(1) = 2*atn(1/3) + atn(1/7), which is an improvement, since atn(1/3) converges more quickly than atn(1/2).
So we can subtract arctangents as well as adding them.
This led to the formula pi/4 = atn(1) = 4*atn(1/5) - atn(1/239), derived by John Machin in 1706, which was used for a number of attempts to calculate pi to a large number of digits.
For example, it was used for two of the earliest calculations of pi on a computer, one to 2037 places on the ENIAC by Reitweisner in 1949, and one to 3089 places on the NORC by Nicholson and Jeenel in 1954.
From 1873 to 1945, it was believed that the value of pi was known to 707 digits, having been calculated to that precision, also using Machin's arctangent relation, by William Shanks. In 1945, calculations by D. F. Ferguson established that only the first 527 digits of that value were correct; his first calculations were made with pencil and paper, but he later used a mechanical calculator to help him to derive 808 digits in 1947. While there was a mistake in the value he initially had published in March 1947, in September 1947 he corrected the error.
Originally, when I first wrote this page, I thought that the error was not the fault of William Shanks alone. In 1853, he had published an earlier calculation of pi to 607 decimal places which, also being correct only to the first 527 places, contained the same error that marred his later calculation. This publication was in the form of a book which included each of the terms in the two arctangent series used for the calculation, making it much easier to check portions of the calculation for error than it was to make the calculation in the first place. However, those individual terms were only given to 530 places, not to 607 places. That still left open a possibility of other mathematicians finding the error, since this was three places beyond where it occurred, but it clearly makes the situation different than it would have been had that not been the case.
Another mathematician did check William Shanks' calculation as far as the first 405 (or 440?) digits. Ironically, the mathematician who did so was William Rutherford, who himself, in 1841, had published a value of pi to 208 digits which was correct only to the first 152 digits. That value, however, was soon corrected, in 1844, through a calculation carried out by Zacharias Dase.
A document by Erwin Engert, dated January 1, 2012, is available on the Web, which sorted out typographical errors in published versions of Shanks' 707 digits of pi, and which investigated where the error was made. Later, an article in American Scientist by Brian Hayes, in their September-October 2014 issue, continued the analysis, finding additional details of the errors, but not all the discrepancies have been accounted for, so it is not yet possible to re-calculate the erroneous value of pi that Shanks would have produced to more places.
Doing so might be of interest for this reason: his 707 digits of pi contained the digit 7 less often than might be expected from random chance, unlike the actual value of pi, which so far has given no indication that its digits are not statistically like a random sequence. If this anomaly were to continue in subsequent digits, it might provide an insight into the conditions under which a mathematical transcendental number like pi could have a digit sequence that is not normal.
Although faster methods for computing pi to a large number of places are now known, arctangent formulas have continued to be used in the calculation of pi even fairly recently.
The calculation of pi by D. F. Ferguson to 808 digits in 1947 in which a desk calculator was used was done using the following identity:
atn(1) = 3*atn(1/4) + atn(1/20) + atn(1/1985)
After the calculations on the ENIAC and the NORC used Machin's identity, a calculation of pi to 10,021 places on the Ferranti Pegasus computer by G. E. Felton used the identities
atn(1) = 8*atn(1/10) - atn(1/239) - 4*atn(1/515)
atn(1) = 12*atn(1/18) + 8*atn(1/57) - 5*atn(1/239)
In the first identity, the first term, being atn(1/10) instead of atn(1/5), is both faster-converging and more convenient for decimal calculations than the first term of Machin's series, and the remaining two terms converge very quickly. The second identity, due to Gauss, used for checking the result, was used again in a later calculation to be mentioned below.
One of the most recent such calculations was in 2002, undertaken by Yasumasa Kanada, in which he had a computer calculate pi to over one trillion digits, using the following two arctangent relations:
atn(1) = 44*atn(1/57) + 7*atn(1/239) - 12*atn(1/682) + 24*atn(1/12943)
atn(1) = 12*atn(1/49) + 32*atn(1/57) - 5*atn(1/239) + 12*atn(1/110443)
Because atn(1/57) and atn(1/239) occur in both relations, but multiplied by a different amount, errors for them would still make the two results disagree, and yet labor - or, rather, machine time - can be saved as these values need only be calculated once.
The second identity was found by F. C. W. Störmer in 1896; the first one by Kikuo Takano in 1982, who had used these same formulas himself for calculating pi to a lesser number of digits.
This basic technique was also used earlier, in 1961, by Daniel Shanks and John W. Wrench to calculate pi to 100,265 places on an IBM 7090 computer. Daniel Shanks, an American, is not known to have any relation to William Shanks.
They used the identities:
atn(1) = 6*atn(1/8) + 2*atn(1/57) + atn(1/239)
atn(1) = 12*atn(1/18) + 8*atn(1/57) - 5*atn(1/239)
so again the calculation could be checked even though atn(1/57) and atn(1/239) were only calculated once.
Here, Störmer found the first of the two identities used, and the second one was due to Gauss.