Teaching Computer Programming to High School students: An introductory course using Python as the high level language; real numbers, calculating square root, numerical integration

Joseph Mack

jmack (at) wm7d (dot) net
AustinTek home page

v20090511, released under GPL-v3.

Table of Contents

1. Back to basics: Real Numbers

1.1. Floating point representation of real numbers
1.2. Examples of binary floating point numbers
1.3. Normalisation of floating point numbers
1.4. The 8-bit set of normalised reals
1.5. The non-existance of 0.0, NaN and Infinities
1.6. Reals with a finite decimal representation, don't always have a finite binary represention
1.7. Do not test reals for equality
1.8. Floating point precision: Optimisation
1.9. Representing money

2. Problems resulting from a finite representation of reals

2.1. Machine's epsilon
2.2. Changing the exponent doesn't loose precision (for small changes in exponent)
2.3. Changing the mantissa can produce errors (for small changes in mantissa)

3. Two Algorithms: square root, numerical integration

4. Calculating square root

4.1. Python reals are 64 bit
4.2. Babylonian Algorithm
4.3. Code for Babylonian Algorithm for Square Root
4.4. Order of the Algorithm for calculating the square root
4.5. Benchmarking (speed) comparision: Babylonian sqrt() compared to built-in math library sqrt()
4.6. Running time comparision: Python/C
4.7. Presentation: the Babylonian Square Root

5. Numerical Integration

5.1. Calculating π by Numerical Integration
5.2. an estimate of running time (back of the envelope calculation)
5.3. range()/xrange() and out-of-memory problem
5.4. Optimising the calculation
5.5. Safe Programming with normalised (reduced) numbers
5.6. For a constant, multiply by reciprocal rather than divide
5.7. add optimisations
5.8. Finding π from the upper and lower bounds
5.9. calculation of π by numerical integration and timing the runs
5.10. Order of the algorithm for calculating π by numerical integration
5.11. errors in calculation of π by numerical integration
5.12. Accuracy of Upper and Lower Bounds with 128 bit reals
5.13. Other algorithms for calculating π
5.14. speed of calculation of π by the library routines
5.15. Area under arbitrary curve
5.16. Calculating the Length of the circumference by Numerical Integration
5.17. Calculating the Volume of a Sphere by Numerical Integration
5.18. Presentation: Numerical Integration as a method to calculate π

6. e - the base of natural logarithms: exponential and geometric processes

6.1. Calculating Bank Interest at small intervals
6.2. errors in calculation of e=(1+1/n)^n
6.3. Logarithmic time for calculating e
6.4. Faster converging series for e: factorial series
6.5. errors in calculation of e from the factorial series, Arithmetic Progressions and the number of casing stones on the Great Pyramid
6.6. Logarithms and Slide Rules
6.7. The Great Numbers of Mathematics
6.8. Compound Interest
6.9. Mortgage Calculation
6.10. Refinancing a Mortgage
6.11. Moore's Law: effect on calculation of e=(1+1/n)^n
6.12. Optimal Slacking and the Futility of Interstellar Space Travel
6.13. Geometric Progressions: Summing the series
6.14. Geometric Progressions: chess and the height of a pile of rice
6.15. Geometric Progressions: Time for the Moore's Law aided calculation of e=(1+1/n)^n
6.16. Geometric Progressions: creating money by multiplication
6.17. Parallel Programming and Distributed Denial of Service (DDoS): linear or exponential?
6.18. Lucky Lindy
6.19. Getting in early when you have an exponential process
6.20. Presentation: e, exponential and geometric series

Abstract

Class lessons for a group of 7th graders with no previous exposure to programming (as of Aug 2008, now 8th graders - in the US 7th, 8th grade are called Middle School). The student are doing this after school on their own time and not for credit. My son's school didn't want me to teach the class using any of the school facilities, as they thought it would compete with a class in Java given to advance placement math 12th graders. However I could use the school facilities, if I didn't teach anything which would fullfil a requirement (which I assume meant anything practical). So the class is at my home and is free. Since this is a hobby activity for the kids, I don't ask them to do homework. As they get further into the class and take on projects, I'll be quite happy for them to work on the projects in their own time, but will let them decide whether/when to do this.

	Note
The notes here are being written ahead of the classes. I've marked boundaries of delivered classes with "End Lesson X". Each class is about an 90mins, which is about as much as the students and I can take. After a class I revise the material I presented to reflect what I had to say to get the points across, which isn't always what I had in the notes that the students saw. Material below on classes that I've given will be an updated version of what I presented. Material below on classes I haven't given, are not student tested and may be less comprehensible. The big surprise to me is that when you're presenting new material, the students all say "yeah, yeah we got that" and want you to go on. However when you ask them to do a problem, they haven't a clue what to do. So most of my revisions to the early material were to add worked problems. I also spend a bit of time at the start of the class asking the students to do problems from the previous week's class. I've found that when asking the students to write a piece of code that some of them (even after almost a year) can't turn a specification into code. Instead I have to say "initialise these variables, use a loop to do X, in the loop calculate Y and print out these values, at the end print out the result". Usually these steps will be given in a list in the notes here. Once I'd shown the students how to initialise and write loops, I had expected them to be able to parse a problem into these steps without my help. However some of them can't and I changed my teaching to reflect this. The kids bring laptops do to the exercises and they display this page by dhcp'ing and surfing to my router where this page is also stored. Students are using Mac OS, Windows XP and Linux. For WinXP, the initial lessons used notepad and the windows python. For material starting at the Babylonian square root, I changed the WinXP student over to Cygwin (still using notepad). In later sections I started having the kids do a presentation on what they'd learned. The primary purpose of this was to accustom the kids to talking in front of groups of people. They also had to organise the material, revealing how much they'd remembered and understood it. The presentation on numerical integration covered a body of material large enough that it took 2 classes and homework to assemble the presentation. For the next section of work, I'll have them do presentations on smaller sections of the material, and then have them present on the whole section at the end. I've hidden most of the answers to questions in footnotes (so they won't see the answers easily during class). However later, when the students are scanning the text to construct their presentations, that it's hard to find what they're looking for. I don't know what to do about that.

Note

The notes here are being written ahead of the classes. I've marked boundaries of delivered classes with "End Lesson X". Each class is about an 90mins, which is about as much as the students and I can take. After a class I revise the material I presented to reflect what I had to say to get the points across, which isn't always what I had in the notes that the students saw. Material below on classes that I've given will be an updated version of what I presented. Material below on classes I haven't given, are not student tested and may be less comprehensible.

The big surprise to me is that when you're presenting new material, the students all say "yeah, yeah we got that" and want you to go on. However when you ask them to do a problem, they haven't a clue what to do. So most of my revisions to the early material were to add worked problems. I also spend a bit of time at the start of the class asking the students to do problems from the previous week's class.

I've found that when asking the students to write a piece of code that some of them (even after almost a year) can't turn a specification into code. Instead I have to say "initialise these variables, use a loop to do X, in the loop calculate Y and print out these values, at the end print out the result". Usually these steps will be given in a list in the notes here. Once I'd shown the students how to initialise and write loops, I had expected them to be able to parse a problem into these steps without my help. However some of them can't and I changed my teaching to reflect this.

The kids bring laptops do to the exercises and they display this page by dhcp'ing and surfing to my router where this page is also stored.

Students are using Mac OS, Windows XP and Linux. For WinXP, the initial lessons used notepad and the windows python. For material starting at the Babylonian square root, I changed the WinXP student over to Cygwin (still using notepad).

In later sections I started having the kids do a presentation on what they'd learned. The primary purpose of this was to accustom the kids to talking in front of groups of people. They also had to organise the material, revealing how much they'd remembered and understood it. The presentation on numerical integration covered a body of material large enough that it took 2 classes and homework to assemble the presentation. For the next section of work, I'll have them do presentations on smaller sections of the material, and then have them present on the whole section at the end. I've hidden most of the answers to questions in footnotes (so they won't see the answers easily during class). However later, when the students are scanning the text to construct their presentations, that it's hard to find what they're looking for. I don't know what to do about that.

Material/images from this webpage may be used, as long as credit is given to the author, and the url of this webpage is included as a reference.

1. Back to basics: Real Numbers

1.1. Floating point representation of real numbers

wiki, real numbers (http://en.wikipedia.org/wiki/Real_number), wiki, IEEE 854 floating point representation (http://en.wikipedia.org/wiki/IEEE_854), wiki, floating point (http://en.wikipedia.org/wiki/Floating_point). IEEE 754 Converter (http://www.h-schmidt.net/FloatApplet/IEEE754.html) a java applet to convert floating point numbers to 32 bit IEEE 754 representation.

The Python docs on floating point Floating Point Arithmetic (see http://docs.python.org/tut/node16.html).

A compiler writer's view of floating point Lahey Floating Point (http://www.lahey.com/float.htm).

The reasoning behind the IEEE-754 spec: What Every Computer Scientist Should Know About Floating-Point Arithmetic, by David Goldberg, published in the March, 1991 issue of Computing Surveys. Copyright 1991, Association for Computing Machinery, Inc. The paper is available at ACM portal (http://portal.acm.org/citation.cfm?id=103163).

Scientists and engineers are big users of computing and do calculations on non integer numbers (e.g. 0.12). These numbers are called "real" to differentiate them from "imaginary" (now called "complex") numbers. Real numbers measure values that change continuously (smoothly) e.g. temperature, pressure, altitude, length, speed.

	Note
	You will never have to reproduce any of the following information in normal coding, since all the routines have been written for you, but you will have to understand the limitations of floating point representation, or get nonsense or invalid results.

One possible representation of real numbers is fixed point e.g., where the decimal point is in its usual position. Fixed point isn't useful for small or large numbers in fixed width fields (e.g. 8 columns wide) as significant digits can be lost. A small number like 0.000000000008 will be represented at 0.000000 and 123456789.0 will be represented as (possibly) 12345678 or 456789.0. So floating point notation is used. Here's some floating point numbers and their 8-column representation

real                fixed point    floating point
123456789.0         12345678       1.2345E8  (ie 1.2345*10^8)
0.0000000012345678  0.000000       1.234E-9  (ie 1.2345*10-9)

A decimal example of a floating point number is +1.203*10². There is one digit before the decimal point which has a value from 1-9 (the base here being 10). Subsequent digits after the decimal point are divided by 10,100... etc. The term 1.023 is called the mantissa, while the term 2 is called the exponent. A mantissa and exponent are the two components of scientific notation for real numbers. Here's how we calculate the decimal value of +1.203E2

sign     + positive
mantissa 1.203 = 1/1 + 2/10 + 0/100 + 3/1000
exponent 10^2 = 100
+1.203*10^2 = +120.3

1.2. Examples of binary floating point numbers

Most computers use 64, 80 or 128-bit floating point numbers (32-bit is rarely used for optimisation calculations, because of its limited precision, see floating point precision). Due to the finite number of digits used, the computer representation of real numbers as floating point numbers is only an approximation. We'll explore the reason for this and the consequences here.

One of the early problems with computers was how to represent floating point numbers. Hardware manufacturers all used different schemes (which they touted as a "marketing advantage"), whose main effect was to make it difficult to move numbers between machines (unless you did it in ASCII, or did it between hardware from the same manufacturer). The purpose of these non-standard schemes was to lock the customer into buying hardware from the same manufacturer, when it came time to upgrade equipment.

You'd think that manufacturers would at least get it right, but these different schemes all gave different answers. A C.S. professor in the early days of programmable calculators (1970's) would stuff his jacket pockets with calculators and go around manufacturers and show them their calculators giving different answers. He'd say "what answer do you want?" and give it to them. Eventually a standard evolved (IEEE 854 and 754) which guaranteed the same result on any machine and allowed transfer of numbers between machines in standard format (without intermediate conversion to ASCII).

Hop over to IEEE_854 (http://en.wikipedia.org/wiki/IEEE_854) and look at the bar diagrams of 32- and 64-bit representation of real numbers.

Worked examples on 32- or 64-bit numbers take a while and because of the large number of digits involved, the results aren't always clear. Instead I'll make up a similar scheme for 8-bit reals and work through examples of 8-bit floating point numbers. If we were working on an 8-bit (rather than 32- or 64-bit) computer, here's what an 8-bit floating point number might look like.

Sseemmmm
S    = sign of the mantissa (ie the sign of the whole number)
see  = exponent (the first bit is the sign bit for the exponent)
mmmm = mantissa

Here's a binary example using this scheme.

00101010
Sseemmmm
S      = 0: +ve
see    = 010: s = 0, +ve sign. exponent = +10 binary = 2. multiply mantissa by 2^2 = 4 decimal
mmmm   = 1010: = 1.010 binary = 1/1 + 0/2 + 1/4 + 0/8 = 1.25 decimal
number = + 1.25 * 2^2 = +5.00

Here's another example

11011101
Sseemmmm
S      = 1: -ve number
see    = 101: s = 1, -ve sign. exponent = -01 binary = -1 decimal. multiply mantissa by 2^-1 = 0.5 decimal
mmmm   = 1101: = 1.101 binary = 1/1 + 1/2 + 0/4 + 1/8 = 1.625 decimal
number = - 1.625 * 2^-1 = -0.8125

Doing the same problem with bc -l

11011101
Sseemmmm
#sign is -ve

#checking that I can do the mantissa
# echo "obase=10; ibase=2; 1.101 " | bc -l
1.625

#checking that I can do the exponent. "see" is 101 so exponent is -01
# echo "obase=10; ibase=2; (10)^(-01)" | bc -l
0.5

#the whole number
# echo "obase=10; ibase=2; -1.101 * (10)^(-01)" | bc -l
-0.8125

How many floating point numbers is it possible to represent using a byte (you don't know the value of these numbers or their distribution in the continuum, but you do know how many there are) ^[1] ? (How many reals can you represent using 32-bits ^[2] ?) We could choose any 8-bit scheme we want (e.g. seemmmmm, seeeemmm, mmeeeees, mmmmseee...), but we will still be subject to the 8-bit limit (we will likely have different reals in each set).

In the the seeemmmm scheme used here, where do the numbers come from? The first bit (the sign) has 2 possibilities, we have 3 bits of exponent giving 8 exponents, and 4 bits of mantissa giving us 16 mantissas. Thus we have 2*8*16=256 8-bit floating point numbers.

	Note
	End Lesson 17

Student Example: What real number is represented by the 8-bit floating point number 01101001 ^[3] ? Use bc to convert this number to decimal ^[4]

1.3. Normalisation of floating point numbers

You aren't allowed to have 0 as the first number of the mantissa. If you get a starting 0 in a decimal number, you move the decimal point till you get a single digit 1-9 before the decimal point (this format is also called "scientific notation"). The process of moving the decimal point is called normalisation.

0.1234*10^3 non-normalised floating point number
1.234*10^2  normalised floating point number.

Is 12.34*10¹ normalised ^[5] ?

Let's see what happens if we don't normalise 8-bit reals. For the moment, we'll ignore -ve numbers (represented by the first bit of the floating point number). Here are the 8 possible exponents and 16 possible mantissas, giving us 8*16=128 +ve numbers.

   eee   E
   000   2^0  =1
   001   2^1  =2
   010   2^2  =4
   011   2^3  =8
   100   2^-0 =1
   101   2^-1 =1/2
   110   2^-2 =1/4
   111   2^-3 =1/8

   mmmm   M
   0000   0
   0001  1/8
   0010  1/4
   0011  3/8
   0100  1/2
   0101  5/8
   0110  3/4
   0111  7/8
   1000  1
   1001  1 + 1/8
   1010  1 + 1/4
   1011  1 + 3/8
   1100  1 + 1/2
   1101  1 + 5/8
   1110  1 + 3/4
   1111  1 + 7/8

Because we have mantissas <1.0, there are multiple representations of numbers, e.g. the number 1/4 can be represented by (1/8)*2 or (1/4)*1; 1 + 1/4 can be represented by (5/8)*2 or 1 + 1/4. We want a scheme that has only one representation for any number. Find other numbers that can be represent in multiple ways - twice ^[6] - trice ^[7] - four times ^[8] - five times ^[9] .

Normalisation says that the single digit before the decimal point can't be 0; we have to start the number with any of the other digits. For decimal the other digits are 1..9. In binary we only have one choice: "any of the other digits" is only 1. All binary floating point numbers then must have a mantissa starting with 1 and only binary numbers from 1.0 to 1.11111.... (decimal 1.0 to 1.99999...) are possible in the mantissa.

Since there's no choice for the value of the first position in the mantissa, real world implementations of floating point representations do not store the leading 1. The algorithm, which reads the floating point format, when it decodes the stored number, knows to add an extra "1" at the beginning of the mantissa. This allows the storage of another bit of information (doubling the number of reals that can be represented), this extra bit is part of the algorithm and not stored in the number.

Here are all the possible normalised mantissas in our 8-bit scheme (the leading 1 in the mantissa is not stored). How many different mantissas are possible in our scheme ^[10] ?

stored				binary		decimal
shortened	true		value		value
mantissa	mantissa	mantissa	mantissa
0000		10000		1.0000		1
0001		10001		1.0001		1 + 1/16 
.
.
1111		11111		1.1111		1 + 15/16

1.4. The 8-bit set of normalised reals

Here's a table of the +ve 8-bit normalised real numbers. (How many +ve 8-bit real numbers are there ^[11] ?) There are only 112 numbers in this list. The 16 entries with see=000 are missing: these are sacrificed to represent 0.0 - we'll explain this shortly - see the non-existance of 0.0. Since a byte can represent an int just as easily as it can represent a real, the table includes the value of the byte, as if it were representing an int.

In my home-made 8-bit real number, the exponent is a signed int. For clarity I've separated out the sign bit of the exponent. If you use the signed int convention, then the first bit of an integer is the sign, with 1 being -ve and 0 being +ve. Remember that if you're using the signed int representation, and you have a 3 bit int, then 000-1=fff.

S see mmmm	int 	M	E	real
0 001 0000	16	1.0	2.0	2.0
0 001 0001	17	1.0625	2.0	2.125
0 001 0010	18	1.125	2.0	2.25
0 001 0011	19	1.1875	2.0	2.375
0 001 0100	20	1.25	2.0	2.5
0 001 0101	21	1.3125	2.0	2.625
0 001 0110	22	1.3750	2.0	2.75
0 001 0111	23	1.4375	2.0	2.875
0 001 1000	24	1.5	2.0	3.0
0 001 1001	25	1.5625	2.0	3.125
0 001 1010	26	1.625	2.0	3.25
0 001 1011	27	1.6875	2.0	3.375
0 001 1100	28	1.75	2.0	3.5
0 001 1101	29	1.8125	2.0	3.625
0 001 1110	30	1.875	2.0	3.750
0 001 1111	31	1.9375	2.0	3.875
0 010 0000	32	1.0	4.0	4.0
0 010 0001	33	1.0625	4.0	4.25
0 010 0010	34	1.125	4.0	4.5
0 010 0011	35	1.1875	4.0	4.75
0 010 0100	36	1.25	4.0	5.0
0 010 0101	37	1.3125	4.0	5.25
0 010 0110	38	1.375	4.0	5.5
0 010 0111	39	1.4375	4.0	5.75
0 010 1000	40	1.5	4.0	6.0
0 010 1001	41	1.5625	4.0	6.25
0 010 1010	42	1.625	4.0	6.5
0 010 1011	43	1.6875	4.0	6.75
0 010 1100	44	1.75	4.0	7.0
0 010 1101	45	1.8125	4.0	7.25
0 010 1110	46	1.875	4.0	7.5
0 010 1111	47	1.9375	4.0	7.750
0 011 0000	48	1.0	8.0	8.0
0 011 0001	49	1.0625	8.0	8.5
0 011 0010	50	1.125	8.0	9.0
0 011 0011	51	1.1875	8.0	9.5
0 011 0100	52	1.25	8.0	10.0
0 011 0101	53	1.3125	8.0	10.5
0 011 0110	54	1.375	8.0	11.0
0 011 0111	55	1.4375	8.0	11.5
0 011 1000	56	1.5	8.0	12.0
0 011 1001	57	1.5625	8.0	12.5
0 011 1010	58	1.625	8.0	13.0
0 011 1011	59	1.6875	8.0	13.5
0 011 1100	60	1.75	8.0	14.0
0 011 1101	61	1.8125	8.0	14.5
0 011 1110	62	1.875	8.0	15.0
0 011 1111	63	1.9375	8.0	15.5
0 100 0000	64	1.0	1.0	1.0
0 100 0001	65	1.0625	1.0	1.0625
0 100 0010	66	1.125	1.0	1.125
0 100 0011	67	1.1875	1.0	1.1875
0 100 0100	68	1.25	1.0	1.25
0 100 0101	69	1.3125	1.0	1.3125
0 100 0110	70	1.375	1.0	1.375
0 100 0111	71	1.4375	1.0	1.4375
0 100 1000	72	1.5	1.0	1.5
0 100 1001	73	1.5625	1.0	1.5625
0 100 1010	74	1.625	1.0	1.625
0 100 1011	75	1.6875	1.0	1.6875
0 100 1100	76	1.75	1.0	1.750
0 100 1101	77	1.8125	1.0	1.8125
0 100 1110	78	1.875	1.0	1.875
0 100 1111	79	1.9375	1.0	1.9375
0 101 0000	80	1.0	0.5	0.5
0 101 0001	81	1.0625	0.5	0.53125
0 101 0010	82	1.125	0.5	0.5625
0 101 0011	83	1.1875	0.5	0.59375
0 101 0100	84	1.25	0.5	0.625
0 101 0101	85	1.3125	0.5	0.65625
0 101 0110	86	1.375	0.5	0.6875
0 101 0111	87	1.4375	0.5	0.71875
0 101 1000	88	1.5	0.5	0.750
0 101 1001	89	1.5625	0.5	0.78125
0 101 1010	90	1.625	0.5	0.8125
0 101 1011	91	1.6875	0.5	0.84375
0 101 1100	92	1.75	0.5	0.875
0 101 1101	93	1.8125	0.5	0.90625
0 101 1110	94	1.875	0.5	0.9375
0 101 1111	95	1.9375	0.5	0.96875
0 110 0000	96	1.0	0.25	0.25
0 110 0001	97	1.0625	0.25	0.265625
0 110 0010	98	1.125	0.25	0.28125
0 110 0011	99	1.1875	0.25	0.296875
0 110 0100	100	1.25	0.25	0.3125
0 110 0101	101	1.3125	0.25	0.328125
0 110 0110	102	1.375	0.25	0.34375
0 110 0111	103	1.4375	0.25	0.359375
0 110 1000	104	1.5	0.25	0.375
0 110 1001	105	1.5625	0.25	0.390625
0 110 1010	106	1.625	0.25	0.40625
0 110 1011	107	1.6875	0.25	0.421875
0 110 1100	108	1.75	0.25	0.4375
0 110 1101	109	1.8125	0.25	0.453125
0 110 1110	110	1.875	0.25	0.46875
0 110 1111	111	1.9375	0.25	0.484375
0 111 0000	112	1.0	0.125	0.125
0 111 0001	113	1.0625	0.125	0.132812
0 111 0010	114	1.125	0.125	0.140625
0 111 0011	115	1.1875	0.125	0.148438
0 111 0100	116	1.25	0.125	0.15625
0 111 0101	117	1.3125	0.125	0.164062
0 111 0110	118	1.375	0.125	0.171875
0 111 0111	119	1.4375	0.125	0.179688
0 111 1000	120	1.5	0.125	0.1875
0 111 1001	121	1.5625	0.125	0.195312
0 111 1010	122	1.625	0.125	0.203125
0 111 1011	123	1.6875	0.125	0.210938
0 111 1100	124	1.75	0.125	0.21875
0 111 1101	125	1.8125	0.125	0.226562
0 111 1110	126	1.875	0.125	0.234375
0 111 1111	127	1.9375	0.125	0.242188

a couple of things to notice

The 8-bit reals range from 0.125=(1/2)³ to 15=(2⁴)-1. Intervals between numbers range from 0.007813 for the smallest numbers to 0.5 for the largest numbers.
The reals are in groups of 16 numbers, covering a range of 2:1. Within the group, the numbers are in order, but the adjacent groups are not ordered.
Since you can only represent some of real number space (116 numbers between 0.125 and 15.5), floating point reals can't represent all real numbers. When you enter a real number, the computer will pick the nearest one it can represent. When you ask for your old real number back again, it will be the version the computer stored, not your original number (the number will be close, but not exact). Using 8 bits for a real, it's only possible to represent numbers between 8.0 and 16.0 at intervals of 0.5. If you hand the computer the number 15.25, there is no hardware to represent the 0.25 and the computer will record the number as 15.00.
If you wanted to represent 15.25, then you would need to represent numbers between 8.0 and 16.0 at intervals of 0.25, rather than at intervals of 0.5. How many more bits would you need to represent the real number 15.25 ^[12] ?

reals have this format:

fixed number of evenly spaced numbers from 1..2 multiplied by 2<superscript>integer_exponent</superscript>.

In the 8-bit scheme here, there are 16 evenly numbers in each 2:1 range. In the range 1..2 the interval is (2-1)/16=1/16=0.0625; the numbers then are 1, 1 + 1/16.. 1 + 15/16. For the range 2..4, you multiply the base set of numbers by 2. There are still 16 evenly spaced intervals and the resulting spacing is (4-2)/16 = 1/8 = 0.125.

In our scheme, how many 8-bit reals are there between 8..16 and what is the interval ^[13] ? How many numbers are there between 0.25..0.5 and what is interval ^[14] ?

Here's a graphical form of the chart of 128 real numbers above. To show the numbers as a 2-D plot

I plotted the value of the real number on the y-axis on a logarithmic scale. y=0.0 is at -∞
I needed an x-axis with a different value for each number. The integer representation of the particular bit pattern was used. There's no obvious connecton between the integer represented by a particular bit pattern, and the real number represented by that bit pattern, but we should expect that small changes in the bit pattern will produce similar small changes in the number represented in both cases. In this case we should expect the graph to be multiple sets of dots, looking like they could be joined as lines.

Only the +ve numbers are shown. The first 16 of the possible 128 +ve numbers are not plotted (these are sacrificed to represent the 0.0 and a few other numbers, see the non-existance of 0.0). The absence of these 16 numbers is seen as the gap on the left side, where there are no red dots between 0 and the smallest number on the x-axis.

	Note
This is about not being able to read your own code 6 months after you write it: I wrote this section (on floating point numbers) about 6 months before I delivered it in class. (I initially expected I would use it in primitive data types, but deferred it in the interest in getting on with programming.) On rereading, the explanation for the gap on the left hand side, made no sense at all. I couldn't even figure out the origin of the gap from my notes. After a couple of weekends of thinking about it, I put in a new explanation, and then a week later I figured out my original explanation. It was true, but not relevant to the content.

Note

This is about not being able to read your own code 6 months after you write it:

I wrote this section (on floating point numbers) about 6 months before I delivered it in class. (I initially expected I would use it in primitive data types, but deferred it in the interest in getting on with programming.) On rereading, the explanation for the gap on the left hand side, made no sense at all. I couldn't even figure out the origin of the gap from my notes. After a couple of weekends of thinking about it, I put in a new explanation, and then a week later I figured out my original explanation. It was true, but not relevant to the content.

Figure 1. graph of the 7 rightmost bits in a floating point number represented as an integer (X axis) and real (Y axis)

graph of the 7 right bits in a floating point number represented as an integer (X axis) and real (Y axis)

The plot is a set of dots (rather than a line or set of lines), showing that you can only represent some of the real space; a few points on the infinity of points that is a line.

	Note
	End Lesson 18

The representation of the +ve reals is

1..2 * 2^exponent.

The 8-bit real numbers (from 0.25 to 4.0) on a straight line look like this (because of crowding, reals<0.25 aren't represented)

Figure 2. graph of the 8-bit real numbers from 0.25..4 plotted on a linear real number line.

graph of the 8-bit real numbers from 0.25..4 plotted on a linear real number line.
Note there are 16 evenly spaced real numbers in any 2:1 section of the real number line
(e.g. from 0.25..0.50, 0.50..1.0, 1.0..2.0, 2.0..4.0

There are 16 evenly spaced 8-bit real numbers in any 2:1 section of the real number line (e.g. from 0.25..0.50, 0.50..1.0, 1.0..2.0, 2.0..4.0). Because the length of each segment decreases by half each step going leftward (towards 0.0), the real numbers never get to 0.0 (0.0 would be 2^-∞).

Logarithms:

If you don't know about logarithms
Logarithms are the exponents when reals are represented as base^exponent.
Logarithms do for multiplication, what linear plots do for arithmetic.
Let's say you want a graph or plot to show the whole range of 8-bit reals from 0.125=2^-3 to 16=2⁴. On a linear plot most of the points would be scrunched up at the left end of the line (as happens above). If instead you converted each number to 2^x and plotted x, then the line would extend from -3 to +4, with all points being clearly separated.
If n=2^x, then x is log₂(n). So log₂(1)=0, log₂(2)=1, log₂(4)=2.
If n=2^0.5, then n²=2^0.5*2^0.5=2. Then n=sqrt(2) and 2^0.5 is sqrt(2).
If you multiply two numbers, you add their logarithms.
Logarithms commonly use base 10 (science and engineering; the log of 100 is 2), base 2 (for computing), and log_e (for mathematics, e=2.718281828459)

The 8-bit real numbers (from 0.25 to 4.0) on a logarithmic line look like this

Figure 3. graph of the 8-bit real numbers from 0.25..4 plotted on a log real number line.

graph of the 8-bit real numbers from 0.25..4 plotted on a logarithmic real number line.
Note there are 16 logarithmically spaced real numbers in any 2:1 section of the real number line
(e.g. from log(2)number =-2..-1, -1..0, 0..1, 1..2.

Note there are 16 logarithmically spaced real numbers in any 2:1 section of the real number line (e.g. from log₂(number) =-2..-1, -1..0, 0..1, 1..2. Because 0.0 is 2^-∞, 0.0 on this graph is off the left end of the line (at -∞).

How many real numbers can be represented by a 32 bit number ^[15] ? How many floating point numbers exist in the range covered by 32-bit reals ^[16] ?

1.5. The non-existance of 0.0, NaN and Infinities

You can make real numbers as big as you have bits to represent them and as small as you have bits to represent them. However you can't represent 0.0 on the simple floating point scheme shown here.

We already said that we were going to only represent normalised numbers (which start with the digit 1).
In the linear plot above, you don't get to 0.0 because the absolute size of each 2:1 range of reals halves at each step as you approach 0.0. In the log plot, 0.0 is at -∞.

However 0.0 is a useful real number. Time starts at 0.0; the displacement of a bar from it's equilibrium position is 0.0.

In the above images/graphs, there is no way to represent 0.0 - it's not part of the scheme. Since a normalised decimal number starts with 1..9 and normalised binary numbers start with 1, how do you represent 0.0? You can't. You need some ad hoc'ery (Ad hoc. http://www.wikipedia.org/wiki/Ad_hoc).

To represent 0.0, you have to make an exception like "all numbers must start with 1-9₁₀ or 1₂, unless it's 0.0, in which case it starts with 0". Designers of rational systems are dismayed when they need ad hoc rules to make their scheme work. ad hoc rules indicate that maybe the scheme isn't quite as good as they'd hoped. Exceptions slow down the computer, as rather than starting calculating straight away, the computer first has to look up a set of exceptions (using a conditional statement) and figure out which execution path to follow.

How do we represent 0.0? Notice that the bits for the exponent in the previous section can be 100 or 000; both represent 1 (2⁰ or 2^-0). 8 of our exponents are duplicates. We could handle the duplication, by having the algorithm add 1 to the exponent if it starts with 1, giving us an extra 8 numbers (numbers that are smaller by a factor of 2). However since we don't yet have the number 0.0 anywhere, the convention (in this case of a 1 byte floating point number) is to have two of the 8 numbers with the exponent bits=000 as +/- 0.0. As well we need to represent +/-∞ and NaN, so we use some of the remaining 6 numbers for those.

S  see  mmmm 
0  000  0000 +0.0
1  000  0000 -0.0 (yes there is -0.0)
0  111  0000 +infinity
1  111  0000 -infinity
*  111 !0000 NaN

NaN (not a number): In the IEEE-754 scheme, infinities and NaN are needed to handle overflow and division by 0. Before the IEEE-754 standard, there was no representation in real number space for ∞ or the result of division by zero or over/underflow. Any operation which produced these results had to be trapped by the operating system and the program would be forced to exit. Often these results came about because the algorithm was exploring a domain of the problem for which there was no solution, with division by zero only indicating that the algorithm needed to explore elsewhere. A user would quite reasonably be greatly aggrieved if their program, which had been running happily for a week, suddenly exited with an irrecoverable error, when all that had happened was that the algorithm had wandered into an area of the problem where their were no solutions. The IEEE-754 scheme allows the results of all real number operations to be represented in real number space, so all math operations produce a valid number (even if it is NaN or ∞) and the algorithm can continue. Flags are set showing whether the result came about through over/underflow or division by 0, allowing code to be written to let the algorithm recover.

NaN is also used when you must record a number, but have no data or the data is invalid: a machine may be down for service or it may not be working. You can't record 0.0 as this would indicate that you received a valid data point of 0.0. Invalid or absent data is a regular occurrence in the real world (unfortunately) and must be handled. You process the NaN value along with all the other valid data; NaN behaves like a number with all arithmetic operations (+-*/; the result of any operation with NaN always being NaN). On a graph, you plot NaN as a blank.

The number -0.0: is needed for consistency. It's all explained in "What every computer scientist should know about floating-point arithmetic" (link above). It's simplest to pretend you never heard of -0.0. Just in case you want to find out how -0.0 works...

>>> 0.0 * -1
-0.0
>>> -0.0 * 1
-0.0
>>> -0.0 * -1
0.0
>>> -(+0.0)
-0.0		#required by IEEE754
>>> -(+0)		
0		#might have expected this to be -0
>>> -0.0 + 0.0
0.0
>>> -0.0 - 0.0
-0.0
>>> -0.0 + -0.0
-0.0
>>> -0.0 - +0.0
-0.0
>>> -0.0 - -0.0
0.0
>>> 0.0 + -0.0
0.0
>>> 0.0 - +0.0
0.0
>>> 0.0 - -0.0
0.0
>>> i=-0.0
>>> if (i<0.0):
...     print "true"
... 
>>> if (i>0.0):
...     print "true"
... 
>>> if (i==0.0):
...     print "true"
... 
true		#(0.0==-0.0) is true in python
>>> if (i==-0.0):
...     print "true"
... 
true
>>> if (repr(i)=="0.0"):
...     print "true"
... 
>>> if (repr(i)=="-0.0"):
...     print "true"
... 
true

The result of the test (0.0==-0.0) is not defined in most languages. (As you will find out shortly, you can't compare reals, due to the limited precision of real numbers. You can't do it even for reals that have the exact value 0.0).

How many numbers are sacrificed in the 8bit float point representation used here, and in a 32 bit floating point representation (23bit mantissa, 8 bit exponent, 1 bit sign) to allow representation of 0.0, ∞ and NaN ^[17] ? What fraction of the numbers is sacrificed in each case ^[18] ?

1.6. Reals with a finite decimal representation, don't always have a finite binary represention

	Note
	(with suggestions from the Python Tutorial: representation error).

We've found that only a limited number of real numbers (256 in the case of 8-bit reals, 4G in the case of 32-bit reals) can be represented in a computer. There's another problem: most (almost all) of the reals that have finite representation in decimal (e.g.0.2, 0.6) don't have a finite representation in binary.

Only numbers that evenly divide (i.e. no remainder) powers of the base number (10decimal) can be represented by a finite number of digits in the decimal system. The numbers which evenly divide 10 are 2 and 5. Any number represented by (1/2)ⁿ*(1/5)^m can be represented by a finite number of digits. Any multiple of these numbers can also be represented by a finite number of digits.

In decimal notation, fractions like 1/10, 3/5, 7/20 have finite represenations

1/10=0.1
3/5=0.6
7/20=0.35

In decimal notation, fractions like 1/3, 1/7, 1/11 can't be represented by a finite number of digits.

1/3=0.33333....
1/7=0.14285714285714285714...

Do these (decimal) fractions have finite representations as decimal numbers: 1/20, 1/43, 1/50 ^[19] ?

Here's some numbers with finite represenation in decimal.

#(1/2)^3
1/8=0.125

#3/(2^3)
3/8=0.375

#note 1/80 is the same as 1/8 right shifted by one place. Do you know why?
#1/((2^4)*5)
1/80=0.0125

#1/(2*(5^2))
1/50=0.02

#7/(2*(5^2))
7/50=0.14

Using the above information, give 1/800 in decimal ^[20]

Divisors which are multiples of other primes (i.e. not 2,5) cannot be represented by a finite number of decimals. Any multiples of these numbers also cannot be represented by a finite number of digits. Here's some numbers which don't have finite represenation in decimal.

#1/(3^2)
1/9=0.11111111111111111111...

#1/(3*5)
1/15=0.06666666666666666666...

#4/(3*5)
4/15=0.26666666666666666666...

(the Sexagesimal (base 60) http://en.wikipedia.org/wiki/Sexagesimal numbering system, from the Sumerians of 2000BC, has 3 primes as factors: 2,3,5 and has a finite representation of more numbers than does the decimal or binary system. We use base 60 for hours and minutes and can have a half, third, quarter, fifth, sixth, tenth, twelfth, fifteenth, twentieth... of an hour.)

In binary, only (1/2)ⁿ can be represented by a finite number of digits. Fractions like 1/10 can't be prepresented by a finite number of digits in binary: there is no common divider between 10 (or 5) and 2.

# echo "scale=10;obase=2;ibase=10; 1/10" | bc -l
.0001100110011001100110011001100110

In the binary system, the only number that evenly divides the base (2), is 2. All other divisors (and that's a lot of them) will have to be represented by an infinite series.

Here's some finite binary numbers:

echo "obase=2;ibase=10;1/2" | bc -l
.1
echo "obase=2;ibase=10;3/2" | bc -l
1.1

#3/8 is the same as 3/2, except that it's right shifted by two places.
echo "obase=2;ibase=10;3/8" | bc -l
.011

From the above information give the binary representation of 3/32 ^[21]

Knowing the binary representation of decimal 5, give the binary representation of 5/8 ^[22]

Here's some numbers that don't have a finite represenation in binary

"obase=2;ibase=10;1/5" | bc -l
.0011001100110011001100110011001100110011001100110011001100110011001

#note 1/10 is the same as 1/5 but right shifted by one place
#(and filling in the empty spot immediately after the decimal point with a zero).
echo "obase=2;ibase=10;1/10" | bc -l
.0001100110011001100110011001100110011001100110011001100110011001100

with the above information, give the binary representation of 1.6 ^[23]

Here is 1/25 in binary

echo "obase=2;ibase=10;1/25" | bc -l
.0000101000111101011100001010001111010111000010100011110101110000101

From this information, what is the binary representatin of 1/100 ^[24]

Just as 1/3 can't be represented by a finite number of decimal digits, 1/10 can't be represented by a finite number of binary digits. Any math done with 0.1 will be done with a 32-bit truncated representation of 0.1

>>> 0.1
0.10000000000000001
>>> 0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1
0.99999999999999989
>>> 0.1*10
1.0

The problem of not being able to represent finite decimal numbers in a finite number of binary digits is independant of the language. Here's an example with awk:

echo "0.1" | awk '{printf "%30.30lf\n", $1}'
0.100000000000000005551115123126

echo "0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1" | awk '{printf "%.30lf\n", $1+$2+$3+$4+$5+$6+$7+$8+$9+$10}'
0.999999999999999888977697537484

1.7. Do not test reals for equality

Do not test if two reals are equal. Even though the two numbers (to however many places you've displayed) appear the same, the computer may tell you that they're different. Instead you should test

abs(your_number - the_computers_number) < some_small_number

For the real way to do it see ???.

Here's an example of how things don't work quite the way you'd expect.

>>> i=0.1
>>> if (i==0.1):	#is (i==0.1)?
...     print "true"
... 
true
>>> j=0
>>> for x in range(0,10):
...     j += i
... 
>>> j			#j should be 1.0, but you loose a 1 in the last place every time you add 0.1
0.99999999999999989
>>> print j		#outputs field width of 12 (and will round the number)
1.0
>>> print "%20.12f" %j	#print with explicit 12 digits after decimal point
      1.000000000000
>>> print "%20.20f" %j	#wider field
0.99999999999999988898
>>> print 1.0-j
1.11022302463e-16
>>> if (j==1.0):	#is (j==1.0)?
...     print "true"
... else:
...     print j
... 
1.0

Is (j==1.0) true? The answer is not yes or no; you aren't going to get a sensible answer. Knowing that, you should never ask such a question. In case you aren't convinced, let's see what might happen if you do.

(About 747s Airliner Takeoff Speeds, Boeing 747 http://www.aerospaceweb.org/aircraft/jetliner/b747/ Boeing 747-400 http://en.wikipedia.org/wiki/Boeing_747-400 http://www.aerospaceweb.org/question/performance/q0088.shtml). About diesel locomotives Electric Diesel Hauling II (http://www.cfr.ro/jf/engleza/0310/tehnic.htm) Steam Ranger Enthusiast Pages (http://www.steamranger.org.au/enthusiast/diesel.htm).

You've written the computer code to automatically load the Boeing 747-400ER cargo planes each night at the FedEx Memphis terminal. You have a terminal with 1500 palettes, each palette weighing 0.1Mg (about 200lbs). The max cargo payload is 139.7Mg.

It's 2am. All the packages have arrived by truck and are now sitting on palettes. Let's start loading cargo.

#! /usr/bin/python

palettes_in_terminal=[]	#make an empty list for the palettes in the terminal awaiting loading
palettes_in_747=[]	#make another empty list for the palettes loaded onto the plane
number_palettes=1500
weight_palette=0.1
weight_on_plane=0.0
max_weight_on_plane=139.7

#account for palettes in terminal
#make a list containing 1500 entries of 0.1
for i in range(0,number_palettes):
	palettes_in_terminal.append(weight_palette)
	

#load plane
#while plane isn't full and there are more palettes to load
#	take a palette out of the terminal and put it on the plane
while ((weight_on_plane != max_weight_on_plane) and (len(palettes_in_terminal) != 0)):
	#the next line is a train wreck version of this
	#moved_palette=palettes_in_terminal.pop()
	#palettes_in_747.append(moved_palette)
	palettes_in_747.append(palettes_in_terminal.pop())
	weight_on_plane += weight_palette
	#print "weight of cargo aboard %f" % weight_on_plane

print "weight on plane %f" % weight_on_plane
print "max weight on plane %f" % max_weight_on_plane
#-------------

What's this code doing? Swipe the code and run it. Is the plane safe to fly? How overloaded is it? How did the code allow the plane to become overloaded ^[25] ?

In an hour (at 3am), the plane, with a crew of 2, weighing 412,775 kg (about 400 tons, or 2-8 diesel locomotives), will be sitting at the end of the runway ready to take off, with its cargo and enough jet fuel (241,140l.) to make it across the country in time for the opening of the destination airport at 6am. (Why do the pilots have to wait till 6am, why couldn't they fly in earlier ^[26] ?) The pilot will spool up the engines, hold the plane for several seconds to check that the engines are running at full power, then let off the brakes, when the plane will start to roll. After thundering 3km down the runway, about a minute later alarms will alert the pilot that the apparently fully functional plane hasn't reached the lift off speed of 290 km/hr. The plane is too far down the runway to abort takeoff ^[27]. As far as the pilot knows, there is nothing wrong with the plane, only the speed is low. Is the indicated low speed an error and the the plane is really moving at the expected speed? Should the pilot rotate the plane and attempt lift off? He has only seconds to decide. Assuming the runway is long enough for the plane to eventually reach lift off speed, will it lift off ^[28] ? You'll read all about it in the papers next morning. In fact the pilot has no options and no decision to make. He was committed when he let the brakes off.

Why aren't there crash barriers at the end of the runway ^[29] ? There are better solutions for handling an overweight plane (see below).

The airline will blame the software company. The owners of the software business will protest that it was a problem with one of the computers that no-one could possibly have anticipated (there are people that will believe this). The american legal system will step in, the business will declare bankruptcy, the owners will have taken all their money out of the business long ago and have enough money to live very comfortably, the workers will be out of a job, and will find their pension fund unfinanced. The owners of the software business will be sued by the airline and the workers, but since the business is bankrupt, there is no money even if the suit is successful. Only the lawyers and the business owners will come out of this in good financial shape.

No-one would ever write such a piece of software. Even if you could equate reals, you wouldn't have written the code this way. Why not ^[30] ? Fix the code (you need to change the conditional statement). Here's my fixed version ^[31] .

While the algorithm I used here is completely stupid, the actual method that the FAA uses to calculate the take-off weight of a passenger plane is more stupid (and I'm not making this up): the airline just makes up a value. They guess how much the passengers weigh (and from watching how my luggage is handled at check in, they guess the weight of the baggage). The final decision is left to the pilots (at least in the case of US AirWays 5481 below), who have to decide by looking at the baggage and the passengers, whether it's safe to take off. This is the method that's used in the US. If you were a pilot, could you estimate the weight of passengers to 1% by looking out the cockpit door, down the aisles at the passengers? Pilots are trained to fly planes not to weigh passengers: scales are used to weigh planes.

Considering the consequences of attempting to take off overweight, it would be easy enough to weigh a plane before take-off, as is done for a truck or train: see Weigh Bridge (http://en.wikipedia.org/wiki/Weigh_bridge). Accuracy of current equipment is enough for the job: see Dynamass (www.pandrol.co.za/pdf/weigh_bridges.pdf) which can weigh a train moving at 35kph to 0.3% accuracy.

Planes attempt to take off overweight all the time.

The plane piloted by 7 yr old Jessica Dubroff, attempting to be the youngest girl to fly across the USA, crashed shortly after take off from Cheyenne (see Jessica Dubroff. http://en.wikipedia.org/wiki/Jessica_Dubroff). The plane was overloaded and attempted to take off in bad weather.

Most plane crashes due to overloading occur because the weight is not distributed correctly.

Air Midwest #5841 (http://en.wikipedia.org/wiki/US_Airways_Express_Flight_5481) out of Charlotte NC in Jan 2003 crashed on take off, because (among other things) the plane was overloaded by 600lbs, with the center of gravity 5% behind the allowable limit. The take-off weight was an estimate only, using the incorrect (but FAA approved) average passenger weight estimate, that hasn't been revised since 1936. The supersized passengers of the new millenium are porkers, weighing 20lbs more. The airline washed their hands of responsibility and left it to the pilots to decide whether to take off. Once in the air, the plane was tail heavy and the pilots couldn't get the nose down. The plane kept climbing till it stalled and crashed.

An Emirates plane taking off from Melbourne (http://www.smh.com.au/travel/takeoff-error-disaster-averted-20090430-aozu.html) with 257 passengers, weighed 100tonnes more than the pilots had been told. The Airbus A340-500 scraped its tail along the tarmac and grassland beyond the runway at Tullamarine on March 20, then hit airport landing lights and disabled a radio antenna before taking off. The pilots then dumped fuel over Port Phillip Bay for about 30 minutes to reduce the weight before making an emergency landing at Tullamarine.

1.8. Floating point precision: Optimisation

One would think that 32-bits is enough precision for a real, and in many cases (just adding or multiplying a couple of numbers) it is. However most of the time spent on computers is optimising multi-parameter models. Much scientific and engineering computing is devoted to modelling (making a mathematical representation for testing - weather, manufactured parts, bridges, electrical networks). You are often optimising a result with respect to a whole lot of parameters; e.g.

In the scientific world you might be finding a set of parameters which best fit data.
In business/industry, you might be finding the minimum cost for making a product, by running simulations changing the size or quality of ingredients, the location of the plant(s) making it, the cost of making it, changing various suppliers, the public's likely reaction to a better built (but more expensive) device.
You might try to optimise making cookies (biscuits outside USA). Your constraints are that the average cookie must have 10 chocolate chips, or 10 raisins, and you need to minimise the cost of ingredients, labour and heat for cooking. As in many optimisations, you can't program in all parameters and you find that the computer optimised result is unusable. In this case your cookies may have an unacceptable taste or texture. Next time you ask the computer for all recipes that will have a total cost below a certain level. You find that the cost is too high to make any money, so you decide to become a computer programmer instead.

Optimisation is similar an ant's method for finding the top of a hill. All the ant knows is its position and it's altitude. It moves in some direction and takes note of whether it has gone uphill or downhill. When it can no longer move and gain altitude, the ant concludes that it is at the top of the hill and has finished its work. The area around the optimum is relatively flat and the ant may travel a long way in the x,y direction (or whatever the parameters are), before arriving at the optimum. Successful optimisation relies on the ant being able to discern a slope in the hill.

Because of the coarseness of floating point numbers, the hill instead of being smooth, is a series of slabs (with consecutive slabs of different thickness). An ant standing next to the edge of a slab, sees no change in altitude if it moves away from the edge, but if it had moved towards the edge would have seen a large change in altitude. If the precision of your computer's floating point calculations is too coarse, the computer will not find the optimum. In computerese - your computation will not converge to the correct answer and you won't be able to tell that you've got the wrong solution, or it will not converge at all - the ant will wander around a flat slab on the side of the hill, never realising that it's not at the top.

If the computation changes from 32 to 64-bit, how many slabs will now be in the interval originally covered by a 32-bit slab ^[32] ?

Because you can't rely on a 32-bit computation converging or getting the correct answer, scientists and engineers have been using 64-bit (or greater) computers for modelling and optimisation for the past 30yrs. You should not use 32 bit floating point for any work that involves repetitive calculation of small differences.

How do you know that your answer is correct and not an artefact of the low precision computations you're using? The only sure way of doing this is to go to the next highest precision and see if you get the same answer. This solution may require recoding the problem, or using hardware with more precision (which is likely unavailable, you'll already be using the most expensive computer you can afford). Programmers test their algorithm with test data for which they have known answers. If you make small changes in the data, you should get small changes in the answers from the computer. If the answers leap around, you know that the algorithm did not converge.

1.9. Representing money

Money (e.g. $10.50) in a bank account looks like a real, but there is no such thing as 0.1c. Money needs to be represented accurately in a computer in multiples of 1c. For this computers use binary coded decimal (http://en.wikipedia.org/wiki/Binary-coded_decimal). This is an integer representation, based on 1c. You can learn about on this your own if you need it.

	Note
	End Lesson 19

2. Problems resulting from a finite representation of reals

FIXME. This has not been presented to the students and has several problems. I'll come back to this later.

2.1. Machine's epsilon

The smallest number representable by a 64-bit real is 2.2250738585072020*10^-308. Its value is mainly limited by the number of digits you can store in the exponent (multiplied by a normalised mantissa whose value has the range 1.0..9.999).

Another small number of interest to computer programmers is the value of the smallest change you can make to a number (let's say number==1.0). This is determined by the number of bits in the mantissa. If the 23 bit mantissa for 1.0 on a 32 bit machine (being normalised) is 000000000000000000000000, then the smallest change will be to 000000000000000000000001 (only the last bit changes). The number represented by this difference is called the machine's epsilon (http://en.wikipedia.org/wiki/Machine_epsilon). The number is called the "machine's epsilon" because before IEEE 754, every machine had a different way of representing reals, and each machine would have its own epsilon. Now with most machines (and except for specialised embedded machines) being IEEE 754 compliant, the machine's epsilon will be the IEEE 754 epsilon. Let's ask our machine what its epsilon is.

>>> epsilon = 1		#has a simple representation in binary
>>> while (epsilon+1>1):
...     epsilon = 0.5*epsilon	#still has a simple representation in binary 
...     print epsilon
... 
0.5
0.25
0.125
.
.
7.1054273576e-15
3.5527136788e-15
1.7763568394e-15
8.881784197e-16
4.4408920985e-16
2.22044604925e-16
1.11022302463e-16

If to the computer, 1+epsilon==1, the loop exits. The machine's epsilon for 64-bit IEEE 754 reals is 1.11022302463e-16 (for comparison, the smallest representable real is 2.2250738585072020*10^-308). (note: epsilon is a lot bigger than the smallest number representable by the machine's reals. Do you understand the difference between these two numbers?)

What this section is saying is that the computer can't differentiate reals that are more closely spaced than the spacing represented by the last bit in the mantissa.

2.2. Changing the exponent doesn't loose precision (for small changes in exponent)

After shifting x=1/5 one place to the right (in the previous section) to give the result x=1/10, note that the right hand "1" falls off the end and is lost. In the following sequence of calculations (right shift, losing the last 1, then left shift, which will insert a 0, in the place which previously had a 1)

x=0.2
y=x/2	#y=0.1
z=y*2	#z=0.2

would the resulting z have the same value as the original x? If you were using fixed point operations, you would get a different value. However a computer uses a mantissa and an exponent to represent a real, so dividing by a power of 2 just changes the exponent and no precision is lost. The mantissa will be unchanged following the above operations, and z would have the same value as x.

>>> x=0.2
>>> y=x/2
>>> z=y*2
>>> print z-x
0.0

Taking this example to the extreme...

>>> n_128=2**128
>>> print n_128
340282366920938463463374607431768211456
>>> x=0.1
>>> y=x/n_128                                  
>>> z=y*n_128
>>> print "%30.30g, %30.30g, %30.30g" %(x,y,z)
0.100000000000000005551115123126, 2.93873587705571893305445304302e-40, 0.100000000000000005551115123126

2.3. Changing the mantissa can produce errors (for small changes in mantissa)

What if you did an operation that changed the mantissa (ignoring whether or not the exponent was changed)? We need a number that doesn't have a finite representation in binary.

>>> x=0.1		#x doesn't have a finite represenation in binary
>>> n_prime=149.0	#1/n_prime doesn't have finite representation in binary either
>>> y=x/n_prime
>>> z=y*n_prime
>>> print "%30.30g, %30.30g, %30.30g, %30.30g " %(x,y,z,z-x)
0.100000000000000005551115123126, 0.000671140939597315508424735241988, 0.100000000000000005551115123126, 0

I would have thought that z and x would be different, but you get back the same number. I puzzled about this, till a co-worker Ed Anderson suggested the following as a method to find out what was going on: try a whole bunch of numbers. In particular, try some irrational numbers, like square roots (irrational numbers cannot be represented by a finite number of digits, in any base system). All square roots, except integers, are irrational (for the range 1..10, only 1,4 and 9 would have rational square roots). So use all the square roots from 1..some_large_number for both x and the divisor/multiplier. Ed's initial test was a 100x100 grid. Here's Ed's code for a 64*64 grid. There is a blank when the numbers are identical, and a letter/number if they were different.

#! /usr/bin/python

# fp_ed.py
# Joseph Mack (C) 2008, GPL v3.
# With suggestions from Ed Anderson

# takes a number, divides it by a 2nd, stores the result, then multiplies the result the 2nd number.
# in a perfect world, the result should be the original answer.
# due to limited precision of computer reals, sometimes a different answer will be returned.

from math import sqrt;
number_that_are_different=0;

end = 100;
for j in range(1,end):
	output_string = "" # can't output strings without \n in python and can't supress a blank either
	output_string += str('%3d ' %j);        #put blank after string
	for i in range(1,end):
	        x = sqrt(i);
	        y = x/j;
	        z = y*j;
	        if (x-z == 0.0):
	                output_string += " "

	        else:
	                #print x-z ;
	                output_string += "x";
	                number_that_are_different += 1;


	print output_string

print number_that_are_different
#- fp_ed.py -----------------------

Here's the output

	    1         2         3         4         5         6    
  1                                                                
  2                                                                
  3             x2                      33       3     w   3  3    
  4                                                                
  5  1     2     2               333     w                 3  3    
  6             x2                      33       3     w   3  3    
  7              x                                    w w3 w  3    
  8                                                                
  9                     w         w                                
 10  1     2     2               333     w                 3  3    
 11          x  2                     w 3w3w ww  w     3w 3  3     
 12             x2                      33       3     w   3  3    
 13             xx                               ww  w w   w w   3w
 14              x                                    w w3 w  3    
 15               x                                          w3w   
 16                                                                
 17                               w                                
 18                     w         w                                
 19          x                        w w ww                       
 20  1     2     2               333     w                 3  3    
 21                              w    3 w   3    3        3   3    
 22          x  2                     w 3w3w ww  w     3w 3  3     
 23               x                      3  3   3 w          www3  
 24             x2                      33       3     w   3  3    
 25          x   x                         w        w   w  w       
 26             xx                               ww  w w   w w   3w
 27              xx                              w   ww w  w3  w   
 28              x                                    w w3 w  3    
 29              2x                                        33w w3w 
 30               x                                          w3w   
 31                                                               w
 32                                                                
 33                                                                
 34                               w                                
 35       2                    3  3  3w                       3    
 36                     w         w                                
 37   y  2     x          33      w     w          w               
 38          x                        w w ww                       
 39       2                   33         3       33             w  
 40  1     2     2               333     w                 3  3    
 41              x              3 3 3               w      w    3  
 42                              w    3 w   3    3        3   3    
 43  1y    2 x2x  2              w 3  w    w   3   ww   w 3 w 33   
 44          x  2                     w 3w3w ww  w     3w 3  3     
 45                                 3                w  w    3   3 
 46               x                      3  3   3 w          www3  
 47         2x2 2                     w333 w   3     w33 w    w    
 48             x2                      33       3     w   3  3    
 49 0  1           2                        w w       3   3  3w    
 50          x   x                         w        w   w  w       
 51           2  x                          3ww33   w   3  w       
 52             xx                               ww  w w   w w   3w
 53              22                              3   3  3  3   3 3 
 54              xx                              w   ww w  w3  w   
 55   y        xx                                  w3 3w    3 3 w33
 56              x                                    w w3 w  3    
 57              2                                      3ww3     w 
 58              2x                                        33w w3w 
 59              2x                                       33w3 w   
 60               x                                          w3w   
 61                                                             w w
 62                                                               w
 63

What if we do the multiplication first

	    1         2         3         4         5         6    
  1                                                                
  2                                                                
  3   y        x  x                  w    w       3w  3  3     w w 
  4                                                                
  5           2  x                           3 3w       w  ww     3
  6   y        x  x                  w    w       3w  3  3     w w 
  7           x                              w w3           3      
  8                                                                
  9             x x                                    w 3w    w3 3
 10           2  x                           3 3w       w  ww     3
 11                                  3                w   3   w w3 
 12   y        x  x                  w    w       3w  3  3     w w 
 13  y     x 2  2               3 3w  w    3     3     3  w     3  
 14           x                              w w3           3      
 15     x              w                                  3     w  
 16                                                                
 17                                                          w3  w3
 18             x x                                    w 3w    w3 3
 19              22                               w   3 3w 3w3 3 w3
 20           2  x                           3 3w       w  ww     3
 21   y      2 x2                        3w3 w  3  w   3    3w    3
 22                                  3                w   3   w w3 
 23           2  2                  33         3        3  3 33    
 24   y        x  x                  w    w       3w  3  3     w w 
 25                                 ww  ww                   w     
 26  y     x 2  2               3 3w  w    3     3     3  w     3  
 27   1   2    2          w  w 3  w 3     3      w 3         3     
 28           x                              w w3           3      
 29                     w           w                        w3    
 30     x              w                                  3     w  
 31                                 3                        3     
 32                                                                
 33                                                             w3 
 34                                                          w3  w3
 35                                                      3  3      
 36             x x                                    w 3w    w3 3
 37             2                                      3w3    w  3 
 38              22                               w   3 3w 3w3 3 w3
 39   y       xx 2                             ww3 w  w  ww3  3 3  
 40           2  x                           3 3w       w  ww     3
 41             x x                       3   w 333    w  3    www 
 42   y      2 x2                        3w3 w  3  w   3    3w    3
 43               2                           3  ww   3 w      3   
 44                                  3                w   3   w w3 
 45   y      2xx                      w  w 3  ww w w               
 46           2  2                  33         3        3  3 33    
 47  y     x    2x                 w         33        3 3www   3  
 48   y        x  x                  w    w       3w  3  3     w w 
 49          x                   3    3    w ww         3w  3      
 50                                 ww  ww                   w     
 51   1   2   22               3        3     33   3          w    
 52  y     x 2  2               3 3w  w    3     3     3  w     3  
 53              2x              33           w   3        3   w   
 54   1   2    2          w  w 3  w 3     3      w 3         3     
 55      x               w w             3    3 w                3 
 56           x                              w w3           3      
 57                     3                     w       3 w          
 58                     w           w                        w3    
 59          x                        3    w  3          w         
 60     x              w                                  3     w  
 61                                           w                    
 62                                 3                        3     
 63                                           3

The figures are different. It seems that multiplication and division isn't always commutative.

Surprisingly (to me) there is a pattern, but then I didn't understand what was going on so any result would have been a surprise. "x"s appear in blocks on boundaries which are multiples of 8,16,32 and 64 (increasing the value of end shows that the block pattern continues).

With a slight modification of the code, the differences, if not 0.0, were all seen to be (1.11022302463e-16)*2ⁿ.

Is there anything in particular about 1.11022302463e-16? It turns out that it's the machine's epsilon (http://en.wikipedia.org/wiki/Machine_epsilon) (also see machine's epsilon).

What is 1.11022302463e-16 in binary?

dennis:~# echo "obase=2;1.11022302463*10^-16" | bc -l
.0000000000000000000000000000000000000000000000000000011111111111111

If you round this up by one bit, it's 2^-53. A 64 bit double has a 53 bit mantissa, indicating that the epsilon calculation is using a 64 bit real.

From the wiki entry, a single precision (32bit) real has a 24 bit mantissa. 1.0 then has an exponent of 0 and mantissa of 1.00000000000000000000000₂. The next biggest number has an exponent of 0 and a mantissa of 1.00000000000000000000001₂. The difference is 0.00000000000000000000001₂ or 2^-23.

Using IEEE 754 Converter (http://www.h-schmidt.net/FloatApplet/IEEE754.html), to do the conversion to a 32-bit real, I found

1.11022302463e-16=0x2500000=00100101000000000000000000000000.

Breaking the binary representation into sign, exponent and mantissa (with the implied 1 for a normalised mantissa) gives

1.11022302463e-16 = 0 01001010 (1)00000000000000000000000
    decimal         s   exp             mantissa

Calculating this all gives: The sign bit: (bit 32, the leftmost bit) is 0 (positive).

The exponent: the IEEE-754 real math convention doesn't use the signed int representation. a 32 bit real, the exponent is a 9 bit integer (here 001001010). Floating point math is usually done on floating point hardware (or emulated on the the integer registers of the CPU, but the incremental cost of adding a floating point processor/unit (FPU) is small, so all x86 chips, starting with the 486, have a floating point processor built-in to the CPU chip). Since real math doesn't use the CPU's registers (each composed of 2,4,8 or 16 bytes), but instead uses the math co-processor's registers, then real math doesn't have to use the CPU's conventions (e.g. a signed int) to process a 9 bit int.

For 32 bit floating point, the convention is to subtract 127 from the value of the exponent. is is called "excess-127" (excess-1023 for 64 bit) and allows the exponent and the mantissa, when taken together, to vary monotonically (see IEEE-754 References http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html). (What this gets you, I don't know.) The 8-bit scheme 8-bit set of normalised reals I made up for this class is not monotonic in this respect (see the table and the figure, where there are jumps in value every 16 numbers).

In the current example, the exponent is -53 (from the next 9 bits). To get the actual value of the exponent, you subtract 127.

echo "obase=10; ibase=2; 001001010" | bc
74
then subtract 127
74-127=-53

The FPU int then is an unsigned int, from which is subtracted 127, so that -ve numbers are possible (it's called signed magnitude). Here are a few comparison numbers (assuming an 8 bit int):

bits		value, signed int	value, unsigned int	value, signed magnitude
00000000	0 			0 			-127
00000001	1			1			-126
01111111	127			127			0
10000000	-127			128			1
11111111	-1			255			128

Since all of the evaluations can done with the same speed, with hardware designed to do these calculations, there is nothing to pick between them. Why IEEE-754 uses their particular representation of the exponent, rather than the signed int representation, I don't know.

The mantissa is 1.0 (after adding the implied 1 at the beginning of 23 zeroes).

Multiplying this all out gives the original number.

echo "1*2^-53" | bc -l
.00000000000000011102
#just checking that I can count 0s
 echo "1*2^-53/10^-16" | bc -l
1.11020000000000000000

The following table shows the number of times an entry occured for a grid size. The 2nd entry says that in an 16*16 grid, that 206 numbers were not changed and that -(1.11022302463e-16)*2³ was returned 9 times, +(1.11022302463e-16)*2*2 was returned 2 times and that +(1.11022302463e-16)*2*3 was returned 8 times. In the 16*16 grid 15*15=225 numbers are tested (=9+206+2+8).

grid size\n	-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8

8*8		0 0 0 0 0 0 0 0 0 0 0 0 48 0 1 0 0 0 0 0 0 0 0 0 0
16*16		0 0 0 0 0 0 0 0 0 9 0 0 206 0 2 8 0 0 0 0 0 0 0 0 0
32*32		0 0 0 0 0 0 0 0 6 22 0 0 911 0 3 13 6 0 0 0 0 0 0 0 0
64*64		0 0 0 0 0 0 0 0 163 44 3 0 3566 1 6 33 153 0 0 0 0 0 0 0 0
128*128		0 0 0 0 0 0 0 99 346 82 10 0 15062 4 14 80 349 83 0 0 0 0 0 0 0
256*256		0 0 0 0 0 0 0 3085 723 163 24 0 56990 14 39 174 750 3063 0 0 0 0 0 0 0
512*512		0 0 0 0 0 0 1679 6215 1505 355 56 0 241228 35 87 381 1570 6360 1650 0 0 0 0 0 0
1024*1024	0 0 0 0 0 0 50447 12720 3101 752 121 0 911185 91 196 799 3209 12807 51101 0 0 0 0 0 0
2048*2048	0 0 0 0 0 26941 101676 25687 6251 1500 250 0 3863963 230 452 1669 6449 25610 102498 27033 0 0 0 0 0
4096*4096	0 0 0 0 0 822872 204267 51363 12576 3001 525 0 14575322 504 969 3366 12919 51254 205106 824981 0 0 0 0 0	
8192*8192	0 0 0 0 432519 1647991 409597 102641 25183 5958 1057 0 61830620 1105 2100 6885 25992 103109 411309 1652318 434097 0 0 0 0
16384*16384	0 0 0 0 13222697 3300406 821156 205169 50246 11827 2083 0 233145705 2345 4366 13903 52320 206826 824241 3306218 13233181 0 0 0 0 
32768*32768     0 0 0 6931293 26447787 6604323 1645180 409899 100085 23405 4099 0 989292832 4872 9022 28195 105323 414810 1651250 6614158 26462945 6926811 0 0 0
65536*65536	0 0 0 211603215 52895451 13214901 3295139 820314 200000 46646 8163 0 3730616481 9841 18209 56688 211726 830962 3306011 13229067 52912230 211561181 0 0 0

131072*131072
262144*262144
524288*524288
1048576*1048576	

8*8		            0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.980 0.0 0.020 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
16*16		        0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.040 0.0   0.0 0.916 0.0 0.009 0.036 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
32*32		      0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.006 0.023 0.0   0.0 0.948 0.0 0.003 0.014 0.006 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
64*64		      0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.041 0.011 0.001 0.0 0.898 0.0 0.002 0.008 0.039 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
128*128		    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.006 0.021 0.005 0.001 0.0 0.934 0.0 0.001 0.005 0.022 0.005 0.0 0.0 0.0 0.0 0.0 0.0 0.0
512*512		  0.0 0.0 0.0 0.0 0.0 0.0 0.006 0.024 0.006 0.001 0.0   0.0 0.924 0.0 0.0   0.001 0.006 0.024 0.006 0.0 0.0 0.0 0.0 0.0 0.0
1024*1024	  0.0 0.0 0.0 0.0 0.0 0.0 0.048 0.012 0.003 0.001 0.0   0.0 0.871 0.0 0.0   0.001 0.003 0.012 0.049 0.0 0.0 0.0 0.0 0.0 0.0
2048*2048	0.0 0.0 0.0 0.0 0.0 0.006 0.024 0.006 0.001 0.0   0.0   0.0 0.922 0.0 0.0   0.0   0.002 0.006 0.024 0.006 0.0 0.0 0.0 0.0 0.0
4096*4096	0.0 0.0 0.0 0.0 0.0 0.049 0.012 0.003 0.001 0.0   0.0   0.0 0.869 0.0 0.0   0.0   0.001 0.003 0.012 0.049 0.0 0.0 0.0 0.0 0.0
8192*8192	0.0 0.0 0.0 0.0 0.006 0.025 0.006 0.002 0.0 0.0 0.0 0.0 0.922 0.0 0.0 0.0 0.0 0.002 0.006 0.025 0.006 0.0 0.0 0.0 0.0 
16384*16384	0.0 0.0 0.0 0.0 0.049 0.012 0.003 0.001 0.0 0.0 0.0 0.0 0.869 0.0 0.0 0.0 0.0 0.001 0.003 0.012 0.049 0.0 0.0 0.0 0.0
32768*32768     0.0 0.0 0.0 0.006 0.025 0.006 0.002 0.0 0.0 0.0 0.0 0.0 0.921 0.0 0.0 0.0 0.0 0.0 0.002 0.006 0.025 0.006 0.0 0.0 0.0
65536*65536	0.0 0.0 0.0 0.049 0.012 0.003 0.001 0.0 0.0 0.0 0.0 0.0 0.869 0.0 0.0 0.0 0.0 0.0 0.001 0.003 0.012 0.049 0.0 0.0 0.0

131072*131072
262144*262144
524288*524288
1048576*1048576

After the multiplication/division, how many numbers are returned with a different value? For any particular j you have a 50% chance of losing a 0 off the right end, in which case you'll get back the original answer. So one possibility is 50% will be different. How about the divisor?; you only have a 50% chance of getting that right when you drop off extra digits. With two ways of getting it wrong, you only have 25% chance of getting it right, so 75% will be different. Here's the results.

grid_size	number_mul/div	number_changed	%changed	largest_difference
10*10		10^2		      2	 	  2		+    4 * 1.11e-16
100*100		10^4		    713		  7		+-  16 * 1.11e-16
1000*1000	10^6		 127980		 13		+-  32 * 1.11e-16	
10000*10000     10^8		8060303           8		+- 128 * 1.11e-16	
100000*100000	10^10	      927798399		  9		+- 512 * 1.11e-16

FIX ME.

Let's look at this example with the IEEE-754 converter referred to above

>>> sqrt(14)
3.7416573867739413
>>> x=sqrt(14)
>>> y=x/3
>>> z=y*3
>>> z-x
-4.4408920985006262e-16

#from IEEE 754 converter
x = 3.7416573867739413 = 2 * 1.8708287 = 0 10000000 (1) 11011110111011101010001 
3.0 = 2 * 1.5                          = 0 10000000 (1) 10000000000000000000000
#from python
y = x/3.0 = 1.247219128924647          = 0 01111111 (1) 00111111010010011100000

#from bc
echo "obase=10;ibase=2; 1.11011110111011101010001/1.10000000000000000000000" | bc -l
1.24721916516621907552
echo  "obase=2;ibase=2; 1.11011110111011101010001/1.10000000000000000000000" | bc -l
1.0011111101001001110000010101010101010101010101010101010101010101010

#from python
z = 3.7416573867739409

from bc

echo "obase=10;ibase=2; 1.0011111101001001110000010101010101010101010101010101010101010101010 *11" | bc -l
3.7416574954986572265489474728439311945749068399891257286071777343750 

 
echo "obase=2;ibase=2; 1.0011111101001001110000010101010101010101010101010101010101010101010 *11" | bc -l
11.101111011101110101000011111111111111111111111111111111111111111111

The result after right shifting by one (to give an exponent of 1), and dropping the implied leading 1 from the mantissa give
a mantissa of 

11011110111011101010000	#new mantissa (after dropping extra digits)
11011110111011101010001 #original mantissa
	              1 #difference

since the exponent is 1 (a factor of 2) then the difference is

3. Two Algorithms: square root, numerical integration

In the following sections we're going to code up two algorithms.

The first, the square root, converges quickly, but like most quickly converging algorithms, can't be generalised to any other calculation: it only works for square roots. The computing landscape is sparsely populated with these quickly converging algorithms and the discovery of such algorithms are isolated events requiring the efforts or inspiration of some genius. Understanding these algorithms doesn't give you any help discovering new algorithms, but some people turn out to be better producing new algorithms than others, and these people understand all the known algorithms. Computer programmers spend a much effort looking for fast algorithms and the discovery and the discoverers are celebrated in the same way as the discovery of new continent is celebrated by the general populace.

One of the better known algorithm people is Donald Knuth (http://en.wikipedia.org/wiki/Donald_Knuth), who is famous for offering cash rewards ($2.56, a hexadecimal dollar according to Knuth) for finding mistakes in his code. Finding a mistake by Knuth is so rare that Knuth's checks (cheques) are one of computerdom's most prized trophies. People prefer to prominently display Knuth's check (cheque) on their wall for the gasps and admiration of vistors, than to deposit the money in their bank account.
Knuth is one of the people who have pushed the concept of mathematically provably correct code. While it's obvious to everyone that a plane manufacturer must show that their plane will fly, the equivalent proof of usability in computing is difficult to demonstrate. Computer code has sufficient traps, bugs and unpredictable responses to out of bounds input, that in the absence of tools to show that code does exactly what it's supposed to do and nothing else, computer programmers rarely attempt to prove that their code is correct. (One of the diagnostic features of computer programs is that that don't work well.) It seems that proving code correct is not generally possible. Knuth once said "Beware of bugs in the above code; I have only proved it correct, not tried it." Computer programmers have given up on provably correct code and currently have adopted the less rigourous, but hopefully achievable goal of of Programming by Contract (http://en.wikipedia.org/wiki/Design_by_contract). Programming by contract is used in Eiffel and Ada (and less so in C++). In languages which don't have Programming by Contract features, programmers are encouraged to put in equivalent statements (even if only in comments) in their code. The built in Programming by Contract features of Ada make it the language of choice for applications where safety is paramount (e.g. Air Traffic Control).

The discovery of the Fast Fourier Transform (http://en.wikipedia.org/wiki/Fast_Fourier_transform) (FFT) and the Monte Carlo Method (http://en.wikipedia.org/wiki/Monte_Carlo_method), both of which came out of people who worked on the Manhattan project, have revolutionised signal processing and statistics respectively, creating whole industries which would not have been possible otherwise.

The second algorithm, which calculates the value of π, uses an easy to understand and generalisable method: numerical integration. Numerical integration is a brute force method that converges so slowly that it is only usable for a small number of significant figures. Often this is enough for computing purposes (you have to accept the long time, whether you like it or not). If you're doing a calculation for which no-one has discovered a quickly converging algorithm, numerical integration will often be your only choice. While in principle you could do a few square roots by hand, numerical integration, being a brute force method, requires computers (or supercomputers) to be usable at all.

4. Calculating square root

Square roots are often used in computer to calculate distances (from Pythagorus). In graphics (e.g. gaming) the inverse square root is used to normalise 3-D vectors (i.e. make sure that between the r,g,b channels, you aren't outputting more than 1 unit of light). Unlike the distance calculation, the inverse sqrt() doesn't have to be all that accurate. There is a fast inverse sqrt() for graphics, see Origin of Quake3's fast inverse square root (http://www.beyond3d.com/content/articles/8/).

There are manuscripts from 1650BC referring to Egyptian methods of extracting square roots (see wiki square root. http://en.wikipedia.org/wiki/Square_root#History).

There is a sqrt() in the python math module. It calls the C math libraries, which call the hardware sqrt() function on your computer's floating point unit (fpu). The sqrt() then is being done in special hardware. You aren't going to get any faster than that. Despite the ready availability of a fast accurate sqrt(), we're going to use one of the simpler sqrt() methods, the Babylonian method, an algorithm well suited to hand calculation.

4.1. Python reals are 64 bit

First we need to know the precision (number of significant figures) that python uses to represent real numbers. (What is the precision for integers ^[33] ?) Remember that a 32 bit computer can manipulate 2³² integers (you do remember the value of 2³²? - it's about 10 decimal digits). We'll learn more about Real Numbers later, but for the moment we won't go far wrong if we assume that a 32 bit computer is capable of representing 2³² real numbers. While there is only one integer between 0..2, there can be an infinite number of reals 0..2 and a computer can't hope to represent all of them. The computer finds the nearest one it can represent, whether it's correct or not, and say "it's near enough". If we go to 64-bits, we'll get a more accurate representation, but it still won't be exact.

Here's some code to show that floating point numbers longer than 12 digits are truncated i.e. any following numbers are garbage.

>>> for x in range (1,15):
...     print  x, 1+10**-x
... 
1 1.1
2 1.01
3 1.001
4 1.0001
5 1.00001
6 1.000001
7 1.0000001
8 1.00000001
9 1.000000001
10 1.0000000001
11 1.00000000001
12 1.0
13 1.0
14 1.0

10¹²12 is 2^? ^[34] This is not enough accuracy for a 64 bit number and is too much for a 32 bit number. We should suspect that we've goofed. The above truncation turns out to be due to limitations of the formatting. print uses str() for outputting, which only outputs 12 digits after the decimal point. If we increase the number of digits displayed, you can get about 15 significant figures before getting garbage. (Note: we are finding the machine's epsilon).

>>> for x in range (1,20):
...     print "%2d, %10.40f" %( x, 1+10**-x)
... 
 1, 1.1000000000000000888178419700125232338905
 2, 1.0100000000000000088817841970012523233891
 3, 1.0009999999999998898658759571844711899757
 4, 1.0000999999999999889865875957184471189976
 5, 1.0000100000000000655120402370812371373177
 6, 1.0000009999999999177333620536956004798412
 7, 1.0000001000000000583867176828789524734020
 8, 1.0000000099999999392252902907785028219223
 9, 1.0000000010000000827403709990903735160828
10, 1.0000000001000000082740370999090373516083
11, 1.0000000000100000008274037099909037351608
12, 1.0000000000010000889005823410116136074066
13, 1.0000000000000999200722162640886381268501
14, 1.0000000000000099920072216264088638126850
15, 1.0000000000000011102230246251565404236317
16, 1.0000000000000000000000000000000000000000
17, 1.0000000000000000000000000000000000000000
18, 1.0000000000000000000000000000000000000000
19, 1.0000000000000000000000000000000000000000

of course we hoping to get

 1, 1.1000000000000000000000000000000000000000
 2, 1.0100000000000000000000000000000000000000
 3, 1.0010000000000000000000000000000000000000
 4, 1.0001000000000000000000000000000000000000
 5, 1.0000100000000000000000000000000000000000
 6, 1.0000010000000000000000000000000000000000
 7, 1.0000001000000000000000000000000000000000
 8, 1.0000000100000000000000000000000000000000
 9, 1.0000000010000000000000000000000000000000
10, 1.0000000001000000000000000000000000000000
11, 1.0000000000100000000000000000000000000000
12, 1.0000000000010000000000000000000000000000
13, 1.0000000000001000000000000000000000000000
14, 1.0000000000000100000000000000000000000000
15, 1.0000000000000010000000000000000000000000
16, 1.0000000000000001000000000000000000000000
17, 1.0000000000000000100000000000000000000000
18, 1.0000000000000000010000000000000000000000
19, 1.0000000000000000001000000000000000000000

Any calculation of reals using python, will only be accurate to about the 15th place. 15 decimal places requires 50 bits (here "l" is log)

#  echo "15*l(10)/l(2)" | bc -l
49.82892142331043521840

The IEEE double precision (64 bit) representation of real numbers uses 52 bits (close enough to the 50 bits we see above) to represent the mantissa, 11 bits for the exponent and 1 bit for the sign - see IEEE 754-1985 (http://en.wikipedia.org/wiki/IEEE_floating_point_standard#Double-precision_64_bit). It looks like python is using double precision (64 bit) to represent real numbers.

4.2. Babylonian Algorithm

You start with your number>1, and an estimate of the square root (Many algorithms require you to give it starting value(s). Often these can be almost anything.) The square root is going to be somewhere between the number and 1, so make the estimate the arithmetic mean (the average).

estimate = (number + 1.0)/2.0

You use this estimate to get a better estimate

new_estimate = (estimate + number/estimate)/2.0

How does this work? Let's find the square root of 9.

number = 9
estimate = (9+1)/2.0=5.0
new_estimate = (5 + 9/5))/2.0 = 3.4

We started with estimate=5.0 and get estimate=3.4, which is closer to the answer. The estimate gets closer to the known answer with each iteration (the algorithm is said to converge). When we eventually get to our answer (at least within the precision of a real), there will be no change when we plug in the estimate.

estimate = 3
new_estimate = (3 + 9/3))/2.0 = 3

We keep iterating till the difference between the estimate and then new_estimate is acceptable. (for the proper way to do real comparisons see ???) We could use division to test if they are close enough; this will be slow, but will work no matter what size the number is. We could use substraction, which is fast, but the remainder will be large for large numbers and small for small numbers. Ideally we'd like to get the maximum precision possible (double precision), but for the exercise, we'll substract and use a relatively large difference to terminate the algorithm.

4.3. Code for Babylonian Algorithm for Square Root

Before we write the code that iterates, we need to initialise some numbers

the value of the difference which we'll use to detect convergeance. Call it "error" and give it a value of 10^-12
the number for which we need the square root (let's start with 9 and call it "number")
an estimate of the square root (call it "estimate")
a better estimate of the square root, based on the first estimate (call it "new_estimate").

Write some code that does these 4 steps and prints out the value of number and the estimate. Here's my code ^[35] .

We are about to enter a loop. We don't know how many iterations are going to be needed, but we do have a test to show that we're done. Should we use a for loop or a while loop ^[36] ?

At the top of the while loop we test if we should enter the loop again. What's the test that we've found the square root and how do we use it in the while line? Here's my code ^[37]

Assume that the loop is entered (i.e. we don't have the square root yet): Before you entered the loop, when you initialised a few variables, you generated an estimate of the square root and from it a better estimate. You should use exactly the same steps to inside the while loop, to generate a better value of estimate. Heres what happens inside the loop:

You have two estimates for the square root: estimate and a better one new_estimate. Now that you're in the loop, you don't need the current value of estimate any more. You should update its value (with what?).
You've also decided that new_estimate is not good enough (the test for entering the loop showed it to be too different from estimate). You should get a better value for new_estimate.

Write code to do this, and inside the loop, print out the value for estimate. Here's my code ^[38] .

Note

In a for loop, the code for the calculations is all inside the loop.

In a while loop, the code for the calculations must also be ahead of the loop (where it runs once), so that the first time the conditional test in the while statement is run, all values will be known.

Where to have the print() statement: For the final code the print() statement will be commented out, so this decision isn't all that important. You can print out values at any step in the while loop and everyone reading your code will know what you've done. The location given is slightly better, as you're printing out the value that caused the loop to be executed. After the loop exits, the last value of new_estimate will be printed by a statement in the following code. If you'd printed after the calculation of new_estimate, then the value for the first iteration of the loop would not be printed.

The only modification left is to print out the final value for the square root ^[39] .

4.4. Order of the Algorithm for calculating the square root

The order of an algorithm e.g. O(n), O(nlogn) describes the rate at which algorithm's execution time increases with the amount of data that it processes. The calculation of the square root has no data (that can be varied), so we can't use this measure. Instead the measure of worth is the increase in running time needed to increase accuracy by one decimal digit (i.e. a factor of 10).

With a print() statement in the loop, you can see the number of iterations before the answer converges (arrives within the acceptable limit).

On Paul Hsieh's Square Root page (http://www.azillionmonkeys.com/qed/sqroot.html) you'll find that you only need 6 iterations to get 53₂ places (a 64 bit double precision real number) i.e. every step gets you 9 bits, a factor of 2⁹=512, closer to your answer. This is pretty fast.

	Note
	End Lesson 20

4.5. Benchmarking (speed) comparision: Babylonian sqrt() compared to built-in math library sqrt()

At this stage we have working code for the Babylonian sqrt(). The algorithm converges quickly (only 6 iterations are required to calculate to an accuracy of 53 bits). However saying that it "converges quickly" doesn't quantify anything (it's a phrase from marketing). You need to be able to say how quickly, using a measurement that is meaningful to just about anybody (e.g. your grandma), not just computer programmers. An obvious first test is to compare the Babylonian sqrt() to the best sqrt() you have (the math module sqrt()). If it's better, we'll do more tests, otherwise we'll just drop it.

To do speed tests on code, we need to be able to measure the execution time of a program. Python has a timer that measures wall clock time in seconds since 1 Jan 1970 (the start of unix time). (see Time access and conversions http://docs.python.org/lib/module-time.html). time() is a (apparently 32-bit) float. On some systems the resolution is only 1sec (programs that run for less than 1 sec will appear to run for 0 time). If the computer is otherwise idle, the wall clock time and the execution time of your program are nearly the same. Here's some code to measure time.

from time import time

n = 1000000

start = time()
#do something n times
finish = time()

print "execution time per iteration = %10.10f", (finish-start)/n

A single calculation of a square root is a little fast for most timers. We usually do a large number of iterations in a loop and then divide by the number of iterations. This brings its own problems: we have to figure out how much time is taken up by the looping code, but we can handle that too (see below). To time your square root code, first make your Babylonian sqrt() into a function. Copy your Babylonian sqrt code to a new file square_root_compare.py

name the function babylonian_sqrt(), giving it one parameter (my_number).
comment out the line "number = 9" (since the value of number will now come from the parameter) and change all instances of "number" to "my_number".
comment out the print functions (so it will run faster).
return new_guess.

You now have a function. From here on, this function is just a block of code, that you'll do timing runs on. You won't be changing anything inside of it; it's just a black box.

You need a main() to call the function. Write a two-line main() to test your function - assign the variable number the value 1000, and then call babylonian_sqrt() with the parameter number. Test that your code runs.

To do timing runs on the function, main() needs to generate a whole lot different numbers to feed to your the function. Why don't you just feed the same number over and over? Compilers and interpreters are supposed to be smart. If they see you do the same operation over and over on the same data, the compiler/interpreter will recognise what's going on and will just return the same answer, without going through the bother of calculating it over and over. You don't know if your python interpreter is this smart (how would you find out ^[40] ), but you should expect that one day you'll bump into a smart compiler/interpreter, and your timing runs will be meaningless. You may as well start writing your benchmarking code correctly from the start.

One way of producing a whole lot of different numbers is to use all the numbers produced by range(1,largest_number), where largest_number is say 10,000.

create a variable largest_number to be the number of times you'll call your function. Give it a value of 10000.
write a loop to create a number (call it number) that with each iteration assumes the next value from the list 1..largest_number. Do you use a for or a while loop ^[41] ?
inside the loop, call babylonian_sqrt() with number as a parameter.

You now have a block of code in main() that calls the function largest_number of times. You now want to time that block

use an import statement to import the function time() from the module time (confusing huh).
write code that assigns times to the variables start and finish immediately before the block and immediately after it.
Use the values of largest_number, start, finish to print out a line saying
largest number 1000 time/iteration 0.0000341

Here's my code ^[42] . This outputs the time required to calculate the square roots of all the numbers in the range 1..1000. This piece of timing code is a self contained block. You aren't going to change it (except for the value of largest_number)

The average time for the sqrt() calculation depends on the number of times you looped (it shouldn't, but it does). You'll find that the time/iteration depends on whether you did it 10,100 or 1,000,000 times. You need to find a range of largest_number for which the time/iteration is the fastest availalbe and relatively constant. To handle this, you need to call the timing loop with a range of values of largest_number. To do this, put the timing loop above, inside another loop, which feeds different values of largest_number into the timing loop. Putting one loop inside another is called "nesting loops".

To do this, you'll need to use a list. Go to the section explaining ??? and then return here.

Create a list named largest_numbers[] holding the values 1,10,100....1000000. Read these values out, using a loop (for or while?) assigning the values one at a time to largest_number. The timing loop, being nested inside the new outer loop, will now have to be indented one step to allow it to be parsed correctly. Here's what the nested loops look like

largest_numbers=[1,10,100,1000,10000,100000,1000000]	#list of values for outer loop
for largest_number in largest_numbers:		#new outer loop
	for number in range(0,largest_number):	#original loop
	        babylonian_sqrt(number)

Here's my code ^[43] .

Here's my output

#  ./square_root_compare.py
largest number          1 time/iteration 0.0001909
largest number         10 time/iteration 0.0000348
largest number        100 time/iteration 0.0000295
largest number       1000 time/iteration 0.0000341
largest number      10000 time/iteration 0.0000398
largest number     100000 time/iteration 0.0000474
largest number    1000000 time/iteration 0.0000512

Look to see if there's a range of values for largest_number where time for doing a sqrt is independant of the loop size (it should be the fastest time). For largest_number having a value of 10 or more, the Babylonian square root function takes 30-50usec on my 1GHz machine. One one student's machine (MacOS), the time/interation kept dropping with in the range values for largest_number above, so I sent him off to do runs with increasing values of largest_number. Another student's machine (WinXP) reported 0 time for the first 3 runs, and presumably has a time() function with a resolution of only 1 sec.

	Note
	End Lesson 21

Next we want to compare the speed of the Babylonian sqrt() to the built-in sqrt() (in the math module).

Add another timing loop for the built-in sqrt() at the same level as the timing loop for the babylonian_sqrt(). Add an extra section to the outer loop in main() which calls sqrt() from the math module.
add timers to this new block to measure the execution time of the built-in sqrt(). you can reuse the names start, finish since the babylonian_sqrt() timing loop has exited and is not using these variables anymore.
add a print line to output the times.

Here's the new calling code in main() ^[44] .

Calling range() and setting up the looping consumes cpu cycles, so we need to subtract that time from our results. Code up a 3rd timing loop that doesn't do anything, so later you can subtract that time out. Here's a loop that does nothing (you can't have an empty line, the parser gets confused).

	for number in range(0,largest_number):
		pass

Here's my code ^[45] . Here's the output

# ./square_root_compare.py
babylonian_sqrt: largest number          1 time/iteration 0.0001969337
library_sqrt:    largest number          1 time/iteration 0.0000259876
empty_loop:      largest number          1 time/iteration 0.0000119209

babylonian_sqrt: largest number         10 time/iteration 0.0000346184
library_sqrt:    largest number         10 time/iteration 0.0000043154
empty_loop:      largest number         10 time/iteration 0.0000017881

babylonian_sqrt: largest number        100 time/iteration 0.0000295210
library_sqrt:    largest number        100 time/iteration 0.0000031996
empty_loop:      largest number        100 time/iteration 0.0000008512

babylonian_sqrt: largest number       1000 time/iteration 0.0000341501
library_sqrt:    largest number       1000 time/iteration 0.0000030880
empty_loop:      largest number       1000 time/iteration 0.0000007720

babylonian_sqrt: largest number      10000 time/iteration 0.0000398650
library_sqrt:    largest number      10000 time/iteration 0.0000031442
empty_loop:      largest number      10000 time/iteration 0.0000007799

babylonian_sqrt: largest number     100000 time/iteration 0.0000459829
library_sqrt:    largest number     100000 time/iteration 0.0000033149
empty_loop:      largest number     100000 time/iteration 0.0000009894

babylonian_sqrt: largest number    1000000 time/iteration 0.0000511363
library_sqrt:    largest number    1000000 time/iteration 0.0000033719
empty_loop:      largest number    1000000 time/iteration 0.0000010112

The time for the empty loop is 0.7-1.0usec. Here's the timing results, after subtracting the empty loop time.

Table 1. square root (time, usec): Babylonian and math library code

Babylonian	math library
30-50	2

The library sqrt() is 15-25 times faster than the Babylonian sqrt(). Even though the babylonian_sqrt() only needs 5 iterations to get the answer, the library routine gets the answer in the time of 1/3-1/5th of a loop.

4.6. Running time comparision: Python/C

Python being an interpretted language will be slower than a compiled language. As well, python is object oriented, making it slower yet. C is one of the languages of choice when speed is required. Here's the same program as above (comparing the Babylonian and math library sqrt()) written in C (it uses the low resolution timers). With your current programming knowledge, you should be able to read this code and know what it's doing. ^[46] . Here's the output

time/iteration: babylonian         10000000  0.00000122000000000000
time/iteration: library            10000000  0.00000010800000000000
time/iteration: empty              10000000  0.00000000700000000000

time/iteration: babylonian        100000000  0.00000135340000000000
time/iteration: library           100000000  0.00000010790000000000
time/iteration: empty             100000000  0.00000000680000000000

time/iteration: babylonian       1000000000  0.00000147792000000000
time/iteration: library          1000000000  0.00000010824000000000
time/iteration: empty            1000000000  0.00000000691000000000

Here's the comparison of the timing results, subtracting 7nsec for the empty loop (compared to 700nsec for the empty loop in python).

Table 2. square root (time, usec): Babylonian and math library code

Language	Babylonian	math library
python	30-50	2
C	1.3	0.11
ratio time Python/C	25-40	20

C is 20-40 times faster than python, at least for doing sqrt(). So why does anyone code in python? If speed is your main requirement you don't. However it takes longer to write C than it does to write python and people who don't program a lot, can write working python without too much trouble, but may not be able to get a C program to work. If you only require a few square roots and it takes 5 mins to write it in python and 10 mins to write it in C, then you do it in python. If you need to do 10¹² square roots, then you do it in C.

If your code only takes 1 min to run and you're only going to run it a couple of times, then you really don't care if it takes 10 times longer. However if you're going to be running the 1 minute program hundreds of times, or your program will take a week to run, then a factor of 10 in speed is important.

	Note
	End Lesson 22

4.7. Presentation: the Babylonian Square Root

You've just completed a piece of code and have a nice story to tell. For the rest of your life, whether you're a coder or something else, you're going to have to sell what you've done to someone else. Here you have a piece of code.

It's simple to tell other coders of your work. You just walk them through the code. They'll just go "yup, yup... yup. Nice story." and go back to their work. A little harder is a technically trained person, who isn't neccessarily a python coder.

Here's what you do.

You give an introduction.
This will give enough background information so that people will know why you did the work. In this case you wanted to code up the Babylonian square root, to see how it worked, and then you tested its speed compared to the fastest available square root, the math library square root.
You need to give people time to adjust from what they were doing before they came into the room, so you can add a few things that don't require much brainwork; e.g. you could tell them where Babylon is (50miles south of Baghdad), what else Babylon is famous for (besides the square root algorithm) - the hanging gardens of Babylon, one of the 7 wonders of the ancient world; and the Laws of Hammurabi.
You give your talk.
In this case you will explain what the code does from an overall point of view, then explain each piece. You can explain the code in the order you wrote it, which makes it simple to understand
- explain the Babylonian sqrt() function
- explain how you time code
- explain why you used the library square root as a control (it's the fastest square root you know of, otherwise it wouldn't be in the library).
- why you timed the empty loop
You tell them what you accomplished/concluded (that the library square root is "x" times faster than the Babylonian sqrt()).

This is usually phrased as

you tell them what you're going to tell them
you tell them
you tell them what you told them

It's a conceptually simple story with a conclusion that everyone will agree on. A presentation on this piece of code shouldn't take more than 5-10mins.

	Note
	End Lesson 23

5. Numerical Integration

We're going to calculate a value for π by numerical integration. To do this we're going to calculate the area of a quadrant of a circle. The ancient Greeks called finding the area "squaring", as their method used geometric transformations of the object into squares, for which they knew the area. The Greeks knew how to square a rectangle, triangle (and hence a parallelogram). They did not know how to square a circle (the problem being that π is irrational). In modern time (1500's or so), people found ways of squaring the area under parabolas (and other polynomial curves), but had difficulty with squaring the area under a hyperbola. All these problems were swept away by the invention of calculus by Newton and Leibnitz. Calculus still used the method of squaring, and cut areas into infinitesimal rectangular slices and then summed them. This was not as rigorous as the Greek method, but gave useful (and correct) answers. People now accept that the Greek method could not be extended further and that the methods of calculus are acceptable. Since the Greek methods don't work for most problems, the term "squaring" has been replaced with "finding the area".

numerical: we're going to cut a quarter circle into rectangular strips and measure the area of each strip (ignoring that one end of the strip is part of the arc of a circle, rather than a straight line). It's numerical because we're not going to derive a formula for π - we're going to calculate the value by brute force summing of areas (this is what computers do well).
integration: if we integrate a line (e.g. a quarter circle), this means that we calculate the area under the line. If we integrate the plot (graph) of a car's speed as a function of time, we'll get the distance the car travels. If we integrate under a surface (e.g. a hemisphere) we get the volume under the surface.

What's the difference between Numerical Integration and regular Integration

Integration: This is a method from calculus for finding the area under a curve. It gives exact answers for a limited number of cases.
Numerical Integration: This is a computer based method for finding the area under a curve. It works for any curve, but is limited in accuracy by the number of bits used to represent the numbers and by rounding errors.

Some interesting info on π

"The History of Pi", Petr Beckmann (available in at least two editions, 1971 Golem Press, 1993 Barnes and Noble). One of my favourite books.
values of π through history (http://mathforum.org/isaac/problems/pi2.html).
Ludolph van Ceulen and Pi (http://mathforum.org/library/drmath/view/52555.html)

5.1. Calculating π by Numerical Integration

Let's look at the part of the circle in the first quadrant. This diagram shows details for the point (x,y)=(0.6,0.8).

	Note
	The python code for the diagrams used in this section is here ^[47] . Since this code will only be run once, there's no attempt to make it fast.

Figure 4. Pythagorean formula for circumference of a circle

Pythagorean formula for circumference of a circle

From Pythagorus' theorem, we know that the distance from the center at (0,0), to any point (x,y) on the circumference of a circle of radius=1, is 1. The square of the distance to the center of the circle (the hypoteneus) is equal to the sum of the squares of the two sides, the lengths of which are given by (x,y). Thus we know that the locus of the circumference of a circle of radius 1 is

x*x + y*y = 1*1

Note

locus: the path traced out by a point moving according to a law. Here's some examples.

The circumference of a circle is the locus of a point which moves at constant distance from the center.

A straight line is the locus of a point that moves along the path, which is the shortest distance between two points.

An ellipse is the locus of a point which moves so that the sum of the distances to two points (the two focii) is constant. (The planets move around the sun in paths which are ellipses. The sun is one of the focii.)

	Note
	etymology: focus == fireplace

In the diagram above if x=0.6, then y=0.8. Let's confirm Pythagorus.

x^2=0.36, y^2=0.64; x^2+y^2=1.

Rearranging the formula for the (locus of the) circumference, gives y (the height or ordinate) for any x (the abscissa).

x^2 + y^2 = 1^2
y^2 = 1-x^2
y = sqrt(1-x*x)

i.e. for any x on the circumference, we can calculate y. e.g. x=0.6: what is y?

y = sqrt(1-x*x)
  = sqrt(1-0.36)
  = sqrt(0.64)
  = 0.8

If x=0.707, what is y ^[48] .

The point on the circle with y=0.5: what angle (by eyeball, or by marking off lengths around the circumference) does the line joining this point to the center, make with the x axis ^[49] .

Similarly we can calculate x knowing y.

x = sqrt(1-y*y)

The value of π is known to many more decimal places than we can ever use on a 32 bit computer, so there's no need to calculate it again. Instead let's assume that either we don't know its value, or that we want to do the numerical integration on a problem whose answer is known, to check that we understand numerical integration.

The area of a circle is A=pi*r², giving the area of the quadrant as pi/4. Knowing the locus of the circumference (we have a formula that gives y for any x), we will numerically calculate the area of the circle, thus giving us an estimate of the value of π.

If we divide the x axis into say 10 intervals, we can make these intervals the base of columns, whose tops intersect the highest part of the circle in that slice (which is on the left side of each slice).

Figure 5. Upper Bound of area under slices of a circle

Upper Bound of area under slices of a circle

When we add these slices together, we'll get an area that is greater than π; i.e. we will have calculated an upper bound for π.

Here's the set of columns that intersect the lowest part of the circle in each interval (here the lowest part of the circle is on the right side of the slice).

Figure 6. Lower Bounds of area under slices of a circle

Lower Bounds of area under slices of a circle

We can calculate the area of the two sets of columns. If we sum the sets of upper bound columns, we'll get an estimate which is guaranteed to be more than pi/4 and for the set of lower bound colums, a number guaranteed to be less than pi/4.

Note

We could have picked the point in the middle of the interval to calculate the area. The answer will be more accurate, but now we don't know how accurate (we don't even know if it's more or less than π). The advantage of the method used here is that we have an upper and lower bound for π, and so we know that the value of π is in this range.

We could have used tighter bounds - lower bound by constructing a straight line joining the left and right end of each interval (giving a trapezoid), - upper bound by making a line tangent to the circle. (This is more bother than it's worth, and are left as an exercise for the student.)

If we progressively decrease the size of the interval (by changing from 10 intervals, to 100 intervals, to 1000 intervals..) the approximation to a circle by the columns will get better and better giving us progressively better estimates of pi/4. Here's the diagram with 100 slices.

Figure 7. Lower Bounds of area under 100 slices of a circle

Lower Bounds of area under 100 slices of a circle

We have a problem similar to the fencepost error problem: how many heights do we need to calculate if we want to calculate both the upper and lower bounds for 10 intervals. ^[50] ?

Here's code to generate the heights of the required number of columns (you'll need sqrt(), from the math module). I ran this code in immediate mode, but you can write a file called slice_heights.py if you like.

>>> from math import sqrt
>>> for x in range(0,11):
...     print x, sqrt(1-1.0*x*x/100)
... 
0 1.0
1 0.994987437107
2 0.979795897113
3 0.953939201417
4 0.916515138991
5 0.866025403784
6 0.8
7 0.714142842854
8 0.6
9 0.435889894354
10 0.0

Why did I use 11 as the 2nd argument to range() ^[51] ? What is the "1.0" doing in "1.0*x*x" ^[52] ?

	Note
	End Lesson 24

Note

Python is a scripting language, rather than a general purpose language. Python can only use integers for loop variables. To feed real values to a loop, python requires a construct like

#num_intervals, interval are integers
num_intervals=1000
for i in range(1,num_intervals):
	real_number=i*1.0/num_intervals
	function(real_number)

where x,num_intervals make the real real_number. It's not immediately obvious that the calculations are ranging over values 0.0..1.0 (or more likely start..end).

In most languages, real numbers can be used as loop variables, and can use the construct

#interval, start, end, x are reals
interval = 0.001
for (x = start; x < end; x += interval)
	function(x)

Here it's clear that x is a real in the range start..end.

Write code pi_lower.py to calculate the lower bound for π using 10 intervals. Use variable names num_intervals, height, area for the number of intervals, height of each column, and for the cumulative area. Start with a loop just calculating the height of each slice, printing the loop index and the height each time. For the print statement use something like

I've been using the single letter variable 'x' as the loop parameter thus far, since it's the loop parameter conveying the position on the x-axis. However loop parameters (in python) are integers, while the x position is a real (from 0.0..1.0). Usually loop parameters (which are integers) are just counters and are given simple variable names 'i,j,k,l...', which are easy to spot in code as they are so short. In this case the loop parameter is the number of the interval (which goes from 0..num_intervals). I tried writing the code with the name "interval" mixed in with "num_intervals" and it was hard to read. Instead I will use 'i'.

print "%d, %3.5f" %(x, height)

When that's working, calculate the area of each slice and add it to the variable area. The area of each rectangle is height*base. You've just calculated height; what is the width of the base in terms of variables already in the code ^[53] ?

At the end, print out the lower bound for π with a line like

print "lower bound of pi %3.5f" %(area*4)

Here's my code for pi_lower.py ^[54] and here's my output.

0, 0.99499, 0.09950
1, 0.97980, 0.19748
2, 0.95394, 0.29287
3, 0.91652, 0.38452
4, 0.86603, 0.47113
5, 0.80000, 0.55113
6, 0.71414, 0.62254
7, 0.60000, 0.68254
8, 0.43589, 0.72613
9, 0.00000, 0.72613
lower bound of pi 2.90452

Do the same for the upper bound of π writing a file pi_upper.py and using variables num_intervals, Height, Area (note variable names for the upper bounds start with uppercase, to differentiate them from the lower bounds variables). Here's my code ^[55] and here's my output.

0, 1.00000, 0.10000
1, 0.99499, 0.19950
2, 0.97980, 0.29748
3, 0.95394, 0.39287
4, 0.91652, 0.48452
5, 0.86603, 0.57113
6, 0.80000, 0.65113
7, 0.71414, 0.72254
8, 0.60000, 0.78254
9, 0.43589, 0.82613
upper bound for pi 3.30452

The two bounds (upper and lower, from the output of the two programs) show 2.9<pi<3.3 which agrees with the known value of π.

The two pieces of code look quite similar. Also note some of the numbers in the outputs are the same (how many are the same ^[56] ?) We should check for code duplication (in case we only need one piece of code).

Figure 8. Upper and Lower Bounds of area under a circle

Upper and Lower Bounds of area under a circle

Looking at the diagram above which shows upper and lower bounds together, we see the following

The height of the lower bound in one slice is the same as the upper bound for the next slice to the right.
The difference between the upper and lower bounds (the series of white rectangles with the arc of the circumference going from the bottom right corner to the upper left corner), when added together, is the same as the height of the left most slice.

Redrawing the diagram, shifting the lower bounds slices one interval to the right shows

Figure 9. Upper and shifted Lower Bounds of area under a circle

Upper and shifted Lower Bounds of area under a circle

This shows that the upper and lower bounds only differ by the area of the left most slice. This means only one loop of code is needed to calculate both bounds. Look in the output of pi_lower.py and pi_upper.py for the 9 numbers in common.

The duplication arises because the lower bound for one interval is the upper bound for the next interval and we only need to calculate it once. The first interval for the upper bound and the last interval for the lower bound are unique to each bound and will have to be calculated separately.

	Note
	This is a general phenomenon: When calculating a value by determining an upper and lower bound, if the curve is monotonic, you should expect to find values that are used for both the upper and lower bound.

Write code (call the file pi_2.py - there is no pi.py anymore) to calculate the area common to both bounds (i.e. except for the two end pieces) in one loop. Use x for the loop control variable (the slice number), h for the height of each slice and a to accumulate the common area. In the loop output the variables for each iteration with a line like

	print "%10d %10.20f %10.20f" %(x, h, a)

After exiting the loop, add the two end areas, one for the lower bound and one for the upper bound to give area and Area and output your estimates of the lower and upper bounds for π with a line like

print "pi upper bound %10.20f, lower bound %10.20f" %(Area*4, area*4)

Here's the code ^[57] and here's the output

pip:# ./pi_2.py
	 1 0.99498743710661996520 0.09949874371066200207
	 2 0.97979589711327119694 0.19747833342198911621
	 3 0.95393920141694565906 0.29287225356368368212
	 4 0.91651513899116798800 0.38452376746280048092
	 5 0.86602540378443859659 0.47112630784124431838
	 6 0.80000000000000004441 0.55112630784124427841
	 7 0.71414284285428497601 0.62254059212667278711
	 8 0.59999999999999997780 0.68254059212667272938
	 9 0.43588989435406727546 0.72612958156207940696
pi upper bound 3.304518, lower bound 2.904518

giving the same result: 2.9 < pi < 3.3

To get a more accurate result, we now increase the number of intervals, thus making the slices smaller and reducing the jaggyness of the approximation to the circle.

	Note
	End Lesson 25

5.2. an estimate of running time (back of the envelope calculation)

Note

Engineers take pride in being able to do what is called a back of the envelope calculation. While it may take weeks of computer time to get an accurate answer, an engineer should be able to get an approximate answer using only the back of an envelope to do the calculation (extra points are added if you can do it in the elevator between floors). As part of the calculation, you also need to be able determine the accuracy of your answer (is it correct within a factor of 100,10,2, 10%?) or state whether it's an upper or lower bound. If the answer is within reasonable bounds, then it's worth spending a few months and a lot of money to find the real answer.

e.g. how long does it take to get from Earth to Mars using the minimum amount of fuel. It's going to be somewhere between the half the time of Earth's orbit and half the time of Mar's orbit (both planets orbit using no fuel at all), i.e. between 6 months and a year - let's make it 9 months. The answer is 8.6 months (Earth-Mars Hohmann trajectory http://www.newmars.com/wiki/index.php/Hohmann_trajectory).

	Note
	The exact answers, the Hohmann transfer orbit time between planets (see Gravitational Slingshot http://en.wikipedia.org/wiki/Gravitational_slingshot) were in Clarke's tables (along with log and trig tables, and of course the properties of elements) which all high school students in my time (1960's) carried to science class. I remember looking for the element with the highest density (which I remember as being Osmium).

Before we change the number of intervals to some large number (like 10⁹), we need some idea of the time this will take. We could change intervals to 10,100... and measure the time for the runs (the quite valid experimental approach) and extrapolate to 10⁹ intervals. Another approach is to do a back of the envelope calculation of the amount of time we'll need. We don't need it to be accurate - we just need to know whether the run will take a second, an hour or a year, to see if it's feasible to run the code with intervals=10⁹. Lets say we use 1G intervals. The calculation goes like this

our computer has a clock rate of 1GHz, meaning that it can do 1G operations/sec (approximately).
the for loop is executed once for each interval. Let's say there are 100 operations/loop (rough guess, we could be out by a factor of 10, but that's close enough for the moment)

A 1GHz computer then will take about 100 secs for num_intervals=10⁹. If you have a 128MHz computer, expect the run to be 8 times longer (800secs=13mins). In this case, you may not be able to do a run with num_intervals=10⁹ in class time, but you should be able to do a run with num_intervals=10⁸.

Do multiple runs of pi_2.py increasing num_intervals by a factor of 10-100 each time, noticing the changes in the upper and lower bounds for π. (Comment out the print() statement inside the loop or it will take forever.)

5.3. range()/xrange() and out-of-memory problem

	Note
	This is a problem with how I wrote the python code. It has nothing to do with numerical integration, but you're going to run into problems like this and you need to know how to get out of them. So we're going to take a little detour to figure this one out.

For a large enough value of num_intervals (try 10⁸ or 10⁹, depending on your computer), the program exits immediately with a MemoryError: the machine doesn't have enough memory to run the program. When you have a problem with code, that you've thought about for a while and can't solve, you don't just dump the whole program (with thousands of lines of code) in someone's lap and ask them what's wrong. Instead you pare down the code to the simplest piece of code that will demonstrate the problem and then ask them to look it over. Here's a simplified version of the problem showing the immediate exit:

>>> num_intervals=100000000
>>> height=0
>>> for interval in range(0,num_intervals):
...     height+=interval
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
MemoryError

Note

If I'd gone one step further to

>>> for interval in range (0,100000000):
...     print interval
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
MemoryError

I probably would have asked myself why range failed for a large enough argument, and figured it out myself. But I didn't and I had to go ask for advice on a python mailing list.

With a smaller value for num_intervals (try a factor of 10 smaller), this code will now run, but take up much of the machine's memory (to see memory usage, run the program top). I thought that the code was only creating a handful of variables (x, height, num_intervals). In fact range() creates a list (which I should have known) with num_intervals number of values of x, using up all your memory. In all other languages, the code equivalent to

for i in range(0,num_intervals):

calculates one new number each iteration, and not a list at the start, so you don't have this problem. You don't need a list, you only need to create one value of x for each iteration.

While I was waiting for an answer on the python mailing list, I changed the for loop to a while loop. Here's the equivalent while loop

>>> num_intervals=100000000
>>> height=0
>>> interval=0
>>> while (interval<num_intervals):
...     height+=interval
...     interval+=1
...

which works fine for large numbers of iterations (there doesn't need to be anything in particular inside the loop to demonstrate that a while loop can handle a large number of iterations).

Back to range(): Not only do you run into problems if you create a list longer than your memory can hold, but even if you have infinite memory, there is another upper limit to range(): you can only create a list of length=2³¹

>>> interval=range(0,10000000000)	#list 10^10
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: range() result has too many items

>>> interval=range(0,10000000000,4)	#list of every 4th number of 10^10 numbers = 2.5*10^9
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: range() result has too many items
>>> 

>>> interval=range(0,10000000000,5)	#list of every 5th number of 10^10 numbers = 2*10^9
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
MemoryError

Why did the error change from OverflowError with a list of length 2.5*10⁹ to MemoryError with a list of length 2.0*10⁹?

OverflowError indicates that the size of the primitive datatype has been exceeded. MemoryError indicates that the machine doesn't have enough memory to do that operation (i.e. your machine has less than 2G memory) - python will allow a list of this length, but your machine doesn't have enough memory to allocate a list of this size.

What primitive data type is being used to hold the length of lists in python ^[58] ?

If the transition from MemoryError to OverflowError had happened between 4.0*10⁹ and 4.5*10⁹, what primitive datatype would python have been using ^[59] ?

It turns out on a 64 bit system you can still only create a list of length 2³¹.

The python way of overcoming the memory problem of range() is to use xrange() which produces only one value of the range at a time (xrange() produces objects, one at a time, while range() produces a list). Change your pi_2.py to use xrange() and rerun it with a large value of num_intervals. Here's my fixed version of pi_2.py called pi_3.py. The only change to the code is changing range() to xrange() ^[60] .

	Note
	Use `xrange()` whenever you are creating a list of size comparable to the memory in your machine.

5.4. Optimising the calculation

When coding your goals (in order) are

get the code to run
get the code to run correctly
get the code to run fast

The current code runs correctly, but it runs slowly. Your first attempt at coding up a problem is always like this: you (and everyone else involved) wants to see if the problem can be handled at all. The process of changing code so that it still runs correctly, but now runs quickly, is called "optimising the code".

If you have a piece of software running on a $1M cluster of computers and you can speed it up by 10%, you just saved your business $100k. The business can afford to pay a programmer $100k for a year's work to speed up the program by only 10%. Optimising code is tedious and difficult and you can't always tell ahead of time how successful you'll be. People who optimise code can get paid a lot of money. On the other hand, the people with the money can't tell whether the code needs to be optimised (they think if it works at all, then it's finished) and can't tell a person who can optimise, from one who can't.

If you're selling the software to someone else, then the cost of the extra hardware, needed to run the unoptimised software, is borne by the purchaser and not you, so software companies (unless they have competition) have no incentive to optimise their code. Software companies, just like many businesses, will externalise their costs whenever they can (see Externality http://en.wikipedia.org/wiki/Externality, and Cost_externalizing http://en.wikipedia.org/wiki/Cost_externalizing). With little competition in the software world, there's a lot of unoptimised and badly written code in circulation.

The place to look for optimising, is the lines of code where most of the execution time is spent. If you double the speed of the code where only 1% of the execution time occurs, the code will now take 99.5% of the time, and no-one will notice the difference. It's better to tackle the part of the code where 99% of the time is spent. Doubling the speed there will halve the run time. Places in the code where a lot of time is spent are loops that are run many times (or millions of times). In pi_3.py the program is only a loop and it's running 10⁹ times.

Let's look at the loop in pi_3.py.

Pre-calculate constants:

How many times in the loop is num_intervals*num_intervals calculated? How many times do you need to do it ^[61] ?

We won't modify pi_3.py till we've figured out the optimisations we need. Instead we'll use this simple code (call it test_optimisation.py) to demonstrate (and time) optimisation. Remember that the timing code measures elapsed (wall clock) time rather than cpu time (i.e. if your machine is busy doing something else, the time won't be valid). If your computer is about 200MHz, run the program with num_intervals=10⁵; if you have a 1GHz machine, use num_intervals=10⁶.

#! /usr/bin/python
#test_optimisation.py
from time import time
num_intervals=100000
sum = 0
print "                                   iterations          sum   iterations/sec"

start = time()
for i in xrange(0,num_intervals):
     sum+=(1.0*i*i)/(num_intervals*num_intervals)

finish=time()

print "unoptimised code:                  %10d %10.10f %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

Why do I print out the results num_intervals, sum at the end when I only need to print out the speed? It's to show that the code, whose speed I'm measuring, is doing what it's supposed to be doing. What if my loop was coded incorrectly and was doing integer, rather than real, division? As a check, I should at least get the expected value for sum in all runs.

Add another copy of the loop below the unoptimised loop (including the timing and print commands), recoded so that num_intervals*num_intervals is only calculated once.

	Note
	When you modify code, comment out the old code, rather than deleting it. Later when you're done and the code is working properly, you can delete the code you don't want. Right now you don't know which code you're going to keep.

Here's my new version of the loop (with the original code commented out in the revised version of the loop). This is a stand-alone piece of code. You will just be adding the loop part to test_optimisation.py. ^[62] .

Initially I handled the different versions of the loop by re-editing the loop code, but it quickly became cumbersome and error prone. Instead, I spliced the different versions of loop code into one file and ran all versions of the loop together in one file. Here's what my code looked like at this stage ^[63] .

Note
from Joe: you can skip this part (it shows that division by reals which are powers of 2, doesn't use right shifting).
Long division is slow - pt1 (can we right shift instead?).
We've been using powers of 10 for the value of num_intervals. We don't particularly need a power of ten, any large enough number would do. Dividing by powers of 10 requires long division, a slow process. Integer division by powers of 2 uses a right shift, a much faster process. If real division is like integer division, then if we made num_intervals a power of 2, the computer could just right shift the number. In your test_optimisations.py make a 3rd copy of your loop, using the new value of num_intervals
If your computer is about 200MHz, run the program with num_intervals=10⁵ and num_intervals=2¹⁷ (both have a value of about 100,000). If you have a 1GHz machine, use num_intervals=10⁶ and 2²⁰.
There's no change in speed, whether you're dividing by powers of 2 or powers of 10. While integer division by powers of 2 uses right shifting, division by reals uses long division, no matter what the divisor (the chances of a real being a power of 2 is small enough that python - and most languages - just has one set of code for division).
Long division is slow - pt2 (but multiplication is fast).
Why is multiplication fast, but division slow? Division on computers uses long division, the method used to do division by hand. If you divide a 32 bit number by a 32 bit number, you'll need 32 steps (approximately), all of which must be done in order (serially). Multiplication of the same numbers (after appropriate shifting) is 32 independant additions, all of which can be done at the same time (in parallel). If you're going to use the same divisor several times, you should instead multiply by the reciprocal.
Rather than dividing by num_intervals_squared once for each iteration, we should multiply by the reciprocal. Make another copy of your loop, this time multiplying by the reciprocal of num_intervals_squared. Here's my stand-alone version of the code (add the loop, timing commands and print statements to the end of test_optimisations.py) ^[64] . Do you see any change in speed?

Here's my test_optimisation.py ^[65] and here's the output run on a 200MHz machine

pip:# ./test_optimisation.py
	                           iterations          sum   iterations/sec
unoptimised code:                     1000000 0.0000010000    38134.77894
precalculate num_intervals^2:         1000000 0.0000010000    63311.38723
multiple by reciprocal:               1000000 0.0000010000    88156.25065

showing a 2.5-fold increase in speed.

	Note
	End Lesson 26

5.5. Safe Programming with normalised (reduced) numbers

Don't alter your pi_3.py; you aren't finished with test_optimisations.py yet.

Computers have a large but limited range of numbers they can manipulate. We're calculating π, whose value lies well in the range of numbers that computers can represent. However numerical methods, like we are using here, use values that are very small (and the squares of these numbers are even smaller). Other codes can divide small numbers by large numbers, large numbers by small numbers, square numbers that are smaller than the square root of the smallest representable numbers, or square numbers that the are bigger than the square root of the biggest representable numbers. You'll wind up with 0 instead of a small number or NaN (not a number, the computer result of dividing by 0). If you're close to the edge there, you drop significant figures and wind up with an inaccurate answer. Your rocket may not blow up for 10 launches, then one day when the breeze is coming from a different direction, the guidance system won't be able to correct and your rocket will blow up.

With the range of numbers we're using in the numerical integration, we're not going to have this problem. However you're writing code that for a different range of numbers will have this problem. Someone could swipe your code and use it for a new range of inputs, or you could re-use it in another piece of code. You want to write the code to minimise problems in the distant future. Here's where this problem shows up in our code

(1.0*x*x)/(num_intervals*num_intervals)

Depending on x, this is a small number squared, over a large number squared (x*x might evaluate to 0.0 and the division by num_intervals*num_intervals would be irrelevant; or num_intervals might be large, causing num_intervals*num_intervals to overflow i.e. NaN) - a disaster with your rocket's name on it. Instead you should write

(1.0*x/num_intervals)^2

Numbers and their scaling constants should not be separated. In computing, dividing a data value by some appropriate scaling constant is called normalisation. The word "normalisation" is used in different places to mean different things (unfortunately). Where else have we seen the term ^[66] ? You can take "normalised" to mean "anything that fixes up numbers so that something awful doesn't happen". What that may mean in any particular situation, you'll have to know.

The term used in science/engineering is "reduced" (rather than "normalised"), where the reduced value is dimensionless. Say a machine/system can do a maximum of 1000 operations/hr. If it's doing 500 operations/hr, its reduced speed is 0.5 (not 0.5 of its maximum speed, just 0.5 i.e. a number). Rather than refer to the speed of a plane (which has the dimensions of l/t), you can refer to its Mach number (the speed of the plane/speed of sound, this is dimensionless). The speed of sound depends on temperature. The temperature of air varies with altitude, so the speed of sound varies with altitude. Aerodynamics, especially near the speed of sound depends on the Mach number and not the speed (you'll need your speed to know when you'll arrive at your destination). If you're studying the aerodynamics of a plane at different altitudes, you want to know the Mach number.

In computations, often there will be dimensionless terms like x/y (which might be speed/speed of sound). In this case while each of the variables x,y might have a large range of values (e.g. 10**-160 to 10**160), the range of x/y might be from 10**-1 to 10**1.

Let's say we have a calculation which requires the value of (x/y)^2 when each of x,y have large ranges, but x/y has a small range. Here's how the ant climbing a hill, (floating point precision) looking for the highest point, gets stuck and starts wandering around aimlessly.

#small numbers (you have to guard against this possibility when doing optimisation problems)
>>> x=10**-161
>>> x*x
9.8813129168249309e-323
>>> x=10**-162
>>> x*x					#underflow: x*x is too small to be represented by a 64-bit real
0.0
>>> interval=10**-150
>>> (x*x)/(num_interval*num_interval)	#underflow: invalid answer
0.0
>>> (x/num_interval)*(x/num_interval)	#normalised number gives valid answer
9.9999999999999992e-25			#the correct answer is 10^-24, this answer is near enough

#big numbers (this doesn't happen a real lot)	
>>> num_interval=1.01*10**154
>>> num_interval*num_interval
1.0201e+308
>>> num_interval=1.01*10**155		#if I don't have the 1.01, the number will be an integer
>>> 					#python can have abitrarily large integers 
>>>					#and you'll get the correct answer
>>> num_interval*num_interval		#squaring a large number	
inf					#overflow
>>> x=1.01*10**150			#squaring a slightly smaller number
>>> x*x
1.0201000000000001e+300
>>> (x*x)/(num_interval*num_interval)	#unnormalised division
0.0					#wrong answer
>>> (x/num_interval)*(x/num_interval)	#normalised division
1.0000000000000002e-10			#correct answer

Whenever you have

y=(a*b*c)/(e*f*g)

figure out which denominator terms normalise which numerator term (you'll know from an understanding of the physics of the problem). You'll wind up changing the code to something like this

y=(a/e)*(b/f)*(c/g)

With the numbers we're using in the numerical integration, we aren't getting any benefit from using reduced numbers. None of the numbers under or overflow on squaring. However reduced numbers should always be used, just as seat belts should be used in cars. Here's the range of numbers involved

loop counter: i: 1..num_intervals (1-10^9)
1.0/num_intevals^2: 10^-18
reduced variable i/num_intevals: 1.0/num_intervals.. 1.0 (10^-9..1.0)
square of reduced variable (i/num_intevals)^2: (1.0/num_intervals)^2.. 1.0^2 (10^-18..1.0)

We're dealing with numbers 10^-18..1.0 whether we reduce or not.

Here's a case when reduced variables helps.

range of x: 10^-160..10^160
range of y: 10^-140..10^140
range of x/y: 10^-20..10^20

Here if we square x,y separately, we'll get an under or overflow. If we use reduced numbers, we'll get a correct answer.

Why did I get you to use reduced numbers when you don't need them? It's a good programming practice. You never know what someone else is going to do with your code. When another coder reads your code, he can tell that you know what you're doing and he won't have to have to look for stupid coding errors. This sort of code gives you good programming cred to people reading your code. You need to fix problems before they happen (not after). This country has blown up two Space Shuttles because people didn't fix problems when they came up to be fixed. You don't want to write code that blows up Space Shuttles.

5.6. For a constant, multiply by reciprocal rather than divide

In computers, division is done by long division, the same way you divide by hand. Long division is slow compared to multiplication, which on a computer (with special hardware) can be done in two steps. In a loop, if you divide by a constant, it's faster to multiply by the reciprocal.

#loop, slow way
	a=h/num_intervals

#faster way
interval=1.0/num_intervals	#outside loop
#loop
	a=h*interval

Add another section with a loop to test_optimisations.py using the reduced number x/num_intervals (since 1/num_intervals is used several times, you should multiply by the reciprocal, and so the code will actually use x*interval). Here's the code ^[67] and here's the ouput.

./test_optimisation.py
	                           iterations          sum   iterations/sec
unoptimised code:                     1000000 0.0000010000    37829.84246
precalculate num_intervals^2:         1000000 0.0000010000    63759.14714
multiply by reciprocal:               1000000 0.0000010000    99322.31288
normalised reciprocal:                1000000 0.0000010000    60113.31327

Here's my final version of test_optimisations.py ^[68] .

Because you're reducing numbers, you can no longer take advantage of the precalculated square, and your code is slower. In general, you don't care if normalisation slows down the code; you'd rather your rocket's navigation system worked correctly but slowly, than crashing the rocket quickly. If you're doing the runs yourself, and you know that the data is well conditioned and doesn't have to be reduced, you can do a run that takes a week instead of 10days. However, in 6 month's time, if your rocket blows up with someone's $1B satellite on board, no-one's going to be impressed when they find that your code didn't normalise. If your code will be released to a trusting and unsuspecting world, to people who will feed it any data they want, you must normalise.

The optimisations we tested were

constants in a loop need to be precalculated
multiply by constants, not divide
normalise (reduce) variables

If the loop had been longer (say 50 lines long), with lots of other calculations, then moving the generation of the constant out of the loop may not have made much difference. However the optimisations shown here should be done as a matter of course; they will assure anyone reading your code, that you know what you're doing.

The optimisations shown here are relatively routine and are regarded as normal programming practice, at least for a piece of code that takes a significant fraction of the runtime (e.g. a loop which is run many times). You wouldn't bother optimising if the lines of code were only run a few times. The unoptimised code that I've presented to you was deliberately written to be poor code so that you could make it better.

Note

End Lesson 27:

At this stage I'd left the students with the results of running test_optimisation.py, expecting them to figure out on their own, the optimisations that would be incorporated into the numerical integration. By the next class, they couldn't even remember the purpose of reducing numbers, so I went over this lesson again, and had them disect out the optimisation that would be transferred to the numerical integration.

Which optimisations are you going to use in the numerical integration? We have tried these loops:

sum = 0

#unoptimised
for i in xrange(0,num_intervals):
     sum+=1.0/(num_intervals*num_intervals)

#faster
#precalculate constants (here num_intervals^2)
num_intervals_squared=num_intervals*num_intervals 
for i in xrange(0,num_intervals):
     sum+=1.0/num_intervals_squared
 
#fastest
#multiply by constants, don't divide
interval_squared=1.0/num_intervals_squared
for i in xrange(0,num_intervals):
     sum+=1.0*interval_squared

#safe, about the speed of "faster", not "fastest"
#use reduced numbers 
interval=1.0/num_intervals
for i in xrange(0,num_intervals):
     sum+=(1.0*interval)**2

The optimisations we can use for the final code are

reduced numbers: We have to use these for safety.
Once you choose reduced numbers, you can no longer precalculate num_intervals^2 (or its reciprocal) because num_intervals is subsumed into the reduced number.
You can still precalculate the reciprocal of num_intervals to make your reduced number.

We must use reduced numbers for safety. Once we've made this choice, the only other optimisation left is to multiply by the inverse of num_intervals. We can't use any of the other optimisations. We don't get the best speed, but we will get the correct answer for a greater range of x, num_intervals

As a programmer, if you're selling your code, you are in somewhat of a bind here. If your code is compared to a competitor's, which uses all the speed optimisations, and which doesn't reduce their numbers, your code will run at half the speed of the competition. It would take quite a sophisticated customer (one who can read the code) to understand why your code is better. In my experience, people who want the program, rarely know the back from the front of a computer, much less understand the trade-offs involved in coding for safety or speed. If you tell the customer that you can make your code run fast too, it just won't be safe, and they buy your code and it does blow up a rocket or kill a patient in a hospital, they'll still blame you for shoddy programming. The "poor unknowing customer", who bought the code knowing full well that wasn't safe, won't be blamed.

Solaris (the unix variant used as the OS on Sun's computers) is slow compared to their competitor's OSs and it's disparaged by calling it "Slolaris". It throws up a lot of errors in the logs (say when there's a network problem), while the logs of a competitor's machine, on the same network, seeing the same problems, are empty. People in the know defend Solaris saying it is slow because it's safer and that the other OSs are seeing errors too, but aren't reporting them. The people on the other machines say "my machine isn't having any problems, Slolaris is having the problem!". Expect this if you write good code.

Here's the way out of the bind.

You have sophisticated users: This likely scenario for this would be if you wrote some code and put it out on the internet, under a GPL License, saying "here's code which does X, and here's the tests I put it through. Have fun." Your users will expect you know what you're doing and will go find things to do with it and using it out of any bounds that you ever thought about. You'll use safe programming.
You have a paying customer who doesn't know anything about programming or computers. They want code that runs fast, because they're using the cheapest possible hardware to save costs. (Everyone wants to save costs, you only have to give token acknowledgement of this. Some people want to save costs in a way that makes it expensive in the long run.) You say
I can make it run fast or I can make it run safe. Here is the range of parameters under which I've tested it. If you stay in this range you can run the fast code. If you go outside this range you'll need the safe code.
Now no matter which one they want, you have them write it into the specifications (including the caveats as to what might happen if they used the code outside the test conditions), so that legally you can say that you wrote what they wanted and they paid for what they asked for.

5.7. add optimisations

We now need to incorporate the optimisations, and time our π calculation runs. Code takes a while to load from disk. If it's just been run, a subsequent run will load faster (the disk cache knows where the file is or will still have the file cached in memory). Speed then will be affected by how long since the code was last run (after about 15 secs, the cache has been flushed). When you do several runs, you may notice that the first run is slow.

Copy pi_3.py to pi_4.py and include the optimisations that we found to be useful for this code (hint ^[69] ). Here's my version of pi_4.py ^[70] .

Table 3. Results of optimising calculation of π by numerical integration

code optimisation	speed, iterations/sec (computer 1)	speed, iterations/sec (computer 2)
none (base case)	16k	73k
all	24k	105k

5.8. Finding π from the upper and lower bounds

	Note
	This section hasn't been presented in class. It was written while the kids are writing their presentations.

Once we have the upper and lower bounds, the value of π comes by comparing the two numbers and accepting from the left, the numbers that match. As soon as we find a non-match, we discard the remaining digits. Since we're calculating π only once, we could match by hand and write down the answer. However your high school wants you to know array operations, so we'll use array operations to do the matching. In python, arrays are implemented with lists. If you need to, first refresh your memory of list operations ??? and ???. Then go to ???. Here you'll learn about arrays (you don't need to know all the material in this section to do the following section here, but your school wants you to understand arrays, and it's a good opportunity to get aquainted with them). Then go to the following section ??? and return here.

Lets do the match by finding the char elements of a string. (I wish to thank members of the North Carolina Zope/Python User Group for help here. There is a neater solution using the python specific zip() which I don't want to use - I'm only using code that will work in any language.) Write code compare_reals.py that does the following

initialises
- two reals (call them real_lower, real_upper) giving them a pair of values from lower and upper bounds
- a string pi_result to hold the matching digits of the two reals
put the string representations of the two reals into string_lower,string_upper
What happens if you compare strings of different lengths ^[71] ? To handle the case when the two strings are of different length, find the length of the shorter string (use the python function len(), which returns an integer, to find the length of a string). Assign the length of the shortest string to the variable shorter.
iterates (loops) over the two strings comparing the char elements at that position in the string.
when the two chars match, add that char to pi_result
when the loop finds the first non-matching pair, execute the python command break which exits the loop, thus preventing the loop from adding matching chars further down the string. The code with the break statement will look like this
for i in range(start, finish): if (char i from string_1==char i from string_2): add char to output string else: break # exit the for loop
leave pi_result as a string and output it to the screen.

Should you output a string or a real? With the upper and lower bounds being reals, you might expect to write code returning π as a real. However if the string was "3.141", after conversion to a real it might be 3.140999999999995. With correct formatting in the print statement, it might be possible to output this as 3.141, but you would have to know ahead of time the number of places to output. It's pointless to do this when you already have the correct number of places in the string form of π.

	Note
	Optional: If you wanted to output a real for `pi_result`. If you were to output (or return) a real, you'd use the function `float()` on the string. Test your code when no digits match (what is float("")?). Will your rocket crash and burn ^[72] ?

Here's my code ^[73] . What is the order of the algorithm? i.e. if the length of the strings are increased by a factor of n, will the runtime increase by n times or n^2 times (or some other function of n) ^[74] ?

The value of π is truncated to the number of digits presented. It would be more accurate if we rounded the number up or down. Can we do that? If the first non-matching digits are both 5 or above, we can definitely round up. If the first non-matching digits are both 4 or below, we can definitely round down. What if one digit us 4 and the other 7? We can't say anything. We can't have a scheme where the last digit may be rounded or may be truncated, but we can only tell by looking up the original computer print out. Our only choice is to say that the value of π is truncated to the quoted number of places.

Copy the code to compare_reals_2.py. Make the code into a function string longest_common_subreal(real,real) (i.e. takes parameters of two reals, and returns a string). Change the names of variables to be more generic i.e. string_lower becomes string_1, so that the code will apply to any situation, not just to upper and lower bounds. In main() call the function with two reals (use typical values for upper and lower bounds) and print out the result. Here's my code ^[75] .

Copy pi_4.py to pi_5.py. Add longest_common_subreal() to the code, so that it outputs the truncated string for the estimate for π. Here's my code ^[76] .

5.9. calculation of π by numerical integration and timing the runs

Using pi_5.py pick a high enough value for num_iterations to get meaningful output (it looks like I used num_iterations=10⁹, you won't have time to run this in class). Do runs starting at num_intervals=1 (for Cygwin under Windows, the timer is coarse and will give 0 for num_intervals<10000), noting the upper and lower bounds for π. We want to see how fast the algorithm converges and get progressively tighter bounds for π. Here's my results as a function of num_intervals.

	Note
	End Lesson 28

Table 4. π (and upper and lower bounds) as a function of the number of intervals

intervals	lower bound	pi (truncated)	upper bound	program time,secs	loop time,secs
10⁰	0.0		4.0	0.030	0.000022
10¹	2.9		3.3	0.030	0.000162
10²	3.12	3.1	3.16	0.030	0.000995
10³	3.139	3.1	3.143	0.040	0.0095
10⁴	3.1414	3.131	3.1417	0.13	0.094
10⁵	3.14157	3.141	3.14161	0.97	0.94
10⁶	3.141590	3.14159	3.141594	9.51	9.45
10⁷	3.1415924	3.141592	3.1415928	95.1	95.1
10⁸	3.14159263	3.1415926	3.14159267	948	947
10⁹	3.141592652	3.14159265	3.141592655	9465	9464
10¹⁰ - not an int, too big for xrange()	-	-	-	-	-

We're doing runs that take multiples of 1000sec. Another useful number: how long is 1000 secs ^[77] ?

For large enough number of intervals, the running time is proportional to the number of intervals. It seems that the setup time for the program is about 0.03 sec and the setup time for the loop stops proportionality at about 10 intervals.

5.10. Order of the algorithm for calculating π by numerical integration

The difference between the upper and lower bound of π is 1/num_intervals (the area of the left most slice). If we want to decrease the difference by a factor of 10, we have to increase num_intervals (and the number of iterations) by the same factor (10), increasing the running time by a factor of 10. If it takes an hour to calculate π to 9 places, it will take 10hrs to calculate 10 places. With about 10,000 hours in a year, we could push the value of π out to all of 13 places. We'd need 100yrs to get a double precision value (15 decimal places) for π.

Here's the limits we've found on our algorithm:

Table 5. Limits to calculating π by numerical integration

limited item	value of limit	reason for limit	fix
range() determines max number of intervals	100M-1G	available memory	use xrange()
time	1hr run gives π to 9 figures	have a life and other programs to write	faster algorithm
precision of reals	1:10¹⁵	IEEE 754 double precision for reals	fix not needed, ran out of time first

If someone wanted us to calculate π to 100 places, what would we do ^[78] .

Calculating π to 100 places by numerical integration would require 10¹⁰⁰ iterations. Doing 10⁹ calculations/sec, the result would take 10^(100-9)=10⁹¹ secs. To get an idea of how long this is, the Age of the Universe (http://en.wikipedia.org/wiki/Age_of_the_universe)=13Gyr (another useful number) =13*10⁹*365*24*3660=409*10¹⁵ secs. If we wanted π to 100 places by numerical integration, we would need to run through the creation of 10^(100-9-17)=10⁷⁴ universes before getting our answer.

As Beckmann points out, quoting the calculations of Schubert, we don't need π to even this accuracy. The radius of the (observable) universe is 13*10⁹ lightyears = 13*10⁹*300*10⁶ * 365*24*3600=10²⁶m=10³⁶Å. Assuming the radius of a hydrogen atom is 1Å, then knowing π to 10² places, would allow us to calculate the circumference of the (observable) universe to 10²⁶/10¹⁰⁰=10^-64 times the radius of a hydrogen atom.

Despite the lack of any practical value in knowing π to even 10² places, π is currently known to 10¹² places (by Kanada).

We calculated π by numerical integration, to illustrate a general method, which can be used when you don't have a special algorithm to calculate a value. Fortunately there are other algorithms for finding π. If it turned out there weren't any fast algorithms for π some supercomputer would have been assigned to calculate 32-,64- and 128-bit values for π long ago; and they'd now be stored in libraries in our software.

While there are fast algorithms to calculate π, there aren't general methods for calculating the area an arbitary curve, and numerical integration is the method of choice. It's slow, but often it's the only way.

It can get even worse than that: sometimes you have to optimize your surface. You numerically integrate under your surface, then you make a small change in the parameters hoping for a better surface, and then you do your numerical integration again to check for improvement. You may have to do your numerical integration thousands if not millions of times, before finding your optimal surface. This is what supercomputers do: lots of brute force calculations using high order (O(n² or worse) algorithms, which are programmed by armies of people skilled at optimisation. People think that supercomputers must be doing something really neat, for people to be spending all that money on them. While they are doing calculations for which people are prepared to spend a lot of money, the computing underneath is just brute force. Whether you think this is neat or not is another thing. Supercomputers are cheaper than blowing up billion dollar rockets, or making a million cars with a design fault. Sometimes there's no choice: to predict weather, you need a supercomputer, there's no other way anymore. Unfortunately much of supercomputing has been to design better nuclear weapons.

I said we need a faster algorithm (or computer). What sort of speed up are we looking for? Let's say we want a 64-bit (52-bit mantissa) value for π in 1 sec. If using numerical integration, we'd need to do 2⁵²iterations=4*10¹⁵=10^15.6 iterations. A 1GHz computer can go 10⁹ operations/sec (not iterations/sec, but close enough for the moment; we're out by a factor of 100 or so, but that's close enough for the moment). We'd need a speed up of 10^15.6-9.0=6.6. If we instead wanted π to 100 places (and not 15 places) in 1sec, we'd need a further speed up of 10⁸⁵. There's no likelihood of achieving that sort of speed-up in computer hardware anytime real soon. Below let's see what sort of speed up we can get from better algorithms.

5.11. errors in calculation of π by numerical integration

When we have a calculation that involves 10⁹ (≅ 2³⁰) multiplication/additions, we need to consider whether rounding errors might have swamped our result and our result is garbage. In the worst case, every floating point operation will have a rounding error and the errors will add. (In real life not all calculations will result in a rounding error, and many roundings will cancel, but without doing a detailed analysis to find what's really going on, we have to take the worst, upper bound, case.) Each calcultion could have a rounding error of 1 in the last place of the 52-bit mantissa (assuming a 64 bit real, see machine's epsilon) leading to 32 bits of error. We can expect then that only the first (52-30)=22 bits of our answer will be correct. In the worst case, only 22/3=7 significant decimal figures will be correct.

The error is relative to the number, rather than an absolute number. For slice heights which are less than half of the radius of the quadrant, the error in the heights will be reduced to one half. In this case approximately half the errors will not be seen (they will affect digits to the right of the last one recorded). We have two factors which each can reduce the number of rounding errors by a factor of 2 (half of the errors underflow on addition, and half of the additions don't lead to a rounding error). It's possible then that the worst case errors are only 29 bits rather than 30 bits.

Still it's not particularly encouraging to wait for 10⁹ iterations, only to have to do a detailed analysis to determine whether 29 or 30 of the 52 bits of your mantissa that are invalid. If we did decide to calculate π to 15 decimal places (64 bit double precision, taking 100yrs), doing 10¹⁵ operations, what sort of error would we have in our answer ^[79] ? One of the reasons to go to 64 bit math, is to have a large enough representation of the number that there is room to store all the errors that accumulate, while getting enough correct digits to get a useful answer.

When we present our result for the numerical integration, what accuracy do we give; 7 places or 9 places? We found that the upper and lower bounds agreed to 9 places, so you might think that we can give 9 places. However our error analysis shows that in the worst case only 7 of these places are correct. We have to give our answer to 7 places.

In most cases of numerical integration, we don't know the answer (otherwise we wouldn't be doing the calculation). In this case our answer is the known π. Our answer agrees with the known value of π to 9 places. What does this tell us? It tells us that we didn't have the worst case errors. In fact we had about 100 times less errors than the worst case. Why was this? Was it due to skillful programming on our part? No; it was just dumb luck. Most of the errors cancelled. We can expect as many rounding errors to round down as round up, so we should expect most of them not to be seen. However we would need a more complicated error analysis than we've done here to know how many errors we did get. Without a more careful error analysis, we have to give our answer for π to only 7 places.

Assume we do 10^9 calculations. Let's see the effect of the size of the reals on the effect of rounding errors.

Table 6. effect of size of reals on rounding errors for 10^9 calculations

rounding error, bits	real size	mantissa size	correct bits in mantissa	correct decimal digits
30	32	23	all garbage	-
30	64	52	22	7
30	128	112	82	24

To not swamp our 9 digit accuracy of π with rounding errors, we would need to use 128 bit reals.

5.12. Accuracy of Upper and Lower Bounds with 128 bit reals

What accuracy (how many digits) will we get with 10^9 iterations using 128 bit reals?

From the algorithm we know that increasing the number of intervals by a factor of 10, increases the accuracy of the area by a factor of 10 (1 rectangle becomes 10 rectangles; the step in height between adjacent rectangles becomes 10 steps, giving an increase in accuracy in height, and hence area of a factor of 10). We know that 10^9 intervals should give approximately 9 significant figures (whether approximately means 7 or 12 we don't know just yet).

The previous tabl,e showing accuracy as a function of the number of bits in a real, shows that we should get 24 significant digits for our answer, and not 9. What happened, why 24 in one case and 9 in another ^[80] ? From the table showing the accuracy of 128 bit calculations, what accuracy can we expect for the value of π for 10^9 iterations ^[81] ? We can tell from the table, that in the case of 64 bit reals, that the upper and lower bounds will only be correct to 7 decimal digits and to ignore the apparent agreement when the upper and lower bounds agree to 9 decimal places.

For 10^9 iterations, we will get 30 correct bits on the left (most significant) end of the mantissa, 30 bits of rounding errors at the right (least significant) end of the mantissa. There will be 112-30-30=52 bits of which will be correct for both the upper and lower bounds, but which will disagree, because the upper and lower bounds will have different values.

What's the maximum accuracy we can expect from numerical integration on a 128 bit machine? Each iteraction produces a correct bit at the left hand end of the mantissa and a rounding error at the right hand end of the mantissa. After 2^52=4*10^15 iterations, the rounding errors and the correct digits will meet. We will have about 15 correct decimal digits (and the calculation will take 100yrs). The loop in our program is short and 10^9 iterations is about the limit timewise. A machine with 128 bit reals will have no problems with the rounding errors swamping the result for any practical numerical integration.

5.13. Other algorithms for calculating π

This section was originally designed to illustrate numerical integration. It turns out that the numerical integration algorithm isn't practical for π. It's too slow, and has too many errors. We need another quicker algorithm, where most of the 52 bits of the mantissa are valid. Since we've got numerical integration covered, let's look at other algorithms for calculating π.

There are many mathematical identities involving π and many of them can be used to calculate values for π. Here's one discovered independantly by many people, but usually attributed to Leibnitz and Gregory. It helped start of the rush of the digit hunters (people who calculate some interesting constant to large numbers of places).

pi/4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 - ...

Swipe this code with your mouse, calling it leibnitz_gregory.py.

#! /usr/bin/python
#
#leibnitz_gregory.py
#calculates pi using the Leibnitz-Gregory series
#pi/4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 - ...
#
#Coded up from 
#An Imaginery Tale, The Story of sqrt(-1), Paul J. Nahin 1998, p173, Princeton University Press, ISBN 0-691-02795-1.

large_number=10000000
pi_over_4=0.0

for x in range(1, large_number):
	pi_over_4 += 1.0*((x%2)*2-1)/(2*x-1)
	if (x%100000 == 0):
		print "x %d pi %2.10f" %(x, pi_over_4*4)

# leibnitz_gregory.py-------------------------

In the line which updates pi_over_4, what are the numerator and denominator doing ^[82] ? Run this code to see how long it takes for each successive digit to stop changing. How many iterations do you need to get 6 places (3.14159) using Leibnitz-Gregory and by numerical integration ^[83] ?

Newton (http://en.wikipedia.org/wiki/Isaac_Newton), along with many rich people, escaped London during the plague (otherwise known as the Black Death http://en.wikipedia.org/wiki/Black_Death, which killed 30-50% of Europe and resulted in the oppression of minorities thought to be responsible for the disease) and worked for 2 years on calculus, at his family home at Woolsthorpe. There he derived a series for π which he used to calculate π to 16 decimal places. Newton later wrote "I am ashamed to tell you to how many figures I carried these computations, having no other business at the time". We'll skip Newton's algorithm for π as it isn't terribly fast.

Probably the most famous series to calculate π is by Machin - see Computing Pi (http://en.wikipedia.org/wiki/Computing_%CF%80), also see Machin's formula (http://milan.milanovic.org/math/english/pi/machin.html).

pi/4 = 4tan^-1(1/5)-tan^-1(1/239)

Machin's formula gives (approximately) 1 significant figure (a factor of 10) for each iteration and allowed Machin, in 1706, to calculate π by hand to 100 places (how many iterations did he need?). The derivation of Machin's formula requires an understanding of calculus, which we won't be going into here. For computation, Machin's formula can be expressed as the Spellbach/Machin series

pi/4 = 4[1/5 - 1/(3*5^3) + 1/(5*5^5) - 1/(7*5^7) + ...] - [1/239 - 1/(3*239^3) + 1/(5*239^5) - 1/(7*239^7) + ...]

250 years after Machin, one of the first electronic computers, ENIAC, used this series to calculate π to 2000 places.

	Note
	End Lesson 29

This series has some similarities to Gregory-Leibnitz. There are two grouping of terms (each within [...]). In the first grouping note the following (which you will need to know before you can code it up)

There is an alternating sign
There is are terms 1/1, 1/3, 1/5..., which you also had in Leibnitz-Gregory.
There is are terms (1/5)^1, (1/5)^3, (1/5)^5... How would you produce these in a loop?

In each term in the series consider

which number(s) are variable and will have to come from the loop variable
which number(s) are constants

The second set of terms is similar to the first with "239" replacing "5". The second set of terms converge faster than the first set of terms (involving "5"), so you don't need to calculate as many of the terms in 239 as you do for the 5's to reach a certain precision (do you see why?). However for simplicity of coding, it's easier to compute the same number of terms from each set (the terms in 239 will just quickly go to zero).

Unlike ENIAC, today's laptops have only 64-bit reals, allowing you to calculate π to only about 16 decimal places. (There is software to write reals and integers to arbitary precision; e.g. bc has it. Python already has built-in arbitary precision integer math.)

The Spellbach/Machin's series has similarities with the Leibnitz-Gregory series. Copy leibnitz-gregory.py to machin.py and modify the code to calculate π using the Spellbach/Machin series. Here's my code for machin.py ^[84] and here's my output for 20 iterations

dennis: class_code# ./machin.py
x  1 pi 3.18326359832636018865
x  2 pi 3.14059702932606032988
x  3 pi 3.14162102932503461972
x  4 pi 3.14159177218217733341
x  5 pi 3.14159268240439937259
x  6 pi 3.14159265261530862290
x  7 pi 3.14159265362355499818
x  8 pi 3.14159265358860251283
x  9 pi 3.14159265358983619265
x 10 pi 3.14159265358979222782
x 11 pi 3.14159265358979400418
x 12 pi 3.14159265358979400418
x 13 pi 3.14159265358979400418
x 14 pi 3.14159265358979400418
x 15 pi 3.14159265358979400418
x 16 pi 3.14159265358979400418
x 17 pi 3.14159265358979400418
x 18 pi 3.14159265358979400418
x 19 pi 3.14159265358979400418

How many iterations do you need, before running into the 64-bit precision barrier of your machine's reals ^[85] ? How many iterations does it take to get 6 significant figures? Compare this with the Leibnitz-Gregory series ^[86] . What is the worst case estimate for rounding errors for the 64-bit value of π as calculated by the Machin series? Look at the number of mathematical operations in each iteration of the loop, then multiply by the number of iterations. Convert this number to bits, and then to decimal places in the answer ^[87] .

Two different formulae by Ramanujan (derived about 200 yrs after Machin) gives 14 significant figures/iteration. Ramahujan's series are the basis for all current computer assaults on the calculation of π by the digit hunters. You can read about them Ramanujan's series Pi (http://en.wikipedia.org/wiki/Pi). Coding these up won't add to your coding skills any more than the examples you've already done, so we won't do them here. How many iterations would Machin have needed to calculate π to 100 places if he'd used Ramanujan's series, rather than his own ^[88] ? How many iterations of Ramanujan's formula would you have needed to calculate π to 6 places ^[89] ? Ramanujan also produced a formula which gives the value for any particular digit in the value of π (i.e. if you feed 13 to the formula, it will give you the 13th digit of π).

5.14. speed of calculation of π by the library routines

π is a number of great interest to mathematicians and its value is required for many calculations. π is a constant: its value, as a 32- and 64-bit number, is stored in all math libraries and is called by routines needing its value. Since the value of π is not calculated on the fly by the math libraries, we have to look elsewhere to see the speed of standard routines that calculate π.

bc can calculate π to an arbitrary number of places. Remember that it would take the age of 10⁷⁴ universes to calculate π to 100 places by numerical integration. Here's the bc command to calculate π to 100 places (a() is the bc abbrieviation for the trig function arctan()). Below the value from bc, I've entered the value from Machin's formula. and the value calculated by numerical integration (with lower and upper bounds), using 10⁹ iterations.

# echo "scale=100; 4*a(1) " | bc -l
3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170676
3.14159265358979 Machin, 16 places, 11 iterations
3.14159265(2..5) lower and upper bounds from numerical integration, 10 places, 10^9 iterations

You might suspect that the quick result from bc is a table look up of a built-in value for π. To check, rerun the example, looking up the value of a(1.01). If the times are very different, then the fast one will be by lookup, and the slow one by calculation. (If both are fast, you may not be able to tell if one is 10 times faster than the other.) Assuming the bc run took 1sec, it's 10^100-9=91 times faster than numerical integration.

Note that numerical integration gets the correct result (at least to the precision we had time to calculate). In another section we will attempt to calculate e and get an answer swamped by errors.

Most numbers of interest to computer programmers, whether available from fast algorithms or slow algorithms, are already known and are stored in libraries. Unfortunately fast algorithms aren't available for most calculations.

5.15. Area under arbitrary curve

The code to calculate the area in a quadrant of a circle, can be extended to calculate the area under any curve, as long as its surface can be described by a formula, or a list of (x,y) coordinates.

5.16. Calculating the Length of the circumference by Numerical Integration

How would we use numerical integration to calculate the length of the circumference of a circle ^[90] ?

5.17. Calculating the Volume of a Sphere by Numerical Integration

We're not going to code this up; we're just going to look at how to do it.

The formula for the volume of a sphere has been known for quite some time. Archimedes (http://en.wikipedia.org/wiki/Archimedes) showed that the volume of a sphere inscribed in a cylinder was 2/3 the volume of the cylinder. He regarded this as his greatest triumph. Zu Gengzhi (http://www.staff.hum.ku.dk/dbwagner/Sphere/Sphere.html) gave the formula for the volume of a sphere in 5th Century AD.

Knowing the formula for the volume of a sphere, there is no need to calculate the volume of a sphere by numerical integration, but the calculation is a good model for calculating the volume of an irregularly shaped object. Once we can numerically calculate the volume of a sphere, we can then calculate the volume of an abitrarily shaped object, whose surface is known, e.g. a prolate (cigar shaped) or oblate (disk shaped) spheroid (see Spheroid, http://en.wikipedia.org/wiki/Spheroid) or arbitrarily shaped objects like tires (tyres) or milk bottles.

We can extend the previous example, which calculated the area of a quadrant of a circle, to calculate the volume of an octant of a sphere. By Pythagorus' Theorem, we find that a point at (x,y,z) on the surface of a sphere of radius 1, has the formula

x²+y²+z²=1²

What would we have to do to extend the numerical integration code above to calculate the volume of an octant of a sphere? Start by imagining that the quadrant of a circle is the front face of an octant of a sphere and that your interval is 1/10 the radius of the sphere. For a frame of reference, the quadrant uses axes(x,y) in the plane of the page. For the octant of a sphere in 3-D add an axis (z) extending behind the page. (A diagram would help FIXME).

Slice the octant from front to back (cut parallel to the xy plane) into 10 slices of the same thickness. The front (and back) of each slice is itself a quadrant of a circle, and each slice has a smaller radius than the slice infront of it.
Take each slice (quadrant) and cut it again, this time left to right (cut is parallel to the yz plane), into columns of width equal to the thickness. After we finish cutting left to right, we see that the number of columns in a rearward slices decreases.
If we'd cut up a cube of side=1, we would have had 100 columns, all of the same height. How many columns are there in the sliced up octant (do a back of the envelope calculation) ^[91] ? Is the answer approximately correct (for some definition of "approximate" that you'll have to provide), or is your answer an upper or lower bound?
The problem is that the base squares covering the circumference are partially inside and outside the quadrant. Are you over- or under-counting (this is the fencepost problem on steroids)? Let's look at the extreme cases
- the number of intervals is large: (the base squares are infinitesmally small). The squares which are partially inside and partially outside the circle are also small. As it turns out (you'll have to believe me) the area of these squares outside the circle becomes zero as the squares get smaller. (The problem is that as the squares get smaller, the number of squares gets larger. Does the area outside the circle stay constant or go to zero?) In this case the squares exactly cover pi/4.
- there is only 1 interval: It will cover the whole square (x,y)=(0,0), (x,y)=(1,1) or 100% of the square.
At both extremes the area covered is at least as great as the area of of the quadrant. You then make the bold assertion that this will be true for intermediate values of the number of intervals. (x,y)=(0,0), (x,y)=(1,1). The answer 78% is a lower bound.
As a check, here's the diagram showing the number of bases when num_intervals=10. (The code to generate the diagram below is here ^[92] .) For this case, 86 squares need to be included in the calculation of column heights.
Figure 10. 86 column bases are needed to calculate column heights when num_intervals=10
Number of column bases needed for height calculation for num_intervals=10
Now calculate the volume of each column. Except for the top of the column, which is part of the surface of a sphere, the columns are square columns (with a square base). The volume under the curve has an upper bound, determined by the height of the highest corner of the spherical surface (this corner faces the center of the sphere) and a lower bound, determined by the height of the lowest corner of the spherical surface (this corner faces away from the center of the sphere).
Now sum the volumes of all the slices. Do this with two loops, one inside the other. The inside loop integrates across each slice and returns the volume for the slice. The outer loop steps the calculation from front to back, each step calculating a new radius for the next slice (and hence the number of intervals in each slice) and summing the volumes from the inner loop.
How many iterations do you need for each loop?
- You could do a variable number of iterations for the inner loop, using a while loop. Since each quadrant has a different number of columns, the while conditional would test if you'd reached the end of the quadrant and then stop.
- You could do the same number of iteractions for the inner loop, using a for loop. The loop variable would range from 0.0..1.0*r. Since only 78% of the columns are inside the octant, 22% of them are outside and have a zero height. Calculating a zero height column for 22% of the columns isn't a big deal, if it simplifies the coding. However you'll still have to test if you're outside the octant before being able to declare a zero height.
Both ways are about the same amount of trouble.

	Note
	End Lesson 30

5.18. Presentation: Numerical Integration as a method to calculate π

In this section the students prepared notes, diagrams, and tables for their presentation.

	Note
	End Lesson 31 End Lesson 32. After having spent 2 lessons * 1.5hrs on preparing notes, the kids said they try to finish them as homework. End Lesson 33. The kids had written most of the presentation during the week. There were a couple of points that the kids didn't understand (e.g. the upper limit for the errors in π by numerical integration). I had the kids put up their slides to make sure that they were in a logical order and looked about right. I then had the kids go through their material, upto explaining where the upper and lower bounds of the estimates of π came from. End Lesson 34. We spent the time finding ways to explain how the upper and lower bounds for the estimate of π can be done in one calculation. One of the students had forgotten (or no longer understood) how the individual white blocks of Figure 8, become the single column of white blocks in Fig 9. This is taking a while. On consulting my partner, she said to be patient; I've decided that the kids have to give a presentation and I shouldn't to make it unpleasant by being in a rush - it will only put the kids off.

The presentation should cover

How numerical integration works: (you cut up an arbitary shape into many smallers objects - in this case a rectangle - whose area can be calculated. you can use my diagrams if you like.)
The connection between the area of the quadrant as calculated by numerical integration and the value π.
How you calculate the upper bound, the lower bound (show code, point out common numbers in both outputs)
Show how one lot of code can calculate both upper and lower bounds
Show the increase in precision in the value calculated, as the number of intervals is increased
Show the (speed) optimisations that you tested, show the speed up from each. Discuss why you used some of these optimisations but not others.
Show the output as a function of the number of intervals (value calculated, time taken).
Estimate the upper bound for the error in the output.
Discuss whether numerical integration is a useful method to calculate π (include time needed to calculate to 15, 100 decimal places).
Compare speed of calculating π by numerical integration with other algorithms (Leibnitz-Gregory, Machin).

6. e - the base of natural logarithms: exponential and geometric processes

e≍2.71828 is the basis of equations governing exponential growth, probability and random processes, and wave functions. It one of the great numbers in mathematics. e winds up in all sorts of places.

6.1. Calculating Bank Interest at small intervals

Let's say you deposit $1 in an interest bearing bank account, where your interest is calculated annually.

etymology of bank: Bankers (banchieri) were originally foreign money exchangers, who sat at benches (stall would be a better word; the french word for stool is "banc"), in the street, in Florence. (from The Ascent of Money, Niall Ferguson, The Penguin Press, NY, 2008, p42. ISBN-978-1-59420-192-9) (Italy was the birth place of banking.) The word "bank" is related to "embankment".

Let's assume your annual interest rate is 100%, calculated annually. At the end of the first year, the balance in your account will be $2. What would you guess would be the balance in your account, if your interest was calculated semi-annually (i.e. 50% interest twice a year), or quarterly (25% interest, but now 4 times a year), or monthly at 8&1/3%? How about weekly, daily, hourly, by the nanosecond? You'd be getting a smaller amount of interest each time, but the number of times your interest is added would be increasing by the same ratio. At the end of the year would you be

worse off
the same
better off

than if interest were calculated annually?

Let's try interest calculated twice a year. At the end of the first 6 months, your interest will be $1*1/2=$0.50 and your balance will be $1*(1.0+0.5)=$1.50. At the end of the year, your balance will be $1.50*(1.0+0.5)=$2.25 (instead of $2.00). It appears to be to your advantage to calculate interest at shorter intervals.

Will your balance at the end of the year be finite or infinite? Some series when summed go to 0, some reach a finite sum and some go to ∞. Unless you know the answer already, you aren't likely to guess the sum to ∞ of anything.

Let's find out what happens if you calculate interest at smaller and smaller intervals. Write code (filename calculate_interest.py) that has

annual_interest=1.0
initial_deposit=1.0
evaluations_per_year=1

Using these variables, write code to calculate (and print out) the value of balance at the end of the year For the first attempt at writing code, calculate interest only once, at the end of the year. Here's my code ^[93] and here's my output

igloo:# ./calculate_interest.py
evaluations/year 1 balance is 2.000000

Modify the code so that interest is calculated semi-annually (i.e. do two calculations, one after another) and print out the balance at the end of the year. Here's my code ^[94] and here's the output

igloo:# ./calculate_interest.py
evaluations/year 2 balance is 2.250000

If you haven't already picked it, the interest calculation should be done in a loop. Rewrite the code to interate evaluations_per_year times. Have the loop print out the loop variable and the balance. (If you're like me, you'll run into the fencepost problem and the number of iterations will be off by one; you'll have adjust the loop parameters to get the loop to execute the right number of times.) Here's my code ^[95] and here's the output

igloo:# ./calculate_interest.py
1 balance is 1.500000
2 balance is 2.250000
evaluations/year 2 balance is 2.250000

Try the program for interest calculated monthly and daily. Turn off the print statement in the loop and put the loop inside another loop, to calculate the interest for 1,10,100...10⁸ evaluations_per_year, printing out the balance for each iteration of the outer loop.

Here's the code ^[96] and here's the output

igloo:# ./calculate_interest.py 
evaluations/year          1 balance is 2.0000000000
evaluations/year          2 balance is 2.2500000000
evaluations/year         10 balance is 2.5937424601
evaluations/year        100 balance is 2.7048138294
evaluations/year       1000 balance is 2.7169239322
evaluations/year      10000 balance is 2.7181459268
evaluations/year     100000 balance is 2.7182682372
evaluations/year    1000000 balance is 2.7182804691
evaluations/year   10000000 balance is 2.7182816941
evaluations/year  100000000 balance is 2.7182817983
evaluations/year 1000000000 balance is 2.7182820520
evaluations/year 2000000000 balance is 2.7182820527

Since we haven't calculated an upper or lower bound, we don't have much an idea of the final value. It could be 2.71828... or we could be calculating a slowly divergent series, which gives a final answer of ∞. It turns out we're calculating the value of e, the base of natural logarithms. One of the formulas for e^x is (1+x/n)ⁿ for n→∞. The final value then is finite; the best guess, as judged by unchanging digits is 2.718282052. Here's the value of e from bc (run the command yourself, to see the relative speed of bc and the calculation we ran).

igloo:~# echo "scale=100;e(1)" | bc -l
2.7182818284590452353602874713526624977572470936999595749669676277240766303535475945713821785251664274

2.718282052 for comparison, the value from (1+1/n)^n

	Note
	almost half our digits are wrong, we'll see why later.

As for the earlier calculation of π using bc, check that the value of e is not stored inside bc by timing the calculation of e^1.01.

For your upcoming presentation on e, make a table of the balance at the end of the year in your account if interest is calculated annually, monthly, weekly, daily, by the hour, minute, second.

6.2. errors in calculation of e=(1+1/n)^n

This method of calculating e gets about half a decimal place, for a factor of 10 increase in calculation time. This is painfully slow, but if there wasn't anything better (there is), you'd have to use it.

When you're doing 10⁹ (≅2³²) arithmetic operations (e.g. addition, multiplication), you need to know whether the rounding errors have overwhelmed your calculation. In the case of pi errors we got a usable result, because the errors were relative to the size of the numbers being calculated and by going to higher precision (e.g. to 64 bit).

What happens with this calculation of e? We're calculating with (1+1/n)ⁿ, where n is large. If we were using 32 bit reals, i.e. with a 24 bit mantissa), we would chose the largest n possible, n=2²⁴ giving a mantissa of 1.000000000000000000000001₂. Because of rounding, the last two digits could be 00,01 or 10. The resulting number then would be e⁰=1, e¹≍2.7 or e²≍7.4. We're calculating using a number that is as close to 1.0 as possible and the magnitude of the rounding errors is comparable to the magnitude of the last few digits in the number being multiplied. We will have large errors, because the calculation is dependant on the small difference from 1.0, rather than the relative value as it was in the case of the calculation of π.

At this stage let's look at the machine's epsilon.

Here's the value for e calculated from the series and the real value from bc.

e from interest series  e=2.718282052
e from bc               e=2.7182818284590452353602874713526624977572470936999595749669676277240766303535475945713821785251664274

Our calculation is wrong in the last 4 digits. We need a different method of calculating e.

6.3. Logarithmic time for calculating e

Let's say we wanted to calculate e=(1+1/n)ⁿ for n=1024. In the above section, we did this with 1024 multiplications. Instead let's do this

x=(1+1/1024)
x = x^2		# x=(1+1/1024)^2
x = x^2		# x=(1+1/1024)^4
x = x^2		# x=(1+1/1024)^8
x = x^2		# x=(1+1/1024)^16
x = x^2		# x=(1+1/1024)^32
x = x^2		# x=(1+1/1024)^64
x = x^2		# x=(1+1/1024)^128
x = x^2		# x=(1+1/1024)^256
x = x^2		# x=(1+1/1024)^512
x = x^2		# x=(1+1/1024)^1024

Doing it this way, we only need 10 iterations rather than 1024 iterations to find (1+1/1024)^1024.

	Note
	Note that 1024=2¹⁰ (i.e. log₂1024 = 10) and we needed 10 steps to do the calculation.

This algorithm scales with O(log(n)) (here 10) and is much faster than doing the calculation with an algorithm that scales with O(n) (here 1024). You can only use this algorithm for n as a power of 2. This is not a practical restriction; you usually want n as big as you can get; and a large power of 2 is just fine.

The errors are a lot better. The divisor n is a power of 2 and has an exact representation in binary. Thus there will be no rounding errors in (1+1/n). As with the previous algorithm, we still have the problem for 1/n less than the machine's epsilon (what's the problem ^[97] ?). There still will be a few rounding errors. Look what happens when you square a number close to (1+epsilon) (assume an 8-bit machine)

(1+x)^2=1+2x+x^2
for x=1/2^8
then 
2x=1/2^7
x^2=1/2^16

While addition (usually) will fit into the register size of the machine (unless you get overflow), multiplication always requires twice the register width (if you have a 8-bit machine, and you want to square 255, how many bits will you need for your answer?). To do multiplication, the machine's registers are double width. (Joe FIXME, is this true for integers?) For integer multiplication, you must make sure the answer will fit back into a normal register width, or your program will exit with an overflow error. For real numbers, the least significant digits are dropped. How does this affect the squaring in an an 8-bit machine? Here's the output from bc doing the calculations to 40 decimal (about 128 bit) precision. The 8-bit computer will throw away anything past 8-bits.

x=1+1/2^8=10000001 binary
x^2      =1000001000000001
x^4      =10000100

 1.0000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
 1.0000010000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
 1.0000100000011000001000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
 1.0001000001110001110001000110011100000111000001000000000011111111111111111111111111111111111111111111111111111111111111111111111111111
 1.0010000111110001111100111110100111101000100010101001111111011101101010001111001000000110000111100000100000000000111111111111111111101
 1.0100100001100100001011010110001101000101110101101011011100100011110000011101101111100110100010101101001111001111011000011000110100010
 1.1010010101000000110110111000000111100000100100001101001000010111010010000011010010011101001011110001001110100110110010111110011110111
10.1011010100101110011000100110011110101001110001000001001110000101011111011000011001111001111000011111110100011000111001011111011011011

We should be able to get However in the example above, we don't have to multiply the error 1024 times, only 10 times. With this algorithm, we should expect an answer with smaller errors in a shorter time.

If errors weren't a problem, how many iterations would we need (approximately) if we were to calculate e=(1+1/n)ⁿ, using the logarithmic algorithm, for n=10⁹ ^[98] ? Write out each step by hand to see that you get this answer. If we scale to multiplying 2³⁰≅10⁹ times, we'll only multiply the rounding errors 30 times.

What is the maximum number of squarings we can do on a 64-bit machine (hint: x=(1+1/n) can't be less than (1+epsilon). What value of n will do this?) ^[99] ?

Could we have used the logarithmic algorithm for the numerical integration of π ^[100] ?

Let's code this up. Copy calculate_interest.py to calculate_interest_logarithmic.py. (You're going to gut most of the old code.) Here's the first attempt at the writing the code (you're doing only one iteration, so there's no loop yet).

change the documentation to reflect the new name, and function
initialise the following variables
- initial_deposit=1.0
- annual_interest=1.0
- iterations=1
- evaluations_per_year=2**iterations
- x=1+annual_interest/evaluations_per_year
- iteration=1
print the iteration and the balance (x * initial_deposit)

Here's my code ^[101] and here's my output.

pip:/src/da/python_class/class_code# ./!$
./calculate_interest_logarithmic_1.py
evaluations/year          1
iteration   1 balance=2.00000000000000000000

We now need to add a loop to do the squaring operations. Note in the initial example above (with 10 iterations), that the first iteration did something different to the rest of the iterations (the first iteration gave the formula for calculating x while the rest of the iterations did the squaring operations). We could implement this in either of these two logically equivalent ways.

#pseudo code
iterations=10

#iteration #1
define x=(1+1/n)
print statement

#iterate for loop_variable =2,iterations+1
	x=x**2
	print statment

#pseudo code
iterations=10

iterate for loop_variable=1,iterations+1
	if loop_variable==1
		define x=(1+1/n)
	else
		x=x^2

Both of these are a bit messy.

In the first method, the loop does only one type of calculation (the squaring) (which is easy to read), but the loop variable starts at 2, which is a little difficult to read. As well you need print statements in two places.
In the 2nd method, the loop has to do an if/else, when the if only succeeds on the first iteration (i.e. it's only needed once). Since you won't be running this loop more than 52 times, you don't need to optimise the loop code, so an if/else isn't particularly bad. If you were iterating 10⁶ times, then you wouldn't want the if/else inside the loop.

Sometimes you can't write aesthetically pleasing code; sometimes it's a mess. A problem which is trivial for a human to do in their head, can be quite hard to write down. You can write code to do it either of these two ways (or any other way you want). I'll show both methods above.

For the 1st method: below your current code,

add a loop that allows the code to evaluate (1+1/n) a total of iterations times, each time squaring x (you have the fencepost problem: allow yourself to get the number of iterations wrong initially and fix up the number of iterations later). Use the variable iteration as the loop variable. Because you've already done one calculation before you enter the loop, make sure you output the correct number of the iteration.
At each iteration print out the number of times you've evaluated x and the balance in the account for the current value of x (the only value of x that the customer cares about is the value when the loop exits).
When the loop exits, print out the balance (the value of e).

Here's my code for (1st method) ^[102] and here's my output for iterations=1

and for iterations=4

For the 2nd method:

create a loop with loop variable iteration
inside the loop, when iteration==1, define x=1+annual_interest/evaluations_per_year
else square x
for each iteration, print the iteration and the balance
after the loop exits, do the same

Here's my code (2nd method) ^[103] and here's my output for iterations=1

and for iterations=4

6.4. Faster converging series for e: factorial series

The definition of a factorial is

n!=n*(n-1)*(n-2)...1

where n! is pronounced "n factorial" or "factorial n"

examples
3!=3*2*1=6
6!=6*5*4*3*2*1=120

Factorial series increase the fastest of all series (even faster than exponential series). Factorials wind up in combinatorics.

How many different orderings, left to right, are possible if you want to line up 4 people in a photo? The answer is 4!. You can pick any of 4 people for the left most position. Now you have 3 people you can chose from for the next position, then 2 for the next position, and there is only 1 person for the last position. Since the choices of people at each position are independant, the number of possibilities is 4*3*2*1=4!.
Note
0!=1. (there is only 1 way of arranging 0 people in a photo with 0 people.)

Here's a faster converging series to calculate e:

e=1/0!+1/1!+1/2!+1/3!...

Since you'll need a conditional to handle calculation of factorial n for n=0, it's simpler to use the series

e=1+1/1!+1/2!+1/3!...

and have your code initialise e=1, allowing the loop for the factorial series to start at 1/1!.

Write code (e_by_factorial.py) to

initialise e=1
initialise factorial_n=1.0

Now you want to calculate the value of e from the first two terms (1+1/1!). Do this by

calculating the first factorial term (1/1!) from the initial value of factorial_n, putting the result into factorial_n.
add factorial_n to e

then prints out e. Here's my code ^[104] and here's my output.

pip:# ./!$
./e_by_factorial.py
2.0

What does the construction "/=" do?

Now put the calculation of e into a loop, which calculates the next factorial number in the series and adds it to e. Iterate 20 times, each time print out the calculated value for e using

	print "%2.30f" %e

Here's my code ^[105] and here's my output

dennis:/src/da/python_class/class_code# ./e_by_factorial.py
2.000000000000000000000000000000
2.500000000000000000000000000000
2.666666666666666518636930049979
2.708333333333333037273860099958
2.716666666666666341001246109954
2.718055555555555447000415369985
2.718253968253968366752815200016
2.718278769841270037233016410028
2.718281525573192247691167722223
2.718281801146384513145903838449
2.718281826198492900914516212652
2.718281828286168710917536373017
2.718281828446759362805096316151
2.718281828458230187095523433527
2.718281828458994908714885241352
2.718281828459042870349549048115
2.718281828459045534884808148490
2.718281828459045534884808148490
2.718281828459045534884808148490

Note that the output increases in precision by a factor of 10 for each iteration (at least up to iteration 15 or so). This is a quickly converging series (all series based on factorials converge quickly). Although you can't tell from the output, this series does converge. Note that the last 3 entries are identical and match the 4th last entry to 15 decimal places.. What piece of information brings together "real" and "accurate to 15 decimal places" ^[106] ? As a check, compare the output of your program with the value from bc.

e from bc               e=2.7182818284590452353602874713526624977572470936999595749669676277240766303535475945713821785251664274
e_by_factorial          e=2.718281828459045
e from interest series  e=2.71828

The value from the factorial series matches the real value to 16 places.

6.5. errors in calculation of e from the factorial series, Arithmetic Progressions and the number of casing stones on the Great Pyramid

It's possible that any real arithmetic operation (i.e. +-*/) will be rounded, so you must plan for an upper bound in errors of 1 bit/operation. With the 52 bit mantissa for 64 bit math, each operation can potentially introduce an error of 1:2⁵² (≅ 1:10¹⁶ - the machine's epsilon). We needed 10⁹ operations to calculate e by the previous method. How many operations were needed to calculate e by the factorial series? We needed only 16 iterations to calculate e to the precision of python's 64 bit math. How many real math operations were involved in the 16 iterations? It's obviously very much less than 10⁹, but we'll need the exact number in the future, when we start comparing two good (fast) algorithms, so let's calculate it exactly.

The addition of a term in the formula for e requires

the calculation of the factorial (n multiplications)
a division
an addition

This is n+2 operations. For a back of the envelope calculation, let's assume that n operations are required (we can come back and do the exact calculation later). For 16 iterations then, the number of operations is 1+2+.....16. This series of numbers; 1,2..16, is an arithmetic progression

A series of numbers, where the difference between consecutive numbers is constant, is called an Arithmetic Progression (http://en.wikipedia.org/wiki/Arithmetic_progression). As an example, the set of numbers 2,5,8,11....29, is an arithmetic progression, whose first and last members are 2,29 with a difference of 3.

First we can find an approximate upper bound for the number of operations. Assume the series was 16,16...16 for 16 numbers. The sum of these numbers is 256 giving an upper bound of 256 for the number of operations to calculate e to 16 significant figures. Let's get a better estimate. Here's a diagram showing the number of operations for each term, the first where the number of operations is modelled at 16 for each term, and the second where the number of operations is modelled as n for each term. The sum of the heights of all the members of the series can be found from the area of each of the two diagrams.

****************                 *
****************                **
****************               ***
****************              ****
****************             *****
****************            ******
****************           *******
****************          ********
****************         *********
****************        **********
****************       ***********
****************      ************
****************     *************
****************    **************
****************   ***************
****************  ****************
height=16         height=1..16

The first is a square 16*16 (it doesn't look square because the area occupied by a letter on a computer screen is a rectangle), and the second triangular looking object of 16*16. The second diagram represents the actual number of operations for the 16 iterations used to calculate e. Using simple geometry, what is the number of operations (the area of the 2nd diagram) ^[107] ?

Where's the fencepost error? The triangular looking object above is not a triangle at all; it's a set of stairs. The "*" characters on the hypoteneuse are full characters, and not a "*" sliced diagnonally along the hypotenuse. Here's what the object really looks like

	       _|*
	      _|**
	     _|***
	    _|****
	   _|*****
	  _|******
	 _|*******
	_|********
       _|*********
      _|**********
     _|***********
    _|************
   _|*************
  _|**************
 _|***************
 |****************
height=1..16

To calculate the area geometrically, we duplicate the object, rotate it 180°, and join it with the original to make a rectangle (here the duplicated version uses a different symbol).

................
...............*
..............**
.............***
............****
...........*****
..........******
.........*******
........********
.......*********
......**********
.....***********
....************
...*************
..**************
.***************
****************
height=1..16

What is the area of this rectangle and hence the number of real operations required to calculate e ^[108] ?

This means that upto 7 bits (approximately, it would be 7 bits if 128 operations were used) of the 52 bit mantissa could be wrong, or 7/3 i.e. the last 2 decimal digits could be wrong. From the output from bc above, we don't see any rounding errors.

From your understanding of the construction of the rectangle above, what's the formula for the sum of the elements of an arithmetic progression, which has n elements, the first term having the value first and the last term having the value last ^[109] ?

The number of operations was n+2 and not n. Now let's do the exact calculation. The number of operations is 3..18. Here's the diagram showing the number of operations at each iteration of the loop.

................
................
................
...............*
..............**
.............***
............****
...........*****
..........******
.........*******
........********
.......*********
......**********
.....***********
....************
...*************
..**************
.***************
****************
****************
****************
height=3..18

What's the number of real number operations (do it by inspecting the diagram here and from your formula). ^[110] ? Does the exact calculation make any real difference to the number of bits of the mantissa that could be wrong ^[111] ?

The factorial series is better for calculating e than the (1+1/n)ⁿ series for two linked reasons

it converges faster (factorial series converge the fastest)
the smaller number of calculations result in less rounding errors.

There's an interesting story about arithmetic progressions and one of the great mathematicians, Gauss. When Gauss was in grade school, the teacher wanted to keep the students busy for a while, so they wouldn't bother him, and he set the class to the task of adding all the numbers 1..100. Gauss handed in his answer immediately, while the rest of the class swatted away, presumably adding all the numbers by brute force. The teacher thought Gauss was being insolent and didn't look at Gauss' answer till all the other student's answers had been handed in and marked. The teacher was astonished to find that Gauss' answer was correct.

Before we look at how Gauss did it, let's look at addition of all numbers from 1..99 (we can add the 100 later). People in this class familiar with loops and finding common elements in a problem, should be able to see that addition of (1+2+3+4+5+6+7+8+9)+(10+11+12+13+1+4+15+16+17+18+19)..99 involves a core problem of adding the numbers 1..9 a couple of different ways. Can you see how to add 1..99 ^[112] ?

Gauss used the geometric constuct we used above, although he didn't need to draw the dots as we did. Gauss wrote out the numbers in a line and then below them wrote out the numbers in the reverse order. His notebook looked like this

  1   2   3   4 ...  97  98  99  100
100  99  98  97 ...   4   3   2    1

What are the sums of each pair of terms?

  1   2   3   4 ...  97  98  99  100
100  99  98  97 ...   4   3   2    1
--- --- --- ---     --- --- ---  ---
101 101 101 101     101 101 101  101

How many terms did Gauss have across his page (it's unlikely that Gauss wrote out all the terms) ^[113] ? What's the sum of all the terms on the bottom line ^[114] ? What's the sum of the original arithmetic progression ^[115] ?

Exercise in Arithmetic Progressions: Calculate the number of casing stones on the Great Pyramid of Khufu at Giza

First a bit of background:

	Note
	About pyramids in Egypt: Great Pyramid of Giza (http://en.wikipedia.org/wiki/Great_Pyramid_of_Giza), Another look at the Pyramids of Giza by Dennis Balthaser (http://www.truthseekeratroswell.com/ed010108.html) Seven Wonders of the Ancient World (http://en.wikipedia.org/wiki/Wonders_of_the_World#Seven_Wonders_of_the_Ancient_World), Pharaonic monuments in Cairo and Giza (http://www.sis.gov.eg/En/Arts&Culture/Monuments/PharaonicMonuments/070202000000000006.htm). For a likely explanation of in involvement of the number π in the dimensions of the Great Pyramid see Pi and the Great Pyramid (http://www.math.washington.edu/~greenber/PiPyr.html)

The Great Pyramid of Khufu (Cheops in Greek) was built about 2560 BC over a period of 20yrs, employing about 100,000 labourers, moving 800 tonnes of stone a day, excavating, cutting, polishing, transporting and mounting a 15 ton stone every 90 sec. It is the only one of the Seven Wonders of the Ancient World to survive till today. For 3800 yrs it was the tallest man-made structure on earth. The base is horizontal to 15mm, the sides are identical in length to 58mm. The ratio of the circumference of the base (4*440 cubits) to the height (280 cubits) is 2π to an accuracy of 0.04%. The sides of the base are not a square but a 4 pointed star as can be seen by aerial photography at the right time of day (see Aerial photo by Groves 1940 http://www.world-mysteries.com/mpl_2conc1.gif). The closeness of the angles of the base to a right angle (1' deviation) compared to the deviation of the orientation of the pyramid to true north (3' deviation west), allows us to conclude that the pyramid must have been originally oriented to true north and that the continent of Africa has since rotated, with respect to the axis of the earth's rotation, carrying the pyramid with it. The amount of rotation is consistent with estimates from plate tectonics (I read the original paper back in - I think - the '70's but I can't find a reference to it with google).

The pyramid was surfaced by white polished limestone casing stones. Most of the casing stones at Giza were removed to build Cairo. The few that remain today are at the top of the Pyramid of Khafre . The remaining casing stones are dazzling in today's desert sun. Back when the whole pyramid was covered in these stones, it must have been quite a sight as you approached it from over the horizon.

The casing stones on the Pyramid of Khufu are all of the same height and width (to maintain symmetry of the sides), accurate to 0.5mm and weighing 15tons. The casing stones were transported by barge, from the quarry at Aswan, during the Nile flooding, allowing the stones to be deposited next to the pyramid.

Aswan (http://en.wikipedia.org/wiki/Aswan) is famous not only for its quarry which produced syenite (a pink granite used for many of the obelisks in Egypt, and which is also found in our local North Carolina State Park, the Eno River Park), but for being at one end of the arc used in the first estimate of the circumference of the earth by Eratosthenes.

By a fortunate cancellation of two errors of about 20%, Eratosthene's value differed by only 1% from the current accepted value. Christopher Columbus (http://en.wikipedia.org/wiki/Christopher_Columbus) was looking for a route to Japan/India/China across the Atlantic. At the time no ship could carry enough food to last the distance across the Atlantic to China from Europe. As well, a crew would mutiny if required cross open ocean that distance: they might sail off the edge of the earth.

Using the method of Eratosthenes, a century later, Posidonius made his own estimate. Posidonius knew that the star Canopus grazed the horizon at Rhodes (his home) and measured it's maximum elevation as 7.5° (actually 5deg;) at Alexandria. Again by cancellation of errors in the elevation of the star and the distance between the two towns, Posidonius came up with the correct answer.

	Note
	due to light pollution and atmospheric pollution, on most places on earth, it's impossible to see any star on the horizon, much less at 5° above the horizon.

Posidonius later corrected the distance from Rhodes to Alexandria, coming up with a now incorrect estimate of the circumference of the earth of 30,000km (18,000 miles). The two estimates of the size of the earth became known in Europe after translation from Arabic (12th Century). Medieval scholars debated the veracity of the two values for centuries, without any attempt to verify the measurement themselves (for in those days truth was revealed, rather than tested, so the matter would be resolved by debate, rather than measurement).

	Note
	Due to precession of the equinoxes, Canopus no longer reaches an elevation high enough to graze the horizon at Rhodes. Presumably this would have caused some consternation amongst people attempting to reproduce Posidonius's measurement.

	Note
	Posidonius travelled widely. In Hispania, on the Atlantic coast at Gades (the modern Cadiz), Posidonius studied the tides. He observed that the daily tides were connected with the orbit and the monthly tides with the cycles of the Moon.

To show that his trip was practical, in order to gain financing from Queen Isabella, Columbus used the value 30,000km for the circumference of the earth, and a width of 3,300km (2,000miles) of Sipangu (Japan), making the trip across the Atlantic appear practical. Knowing the potential for mutiny, Columbus kept two logs, one to show the crew, which showed that they'd travelled a shorter distance, as well as the correct log. Columbus was lucky to find the West Indies. His crew, being further from land than anyone had ever been, and was within a day of mutiny.

The currently accepted value for the circumference of the earth is 40,000km (thanks to Napoleon).

	Note
	What does Napoleon have to do with the circumference of the earth ^[116] ?

	Note
	You may wonder what Napoleon is doing in a section on the Pyramids of Egypt, but I thought the connection was too strong to pass up the opportunity to talk about it. You should now go find the connection between Napoleon and the Sphinx which will bring us back to Egypt again.

The dimensions of the casing stones given in Quarries in Ancient Egypt (http://www.cheops-pyramide.ch/khufu-pyramid/stone-quarries.html) gives dimensions for the bottom casing stones of 1 m x 2.5m and 1-1.5m high (6.5 - 10 tons) while the upper casing stones are 1m x 1m and 0.5m high (1.3 tons). Details on the thickness of each layer are at Stone courses of the Pyramid of Khufu (http://www.cheops-pyramide.ch/khufu-pyramid/stonecourses-pyramid.html).

Assuming the same height and width for the casing stones, the number of casing stones in each layer is a (decreasing) arithmetic progression. The height of the pyramid is 146m (482') with a base of 230m (756'). Assuming the pyramid was covered in the upper stones only, how many casing stones would you have needed to order to cover the pyramid? First calculate the number of layers of casing stones (the number of members of the arithmetic series) and the number of casing stones in the first (bottom) and last (top) layers. Then calculate the number of casing stones in a face of the pyramid, then calculate the total number of casing stone for the whole pyramid. Here's my calculations ^[117] . The actual number of casing stones given in the references is 144,000. Our answer doesn't fit real well.

Some bigger stones were used too, this would reduce the estimate further, making our answer worse. It seems reasonable that the size of the casing stones got smaller towards the top and that the bigger stones were at the bottom.
The number of layers is 210, not 292.

So what happened to our calculation? The references are unclear as to the size of the facing stones. Further work needs to be done (a class trip to the pyramids?).

Since this wasn't a particularly successful calculation, let's try another. A well known 20th century pyramid is the Transamerica Pyramid (http://en.wikipedia.org/wiki/Transamerica_Pyramid) in San Francisco. The number of windows (from http://ecow.engr.wisc.edu/cgi-bin/get/cee/340/bank/11studentpro/transamericabuilding-2.doc) is 3678. You can get an estimate of the number of floors with windows from transamerica.jpg (http://www.pursuethepassion.com/interviews/wp-content/uploads/2007/08/transamerica.jpg) giving about 45 floors of windows (there are supposed to be 48 floors, presumably the extra 3 floors are in the lower section, which has slightly different architecture). Notice from this photo, that the floors are in pairs, with two consecutive floors having the same number of windows. The lower floor of each pair have longer windows at the ends to make up the extra length. The lowest visible floor has 29 windows on a side. The top floor with windows has 8 windows (transam-coit-01-big.jpg http://www.danheller.com/images/California/SanFrancisco/Buildings/transam-coit-big.jpg). (The different colored top, that starts at the top of the elevator shafts - the "wings" - I think is decorative and not part of the floor count.) Let's see if we've accounted for all the windows. We have windows (starting at the bottom) in this series: 29,29,28,28..8,8. How many members are there in this series ^[118] ? Let's add the windows in pairs of floors so we now have an arithmetics series: 58,56...16. What's the sum of this series ^[119] ? There's about 800 windows unaccounted for. (We're not doing real well here either.)

6.6. Logarithms and Slide Rules

FIXME (elephants have linear ears)

6.7. The Great Numbers of Mathematics

	Note
	The historical writeup on negative numbers came largely from "e: The Story of a Number, Eli Maor (1994), Princeton University Press, ISBN 0-691-05854-7.

The Great Numbers of Mathematics all appear in Euler's Identity (http://en.wikipedia.org/wiki/Euler's_identity)

e^iπ = -1

e^iπ + 1 = 0

The Roman number scheme had no zero ^[120] . The concept of Zero (http://en.wikipedia.org/wiki/Zero) and positional notation entered Europe in the 12th century from the Arab world. It had originated in 628AD in a book by Brahmagupta in India, but was ignored in Europe, mired deep in 1000yrs of ignorance (the Dark Ages). In Central America, the Olmecs were using 0 in the 4th century BC.

Greek mathematics was geometry, for which only positive numbers were needed (lengths, areas and volumes). For the longest time, no-one could handle the concept of less than nothing, much less multiply two negative numbers. Fibionacci (1225) bravely interpreted a negative number in a financial statement as a loss, but this effort was lost on mathematicians. For mathematicians, the idea of subtracting 5 apples from 3 apples was absurd (one of the names given to negative numbers). Bombelli (b 1530) assigned real numbers to the length of a line, with the 4 operators (+-*/) corresponding to movements along the line and that negative numbers were an extension of the line to the left. It was only when subtraction was recognised as being the inverse of addition, that negative numbers became part of geometry and hence math.

Negative numbers cause no problems for even the youngest people now. It's now obvious that they obey the same rules as positive numbers. However we must remember how much effort and time it took to understand what appears to us now as a trivial step.

The concept of π has been handled well since the beginning of time. However our current understanding of π as a transcendental number is only recent.

i=sqrt(-1) is a new concept. Without it, wave functions and much of modern physics and math cannot be understood. see: An Imaginery Tale, The Story of sqrt(-1), Paul J. Nahin 1998, Princeton University Press, ISBN 0-691-02795-1.

An example: what pair of numbers have a sum of 2, and a product of 1 ^[121] ?

What pair of number have a sum of 2, and a product of 2? This problem had no solution till recently. When mathematicians find problems which have a solution for some range of numbers, but not for other quite reasonable looking numbers, they assume that something is missing in mathematics, rather than the problem has no solution for those numbers. The solution to this problem came with the discovery that i=sqrt(-1) behaved quite normally under standard operations (+-*/). It took some time for i to be accepted as just another number, since there was initially no obvious way to represent it geometrically. In the initial confusion, numbers using i were called imaginary (or impossible) to contrast them with real numbers. The name imaginary is still in use and is most unfortunate: imaginary numbers are no less real than numbers which represent less than nothing.

The solution: (1+i, 1-i). Add these numbers and multiply them to see the result.

e is the basis for exponential functions. It is also the basis for natural logarithms. Logarithms to base 10 (called common logarithms) are better known. Logarithms are used to substitute addition for the more complicated process of multiplication. Logarithms were a great advance allowing engineering and math multiplications involved in orbital mechanics, surveying and construction. The first log tables were constructed by hand by Napier, who spent 20 years calculating the 10⁷ logarithms to 7 significant figures. Multiplication then involved looking up logs in a book, adding them and then converting back to the number using a table of anti-logarithms. Later the slide rule allowed the same calculations to be done in seconds, provided you could accept the reduced accuracy of 3-4 signicant figures. The slide rule was used to design all the spacecraft in the era of early space exploration (including ones with human crews), as well as bridges, jet engines, aeroplanes, power plants, electronics (in fact, anything needing a multiplication). It wasn't till the mid 1970's that hand held calculators could do a faster job of multiplying than a human using tables of logarithms. Very few engineering companies could afford a computer.

Why did anyone care about orbital mechanics ^[122] ? It's hardly a topic of conversation in our time.

The great numbers are required for the modern understanding of the physical world. If you're interested in math, a landmark in your progress will be understanding Euler's identity. which you should have by the time you leave high school. If not, you can hold your teachers accountable for failing in their job.

6.8. Compound Interest

Bank interest is usually put back into the account, where the interest now draws interest. This process of allowing interest to accrue interest is called compound interest.

Write code to calculate compound interest (call it compound_interest.py). You deposit a fixed amount at the beginning of the year, with interest calculated annually at the end of the year. Variables you'll need (with some initialising values) are

principal: $0
annual_deposit: $10,000
interest_rate: 5%
age_start: 18
age_retirement: 65

Let's do the calculation for only 1 year. Start with your initial principal, add your deposit, wait a year, calculate your interest and add it to your principal.

Here's my code ^[123] and here's my output

pip:# ./compound_interest.py
principal   10500.00

Put your interest calculation into a loop (do you need a for loop or a while loop), which prints out your age and principal at the end of each year, using the start and end ages which you've initialised. Here's my code ^[124] and here's my output

pip:/src/da/python_class/class_code# ./compound_interest.py
age 19 principal   10500.00
age 20 principal   21525.00
age 21 principal   33101.25
age 22 principal   45256.31
age 23 principal   58019.13
age 24 principal   71420.08
age 25 principal   85491.09
age 26 principal  100265.64
age 27 principal  115778.93
age 28 principal  132067.87
age 29 principal  149171.27
age 30 principal  167129.83
age 31 principal  185986.32
age 32 principal  205785.64
age 33 principal  226574.92
age 34 principal  248403.66
age 35 principal  271323.85
age 36 principal  295390.04
age 37 principal  320659.54
age 38 principal  347192.52
age 39 principal  375052.14
age 40 principal  404304.75
age 41 principal  435019.99
age 42 principal  467270.99
age 43 principal  501134.54
age 44 principal  536691.26
age 45 principal  574025.83
age 46 principal  613227.12
age 47 principal  654388.48
age 48 principal  697607.90
age 49 principal  742988.29
age 50 principal  790637.71
age 51 principal  840669.59
age 52 principal  893203.07
age 53 principal  948363.23
age 54 principal 1006281.39
age 55 principal 1067095.46
age 56 principal 1130950.23
age 57 principal 1197997.74
age 58 principal 1268397.63
age 59 principal 1342317.51
age 60 principal 1419933.39
age 61 principal 1501430.06
age 62 principal 1587001.56
age 63 principal 1676851.64
age 64 principal 1771194.22
age 65 principal 1870253.93

You're depositing $10k/yr. How much (approximately) does the bank deposit the first year, the last year ^[125] ? At retirement age, how much (approximately) did you deposit, and how much did the bank deposit ^[126] ? The bank deposited 3 times the amount you did, even though it was only adding 5% interest each year. The growth of money in a compound interest account is one of the wonders of exponential growth.

If you're going this route, the payoff doesn't come till the end (and you have to start early). Rerun the calculations for an interest rate of 5%, and starting saving at the age of 20,25,30,35,40.

Table 7. Final principal at retirement as a function of starting age

start age, yrs	principal at retirement, $
18	1,840,000.00
20	1,676,851.64
25	1,268,397.63
30	948,363.23
35	697,607.90
40	501,134.54

How long do you have to delay saving for your final principal to be reduced by half (approximately) ^[127] . There is a disadvantage in delaying earning by doing lengthy studies (e.g. a Ph.D) (you may get a lifestyle you like, but at a lower income on retirement). Rerun the calculations (at 5% interest) for a person who starts earning early (e.g. a plumber) and a Ph.D. who starts earning at 30, but because of a higher salary, can save $20k/yr.

Table 8. Principal at retirement: Plumber, Ph.D.

Plumber	Ph.D.
1,840,000.00	1,896,726.45

Another property of exponential growth is that small increases in the amount of interest, have large effects on the final amount. Rerun the calculations for an interest rate of 3,4,5,6,7%. Here's my results.

Table 9. Final principal at retirement at function of interest rate

interest rate, %	principal, $
3	1,034,083.96
4	1,382,632.06
5	1,840,000.00
6	2,555,645.29
7	3,522,700.93

If your interest rate doubles (say from 3% to 6%), does your final principal less than double, double, or more than double? People spend a lot of time looking for places to invest their money at higher rates. Investments which offer higher interest rates are riskier (a certain percentage of them will fail) and the higher interest rate is to allow for the fact that some of them fail (you are more likely to loose your shirt with an investment that offers a higher rate of return).

For your upcoming presentation, prepare a table showing your balance at retirement as a function of the age you started saving; show the amount of money you deposited and the bank deposited. Show the balance as a function of the interest rate. Be prepared to discuss the financial consequences of choosing a career as a plumber or a Ph.D.

6.9. Mortgage Calculation

While the above section shows that at Ph.D. and a plumber will be about even savings-wise at retirement, there is another exponential process going against the person who does long studies: the interest you but a house on a mortgage (rather than cash, which most people don't have). (The calculations below apply to other loans, e.g. student loans.)

First, how a mortgage works: In the US, people, who want to buy a house and who don't have the cash to pay for it (most of them), take out a mortgage (http://en.wikipedia.org/wiki/Mortgage). A mortgage is a relatively new invention (early 20th century) and was popularized by Roosevelt's New Deal (see Roosevelt's New Deal http://www.bestreversemortgage.com/reverse-mortgage/from-roosevelts-new-deal-to-the-deal-of-a-lifetime/)

Previously only rich people could buy a house; the rest rented. In places like US, renters were sentenced to a life of penury and beholden to odious landlords. In other places (e.g. Europe), owning a house was never seen as being particularly desirable, most people choose to rent and presumably the relationship of a renter with the owner (more often a Bank than a person) was more amicable than in the US. In Australia, people just expect that everyone should own their house (and they do), because that's just the way that it's supposed to be, so most people take out a mortgage as soon as they start earning. In (at least Sydney where I grew up) Australia, changing jobs doesn't require you to move. The city has a great train system, and if you change jobs, you just get off the train at a different stop. A person paying off a mortgage in Sydney won't have to sell their house when they get a new job. In the US, a change of job usually requires you to sell your house, and move. Most US towns are small, and there is only ever 1 job for any person. If you loose your job in the US, you have to move. This arrangement keeps people in jobs they hate. The US job system wrecks the main feature of a mortgage, which is that you don't get any benefit of buying a house on a mortgage, unless you own it for a while (this will be explained below).

The essential features of a mortgage are

The mortgagee (the person living in the house), has a loan from the mortgagor (the person who comes up with the money)
The mortgagee agrees to pay back the loan at a certain rate (faster payoff is also acceptable).
The mortgagor holds a lien on the house in case of default on the loan by the mortgagee
The agreement expires (is dead - mortgage=="dead pledge") on full payment of the loan. In the US the loan is usually paid off in 30yrs (or less often, in 15yrs).
In Switzerland and Australia, the mortgage period is 3-5yrs. In these systems, you only pay off a small amount of the mortgage, and you have to get a new mortgage at the end of the mortgage period. Why the system is different to the US system will be explained below.

Let's see how the US system works: The median house price in the US in 2008 is 250k$. A mortgage lending company expects a deposit of at least 5% of the house price (to show that you're capable of at least a token level of saving, the actual amount is 5-20% depending on the politics and the economics of the moment, but we'll use 5% for the calculations here), does a credit check on you (makes sure you've paid off your other loans, and that you have a steady job), checks that the house is in good condition and is worth the amount of the mortgage, and then after the mortgagee pays legal fees etc, the mortgage is issued and the mortagee gets a book of coupons to fill in, one each month and an address to send the monthly check (cheque) (usually the mortgagee arranges for their bank to do it automatically).

Let's say you've just scraped the 5% deposit and fees and have got a mortgage for 95% of 250k$ (23k75$) at 6% interest/year. If you don't pay off any principal (i.e. the loan), what will be your monthly interest ^[128] ? You don't want a mortgage like this (called a Balloon Mortgage) as you're not paying off any of the principal. You'll be paying interest forever. Let's say instead that you decide each month, to pay off $100 of principal along with the interest payment. What will be your first month's payment ^[129] ? If for the 2nd month you pay off another $100 of principal, and continue to pay interest on the debt at 6%/yr, will your second month's payment be less than, the same as, or more than the first month's payment ^[130] ?

A mortgage like this will have a slowly reducing payment each month, since you will be paying interest on a reduced amount of principal. Having a different payment each month is difficult for the mortgagee to track. The accepted solution is to keep the monthly payment constant and to let the accountants (or now computers) handle the varying amounts of principal and interest each month. Using this algorithm, the mortgagee pays the same amount ($1287) each month, but in the 2nd month has an interest payment lower by $6.44. The mortgagee instead puts the $6.44 towards payment of principal. Here's the calculations for the 2nd month with a constant monthly payment.

1st month:
principal at start          =237500.00
interest on principal at 6% =           1187.00
principal                   =            100.00
1st payment (end month)     =           1287.50

2nd month:
principal at start          =236212.50 
interest on principal at 6% =           1181.06
payment of $106.44 principal=            106.44
2nd payment (end month)     =           1287.50

This schedule (constant payments) has lots of advantages for both the mortgagor and mortgagee based on the following assumptions

the mortgagee starts off paying as much as they can afford
The mortgagee's (real) salary will increase with time, meaning that after an initial period of eating peanut butter for dinner, that the mortgagee's financial pressure of paying off the mortgage will decrease. Because of this, the mortgagee will be unlikely to default in the far future (the likelihood of default in the near future having been taken care of by the pre-mortgage vetting).

The value of the house will increase with time. It's unlikely that the mortgagee will still be in the house when the mortgage is paid off (they will move before then), and the mortgagor will be paid back out of the increase in value of the house.

Let's say that the house price in 2008 = $250k. In 2018, the house is worth $350k, while the mortgagee has paid off $110k in principal. Here's the mortgagee's accounts

	           mortgage liability                  house asset               cash (check/cheque account)
	              D        C                       D        C                  D       C
	           ------------------               ---------------             ---------------
2008 
   buy house               |  250k                    250k  |                          |

2008-2018
   payments          110k  |                                |                          | 110k
   incr. value             |                          100k  |  

2018 
   sell house              |                                | 350k                350k |      
   pay-off mortage   140k  |                                |                          | 140k   
	             -------------                    -------------               -----------
balance               0k   |                            0k  |                     100k |

In 10yrs the morgagee has paid off the mortgage and walked away with 100k$ in cash (the increase in the value of the house). However the mortgagor has got their money back in 10yrs (and can loan it out again) as well as earning 110k$ in interest. The financial people will slap the mortgagee on the back and congratulate him/her for having the business acumen to wind up with 100k$ in his/her hands, just by sitting there while the value of the house appreciated. While the mortgagor will have got a great deal out of this, the mortgagee, due to the increase in costs of houses, will likely have to pay $500k for their next house.

While this isn't a great bargain for the mortgagee, it's often better than the next best choice, which is to pay rent forever, which pays off the mortgage for the owner of the rental unit.

The things that (can) go wrong with a mortgage are

People's real salaries don't increase (at least they haven't in the US). Despite the technological advances, workers real salaries (the amount of time they need to work to buy say a loaf of bread, or a car) hasn't changed since the 50's or 60's. One would think with the mechanisation of just about every industry, that the farmer would need less time to produce the wheat for a loaf of bread (or can produce more wheat for the same time), and that cars, refrigerators etc can be produced with less time. As well people are better educated now and can be expected to be more productive.
Possible reasons for this stagnation in real salaries, and in disposable income (what you have left to spend after paying for essentials, like food, rent, medical care)
- The increased productivity is not being returned to the workers. The money could instead be sent to high paid executives (the ratio of salary of the top 10% of workers to the bottom 10% is steadily increasing);
- To pay for expensive factory equipment, whose short life nullifies the increase in productivity.
- People spend money on gadgets that didn't exist 50yrs ago (ipods, computers, wide screen TVs)

(Some) people don't use their money wisely. Rich people spend their money differently to poor people: a poor person will buy an expensive SUV or PickUp truck, whose value in 5-10yrs will be $0. A rich person, who has a lot more money, will buy an adequate car, knowing that it's value will quickly go to $0, and put the rest of their disposable income into some investment; a bigger house, education, lending money to other people for mortgages, stocks, art.

Back to the mortgage calculation: There is a formula for doing these calculations (see Mortgage Calculator http://en.wikipedia.org/wiki/Mortgage_Calculator). Rather than teach you the math neccessary to understand the derivation of the formula, we'll program the computer to do the calculations by brute force.

	Note
	Mortgage calculator webpages are everywhere on the internet.

Write code mortgage_calculation.py to include the following

The name of the program, the author, date, a brief description of the functions

initialise the following variables (you can swipe this with your mouse):

	Note
	there's no magic to where these variables come from. You start with the two obvious variables, `initial_principal` and `interest`, and start writing code. As you need more variables, you add them to the initialisation section of the code (called bottom up programming) (see ???). You could of course sit down and think about the code and figure out all the variables you need from scratch (top down programming). For a piece of code this size (i.e. small) you would do it by bottom up programming. Sometimes you'll have to rename variables you've already declared (there'll be a namespace collision; you'll want to use the same name in two places). You have to sort this out on the fly by using an editor with a simple find/replace capability.

initial_principal = 250000.0    #initial value, will decrease with time
principal_remaining = initial_principal
principal_payment = 0.0

interest  = 0.06                #annual rate
interest_payment = 0.0          #monthly interest
total_interest_paid = 0.0       #just to keep track (in the US, if you itemise deductions,
	                        #the interest on a mortgage for your primary residence is tax deductable),

initial_principal_payment = 100.00        #This is the initial value of the monthly principal payment
time_period = 0

Next calculate the monthly_payment. This is fixed for the life of the mortgage. How do you calculate this ^[131] ? Check that your code runs.

print a column header using this code

#print column headers
print " time    princ. remaining     monthly_payment   principal payment    interest payment      total int paid"

Use this piece of code whenever you need to print out the monthly balances.

#print balances at the beginning of the mortgage
print "%5d %19.2f %19.2f %19.2f %19.2f %19.2f" %(time_period, principal, monthly_payment, principal_payment, interest_payment, total_interest_paid)

write code to show the balances at the beginning of the mortgage. Run your code.
write code to show the balances at the end of the first month. How do you calculate principal_payment for each month (hint ^[132] )? Run your code.
Also print out the total price paid. This is (total_interest_paid + initial_principal - principal_remaining) but there's a simpler way (hint ^[133] ).

Here's my code ^[134] and here's my output

dennis: class_code# ./mortgage_calculation.py 
 time    princ. remaining     monthly_payment   principal payment    interest payment      total int paid
    0           250000.00             1350.00                0.00                0.00                0.00
    1           249900.00             1350.00              100.00             1250.00             1250.00

Copy mortgage_calcultion.py to mortgage_calculation_2.py. Add/do the following checking that the program produced sensible output at each stage.

Put the code that calculated the balances at the end of the first month into a loop that continues till there is no principal left to pay (is this a for or while loop?).
Initially print out the results of each iteration, then change your code to only output at the end of each year.
Change initial_principal_payment so that the mortgage is paid out in 30 yrs (doesn't have to be exact, you can stop iterating when you have a balance of less than half a payment).
Add an extra column which prints out the total amount of money the mortgagee pays out.
At the top of the output (before the monthy stats), output the initial principal payment and the annual interest (as a number e.g. 0.06, or as a percent e.g.6%).

At the end, print out a line like this

interest 0.0700, monthly_payment 2248.33, initial_principal_payment 790.00, ratio (total cost)/(house price) 1.6188

Here's my code ^[135] and here's my output

dennis: class_code# ./mortgage_calculation_2.py
 initial_principal_payment              248.88 annual_interest 0.0600
 time    princ. remaining     monthly_payment   principal payment    interest payment      total int paid       total payment
    0           250000.00             1498.88                0.00                0.00                0.00                0.00
   12           246929.97             1498.88              262.91             1235.96            14916.49            17986.51
   24           243670.60             1498.88              279.13             1219.75            29643.62            35973.02
   36           240210.19             1498.88              296.34             1202.53            44169.72            53959.54
   48           236536.35             1498.88              314.62             1184.25            58482.40            71946.05
   60           232635.91             1498.88              334.03             1164.85            72568.47            89932.56
   72           228494.91             1498.88              354.63             1144.25            86413.98           107919.07
   84           224098.50             1498.88              376.50             1122.37           100004.08           125905.58
   96           219430.92             1498.88              399.72             1099.15           113323.02           143892.10
  108           214475.46             1498.88              424.38             1074.50           126354.07           161878.61
  120           209214.36             1498.88              450.55             1048.32           139079.48           179865.12
  132           203628.77             1498.88              478.34             1020.54           151480.40           197851.63
  144           197698.67             1498.88              507.84              991.03           163536.81           215838.14
  156           191402.81             1498.88              539.17              959.71           175227.47           233824.66
  168           184718.64             1498.88              572.42              926.46           186529.81           251811.17
  180           177622.20             1498.88              607.73              891.15           197419.88           269797.68
  192           170088.07             1498.88              645.21              853.67           207872.26           287784.19
  204           162089.25             1498.88              685.00              813.87           217859.96           305770.70
  216           153597.09             1498.88              727.25              771.62           227354.30           323757.22
  228           144581.14             1498.88              772.11              726.77           236324.87           341743.73
  240           135009.11             1498.88              819.73              679.14           244739.35           359730.24
  252           124846.70             1498.88              870.29              628.58           252563.45           377716.75
  264           114057.49             1498.88              923.97              574.91           259760.76           395703.26
  276           102602.83             1498.88              980.96              517.92           266292.61           413689.78
  288            90441.67             1498.88             1041.46              457.42           272117.96           431676.29
  300            77530.43             1498.88             1105.70              393.18           277193.23           449662.80
  312            63822.86             1498.88             1173.89              324.98           281472.18           467649.31
  324            49269.84             1498.88             1246.30              252.58           284905.66           485635.82
  336            33819.22             1498.88             1323.16              175.71           287441.55           503622.34
  348            17415.63             1498.88             1404.77               94.10           289024.48           521608.85
  360                0.31             1498.88             1491.42                7.46           289595.67           539595.36
 ratio (total cost)/(house price) 2.1644

A few things to notice

How much does your 250k$ house cost you at 6% interest ^[136] ?
How long does it take for you to half own your house at 6% interest ^[137] ?
The average US house owner is in their house 5-7yrs, moving 11 times in their life (see The Evolution of Home Ownership http://www.homeinsight.com/details.asp?url_id=7&WT.cg_n=Publications&WT.cg_s=0&GCID=bhph1). How much of your house do you own after 5yrs, how much interest have you paid ^[138] ? America (well business interests) take great pride in the mobility of its work force. People are prepared to move towns for better jobs (or be faced with having no job at all). If you move every 5-7yrs, all you're doing is paying interest to a mortgage company and not building up equity. You should move to a place where, if you have to change jobs every 5yrs, you won't have to move. This means a big city with a good public transport system and an economy that has stable jobs.

Run the code, changing the interest rates (5,6,7,8%) and the monthly payment, to get a mortgage of 30,15,10 and 5yrs. Find the cost/month and total price paid/price of house at purchase. Here's my results.

Table 10.

Mortgage Calculations, 250k$ house: interest rate, loan period

monthly payment, initial principal payment, (total price)/(value of house)

interest rate,%	30yr	15yr	10yr	5yr
3	$1050, $425, 1.53	$1725, $1100, 1.25	$2405, $1780, 1.16	$4475, $3850, 1.09
4	$1193, $360, 1.72	$1843, $1010, 1.33	$2533, $1700, 1.22	$4633, $3800, 1.11
5	$1341, $300, 1.93	$1976, $935, 1.43	$2651, $1610, 1.27	$4716, $3675, 1.15
6	$1500, $250, 2.16	$2110, $860, 1.52	$2776, $1526, 1.33	$4834, $3584, 1.16
7	$1663, $205, 2.39	$2248, $790, 1.62	$2903, $1445, 1.39	$4950, $3492, 1.19
8	$1834, $168, 2.64	$2390, $723, 1.72	$3033, $1367, 1.45	$5066, $3400, 1.24

	Note
	For some perspective: In the US, a good rate for 30yr mortgage, in the 1990-2000's has been 6%. In the US, the mortgage rates during the stagflation of the 1980s was upto 14%. Short term mortgages (3-5yrs) are common in Switzerland and Australia, but are not the norm for standard house buyers in the US. At the end of a short term mortgage, you expect to take out another mortgage, at a different rate, till you pay off your house. In the US, car loans are for about 5yrs. In this case, divide all the above numbers by a factor of 10 (or so). The problem with taking out a car loan, it that the value of the car drops by 25% just driving it out of the car lot. You'll owe money on selling the car, right till the end of the car loan.

A few things to note (in the US a typical house mortgage is 30yrs, a typical car loan is 5yrs)

For long loans (30yrs), most of your payments are interest (at 8%, you pay in interest an extra 165% of the price of the loan)
For short loans (5yrs), most of your payments are principal (at 8%, you pay in interest an extra 25% of the price of the loan)
Because most of the payments for a long loan are interest, the total payment is strongly affected by the interest rate. For a loan of 30yrs, increasing the interest from 5 to 8% (an increase in interest rate of 60%) increases the total payment by 70%.
For a short loan (5yrs), because most of your payments are principal, increasing the interest rate by 60%, from 5 to 8%, only increases the total payment by 9%.
You don't have a lot of choice about the interest rate. The interest rate varies as a function of the health of the economy. Your only choice (if you think the interest rate is too high) is to keep renting.

I've been the mortgagee on several loans. It was obvious that as the interest rate goes up, the monthly payment goes up, but I didn't realise that the initial principal payment goes down. Here's the data for two 30yr loans, the first at 5%, the second at 8%

	                 5%                                    8%
 time    monthly_payment   principal payment   monthly_payment   principal payment 
    0            1341.67                0.00           1833.67                0.00       
   12            1341.67              314.04           1833.67              179.66 
   24            1341.67              330.11           1833.67              194.58
   36            1341.67              347.00           1833.67              210.72
   48            1341.67              364.75           1833.67              228.21
   60            1341.67              383.41           1833.67              247.16
   72            1341.67              403.03           1833.67              267.67
   84            1341.67              423.65           1833.67              289.89
   96            1341.67              445.32           1833.67              313.95
  108            1341.67              468.10           1833.67              340.00
  120            1341.67              492.05           1833.67              368.23
  132            1341.67              517.23           1833.67              398.79
  144            1341.67              543.69           1833.67              431.89
  156            1341.67              571.51           1833.67              467.73
  168            1341.67              600.74           1833.67              506.55
  180            1341.67              631.48           1833.67              548.60
  192            1341.67              663.79           1833.67              594.13
  204            1341.67              697.75           1833.67              643.44
  216            1341.67              733.45           1833.67              696.85
  228            1341.67              770.97           1833.67              754.69
  240            1341.67              810.42           1833.67              817.33
  252            1341.67              851.88           1833.67              885.16
  264            1341.67              895.46           1833.67              958.63
  276            1341.67              941.27           1833.67             1038.20
  288            1341.67              989.43           1833.67             1124.37
  300            1341.67             1040.05           1833.67             1217.69
  312            1341.67             1093.26           1833.67             1318.76
  324            1341.67             1149.20           1833.67             1428.22
  336            1341.67             1207.99           1833.67             1546.76
  348            1341.67             1269.80           1833.67             1675.14
  360            1341.67             1334.76           1833.67             1814.17

While at the higher interest rate, the initial payment of principal is about 50% lower, at the end of the loan, you're paying about 50% more in principal each month. The 5% loan is half paid out at 20yrs, while the 8% loan is half paid out in 22yrs.

If you're in your house for 5-7yrs (let's say 6yrs, and let's say you're paying at a historically low interest rate of 6%), how much equity do you have in the house (i.e. how much of the house have you paid off, how much do you own) and how much interest have you paid ^[139] ? When you sell the house, you only keep 20% (22/(22+86)) of the money you paid out. The rest goes to the mortgage company in interest. If you stay in your house for only 6yrs, you're not much better off than if you were renting. (You can assume that the rental price is about 80% of the monthly mortgage rate. The quantity measured is called the "price to rent ratio" and is the number of months of rent that you would have to pay to own the house.)
Moving is an expensive proposition: you have to pay real estate agent fees, down time while you move, the expense of moving your possessions and then there's the personal disruption of finding a new doctor, dentist, car mechanic, school(s), friends, people who have similar interests/hobbies. It takes about 2yrs to recover from a move. People living in a society where they (have to) move every 5-7 yrs are working to enrich the banks, moving companies and real estate agents. And what for? Was there a real good reason to move for a job? Why not have jobs in the place you're living? The rest of the world (outside the US) doesn't find this difficult to arrange. In those countries the citizens pay (via taxes) for train systems, so that they don't have to move if the new job is across town. This is a lot cheaper and less disruptive than moving. Why should you have to go to a new school because one of your parents got a new job?
You should live in a society, where workers benefit from their earnings, rather than having to pass it on to other people. Requiring people to move is good for the economy (lots of money is spent), but not good for the person: there is no wealth creation.

How does the interest paid on a mortgage make life easier for the plumber than for the Ph.D? The Ph.D. starts earning later (30yrs old) and is doing well to have paid off their house by the time they retire. Usually the Ph.D. has to move several times, and spends a lot of time paying off interest early in each mortgage. The plumber (hopefully) has a stable business, doesn't move and starts earning when they're 20. The plumber will own his house well before he retires.

Why is it possible to get a 30yr fixed rate mortgage in USA, whereas in other countries, 3-5yrs is the norm? How can banks in USA forecast the country's economy for 30 years into the future and know that they'll make a profit from the loan, whereas no bank in the rest of the world is prepared to make a loan for more that 3-5yrs? The answer is that the banks in USA are no more capable of predicting the economic future than anyone else. In USA, the banks know that the average person stays in their house for 5-7yrs and then has to move. Almost no-one in USA stays in their house for 30yrs. The 30yr fixed rate mortgage is really a 5-7yr mortgage. In the rest of the world, people don't have to move when their job changes, and the banks can't write mortgages for longer than they can forecast the state of the economy.

Similarly, in the US, credit card companies don't have annual fees on their standard credit cards. Why not? Enough people don't pay off their credit cards at the end of the month, incurring interest at rates of upto 37%. The interest on unpaid balances is enough to keep the credit card companies in good financial shape. In the rest of the world, people pay off their balance every month as so credit card companies charge annual fees for the credit card.

With people not trusting other forms of investment (e.g. the stock market), the pressure to buy a house forces up the prices of houses. If for any reason (lack of credit, downturn in the economy) people stop buying houses, then the price of houses crashes.

6.10. Refinancing a Mortgage

With a 30 year fixed rate mortgage being so long, you might find during the time of the mortgage, that the interest rate for the new loans has dropped and is less than the rate you're paying for yours. In this case, you can do what's called a refinance. You get a new mortgage for the remaining amount of principal. This mortgage pays off your old mortgage, so the previous lender gets their money, and you get to pay out on the new mortgage at a lower rate.

Sounds great huh? The wrinkle is fees. You have to put up some money to get a mortgage and then to get a new mortgage. Someone (your mortgage broker) has to check that your house is in good condition, that it's worth at least the principal left on the current mortgage, that you are still in good financial standing, go register deeds and do a whole lot of paper work. (You have to sign a whole lot of paper work too. It takes a couple of hours in a lawyer's office, but they do all the work, you just sign and pay the money.) So while you may be getting a better deal with your refinancing of the mortgage, it will cost you money. Let's find out if it's worth refinancing.

Let's say you've been in your house for 5 yrs (i.e. you have 25 yrs left on your current mortgage at 5.25%) and the interest rate drops to 4.75%. The balance left on your mortgage is 160k$ and fees on a refinance are $3,200. Should you refinance?

Some quick numbers: with a decrease in interest of 0.5%, how much will you save in your first year ^[140] ? You save about $70/month. How long will it take for your new loan to pay off the fees ^[141] ? You've given your mortgage broker $3200 and you're going to get it back through a lower interest rate at $800/yr. If you aren't going to be in your house for at least 4yrs, you've lost money.

Let's say you're going to be in your house for at least 4yrs. Lets do the numbers more closely. The mortgage broker will be giving you a new 30yr mortgage at 4.75%. Let's compare the total amount of money you'd pay in the remaining time (25yrs) on your current 160k$ mortage at 5.25% and the total amount you'd pay on the new lower interest (4.75%) mortgage for 30yrs. The usual thing to do with the fees is to add them to the loan. This assumes you'd rather keep your $3,200 and invest it in something that gets a better return than 4.75% and instead you borrow $3,200 from the mortgagee and pay it back at 4.75%. So your 4.75% loan is for 163k$. Run your mortgage calculator (get the principal right to the nearest $). Here's my results

Table 11. Comparison of 4.75% 30yr and 5.25% 25 yr mortgage on 160k$

Mortgage amount	interest	time	payments initial princ.	monthly	total
160k$	5.25%	25yr	$259	$959	288k$
163k$	4.75%	30yr	$205	$850	306k$
savings				$110	-18k$

You're saving you $110/mo. This is not too bad, but look at the last column, the one that you're interested in if you're going to stay a long time (which you should if you buy a house). The house is going to cost you 18k$ more (some of that is the extra 3k$ in fees, plus interest). You've got a short term benefit in exchange for a long term loss. Why? You've forgotten that the longer the mortgage the more you pay in interest. You're comparing a 25yr loan with a 30yr loan. The 30yr loan will cost you more, almost no matter what the interest rate. You would have noticed a drop in monthly payments if you'd changed from a 25yr mortgage to a 30yr mortgage at the same interest rate. You have to compare mortgages of the same period. The only way you're going to get ahead with refinancing is if you keep the same period or pay at the same rate as before. The mortgage broker can only offer you a 30yr loan (not a 25yr loan), but you're allowed to pay off extra money any time you want. You can set your monthly payments to have the mortgage finish in 25 yrs if you like.

Instead look at the results if you pay the new mortgage for 25yrs or you start paying at $259/mo. Note the results are different if you're looking for short term benefit (monthly payment) or long term benefit (total cost of the house).

Table 12. Comparison of 5.25% 25 yr mortgage for 160k$ with refinanced mortgage at 4.75%

Mortgage amount	interest	time	payments initial princ.	monthly	total
160k$	5.25%	25yr	$259	$959	288k$
163k$	4.75%	26yr	$259	$904	282k$
163k$	4.75%	25yr	$284	$929	278k$
savings (best long term)				$30	10k$

You're paying 10k$ less for the house, which is good, but notice the short term gain isn't much. You're saving $30/mo: it will take how long to pay off the refinancing fees ^[142] ? If you're looking for a long term advantage when refinancing, don't expect it to help in the short term.

A millstone around your neck here is the fees you paid the mortgage broker. You can pay extra into the mortgage any time you like. How about instead of refinancing, you just put the $3200 towards your principal? Look at your 25yr 5.25% mortgage reduced by 3k$

Table 13. Comparison of 5.25% 25 yr mortgage for 160k$ with same mortgage reduced by 3k$

Mortgage amount	interest	time	payments initial princ.	monthly	total
160k$	5.25%	25yr	$259	$959	288k$
157k$	5.25%	24+yr	$259	$945	272+8=280k$
savings				$14	8k$

After depositing 3k$, your mortgage ends in middle of the last year. At the end of the 24th year, you've paid out 272k$ and have another 8k$ to pay. Paying off 3k$ now, saves you 8k$ over the life of the mortgage. How did 3k$ become 8k$? Calculate the result of 25yrs of 5.25% interest on 3k$ ^[143] . I don't know why one answer is 11k$ and the other answer is 8k$. Rerun the calculation to find the interest rate that gives 8k$ principal after 25yrs ^[144] . A balance of 8k$ corresponds to an interest rate of 4%. At 0.5% drop in interest rates, there's only a 2k$ difference between refinancing and putting the refinancing fees into your current mortgage.

Here's the summary

refinance 0.5% lower for 30yrs
- short term benefit: $110/mo, recoup fees in 4yrs
- long term loss: 18k$ more in total payment
refinance 0.5% lower, pay at 25yr rate
- short term benefit: $30/mo, recoup fees in 9yrs
- long term benefit: 10k$ lower total payment
do not refinance, put 3k$ fees into principal in current mortgage
- short term benefit: none
- long term benefit: 8k$ lower total payment

The rule of thumb is that you don't benefit from a refinance unless the interest rate drops 1%. Whether this rule applies to you, depends on whether you expect to be out of the house in the short term or still in it when the mortgage is paid out.

6.11. Moore's Law: effect on calculation of e=(1+1/n)^n

	Note
	I wish to thank my co-worker, Ed Anderson, for the ideas which lead to this (and the next few) sections.

In 1965 Gordon Moore observed that starting from the introduction of the integrated circuit in 1958, that the number of transistors that could be placed inexpensively on an integrated circuit had doubled about every 2yrs. This doubling has continued for 50yrs and is now called Moore's Law (http://en.wikipedia.org/wiki/Moore%E2%80%99s_law). Almost every measure of the capabilities of digital electronics is linked to Moore's Law: processing speed, memory capacity, number of pixels in a digital camera, all of which are improving at exponential rates. This has dramatically improved the usefullness of digital electronics. A (admittedly self serving) press release says that the new Cray at the CSC Finnish Center for Science (http://www.marketwire.com/press-release/Cray-Inc-NASDAQ-CRAY-914614.html) will take 1msec to do a calculation that the center's first computer, ESKO, installed half a century earlier, would have taken 50yrs to complete.

The conventional wisdom is that Moore's Law says that the speed of computers doubles every 2yrs, but Moore didn't say that; Moore's Law is about the number of transistors, not speed. Still the speed of computers does increase at the Moore's Law rate. What Moore is describing is the result of being able to make smaller transistors. A wafer of silicon has certain irreducible number of crystal defects (e.g. dislocations). Multiple copies (dozens, hundreds, thousands, depending on their complexity) of integrated circuits (ICs) are made next to each other in a grid on a silicon wafer. Any IC that sits over a crystal defect will not work and this will reduce the yield (of ICs) from the wafer driving up costs for the manufacturer. Since on average the number of defects in a wafer is constant, if you can make the ICs smaller, thus fitting more ICs onto the wafer, then the yield increases.

Transistors and ICs are built onto on a flat piece of silicon. What is the scaling of the number of transistors (or ICs) if you reduce the linear dimensions of the transistors by n ^[145] ? The thickness (from top to bottom looking down onto the piece of silicon) decreases by the factor of n too, but since you can't built transistors on top of each other (well easily anyway) you don't get any more transistors in this dimension.

One consequence of making ICs smaller, is that the distance that electrons travel to the next transistor is smaller. With the speed of light (or the speed of electrons in silicon) fixed, the speed of the transistors increased. As well the capacitance of gates was smaller, so that less charge (less electrons) was needed to turn them off or on, increasing the speed of the transition. Silicon wasn't getting any faster, just the transistors were getting smaller. Manufacturers were driven to make smaller transistors by the defects in silicon. The increase in speed was a welcome side effect.

Let's return to the numerical integration method for calculating π to 100 places, which for a 1GHz computer would take about 10^100-9=91secs or 10⁷⁴*ages of the Universe, and see if Moore's Law can help. Let's assume that Moore's Law continues to hold for the indefinite future (we can expect some difficulties with this assumption; it's hard to imagine making a transistor smaller than an atom) and that for convenience the doubling time is 1yr. Now at the end of each year (31*10⁶=10^7.5secs), instead of allowing the calculation to continue, you checkpoint your calculation (stop the calculation, store the partial results) and take advantage of Moore's Law, by transferring your calculation to a new computer of double the speed. How long does your calculation take now?

Write code (moores_law_pi.py) that does the following

Has the name of the file, the author, and a brief description of the purpose
Initiallises the following variables
- year = 10**7.5 #secs
- doubling_time = year
- time = 0
- iterations_required = 10**100 #iterations
- iterations_done = 0 #iterations
- initial_computer_speed = 10**9 #iterations/sec (it's actually clock speed. iterations/sec will be down by 10-100 fold, assuming 10-100 clock cycles/iteration. This is close enough for the answer here.)
- computer_speed = 0 #iterations/sec
have the code calculate the number of iterations done in the first year and print out the time (in years) and the number of iterations done. Use %e to print out the number of iterations in scientific format.

Here's my code ^[146] and here's the output.

pip:# ./moores_law_pi.py
elapsed time (yr)     1, iterations done 3.162278e+16

You now want your code to loop, with the speed of the machine doubling each year.

will this be a while or a for loop?
Will you have a loop variable? How will you update time?
how will you test that the loop should exit?
what code will be in the loop? (You have to update something, calculate something, store and print out the result.)
how will you change the speed of the computer each year?

Here's my code ^[147] and here's the last line of my output.

elapsed time (yr)   278, iterations done 1.535815e+100

This is a great improvement; instead of 10⁷⁴ ages of the Universe, this only takes 278yrs. You can easily finish the calculation in the age of a single Universe and have plenty of time to spare.

50% more calculations were done than asked for, because we only check at the end of each year and we reached our target part way through the year. Assuming that the calculation (i.e. 10¹⁰⁰ iterations), finished exactly at the end of the year, when was the job half done ^[148] ? If you'd started the job at the end of the 278th year, how long would the job have taken ^[149] ?

6.12. Optimal Slacking and the Futility of Interstellar Space Travel

If you have a process whose output increases exponentially with some external variable, then you don't start till the latest possible moment. In The Effects of Moore's Law and Slacking on Large Computations (http://arxiv.org/pdf/astro-ph/9912202) Gottbrath, Bailin, Meakin, Thompson and Charfman (1999) shows that if you have a long calculation (say 3 yrs) and in 1.5yrs time you will have computers of twice the speed, then it would be better from a productivity point of view to spend the first 1.5yrs of your grant surfing in Hawaii, and run the calculation at twice the speed in the 2nd 1.5yrs.

Boss: "Gottbrath! Where the hell are you and what are you doing?"

Gottbrath: "I'm working sir!"

If you purchase a computer with speed as a requirement, you don't purchase it ahead of time (like 6 months or a year ahead) as it will have lost an appreciable fraction of its Moore's Law speed advantage by the time you put it on-line. With government computer procurements taking a year or so, there is no point in comparing computers with a 25% speed difference; they will lose half their speed while waiting for the procurement process to go through.

Interstellar space travel will take centuries, while back on earth there will be exponential improvements in technology. The result of this will be that the first party to leave on an interstellar trip will be overtaken in 100yrs time by the next party. Figure 1 in the Gottbrath paper shows the effect of delaying starting a job, when the computers double in speed every 18months (Moore's Law time constant). The jobs starting later all overtake the early jobs. This graph could be relabled "the distance travelled through space by parties leaving earth every 5 months", using the exponentially improving (doubling every 18mo) space travel technology. The curves all cross eventually. The first party to arrive at the interstellar destination will be the last to leave. Thus the futility of interstellar space travel.

6.13. Geometric Progressions: Summing the series

Since this is a computer class, I had you write a computer program to calculate the time taken for the Moore's Law improvement on the π calculation. However you could have done it your head (a bit of practice might be needed at first). The speed of the Moore's Law computer as function of time is a Geometric Progression (http://en.wikipedia.org/wiki/Geometric_sequence). Example geometric progressions:

1,2,4,8,16,64

1,0.5,0.25,0.125...

A geometric series is characterised by the scale factor (the initial value, here 1) and the common ratio (the multiplier) which for the first series is 2, and for the 2nd series is 0.5. Geometric progressions can be finite, i.e. have a finite number of terms (the first series) or infinite, i.e. have an infinite number of terms (the second series). When the common ratio>1, the sum of an infinite series diverges (exponential growth), i.e. the sum becomes infinite; when the common ratio<1, the sum of an infinite series converges (exponential decay), i.e. the sum approaches a finite number.

example: exponential growth: What is the sum of the finite series 1+2+4+...+128 (hint: reordering the series to 128+64+...+1=11111111₂=0xff=ffh) ^[150] ? What's the sum of the finite series: 2⁰+ 2¹+ 2²+...+ 2³¹ ^[151] ? What is the sum of the finite series 10⁰+10¹+10² ^[152] ?

example: exponential decay: (from rec.humor.funny http://www.netfunny.com/rhf/jokes/08/Nov/math_beers.html) An infinite number of computer programmers go for pizza. The first programmer asks for a pizza, the next programmer asked for half a pizza, the next for a quarter of a pizza... The person behind the counter, before waiting for the next order yells out to the cook "x pizzas!" (where x is a number). What was the value of "x" (what is the sum of the infinite series 1+0.5+0.25+...) ^[153] ? The question rephrased is "what is the value of 1.111111..₂?"

Let's derive the formula for the sum of a geometric series

sum = ar^0 + ar^1 + ar^2 + ... + ar^n

where r=common ratio (and r⁰=1); a=scaling factor. Multiplying all terms by r and shifting the terms one slot we notice

  sum = ar^0 + ar^1 + ar^2 + ... + ar^n
r*sum =        ar^1 + ar^2 + ... + ar^n + ar^(n+1)

Can you do something with these two lines to get rid of a whole lot of terms (this process is called telescoping)? Hint ^[154] . This leads to the following expression for the sum of a geometric series (for r!=1).

sum = a(r^(n+1)-1)/(r-1)

(For r<1, you can reverse the order of the two terms in each of the numerator and denominator.) Using this formula what is the sum of the finite series 1+2+4+8+...+128 ^[155] ?

From the formula, what is the sum of the infinite series 1+1/10+1/100+ ... ^[156] ? Note: the sum can be written by inspection in decimal as 1.111.... although it's not so obvious that this 1/0.9.

From the formula, what is the sum of the infinite series 1+2/10+4/100+ ... ^[157] ?

From the formula, show that the sum of an infinite series, with (r<1) is always finite ^[158] .

6.14. Geometric Progressions: chess and the height of a pile of rice

	Note
	The version of the story quoted here comes from George Gamow's book "One Two Three...Infinity". The book came out in 1953 and is still in print. It's about interesting facts in science and I read this book in early high school (early '60's). In 2008 I still find it interesting. When I first looked up this story, I saw it in the book on π (and not from Gamow's book), where I found that the grain involved was rice and I did my calculations below on the properties of rice. It seems that the original story was about wheat. Rather than redo my calculations, I've left the calculations based on the properties of rice (the answer is not going to be significantly different).

According to legend, the inventor of chess, the Grand Vizier Sissa Ben Dahir was offered a reward for his invention, by his ruler King Shirham of India. The inventor asked for 1 grain of rice for the first square on the board, 2 grains of rice for the 2nd square on the board, 4 grains for the 3rd square... The king, impressed by the modesty of the request, orders that the rice be delivered to the inventor. (For a more likely history of chess see history of chess http://en.wikipedia.org/wiki/Chess#History.) How many grains of rice will the inventor receive ^[159] ?

If the Royal Keeper of the Rice counts out the grains at 1/sec, how long will he take (in some manageable everyday unit, i.e. not seconds) ^[160] ? Clearly the inventor would prefer not to have to wait that long, but understanding exponential processes, it's likely that he'll suggest to the Royal Keeper of the Rice that for each iteration, that an amount of rice double that of the last addition be added to the pile (this is not part of the legend). How long will it take the Royal Keeper of the Rice to measure out the rice now ^[161] ? The lesson is that if you handle exponential processes with linear solutions, you will take exponential time (your solution will scale exponentially with time); if you have exponential solutions, you will take linear time (your solution will scale linearly with time). (Sometimes exponential solutions are available e.g. the factorial series for π, sometimes they aren't.)

How high is the pile of rice? First we'll calculate the volume. To find the height, we need to know the shape of the pile. The rice will be delivered by the Royal Conveyor Belt and will form a conical pile. The formula for the volume of a cone is V=πr²h/3. The only variable for a cone of grains, poured from one point above, is the angle at the vertex, which will allow us to find the height of the pile of known volume.

The height of the pile can be done as a back of the envelope calculation. It doesn't have to be exact: we just need to know whether the answer is a sack full, a room full, or a sphere of rice the size of the earth. We have the number of grains; we want the volume. There's some more data we need. We can lookup the number of grains in a bag of rice. Rice is sold by weight. Knowing the weight, we can calculate the number of grains/weight. However we want the number of grains/volume, so we need the weight/volume (the density) of rice. Going on the internet we find the missing numbers

number/weight. There are 29,000 grains of long grain rice in a pound (http://wiki.answers.com/Q/How_many_grains_of_rice_are_in_a_one_pound_bag_of_long_grain_rice). A pound is about 0.5kg, so that's 60kgrains/kg.
weight/volume: From Mass, Weight, Density or Specific Gravity of Bulk Materials (http://www.simetric.co.uk/si_materials.htm) the relative density (specific gravity) of rice is 0.753. Density relative to what? Water. Water (by definition) has a density of 1kg/litre=1000kg/cubic metre. The density of rice then is 753kg/m³ (a bit less than a ton/m³). Objects, with a density less than water, float (in water), while objects with a density more than water will sink. As a check on this value for the density, does uncooked rice sink or float when you throw it into a saucepan of water? What's going on? Here's my explanation ^[162] .

We want volume/number; we have the weight/volume (let's say relative density is 0.75) and the number/weight (60,000/kg). How do we get what we want? Using dimensional analysis, the dimensions on both sides of the equation must be the same.

volume/number= 1.0/((number/weight)*(weight/volume))	#the dimension of weight cancells out
units?       = 1.0/( number/kg     * kg/m^3 )
	     = 1.0/( number/m^3 )
	     = volume/number

volume/grain = 1.0/(60000*750)

The units of our answer are going to be in the units of the quantities on the right, i.e. m^3/grain
pip:# echo "1.0/(60000*750)" | bc -l
.00000002222222222222	#m^3/grain

Let's turn this into more manageable units, 
by multiplying by successive powers of 10^3 till we get something convenient.

pip:# echo "1.0*10^9/(60000*750)" | bc -l
22.22222222222222222 	#nm^3/grain (nano cubic metres/grain)

Can we check this number to make sure we haven't gone astray by a factor of 10³ or so? What is the side of a cube of volume 1 n(m³) (this is a cube of volume 10^-9 cubic metres, not a cube of side 1nm=10^-9m.)? The length of the side is cuberoot(10^-9m³)=10^-3m=1mm. A rice grain, by our calculations, has a volume of 22 cubes each 1mm on a side. Does this seem about right? We could make the volume of a grain of rice, by lining up 22 of these volumes in a line (11mm*1mm*1mm) or 20 of them by lining up 5mm*2mm*2mm. This is about the size of a grain of rice, so we're doing OK so far.

We've got 2⁶⁴ of these grains of rice. What's the volume of the rice ^[163] ? What's the weight of this amount of rice using density=0.75tons/m³ ^[164] ?

The next thing to be determined is the shape of the cone. Mathematicians can characterise a cone by the angle at the vertex. For a pile of rice (or salt, or gravel), in the everyday world, the angle that's measured is the angle the sloping side makes with the ground (called the angle of repose http://en.wikipedia.org/wiki/Angle_of_repose). If the rice is at a steeper angle, the rice will initiate a rice-slide, restoring the slope to the angle of repose. A talus (scree) slope is a pile of rocks at its angle of repose. A talus slope is hard to walk up, as your foot pressure pushes on rocks which are at their angle of repose. Your foot only moves the rocks down the slope and you don't get anywhere (see the photo of scree http://en.wikipedia.org/wiki/Scree). An avalanche is caused by the collapsing of a slope of snow at its angle of repose. A sand dune (http://en.wikipedia.org/wiki/Dune, look at the photo of Erg Chebbi) in the process of building, will have the downwind face at the angle of repose for sand (this will be hard to walk up too).

Rice still with its hulls forms a conical pile with an angle of repose of about 45° (for image see rice pile crop http://www.sas.upenn.edu/earth/assets/images/photos/rice_pile_crop.jpg). For hulled rice (which the inventor will be getting) the angle of repose The Mechanics and Physics of Modern Grain Aeration Management is 31.5° or Some physical properties of rough rice (Oryza Sativa L.) grain. M. Ghasemi Varnamkhastia, H. Moblia, A. Jafaria, A.R. Keyhania, M. Heidari Soltanabadib, S. Rafieea and K. Kheiralipoura 37.66 and 35.83°.

For convenience, I'm going to take an angle of repose for rice of 30°. What is the relationship between the radius and height of a cone with angle of repose=30°?

We need to take a little side trip into ???. Come back here when you're done.

The formula for the volume of a cone (V=πr²h/3) involves the height and the radius. When you make a cone by pouring material onto the same spot, the height and radius of the cone are dependant (i.e. they are always determined by the angle of repose). In this case, we can use a formula for the volume of a cone that uses either the radius or the height, together with the angle of repose.

If we know the angle of repose, what is the trig function that tells us the ratio of the height to the radius of the cone ^[165] ? Let's say tan(angle_repose)=0.5 then

h = tan(angle_repose)*r
h = 0.5*r

substituting this relationship into the formula for the volume of a cone gives
V = pi*r^2*0.5*r/3
  = 0.5*pi*r^3/3

in general terms
V  = pi*tan(angle_repose)*r^3/3

What is the ratio height/radius for a cone with an angle of repose of 30° (you can do this with a math calculator or with python)? Python uses radians for the argument to trig functions. You turn radians into degrees with degrees() and degrees into radians with radians().

dennis:/src/da/python_class# python
Python 2.4.3 (#1, Apr 22 2006, 01:50:16) 
[GCC 2.95.3 20010315 (release)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from math import *
>>> radians(30)
0.52359877559829882
>>> degrees(0.52359877559829882)
29.9999999999999996	#note rounding error in last place in real representation
>>> tan(radians(30))
0.57735026918962573
>>>

For a cone of rice, h/r=0.577. For a back of the envelope calculation we can make h/r=0.5 (or even 1.0; it's not going to make much difference to the height of the pile of rice). If you want a more accurate answer, you can come back later and use the real numbers. What angle has tan(angle)=0.5 ^[166] ? It's somewhere between 26 and 27° The inverse trig functions allow you to find the angle for which a number is the tan(). The inverse of tan() is called either tan^-1(), arctan() or atan() (the name in python).

>>> atan(0.5)
0.46364760900080609	#angle in radians
>>> degrees(atan(0.5))
26.56505117707799	#angle in degrees

giving an angle of repose of 26.5°

The cone of rice, from its angle of repose (in the section on trigonometry), has r=2h. For such a cone V=πr²h/3=(4π/3)h³. If the rice is presented to the inventor as a conical pile of rice, what is its height ^[167] ? For comparison, how high do commercial airliners fly ^[168] ? How high is Mt Everest ^[169] ?

The radius of the cone of rice with an angle of repose of 30° is twice the height. What is the diameter of the base of the pile of rice ^[170] ?

What is the world's annual production of rice (look it up with google)? How many years of the world's rice production would be needed for the reward ^[171] ?

6.15. Geometric Progressions: Time for the Moore's Law aided calculation of e=(1+1/n)^n

Let's go back to the program which calculated the time to determine the value of π, when the computer speed increased with Moore's Law. The number of iterations/yr is a geometric series with scaling factor=10^7.5*10^9=10^16.5, common ratio=2. You didn't need to write a program to calculate the sum. Using the formula for the sum of a geometric series, what is the sum of n terms of this geometric series ^[172] ? What is n if sum=10*100? You will need a calculator - python will do. Make a few guesses about x to calculate values for 2**x/10**83.5 which in turn will give the expected number of iterations. Here's my calculation ^[173] . The answer is that you will need 277 years of calculating, with the value for π arriving at the start of the 278th year.

Note

If you know logs:

pip:~# echo "83.5*l(10)/l(2)" | bc -l
277.38

ie
10**83.5=2**227.4

(I haven't been clear about the fencepost problem. You now need to decide if the answer is 277 or 278 years.)

6.16. Geometric Progressions: creating money by multiplication

Let's differentiate wealth and money

wealth: a society has wealth when it has educated healthy citizens, public transport, hospitals, schools, libraries, roads, food, housing, jobs, a functioning and trusted legal system and government, a stable economy, peace and a low crime rate.
money: money is a form of exchange for goods and services.

Where does money come from? As Galbraith says ("The Age of Uncertainty, John Kenneth Gailbraith (1977), Houghton Miflin, ISBN 0-395-24900-7 - anything by Galbraith is worth reading, not only for the substance, but for the style in which he communicates) on p 167, it's created from nothing.

Galbraith credits the mechanism for the creation of money to the Amsterdam bankers (bottom of p 166) in 1606, although the principle was known earlier. The theory of the creation of money (multiplication) is a standard part of Macroeconomics. The best description I found on the web is at The Banking System and the Money Multiplier (http://www.colorado.edu/Economics/courses/econ2020/section10/section10-main.html), which I'm using here.

Banks make loans, from which they earn interest. This is obvious to everyone. In the process, banks also create money. How this works is not quite so obvious. Let's assume that everyone banks in the same or linked banking system (for convenience, let everyone bank at the same branch of a bank). Warren Buffett deposits 1M$ in the bank. The bank has an asset of 1M$ and a liability to Warren Buffett of 1M$. Let's look at the bank's balance sheet.

Programmer's Bank of N.C.

  Assets	                 Liabilities
  D    C                          D    C
-----------                      -----------
 1M  |       cash                    | 1M    Warren Buffett

	Note
	Double entry accounting rules: Each block here is called an `account`. In this case, the bank has an asset account and a liability account. When money moves, the same amount of money must appear in two accounts, in opposite columns. In accounting, the left hand column is called "debit" and the right hand column "credit". These two names are just opposite labels, like positive and negative, up and down, left and right. You always have credits and debits in equal amounts. In accounting money is not created or destroyed. It just moves. For more about accounting see Gnucash and Double Entry Accounting (http://www.austintek.com/gnucash/).

The bank has 1M$ on hand in cash and has a liability to (owes) Warren Buffett for 1M$.

Note

The bank's net worth is $0 However they have 1M$ cash on hand, which they can use to make money. As long as the bank makes enough money to pay Warren Buffett his monthly interest (which will confirm his wisdom in choosing to leave his money with this bank), the bank stays in business. Warren Buffett is in no hurry to close out his account (retrieve his money).

Surprisingly much of the economy (people, businesses) has little net worth or is in debt and everyone has agreed that this is just fine. I started off with a 5% down mortgage, where I was in debt to the tune of 95% of the value of my house. Many businesses have receive and pay out money at different rates throughout the year; sometimes having lots of cash and sometimes having less than none. These businesses have a line of credit with a bank, to even out the money flow, allowing them to make payroll and expenses at all times of the year. Of course you have to pay interest on the loans, but it makes the business possible, and allows people to buy a house rather than rent.

Let's say the bank makes some really bad decisions and looses all of Warren Buffett's money. Who's in trouble? Not the bank - they have no money. What about the bank's officers? Their houses and personal money aren't involved in the bank. Only Warren is in trouble. He's lost 1M$. Admittedly, the bank's officers won't be able to show their faces in town (they'll leave), but that's about as bad as it gets. In the US there's no moral outrage about this sort of thing. The bank's officers, using the same skills they used to convince you to deposit your money with them, will get jobs somewhere else. To handle this situation, in the US, the FDIC insures the depositor's money, so that in the event of a bank forfeiture, the FDIC will reimburse you for the lost amount. For this to work, the FDIC has to keep tabs on the bank's accounts and track what they're doing. The bank has to pay the FDIC for their work, and so that the FDIC has enough money to handle any forfeitures (i.e. insurance). Once this has been setup, the bank can hang the FDIC sign in their window. In the last few years, with no forfeitures since the S&L scandal (only 20yrs ago), Congress decided to stop levying this charge against the banks. (Congress and the banks were good mates you understand, and the banks long ago had stopped being bad boys. Congress wanted to help the economy with reduced regulation, which as everyone knows, increases honesty and transparency in business.) As a result, with the credit and bank implosion in late 2008, the FDIC found it had no money socked away to handle the bankruptcies. Where did the money come from? Congress took it from the taxpayers (definitely not the banks).

What's the bank going to do with Warren's money? If they just let it sit in the vault, there won't be any income for the bank. The bank has to pay salaries, expenses, and to Warren Buffett, interest. To cover expenses and earn money, the bank loans out the money for which it charges interest. They don't loan all of it out. In the US, the Federal Reserve (the "Fed") decrees that the bank must hold 10% of their deposits in reserve, against possible withdrawals by customers. The bank deposits the required amount with the Federal Reserve (earning a small amount of interest). Let's look at the bank's balance sheet now.

Python Programmer's Bank of N.C.

  Assets	                 Liabilities                   Federal Reserve 
	                         (deposits)                        Escrow 
  D    C                          D    C                           D   C
-----------                      -----------                   -------------
 1M  |       cash                    | 1M    Warren Buffett          |
     |  0.1M                         |                          0.1M |

The bank has 0.9M$ available to make loans. For simplicity, let us assume that the entire $900,000, made available by Warren's deposit, was borrowed by one business: Joe the Plumber. Joe the Plumber doesn't borrow the money for fun, he wants to invest in new capital equipment, to increase the productivity of his workers. Joe the Plumber wants to invest in a new laser rooter, for which he pays $900,000 to Sapphire Slick's Laser Emporium. Let us also assume that Sapphire has an account with the Python Programmer's Bank, where she deposits the $900,000 received from Joe the Plumber's purchase. The bank deposits 10% of Sapphire's deposit with the Fed. Here's the accounts at the Python Programmer's Bank after Sapphire's deposit.

Python Programmer's Bank of N.C.

  Assets	                 Liabilities                   Federal Reserve         Loans
	                         (deposits)                        Escrow 
  D    C                          D    C                           D   C               D    C
-----------                      -----------                   -------------         -----------
1.0M |                               |1.0M Warren Buffett            |                   |
     |  0.1M                         |                          0.1M |                   |
     |  0.9M                         |                               |              0.9M |   Joe the Plumber
0.9M |                               |0.9M Sapphir Slick             |                   |   
     |  0.09M                        |                          0.09M|                   |   

Total:
0.81M                                 1.9M                      0.19M               0.9M

The bank now has assets of 0.81M$ (with 0.19M$ in the Fed escrow account). It has issued a loan for 0.9M$ to Joe the Plumber (there is now 0.9M$ injected into the business world). However the bank now has deposits of 1.9M$ (it also has liabilities of 1.9M$). The bank has an extra 0.9M$ on deposit, which it can use to make more loans. Let's say the bank now issues a loan of 0.81M$ to another business, which buys equipment or has work done for it and the contractor deposits the money back in the Python Programmer's bank. And this goes on till the bank's assets are reduced to 0$. How much money is on deposit at the bank, how much money is in the Fed escrow account and how much money has been loaned out to businesses?

Loans (M$): 
0.9 + 0.81....
= 1(0.9^1+0.9^2+...)
using the formula for the sum of a geometric series
S=a(1-r^n)/(1-r)
= 0.9 * (1-0.9^inf)/(1-0.9)
= 9.0M$

The amount of money that's loaned out is the original amount of money put into circulation by the first loan, divided by the Federal Reserve ratio (the 1-0.9 term).

the money multiplier = 1/reserve_ratio

The amount of money on deposit is 10.0M$ (9.0M$ + original deposit), and the amount of money in the Federal Reserve is 1.0M$.

	Note
	The bank has no money as reserves (in case anyone wants to withdraw money) (the bank will usually not lend out all its money).

Here's the bank's accounts when all of the money has been loaned out

Python Programmer's Bank of N.C.

  Assets	                 Liabilities                   Federal Reserve         Loans
	                         (deposits)                        Escrow 
  D    C                          D    C                           D   C               D    C
-----------                      -----------                   -------------         -----------
1.0M |                               |1.0M Warren Buffett            |                   |
     |  0.1M                         |                          0.1M |                   |
     |  0.9M                         |                               |              0.9M |   Joe the Plumber
0.9M |                               |0.9M Sapphir Slick             |                   |   
     |  0.09M                        |                          0.09M|                   |   
     .                               .

Total:
0.0M                                  10.0M                      1.0M               9.0M

Notice we started with a deposit of 1M$ and when the process is finished, the bank has 10M$ in deposits, with 9M$ in loans and 1M$ in the Federal Reserve. Was any wealth created ^[174] ? Was any money created ^[175] ?

In reality there are a number of leakages from the above scenario, that will reduce the value of the multiplier:

People may not deposit all of their cash into the banking system. Besides the money we keep in our wallets, we may save some of our money outside the depository banking system.
People may decide not to spend all our money.
Banks may not loan out all potential reserves, choosing to keep excess reserves.

The multiplier effect works in all sectors of the economy. You buy a self levitating mouse for your computer; the clerk at the store, the store's owner, the truck driver who delivered the mouse, the factory that made the mouse, the guy that designed the mouse, all get money. These people in turn go to stores and buy Homer Simpson towels... and on it goes. The circulation of money maintains the economy (at least in the US) and leads to Fisher's concept of the velocity of money Quantity theory of money (http://en.wikipedia.org/wiki/Quantity_theory_of_money), which is a bit outside the scope of this course.

What's needed to maintain this system of creating money

Banks can't let their reserves drop to $0 (they can't lend out all their money), or there'll be nothing when the customer who have money on deposit come in the door.
Customers can only retrieve their money one at a time. A run on the bank, usually caused by people thinking that the bank is in trouble (it may be in trouble), will quickly exhaust the bank's reserves.
People have to spend their money and not save it.

As Galbraith says on p167 in the above reference

The creation of money by a bank is as simple as this, so simple, I've often said, the mind is repelled.
The important thing, obviously, is that the original depositor and the borrower must never come to the bank at the same time for their deposits - their money. They must trust the bank. They must trust it to the extent of believing it isn't doing what is does as a matter of course.

While we may think of banks as just safe places to store our extra money, their main role in the economy is to make loans. If they stop making loans, the amount of money in circulation drops by a factor of 10. Credit worthy people can't get loans to buy a house, people who move and want to sell their houses can't, but still have to keep paying the mortgage on their empty house, businesses can't meet payroll or pay for the goods they need to buy to stay in business. The economy goes into a death spiral, with people being laid off and defaulting on their mortgages, businesses fail, people default on their loans and the banks have even less money to make loans.

The multiplication feature of the creation of money means that any small changes in the economy are greatly magnified. If people stop spending money, or banks stop making loans, the amount of money in circulation drops drastically. Businesses can't pay their bills, people loose their jobs and can't pay their mortgage, which means that the banks loose money on their bad loans and can't make more loans. Then in good times, the economy swings the other way. For people trying to go about their lives, as my friend Tony Berg says, the worst thing is change. You can plan your life if you know the rules, even if they aren't great rules, but if one day after making all your plans, someone comes and says "we've changed the rules, the multipliers, the interest rates...", your plan goes out the window. The multiplication factor and the amount of money in circulation needs strict monitoring, or the economy will go through booms and busts, causing havoc and disruption in the lives of innocent people, who are just trying to work hard and get their jobs done. When you hear newspapers/commentators talking about the US economy being a vibrant innovative place, where people can take risks and reap the rewards, what they're talking about is an economy with large swings where a small number of people (those with the money) can make a packet and get out, while the expendable workers suffer lives of great disruption. However the newly unemployed will have a chance to be the first to buy the next iPod.

One of the critical features of this system is the reserve ratio (some countries, e.g. Australia, don't require banks to deposit any money with their equivalent of the Reserve Bank). The banks are always pressing to reduce the amount of money they have to deposit with the Fed. If this is reduced to 5%, then the banks can loan out twice the amount of money, and earn twice as much interest. The problem with having lots of money to make loans, is that there may not be enough demand for the extra money from credit worthy people, in which case the banks will have to start chasing people who are less credit worthy. This got a lot of banks into trouble in the US credit crunch of 2008.

The instability in the economy, which is part and parcel of the banks being able to create money out of thin air, has led some economists e.g. Soddy (a Nobel Prize winning chemist turned economist) to propose that banks not be allowed to create money at all (see Op-Ed in NYT Mr. Soddy's Ecological Economy). Of course such radical statements are laughed at by the economists who profit handsomely from the large bonuses they receive from making all the loans, and who are bailed out by the tax payers when the loans all collapse.

We're all told that saving is a good thing (a penny saved is a penny earned etc - who said this ^[176] ?). Note that personal saving derails the system here, by taking money out of the multiplication spiral. Economics text books are full of examples of the type "If you do X, what effect will this have on you, have on the economy? If everyone does X, what effect will this have on you, have on the economy?" In some of these an activity done by one or by everyone is good (e.g. working hard), while another activity (e.g. preparing a good resume, so you will get a particular job) is good for you, but you wouldn't want to hand around your resume to all the other applicants or they would figure out what's wrong with theirs. You need to know what's good for you and what's good for the economy. Sometimes they're the same; sometimes they're the opposite.

It turns out that saving is good for you, but just not good for the economy, at least in the sense discussed above. After 9/11, George Bush told the US to go shopping (to keep the economy rolling), when he should have told the country to tighten its belt.

The US credit crunch of 2008 was caused by banks making home loans to people who didn't have a good enough credit rating for standard (prime) rate mortgage. Instead these people were given a sub-prime mortgage, with a higher interest rate to handle the risk that the mortgagee would default. The only reason that lending money to high risk mortgatees worked was because at the same time there was a house price bubble (house prices were rising faster than their real worth). When people defaulted, the increased value of the house covered the bank's costs. As well the defaulter would get back some money from the sale of the house. Previously no-one would touch these high risk people, but with the house prices rising, some banks decided it was worth the risk and these banks made a packet writing sub-prime mortgages. Eventually the house price bubble burst. When house prices decreased with the purchase value of the house, and mortgagees defaulted, banks with outstanding loans now had a lot of bad loans and the mortgagees went bankrupt.

The initial problem, in the credit crunch, was that banks no longer had money to make loans. Businesses who weren't a part of the sub-prime mortgage problem, weren't able to make their payroll, make planned purchases, even though their businesses were being conducted to the highest standards. The problem became obvious to people who read newspapers in the middle of 2007.

Here are the players involved

Congress (and their appointees, the Secretary of the Treasury etc). The people have the job of regulating the greed of human nature so that the financial industry does not rip people off, or cripple the economy, in its rush to exercise new innovative financial instruments.
Banks who made a packet of money issuing risky loans during a house price bubble, and who now had a load of bad loans. The banks had "insured" these loans with innovative, opaque and unregulated new instruments called credit default swaps and CDOs (go look them up if you want to know more). The point of reserving money for insurance, is that it must be safe somewhere, in conservative investments. The banks instead used their "insurance" money on more sub-prime mortgages. The banks were screaming for help on the tried and true principle of capitalism, that if you're in trouble, you get the taxpayers to bail you out, but if things are going well, you're on your own (this is called socialism saving capitalism).
A bunch of people who should not have been allowed to take out ARMs ( Adjustable Rate Mortgage. ^[177] ) and who, if had been given "truth in advertising" advice, would never have taken them. There is the principle of "moral hazard" which says that if people know you'll bail them out when they engage in risky behaviour, then people will be encouraged to take more risks. However you can only morally not help someone, if they've engaged in risky behaviour in full knowledge of what they're doing. It's for this reason that the government requires drug testing, safe and regulated designs of cars, roads and airplanes. It is clear that the people in the lower rungs of the economy weren't fully appraised of the consequences of their actions. The other problem is that people live in neighbourhoods of similar economic levels. Once people started defaulting on their mortgages in a neighbourhood, the price of the other houses fell, putting other responsible people in financial stress, eventually leading to whole neighbourhoods in terrible financial circumstances.
Normal financially frugal and responsible people going about their daily business, who couldn't get loans anymore and consequently whose lives and businesses were going down the tubes.

The people who were running the economy about Sept 2008 realised there was a problem. (these were Henry Paulson and Ben Bernanke. Alan Greenspan had retired and expressed profound disappointment that his perfectly working scheme had been derailed by human greed. Quite reasonably, Greenspan was not held responsible for the consequences of his incompetence.)

With what you know now, you're qualified to solve the problem. To make it easier, I'll make it multiple-guess. What would you do (you can choose more than one answer)? While you're at it, what do you think Congress did?

Give 700G$ to the financial industry to pay off their bad loans, without any requirements to fix any of the problems, or to start making loans again. Allow the businesses to use the money to put themselves into a better position to emerge from the financial carnage (e.g. buy up other businesses). Allow the senior staff to keep 100M$ in bonuses (for their continuing excellent work), despite 3 or 4 quarters of continuing losses. Tack large amounts of pork onto the bill, so that everyone in Congress will vote for it.
Remind the financial industry that the rules had been setup at their request to remove the odious burden of regulation. The financial world was just having a minor correction and that everything was still working perfectly. They should use the reserves they'd accumulated in the good times to carry them through.
Write new rules to make sure that the problems above would never happen again. Use the taxpayer's money to start issuing loans. If the banks didn't want to do it, then the government can do it.
Many solutions were offered to handle the people who should never have had loans. I presume most of them would have worked. One was for the goverment to pay the increase in interest rate on the ARMs for the next 5 yrs, by which time the economy should be in better shape, their house price would be better, and the people would have 5yrs notice that they would have to get out if they didn't get their act together.

Even though they'd watched mesmerised as the whole economy sank, the US Congress in Oct 2008, decided that they were the best people to fix the problem and gave Paulson and Bernanke, 700G$ of the tax payers money (how much more does each person in the US owe the government as a result of this decision ^[178] ?) to do whatever they liked to fix the problem. The plan was to give this money to the banks who were in trouble to use in whatever way they wanted. It was clear to even the uneducated, that there was no hope that this would work. The banks used the money for weekend parties and retreats, to pay off debts to other banks and to give bonuses to the executives. Guess how much of the 700G$ that the goverment required to be used for loans ^[179] ? After receiving 85G$ from the taxpayers, AIG held a week long party at an expensive resort (AIG) costing 500k$ (see AIG bailout means facials, pedicures http://blogs.moneycentral.msn.com/topstocks/archive/2008/10/07/aig-bailout-means-facials-pedicures.aspx). to make sure that the taxpayers know the contempt in which they are held by the financial industry. Not wishing to miss the party, the Detroit auto makers arrived in private jets to plead their special poverty. Six weeks later (mid Nov 2008), after spending 350G$ of the money without the slighest effect, Paulson admitted that the plan wasn't working. He was adamant that the money should be used to bailout the banks (an investment) rather than prevent mortgage forclosures (an expense). Paulson changed tactics - he would send the remainder to credit card companies. How much money was going to people who needed loans ^[180] ? This is being written in mid Nov 2008, when the results of this new move aren't in. My prediction, as a newspaper reading non-economist, is that the new plan will not work any better than the earlier plan.

If you're like me when I was growing up, you'd say "these are a bunch of idiots. I can see what's wrong already, I'm going to have no problem getting a job, kicking these people out and fixing the whole situation up." You're wrong.

The futility of trying to establish a meritocracy can be seen in the failure of Plato's "The Republic". If you're good in school, your elders and betters will say "we want bright energetic people like you". They'll give you prizes at the end of the year. You'll wind up is working for them, in an assembly line (you may not be assembling cars, but you'll be in an assembly line none the less). The pinacle of achievement is a Ph.D. You work for 5yrs, at no pay, to help your supervisor, who will get tenure out of your (and other student's) work. At the end of it, you get a piece of paper which tells the world "this person will work for 5 yrs at no pay, for a piece of paper saying that they're really smart". Employers love to see people like this come in the door for job interviews. Ask your professors for advice in University: "of course you should take my course; there's a bright future in this field, otherwise I wouldn't be in it". Don't ask anyone, who has a vested interest in the outcome, for advice (OK ask them, but be careful of taking it). While a Ph.D. is a requirement for entry into many fields, make sure that the time spent on it is on your terms and not just to help the career of others.

So why won't you get Paulson's job? You know what's wrong, know what needs to be done and you know that they're a bunch of idiots. These people see you coming. They have comfortable jobs, they aren't held accountable for their blunders and aren't required to deliver results. Their pronouncements are greeted with fauning and adulation by Congress. Are they going to be glad to see you walk in the door ^[181] ? You'll be greeted with the same applause that's followed you since grade school. They'll find something special for a person like you and you won't realise till long afterwards, that you wasted your time. They'll do everything they can to crush you from the very start. If you want to fix up the situation, you'll have a life like Martin Luther King Jr, Jonas Salk or Ralph Nader.

Financial people who made it, not playing along with the system, but using its weaknesses for their own profit, include Warren Buffet and Jeremy Grantham, both of whom seem to have had comfortable lives.

Early on you'll need to choose between the life of Henry Paulson and Alan Greenspan, with public adulation on one hand, or the life of Jonas Salk on the other hand, who cured the world of polio, but who was regarded as a technician by the scientific community, and who had to raise all the money for his cure himself (through the March of Dimes). Salk wasn't even admitted as a member of the National Academy of Science (neither was the educator and populariser of science Carl Sagan). Richard Feynman resigned from the Academy, seeing it as a club of self congratulatory incompetents (my words not his).

If you want to set things right, competence in your field is necessary, but not sufficient. The frontier will be the idiots, not your chosen field of endeavour. As Frederick Douglass said

"Power concedes nothing without a demand. It never did and it never will."

6.17. Parallel Programming and Distributed Denial of Service (DDoS): linear or exponential?

As I said earlier, if you have problems that scale exponentially, to solve them in linear time, you need tools that scale exponentially. If your tools scale linearly, you need exponential time to solve the problem.

One attempt to increase the amount of computing power is parallel programming. You divide a problem into smaller pieces, send each piece off to its own cpu and then to merge/join/add all the results at the end. It turns out that dividing problems into subproblems and joining the partial results doesn't scale well. Much of the time of the computation is spent passing partial answers back and forth between cpus. As well the cost of cpus scales linearly (or worse) with the number of cpus. So parallel programming is a linear solution, not an exponential solution.

When a client on the internet connects to a server on the internet, the client first sends a request to connect (called a SYN packet). This is the tcpip equivalent of ringing the door bell (or the phone). The server replies (SYN-ACK) and the client confirms the connection (ACK). The client then sends its request (give me the page "foo.html"). A DDoS is an attack on a machine (usually a server for a website), where so many (network) packets are sent to the server, that it spends all its time responding to DDoS packets (answering the door to find no-one there), and is not able to respond to valid knocks on the door. The machine doing the DDoS only sends the SYN packet. The server replies with the SYN-ACK and has to wait for a timeout (about 2mins) before deciding that the client machine is not going to connect. There is no way for the server to differentiate a DDoS SYN packet from a packet coming from a valid client on the internet.

DDoS machines are poorly secured machines (Windows) than have been taken over (infected with a worm) by someone who wants to mount an attack on a site. The machines are called zombies and once infected, are programmed to find other uninfected machines to take over. The process of infecting other machines is exponential or linear ^[182] ? The zombies sit there appearing to be normal machines to their owners, until the attacker wants to attack a site. The attacker then commands his zombies (which can number in the millions) to send connect requests to a particular site, taking it off the network. All the zombies in the world are Windows boxes linked to the internet by cable modems. It takes about 20 mins after an unprotected Windows box is put on the internet for it to be infected by one of these worms.

6.18. Lucky Lindy

Note

For the information on the navigation required for this feat, I am indebted to "Portnoy's Imponderables", Joe Portnoy, 2000, Litton Systems Inc, ISBN 0-9703309-0-1. available from Celestaire (http://www.celestaire.com/). Table 4 on p 33 showed how lucky Lindy was. This book has tales of great navigation for those who have to sit at home, while everyone else is out in their boat having the real fun. Celestaire has all sorts of nice navigational equipment.

Other information comes from Lindbergh flies the Atlantic http://www.charleslindbergh.com/history/paris.asp), Charles Lindbergh, (http://en.wikipedia.org/wiki/Charles_Lindbergh), Orteig Prize (http://en.wikipedia.org/wiki/Orteig_Prize), Charles Lindbergh (http://www.acepilots.com/lindbergh.html) table of full moons (http://home.hiwaay.net/~krcool/Astro/moon/fullmoon.htm)

Following a run of bad weather, a forecast on 19 May 1927 predicted a break in the rain. At 7:20AM on 20 May 1927, 4 days after the full moon, an unknown air mail pilot, Charles Lindbergh, in pursuit of the $25k Orteig Prize for the first transatlantic flight between New York and Paris, slowly accelarated his overloaded (2385lb gasoline) single engine Ryan monoplane, northward down the boggy runway at Roosevelt field, clearing the power lines at the end of the runway by only 20'. To save weight Lindbergh carried no radio, no sextant and only carried a compass (accuracy 2°) and maps for navigation (but presumably having an altimeter and airspeed indicator). The plane had no brakes, and the plane was so filled with fuel tanks that Lindbergh had to view the world outside through a periscope.

Six people has already died in attempts on the Orteig prize. Two weeks earlier, Nungesser and his navigator, who planned to fly through the moonless night, left Paris in an attempt to reach New York and were not seen or heard of after they crossed Ireland. Six months earlier, leaving from the same Roosevelt field, Fonck and two crew members, never got off the ground, when the landing gear of the grossly overloaded (by 10,000 lbs) transport biplane collapsed during takeoff. The two crew members (but not Fonck) died in the subsequent inferno.

Flying air mail may seem humdrum compared to the barnstorming and wingwalking popularised in the history of the early days of flying. The job of air mail pilot may remind you of the postman who delivers your daily mail.
In fact air mail pilot was a risky job. Flying was all weather, minimal navigation equipment and with bad airfields. Following a strike in 1919 over safety a new procedure was instituted "if the local field manager on the Post Office Department ordered the pilots to take off in spite of their better judgement, he was first to ride in the mail pit (a kind of second cockpit) for a circuit of the field".
Whatever it was that was so vital to transport quickly, it was delivered at a high cost. Life expectancy for a Mail Service pilot was four years. Thirty one of the first forty pilots were killed in action, meeting the schedules for business and goverment mail.
Condensed from "Normal Accidents", p125, Charles Perrow (C) 1999, Princeton University Press, ISBN-10 0-691-00412-9

Lindbergh made landfall at Nova Scotia, only 6 miles off-course and headed out into the Atlantic as night fell. Lindberg had marked up his Mercator Projection map into 100mile (approximately 1hr flying time) segments of constant magnetic heading. He hand pumped fuel from the tanks inside the plane to the wing tanks. He alternately dodged clouds, which covered his wings with sleet (what's the problem there ^[183] ?) by flying over them at 10,000 ft or going around them. At one stage Lindbergh thought of turning back, but once half way, it was easier to keep going. His hope to use the full moon for illumination, did not pan out (what's the problem here ^[184] ?). To gauge windspeed, Lindbergh flew 20' over the wavetops, looking at the direction of spray and intensity of the whitecaps (how does this help ^[185] ?).

Sunrise on the second day came over the Atlantic with Lindberg arriving over the southern tip of Ireland in the late afternoon, after 1700miles and almost a day of ocean, only 3 miles off-course, a incredible directional error of only 0.1°. The adverse weather had put him an hour behind schedule, and considering the errors in estimating drift from the waves, magnetic variation and the inherent error in the compass, he should have been 50 miles off course. He crossed into Europe as the second night fell, arriving in Paris at 10pm. Amazingly his imminent arrival had been signalled from Ireland, with 150,000 people waiting at Le Bourget field to greet him.

Considering there was no internet, few phones or cars back then, Paris notified and moved 150,000 people by public transport in a few hours, a feat that be impossible in most American cities even today. Would you believe your neighbour pounding on the door "Vite! Lindbergh arrivera au Bourget a 2200hrs!" while you're relaxing having dinner and reading Le Figaro? No way. I would have stayed at home. My neighbour would have fumed (in a John Cleese voice) "You stupide Frenchman!"

The whole world went gaga. Lindbergh was a hero. In my lifetime, the big event was Armstrong landing (and returning) from the moon. But all of us watching the great event on TV knew we wouldn't be landing on the moon. With Lindbergh's flight, people realised that anyone would be able to fly across oceans.

Lindbergh was aggrieved by the considerable number of people who declared his accomplishment just luck. Let's see how lucky he was. As a person who's read just about every story of adventure and exploration that's been printed and has spent a large about of time hiking in the outdoors, you need to know the difference between a successful trip/expedition and an unsuccessful one:

in a successful trip, nothing goes wrong, there are no great tales to tell and the whole thing is no big deal. Most people don't hear about these trips; "we went out, had a good time and came back" is not going to be in the newspapers.
How little do we know of Amundsen's routine trip to the south pole? Amundsen trained for years, overwintering near the north pole, learning from the Innuit and bringing with him the best Greenland dogs for hauling. His party travelled on skis. No-one's interested in his logs ("day 42: marched another 50km. Sven's turn to prepare dinner. Dogs in good shape. Slept well.").
in a unsuccessful trip, people are lost or die or suffer great privation. There is great heroism (or cowardice) on the part of some people. These trips make great stories (as long as you weren't on the trip) and everyone hears about them.
Everyone can recite details from Scott's blunder to the pole, in which all members of the trip died of poor planning. Scott thought that being British would be enough to get him to the South Pole and back safely. His party were unfamiliar with dogs, initially tried horses and eventually man-hauled the sleds. They weren't prepared for the resulting increased perspiration which iced up the insides of their clothes. Scott's party travelled by foot (no skis) and were injured by falls into crevasses, which Amundsen's party on skis more easily passed over. Everyone has read the descent into disaster seen Scott's logs, with his demoralised entry on arriving at the pole to see Amundsen's flag "Great God! This is an awful place".
In Shackleton's 2nd trip to the south pole, due to lack of funds, he took a ship, the Endurance, whose hull would not survive being frozen in. Shackleton expected to be able to land his crew and get his ship out before the sea iced over. Instead the Endurance is caught and crushed in ice (http://www.shackleton-endurance.com/images.html) before they reached land. The trip is better known than others because of Frank Hurley's photographs. (Shackleton knew that he'd never be able to pay for the trip without them.) There is no doubt of Shackleton's leadership and heroism, however the underfunded trip was doomed from the start.
In a war, there are more medals issued for a calamitously executed battle than for a clean victory.

What determines whether the trip is going to be successful or not? The difference between a successful trip and an unsuccessful trip is (ta-dah... drum roll): good planning and preparation. That's it - that's all you need to know. Most of the dramatic adventures that are burned into society's conciousness about how we got to be where we are, the nature of bravery and heroism, and what we pass on to the next generation as important lessons, are nothing more than examples of really bad planning. The un-newsworthy successful trips that should be the focus of our learning are ignored.

To see if Lindy was lucky, we have to look at the preparations he made.

The main risk factor in a 33.5hr, 3,150nmi flight over ocean, is engine failure. Lindbergh selected the highly reliable Wright Whirlwind J-5C engine. Back then (1927) "highly reliable" meant a complete failure (the plane won't fly) every 200hrs (MTBF, mean time between failure = 200hr). A new car engine by comparison can be expected to run for 2000hrs or more (with oil changes etc) without failing. (A car engine doesn't have to be light, like a plane engine. A plane engine will only have just enough metal to hold it together.) Interestingly, 15yrs later in WWII, plane engines still only had the same MTBF. By then a plane (and the pilot) was expected to have been shot down by the time it had spent 200hrs in the air. It wasn't till the reliable jet engine arrived, that commercial passenger aviation became safe and cheap enough that the general populace took to it.

The failure rate of many devices follows the bathtub curve (http://en.wikipedia.org/wiki/Bathtub_curve). There is an initial high rate of failures (called infant mortality) where bad devices fail immediately. Lindbergh was flying over land, during daylight, for this part of the trip. The second phase is a low constant rate of failure, where random defects stop the device from working. At the end of life, as parts wear out, the failure rate rises steeply.

If you have a large number (n) of devices with MTBF=200hrs then on the average, you'll need to replace one every 200/n hrs.

What is the distribution of failures? While it's fine to say that if you have 200 light bulbs (globes) each with a MTBF of 200hrs, that you'll be replacing (on the average) 1/hr, what if you're in a plane and you've only got a small number (1,2 or 3) of engines? Knowing the average failure time of a large number of engines is not of much comfort, when you're over an ocean and your life depends on knowing what each engine is going to do. MTBF=200hrs could mean that the engine will have a linearly descreasing probability of functioning with time, having a 50% chance of working at 100hrs and being certain to die at close to 200hrs. Experimental measurements show that this isn't the way things fail. Presumably the manufacturer knew the distribution of failures of Lindbergh's engine, but I don't, so I'll make the assumption which is simplest to model; the flat part of the bathtub curve, with a uniform distribution of failures i.e. that in any particular hour, the engine has the same (1/200) chance of failure, or a 100-0.5=99.5% chance of still running at the end of an hour.

The Wright Whirlwind J-5C engine was fitted to single engined, two engined and three engined planes.

Let's say that Lindbergh wants to know the chances of making it to Paris in a single engine plane. Write code lindbergh_one_engine.py with the following specs

documentation up top, giving name of the program, author, license under which the code is distributed, and a entry describing the purpose of the code (to calculate the probability of the engine still running after n hours).

initialise

		#these should be reals or you'll get integer arithmetic
in_air=1.0	#the probability that Lindbergh is still flying
MTBF=200.0
time=0.0

You will update the calculation every hour. to hold the inteval between updates, initialise
interval=1.0 #hours
let Lindbergh fly for one interval, then update in_air. If the probability of Lindbergh still flying is in_air, what will be the probability of him still flying 1hr later ^[186] ? What's the probability of Lindberg still flying after interval hours ^[187] ?
At the end if the interval, print out the current time (in hours) and the probability that Lindbergh is still flying.

Here's my code ^[188] and here's my output

dennis: class_code# ./lindbergh_one_engine.py
./lindbergh_one_engine.py
time  1 in_air 0.99500

	Note
	You could have done without the variable `interval` and replaced it with the constant "1.0". You would have got the same result, but it's sloppy programming. A variable, which should be declard up top and which you might want to change later, will now be buried as a constant deep in the code. Someone wanting to modify the code later, will have to sort through all the constants e.g. "1.0" or "1" to figure out which ones are the interval. It's better for you to code it up correctly now and save the next person a whole lot of bother figuring out how the code works.

Now we want to find the probability of the engine still running at any time during the flight. Copy lindbergh_one_engine.py to lindbergh_one_engine_2.py.

Change the print statement(s), so that now you print out the header "time in-air" before doing any calculations.
for each time, print the value of time and in-air.
Lindbergh had enough fuel for 4200 miles (3650nmi) or 38 hrs of flying. Put the code which updates in_air into a loop (while/for?) and run the flight for 38hrs (store the flight time as a variable e.g.fuel_time=38).

Here's my code ^[189] and here's my output

dennis:class_code# ./lindbergh_one_engine_2.py 
time in_air 
   0  1.000
   1  0.995
   2  0.990
   3  0.985
   4  0.980
   5  0.975
   6  0.970
   7  0.966
   8  0.961
   9  0.956
  10  0.951
  11  0.946
  12  0.942
  13  0.937
  14  0.932
  15  0.928
  16  0.923
  17  0.918
  18  0.914
  19  0.909
  20  0.905
  21  0.900
  22  0.896
  23  0.891
  24  0.887
  25  0.882
  26  0.878
  27  0.873
  28  0.869
  29  0.865
  30  0.860
  31  0.856
  32  0.852
  33  0.848
  34  0.843
  35  0.839
  36  0.835
  37  0.831
  38  0.827

The flight to Paris took 33.5hrs. What probability did Lindbergh have of making it in a single engined plane ^[190] ?

Let's define luck as the probability that matters beyond your control, that will wreck your plan, will be in your favour. (In conversational terms, luck is not defined in any measurable units. It's high time that luck be put on a sound mathematical basis.) In this case Lindbergh has a 16% chance that the engine will fail. He requires 16% luck for the flight to be successful.

Figure 11. probability of engine failure stopping flight. Vertical line at 33.5hr is actual time of Lindberg's flight.

The loop parameters in a python for loop, e.g. the step parameter are integers. In the code above, step takes the default=1. We want step=interval a real. For the code above to work, we must chose an interval that is a multiple of 1.0 and we must hand code the value of step to match interval (i.e. if we want to change interval, we also have to change step in the loop code). To let maintainers know what we're doing we should at least do this

start    =0
fuel_time=2500.0
end=int(fuel_time)
interval =1.0
step=int(interval)

for time in (start, end, step):
	...

This will give sensible results as long as fuel_time, interval are multiples of 1.0 Still this code is a rocket waiting to blow up. Python while loops can use reals as parameters. Copy lindbergh_one_engine_2.py to lindbergh_one_engine_3.py and use a while loop to allow the loop parameters to be reals. Here's my code ^[191] (the output is unchanged). Now you can change the parameter affecting the loop, in the variable list above the loop, without touching the loop code.

about areas: The units of area are the product of the units of each dimension. The area of a rectangle, with sides measured in cm, is measured in square centimetres (cm²).
To predict the amount of energy that a power plant will need to produce for a month, the power company plots the expected temperature on the y axis, with days (date) on the x axis. The number of degrees below a certain temperature (somewhere about 17°C, 65°F) at any time multiplied by the number of days, gives the amount of heat (energy) needed in degree-days (http://en.wikipedia.org/wiki/Heating_degree_day) that customers will need. i.e. the number of degree-days is the area between the line temp=65° and the line showing the actual temperature (in winter). Tables of degree-days for US cities are at National Weather Service - Climate Prediction Center (http://www.cpc.noaa.gov/products/analysis_monitoring/cdus/degree_days/).

What's the units of the area under the one-engine graph ^[192] ? If you ran the graph to infinite time, what do you think the area under the graph might be ^[193] ? Let's find out. Copy lindbergh_one_engine_3.py to lindbergh_one_engine_4.py

calculate the area by cumulatively adding the probability at each measurement interval.
run the flight for enough hours that you get some idea of the area under the graph for an infinitely long flight.
print out the area under the graph at intervals of 100hrs

Here's my code ^[194] and here's my output

dennis: class_code# ./lindbergh_one_engine_3.py
time in_air    area
   0  1.000 1.000
 100  0.606 79.452
 200  0.367 126.975
 300  0.222 155.764
 400  0.135 173.203
 500  0.082 183.767
 600  0.049 190.167
 700  0.030 194.043
 800  0.018 196.392
 900  0.011 197.814
1000  0.007 198.676
1100  0.004 199.198
1200  0.002 199.514
1300  0.001 199.706
1400  0.001 199.822
1500  0.001 199.892
1600  0.000 199.935
1700  0.000 199.960
1800  0.000 199.976
1900  0.000 199.985
2000  0.000 199.991
2100  0.000 199.995
2200  0.000 199.997
2300  0.000 199.998
2400  0.000 199.999
2500  0.000 199.999

It looks like the area is going to be exactly the MTBF.

At what time is the area 100hr (you'll have to run the code again printing out the area at every hour)? Can you see a simple relationship between t₁₀₀ and MTBF?

In the current code, the loop requires integer parameters, while the code requires real parameters. To get a reasonable estimate of the area, we have to not only intcrease time to a large number, but we have to decrease the interval to a small (i.e. non integer) value. We need to rewrite the loop using real numbers as the parameters. To do this we need to change the loop from a for to a while loop. Copy lindbergh_one_engine_3.py to lindbergh_one_engine_4.py. Make the following changes to the code

FIXME

Before we start wondering what this means, is our estimate of the area an upper bound, a lower bound, or exactly right (within the 64-bit precision of the machine) (hint: is the probability of the engine working as a function of time a continuous function or is it a set of steps? How have we modelled the probability, as a continuous function or a set of steps?) ^[195] ? If this an upper or lower bound, what is the other bound ^[196] ? So what can we say about the area under the graph ^[197] ?

Why are we talking about Lindbergh in a section on programming geometric series and exponential processes? The probability of Lindbergh still flying at the end of each hour is a geometric series

p       = r^0, r^1, r^2 ... r^n
where r = 0.995

Using the formula for the sum of a geometric series, what's the sum of this series

Sum     = r^0 + r^1 + r^2 ... r^n
	= 1/(1-0.995) = 200

Why is the area under the graph = MTBF = 200hrs?

The next problem is fuel - if you run into headwinds or have to go around storms, you'll run out of fuel. A two engined plane can carry a bigger load (more fuel). However one engine alone is not powerful enough to fly the heavier plane. If you have two engines each with a MTBF of 200hrs, what's the MTBF for (any) one engine? It's 100hrs. (If you have 200 light bulbs, each with a MTBF of 200hrs, then on the average there will be a bulb failure every hour.) Copy lindbergh_one_engine_2.py to lindbergh_two_engines.py. Change the documenation and the MTBF to 100hrs and rerun the flight for the same 38hrs (Lindbergh will now have more fuel than this). Here's my code ^[198] and here's my result

./lindbergh_two_engines.py
time in_air 
   0  1.000
   1  0.990
   2  0.980
   3  0.970
   4  0.961
   5  0.951
   6  0.941
   7  0.932
   8  0.923
   9  0.914
  10  0.904
  11  0.895
  12  0.886
  13  0.878
  14  0.869
  15  0.860
  16  0.851
  17  0.843
  18  0.835
  19  0.826
  20  0.818
  21  0.810
  22  0.802
  23  0.794
  24  0.786
  25  0.778
  26  0.770
  27  0.762
  28  0.755
  29  0.747
  30  0.740
  31  0.732
  32  0.725
  33  0.718
  34  0.711
  35  0.703
  36  0.696
  37  0.689
  38  0.683

Lindbergh won't have to worry about running out of fuel anymore. Now how lucky does Lindy have to be ^[199] ? Would you rely on this amount of luck for your life?

Before we go on to the 3 engine case, we didn't derive the formula for the one or two engine case rigourously, and the method we used doesn't work for 3 engines. So let's derive the formula in the proper way. For this we're going to make a side trip to learn some ???.

One engine case chance of probability of cumulative probabilty total engine failure engine running of single engine failure probability in that hour at end of hour by end of that hour 1hr 0.005 1.0 *(1-0.005)=0.995 1-0.995=0.005 0.995+0.005=1.0 2hr 0.005 0.995*(1-0.005)=0.990 1-0.990=0.01 0.990+0.010=1.0 . . MTBF=200hr Note: at any time (probability of the engine running) + (probability of engine not running) = 1.0 Two engine case: calculate probabilities where plane can only fly on 2, 1 engines chance of chance of either chance of only chance of both chance of no engine failing engine failing one engine running engines failing engines running in that hour in that hour at end of hour in that hour at end of hour engine 1 engine 2 1hr 0.005 0.005 0.005+0.005 = 0.01 1.0 *(1-0.01)=0.99 0.005*0.005=0.000025 1.0 *(1-0.000025)=0.999975 2hr 0.005 0.005 0.005+0.005 = 0.01 0.99 *(1-0.01)=0.98 0.005*0.005=0.000025 0.999975*(1-0.000025)=0.999950 . . MTBF engine 1 engine 2 either engine both engines 200 200 100 40000 Three engine case: calculate probabilities when plane can only fly on 3, 2 and 1 engines.

The next possibility is a 3 engine plane. This could carry the extra fuel and it could still fly with on two engines.

Lindbergh thought not. As Shackleton said on his first attempt at the south pole, when he turned back 88 miles before the pole "better a live donkey than a dead lion".

6.19. Getting in early when you have an exponential process

Building a house (or a rocket) is a linear process. You start with the basement, then the floors, then the walls, the roof... You can't build the roof any faster if the basement is finished in half the time.

Any process where later stages are helped if earlier stages are better will be an exponential process. Examples (other than investing early) are

aquiring skills (mental or physical): Mastery of skills will advance you to learning the next set of skills. You will move to a new cohort of people with these skills, who will show you more skills and will challenge you to improve your skills.
acquiring knowledge. This is hard at first. Look how long it takes to learn to walk and talk. When you know little about a subject, you have no framework to place the new knowledge in: you can't test it, you don't know its relevance, you'll forget it or have no reason to remember it. As you accumulate knowledge, you will find that new pieces of information can be placed into your current framework; you can test whether it's true (and if not discard it and remember why it was wrong). Soon you'll know enough to see voids in your knowledge and look for ways to fill them in.
computer programming. Writing a computer program can take many man-years; much longer than the time it would take to do the calculation by hand (once). The payoff only comes if big program can be run hundreds, thousands or millions of times. You only write programs if they will be used many times. Computer programs lead to better computer programs: the weather forecasting code in use today is the descendant of weather modelling code first written in the 1950's. The same can be said of compilers and many other classes of computer applications and tools.
first to market. If you're the first to market with a device, you can corner the market and have it to yourself, even if your product is junk.
- the IBMPC, which lead to the current desktop and laptop computers. The IBMPC used the 8088 chip as the CPU. The 8088 was a dreadful design and greatly limited what you could do with your PC. It was only chosen because IBM didn't expect to sell many PCs, and they already had an 8088 based design, which would save development costs. However with the PC backed by IBM, and the design made available to any PC manufacturer, the PC swept the marketplace. This took Intel, the manufacturer of the 8088, from obscurity to the place of the dominant CPU maker in the world, with the manufacturers of the well designed CPUs having gone out of the market. Computers based on the descendants of the 8088 became the commodity CPU and earned enough money for Intel, that Intel took over the role played by high end Unix servers which ran on well designed and expensive CPUs. The process of junk forcing out good quality products from the market place is well understood by computer professionals, much to their sorrow, and not thought worthy of further comment. However if you're an economist and you write about it in academic journals, you'll get a Nobel Prize (see Akerlof http://nobelprize.org/nobel_prizes/economics/laureates/2001/).
- Windows (particularly W95. Aided by good marketing by Microsoft and poor marketing by IBM, Win95 (and its descendants) became the dominant OS for PCs, beating out the better OS/2 by IBM.
- DOS. DOS was the OS of choice for the early PCs. (It wasn't an OS, it was a program loader, each application had handle the OS tasks itself. As a result only one program could run at a time.) It sold for $25, while the competitor CPM-86 sold for (I think) about $150. A functional version of Unix (called XENIX) was available for the PC, but was not adopted by the masses. Many people copied DOS and used their free copy. Microsoft sold truck loads of DOS, and the people who didn't pay for DOS, now had to use the programs which depended on DOS (most of which were written by Microsoft) so Microsoft laughed all the way to the bank. Software vendors had thin margins and only wrote applications for DOS, cutting CPM-86 and XENIX out of the market. DOS was a loss leader for Microsoft, allowing Microsoft to dominate the PC marketplace from the beginning.

After you've built your first rocket, or house, you will be in a better position to build a second house, or dozens of houses (rockets) just like the first one.

The point is: if it's an exponential process, get in early and stick with it.

It seems that to excel in a field (whether playing music or computer programming), you need to first spend about 10,000hrs at it A gift or hard graft? (http://www.guardian.co.uk/books/2008/nov/15/malcolm-gladwell-outliers-extract). No one seems to be able to differentiate genius from 10,000hrs of hard work.

6.20. Presentation: e, exponential and geometric series

Give a presentation on exponential processes (you can use any content from this webpage).

Show how bank interest, earned over a fixed time interval, changes when the period for estimation changes (period=year, month, day, hour, second, very small interval). State that this method allows you to calculate e. Show that this method for calculating e is not practical by doing the following:
- Give estimates for the time that it would take to calculate e to the precision of a 64-bit real
- give the worst case estimate of the errors expected
- Compare your best estimate for e using this method, with the actual value of e
Show the factorial series for calculating e. Discuss its speed and worst case errors. Show the role of arithmetic progressions in determining the errors.
Show the effect of compound interest on savings over a long period of time. Discuss the financial consequences of choosing a career as a plumber or a Ph.D.
Show the effect of interest in buying a house with a mortgage.
Discuss the effect of Moore's Law on calculating e=(1+1/n)ⁿ. Discuss Optimal Slacking and the futility of interstellar space travel.
Calculate the volume of the pile of rice that was given as a reward to the inventor of chess. Show how long it would take to deliver the rice if a linear process was used, or if an an exponential process was used.
Was Lindy lucky? Give an estimate for how lucky he was.

Footnotes:

^[1]

256. There are other answers (about 250, allowing us to have NaN, +/- ∞ etc), but 256 will do for the moment.

^[2]

4G - it's the answer for everything 32-bit.

^[3]

01101001
Sseemmmm
S	= 0: +ve number
see	= 110: exponent = -10 binary = -2 decimal. multiply mantissa by 2^-2 = 1/4
mmmm	= 1001: = 1.001 binary = 1/1 + 0/2 + 0/4 + 1/8 = 1 + 1/8 = 1.125
number	= + 1.125 * 2^-2 = 0.28125

^[4]

01101001
Sseemmmm
# echo "obase=10; ibase=2; +1.001 * (10)^(-10)" | bc -l
.28125000000000000000

^[5]

no. shift the decimal point to give 1.234*10².

^[6]

1 + 3/4: (1 + 3/4) * 2^0 = 7/8 * 2^1

^[7]

1/2: 1/4 * 2^1 = 1/2 * 2^0 = 1 * 2^-1
3/2: 1 + 1/2 * 2^0 = 3/4 * 2^1 = 3/8 * 2^2

^[8]
1/8: 1/8 * 2^0 = 1/4 * 2^-1 = 1/2 * 2^-2 = 1 * 2^-3
1  : 1/8 * 2^3 = 1/4 * 2^2  = 1/2 * 2^1  = 1 * 2^0

^[9]

I couldn't find any.

^[10]

16 (the mantissa is 4 bits)

^[11]

128 (there are 256 possible 8-bit numbers, half are +ve and half are -ve)

^[12]

You've doubled the number of reals you can represent. You would need one more bit. For convenience, computers are wired to handle data in bytes, so in practice you don't have the choice of using one more bit - you would have to go to a 16-bit real.

^[13]

There are 16 reals in the interval (as there are for all 2:1 intervals) and the spacing is (16-8)/16=0.5

^[14]

There are 16 numbers in the interval (as there are for all 2:1 intervals) and the spacing is (0.5-0.25)/16=.015625

^[15]

4G (you knew that). Each real is a multiple of the smallest representable real number.

^[16]

∞ - real numbers are a continuum, so there are an infinite number of them. Intergers are discrete (rather than a continuum), i.e. they only exist at points, so there are a finite number of integers in a range.

^[17]
the answer depends on the count of numbers in the mantissa. For the 8bit floating point example used here, there are 16 numbers in the mantissa for each lost exponent. 8bit - 16 numbers lost with exponent=000, 2 for +/-0.0, rest unused 16 numbers lost with exponent=111, 2 for +/-infinity, 14 for NaN (only 1 NaN needed) For the 23bit mantissa 32-bit - 8388608 lost with exponent 00000000, 2 for +/-0.0, rest unused 8388608 lost with exponent 11111111, 2 for +/-infinity, rest used for NaN (only 1 needed)

^[18] Two exponents are lost in each case. There are 2³=8 exponents for the 8bit case and 2⁸=256 exponents in the 32-bit case.

^[19] yes 0.05, no 0.02325581395348837209302325581395348837209302325581..., yes 0.02

^[20]

0.00125

^[21] 0.00011₂

^[22] 0.101₂

^[23]

1.6=16*0.1=8*0.2	#shift 0.1 4 places to the left, or shift 0.2 3 places to the left
1.1001100110011001100110011001100110011001100110011001100110011001

^[24] .000000101000111101011100001010001111010111000010100011110101110000101

^[25]

The maximum weight is 139.7Mg. When the total weight of palettes reached this amount, the test (weight_on_plane==max_weight_on_plane) failed due to lack of precision in the real number implementation (actually the inability to represent the number 0.1 in a finite number of digits in binary) allowing the program to keep loading more palettes. The computer lost a 1 in the last place for each of the palettes loaded (you needed 1397 to overload the plane).

^[26]

Airports are close to big cities (to make them economically viable) and the noise of planes landing and taking off at night keeps people awake. Planes usually aren't allowed to take off or land between midnight and 6am.

^[27]

The plane is already committed to take off long before it reaches air speed. The plane when taking off is at maximum weight (fuel) and there is nothing that can stop a fully loaded 747 at 290km/hr. If the plane incurs a problem after reaching commit speed, such as a bird strike dissabling an engine, or an engine fire, the pilots have to handle the problem in the air.

^[28]

no. it's too heavy.

^[29]

Crash barriers (e.g. steel cable nets which can be quickly lifted up into the path of a plane) are at the limit of technology in stopping a heavy fast moving object like a plane at takeoff speed. A crash barrier is only useful when the airport is forwarned that the plane is in trouble (the pilot radios ahead), the plane is lightly loaded (it has used up or dumped all its fuel and is landing empty). A plane landing is empty and is flying at about half the speed of a plane taking off. After the plane lands the brakes are applied and the plane looses more speed. It's only in these circumstances that a crash barrier is useful. Crash barriers are only used for small planes (military fighters) landing at low speed, not for a fully loaded 747 at take off speed.

^[30]

You shouldn't add cargo if it will result in an overweight plane.
The code assumed that there would be a combination of cargo when the plane was exactly at maximum weight. In practice, the plane will be underweight or overweight, but will never be exactly maximum weight.
You have to allow for the palettes all having different weights, which will never be exactly 0.1Mg.

^[31]

#! /usr/bin/python

palettes_in_terminal=[]
palettes_in_747=[]
number_palettes=1500
weight_palette=0.1
weight_on_plane=0.0
max_weight_on_plane=139.7

#account for palettes in terminal
for i in range(0,number_palettes):
	palettes_in_terminal.append(weight_palette)
	

#load plane
#while plane isn't overloaded and there are more palettes to load
#	take a palette out of the terminal and put it on the plane
while (((weight_on_plane + weight_palette) < max_weight_on_plane) and (len(palettes_in_terminal) != 0)):
	palettes_in_747.append(palettes_in_terminal.pop())
	weight_on_plane += weight_palette
	#print "weight of cargo aboard %f" % weight_on_plane

print "weight on plane %f" % weight_on_plane
print "max weight on plane %f" % max_weight_on_plane

#-------------

^[32]

4G=2³²

^[33]

it's unlimited - python can chain integer operations to any precision needed - see ???. Most languages have a 64 bit limit.

^[34]

2⁴⁰. (useful numbers to remember: 2¹⁰=10³)

^[35]

#! /usr/bin/python

error = 0.000000000001
number = 9
estimate = (number +1.0)/2.0
new_estimate = (estimate + number/estimate)/2.0
print number, estimate, new_estimate, error

^[36]

while.

A while loop is used when you don't know how many iterations will be needed. The loop is entered after testing on a condition (e.g. is (error < somevalue)). A for loop is used when you already know the number of iterations.

	Note
	You can use a `while` for a predetermined number of iterations: you set a counter before entering the loop, increment the counter inside the loop, and test if the counter is larger than `somenumber` before re-entering the loop.

^[37]

#! /usr/bin/python

error = 0.000000000001
number = 9
estimate = (number +1.0)/2.0
new_estimate = (estimate + number/estimate)/2.0
print number, estimate, new_estimate, error

while ((estimate - new_estimate) > error):

^[38]

#! /usr/bin/python

error = 0.000000000001
number = 9
estimate = (number +1.0)/2.0
new_estimate = (number/estimate + estimate)/2.0
print number, estimate, new_estimate, error

while ((estimate - new_estimate) > error):
	estimate = new_estimate
	print "%10.40f" %estimate
	new_estimate = (number/estimate + estimate)/2.0

^[39]

#! /usr/bin/python

error = 0.000000000001
number = 9
estimate = (number +1.0)/2.0
new_estimate = (number/estimate + estimate)/2.0
print number, estimate, new_estimate, error

while ((estimate - new_estimate) > error):
	estimate = new_estimate
	print "%10.40f" %estimate
	new_estimate = (number/estimate + estimate)/2.0

print "the square root of %10.40f is %10.40f" %(number, new_estimate)

^[40]

run the tests below feeding the same number, instead of different numbers, and see if the times are dramatically shorter.

^[41]

for loop. You know the number of iterations before you start.

^[42]

#! /usr/bin/python
from time import time

def babylonian_sqrt(my_number):

	error  = 0.000000000001         #10^-12
	guess  = (my_number+1)/2.0

	new_guess = (my_number/guess +  guess)/2.0
	while ((guess-new_guess) > error):
	        guess = new_guess
	        #print "%10.40f" %new_guess
	        new_guess = (my_number/guess +  guess)/2.0

	return new_guess


largest_number=1000 #calculate sqrt(1).. sqrt(1000)
start=time()
for number in range(1,largest_number):
	babylonian_sqrt(number)

finish=time()
print "largest number %10d time/iteration %10.40f" %(largest_number, (finish-start)/largest_number)

^[43]

#! /usr/bin/python
from time import time

def babylonian_sqrt(my_number):

	error  = 0.000000000001         #10^-12
	estimate  = (my_number+1)/2.0

	new_estimate = (my_number/estimate + estimate)/2.0
	while ((estimate-new_estimate) > error):
	        estimate = new_estimate
	        new_estimate = (my_number/estimate +  estimate)/2.0
	        #print "%10.40f" %new_estimate

#main()

largest_numbers=[1,10,100,1000,10000,100000,1000000]
for largest_number in largest_numbers:
	#largest_number=1000
	start=time()
	for number in range(0,largest_number):
	        babylonian_sqrt(number)

	finish=time()
	print "largest number %10d time/iteration %10.40f" %(largest_number, (finish-start)/largest_number)

^[44]

#main()
largest_numbers=[1,10,100,1000,10000,100000,1000000]
for largest_number in largest_numbers:
	#largest_number=1000
	start=time()
	for number in range(0,largest_number):
	        babylonian_sqrt(number)

	finish=time()
	print "babylonian_sqrt:      largest number %10d time/iteration %10.40f" %(largest_number, (finish-start)/largest_number)

	start=time()
	for number in range(0,largest_number):
	        sqrt(number)

	finish=time()
	print "library_sqrt: largest number %10d time/iteration %10.40f" %(largest_number, (finish-start)/largest_number)

^[45]

#! /usr/bin/python
from time import time
from math import sqrt

def babylonian_sqrt(my_number):

	error  = 0.000000000001 	#10^-12
	guess  = (my_number+1)/2.0

	new_guess = (my_number/guess +  guess)/2.0
	while ((guess-new_guess) > error):
		guess = new_guess
		new_guess = (my_number/guess +  guess)/2.0
		#print "%10.10f" %new_guess


#main()
largest_numbers=[1,10,100,1000,10000,100000,1000000]
for largest_number in largest_numbers:
	#largest_number=1000
	start=time()
	for number in range(0,largest_number):
		babylonian_sqrt(number)

	finish=time()
	print "babylonian_sqrt:      largest number %10d time/iteration %10.10f" %(largest_number, (finish-start)/largest_number)

	start=time()
	for number in range(0,largest_number):
		sqrt(number)

	finish=time()
	print "library_sqrt:     largest number %10d time/iteration %10.10f" %(largest_number, (finish-start)/largest_number)

	start=time()
	for number in range(0,largest_number):
		pass

	finish=time()
	print "empty_loop:   largest number %10d time/iteration %10.10f" %(largest_number, (finish-start)/largest_number)
	
	print

^[46]

//gcc -g square_root.c -lm 

#include <sys/types.h>
#include <time.h>
#include <unistd.h>
#include <stdio.h>
#include <math.h>

double babylonian_sqrt(my_number){

	double error;
	double guess, new_guess;

	error  = 0.000000000001;	//10^-12
	guess  = (my_number+1.0)/2.0;

	new_guess = (my_number/guess +  guess)/2.0;
	while ((guess-new_guess) > error){
		guess = new_guess;
		new_guess = (my_number/guess +  guess)/2.0;
		//printf ("babylonian_sqrt: %10d %5.20f \n", my_number, new_guess);
	}
	return new_guess;
}

//-----------------------------------

//timing code from http://beige.ucs.indiana.edu/B673/node104.html

int main(int argc, char *argv[]){

long int largest_number = 1000000;
int i,j;
double square_root;

time_t  t0, t1; /* time_t is defined on <time.h> and <sys/types.h> as long */
clock_t c0, c1; /* clock_t is defined on <time.h> and <sys/types.h> as int */

long count;
double a, b, c;

//printf ("using UNIX function time to measure wallclock time ... \n");
//printf ("using UNIX function clock to measure CPU time ... \n");


for (j=0; j<10; j++){ 

	largest_number=pow(10,j);	//largest_number is 10^0, 10^1... 10^9

	
	t0 = time(NULL);
	c0 = clock();
	//printf ("\tbegin (wall):            %ld\n", (long) t0);
	//printf ("\tbegin (CPU):             %d\n", (int) c0);
	
	for (i = 0; i < largest_number; i++){
		square_root = babylonian_sqrt(i);
		//printf ("%d, %10.20f \n", i, square_root);
		}
	
	t1 = time(NULL);
	c1 = clock();
	//printf ("\tend (wall):              %ld\n", (long) t1);
	//printf ("\tend (CPU);               %d\n", (int) c1);
	//printf ("\telapsed wall clock time: %ld\n", (long) (t1 - t0));
	//printf ("\telapsed CPU time:        %f\n", (float) (c1 - c0)*1.0/CLOCKS_PER_SEC);
	printf ("\ttime/iteration: babylonian       %10d  %10.20f\n", largest_number, (double) ((c1 - c0)*1.0/CLOCKS_PER_SEC)/largest_number );
	
	//lets do it again for the library sqrt
	
	t0 = time(NULL);
	c0 = clock();
	//printf ("\tbegin (wall):            %ld\n", (long) t0);
	//printf ("\tbegin (CPU):             %d\n", (int) c0);
	
	for (i = 0; i < largest_number; i++){
		square_root = sqrt(i);
		//printf ("%d, %10.20f \n", i, square_root);
		}
	
	t1 = time(NULL);
	c1 = clock();
	//printf ("\tend (wall):              %ld\n", (long) t1);
	//printf ("\tend (CPU);               %d\n", (int) c1);
	//printf ("\telapsed wall clock time: %ld\n", (long) (t1 - t0));
	//printf ("\telapsed CPU time:        %f\n", (float) (c1 - c0)*1.0/CLOCKS_PER_SEC);
	printf ("\ttime/iteration: library          %10d  %10.20f\n", largest_number, (double) ((c1 - c0)*1.0/CLOCKS_PER_SEC)/largest_number );
	
	//lets do it again for the empty loop
	
	t0 = time(NULL);
	c0 = clock();
	//printf ("\tbegin (wall):            %ld\n", (long) t0);
	//printf ("\tbegin (CPU):             %d\n", (int) c0);
	
	for (i = 0; i < largest_number; i++){
		//square_root = sqrt(i);
		//printf ("%d, %10.20f \n", i, square_root);
		}
	
	t1 = time(NULL);
	c1 = clock();
	//printf ("\tend (wall):              %ld\n", (long) t1);
	//printf ("\tend (CPU);               %d\n", (int) c1);
	//printf ("\telapsed wall clock time: %ld\n", (long) (t1 - t0));
	//printf ("\telapsed CPU time:        %f\n", (float) (c1 - c0)*1.0/CLOCKS_PER_SEC);
	printf ("\ttime/iteration: empty            %10d  %10.20f\n", largest_number, (double) ((c1 - c0)*1.0/CLOCKS_PER_SEC)/largest_number );
	
	printf ("\n");
	}
return 0;
}

//-------------------------------------

^[47]

#! /usr/bin/python
#
# pi_numerical_integration_diagram.py
#
# Joseph Mack (C) 2008, 2009 jmack (at) wm7d (dot) net. Released under GPL v3
# generates diagrams for the numerical integration under a circle, to give the value of Pi.
# uses PIL library.

from math import sqrt
import os, sys
from PIL import Image, ImageDraw, ImageFont

#fix coordinate system
#
#origin of diagram is at top left
#there appears to be no translate function ;-\
#strings are right (not center) justified
#axes origin is at 50,450 (bottom left)
x_origin=50
y_origin=450
unit=400	#400 pixels = 1 in cartesian coords
extra_labels=0

#change coordinate system.
#cartesian to pil coords
def x2pil(x):
	result=x_origin +x*unit 
	return result

def y2pil(y):
	result=y_origin -y*unit 
	return result

def draw_axes():
	#axes
	axes_origin=(x2pil(0), y2pil(0))
	x_axes=(axes_origin,x2pil(1),y2pil(0))
	y_axes=(axes_origin,x2pil(0),y2pil(1))
	draw.line(x_axes,fill="black")
	draw.line(y_axes,fill="black")
	#label axes
	color="#000000"
	x_axes_label_position_0=(x2pil(-0.02), y2pil(-0.02))
	draw.text(x_axes_label_position_0, "0", font=label_font, fill=color)
	x_axes_label_position_1=(x2pil(1-0.02), y2pil(-0.02))
	draw.text(x_axes_label_position_1, "1", font=label_font, fill=color)

	y_ayes_label_position_0=(x2pil(-0.05), y2pil(0.02))
	draw.text(y_ayes_label_position_0, "0", font=label_font, fill=color)
	y_ayes_label_position_1=(x2pil(-0.05), y2pil(1.02))
	draw.text(y_ayes_label_position_1, "1", font=label_font, fill=color)

	#extra x-axes
	if (extra_labels==1):
		x_axes_label_position_0=(x2pil(-0.1-0.02), y2pil(-0.02))
		draw.text(x_axes_label_position_0, "x=", font=label_font, fill=color)
		x_axes_label_position_0=(x2pil(-0.1-0.02), y2pil(-0.06))
		draw.text(x_axes_label_position_0, "i=", font=label_font, fill=color)
		x_axes_label_position_0=(x2pil(-0.02), y2pil(-0.06))
		draw.text(x_axes_label_position_0, "0", font=label_font, fill=color)
		x_axes_label_position_1=(x2pil(0.1-0.02), y2pil(-0.06))
		draw.text(x_axes_label_position_1, "1", font=label_font, fill=color)
		x_axes_label_position_n=(x2pil(1-0.02), y2pil(-0.06))
		draw.text(x_axes_label_position_n, "n", font=label_font, fill=color)

#---------------------
size=(500,500)
mode="RGBA"
#bbox for circle
bbox=(x2pil(-1),y2pil(1),x2pil(1),y2pil(-1))

#fonts
#print sys.path
label_font = ImageFont.load("/usr/lib/python2.4/site-packages/PIL/courR18.pil")
#label_font = ImageFont.load("PIL/courR18.pil")

#slices for integration
intervals=10	#400/10=40 pixels/inverval
rectangle_width=1.0/intervals

#arc only, showing Pythagorus
im=Image.new (mode,size,"white")
#draw quarter cicle
draw=ImageDraw.Draw(im)
#arc, angle goes clockwise :-(
draw.arc(bbox,-90,0,"#00FF00")
draw_axes()

color="#000000"
label_position=(x2pil(0.0), y2pil(1.1))
label="Pythagorean formula:"
draw.text(label_position, label, font=label_font, fill=color)
label_position=(x2pil(0.2), y2pil(1.05))
label="circumference of circle"
draw.text(label_position, label, font=label_font, fill=color)
point_label_position=(x2pil(0.4), y2pil(1.0))
draw.text(point_label_position, "x^2+y^2=r^2", font=label_font, fill=color)
polygon=(x2pil(0),y2pil(0),x2pil(0.6),y2pil(0),x2pil(0.6),y2pil(0.8))
draw.polygon(polygon,fill="#ffff00")

#draw fine grid, spacing = 0.01
intervals=100
rectangle_width=1.0/intervals
color="#00ff00"
for interval in xrange(0,intervals):
	x=interval*rectangle_width
	print x
	h = sqrt(1-x**2)
	#vertical line
	line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
	draw.line(line,fill=color)
	#horizontal line
	line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
	draw.line(line,fill=color)

#draw coarse grid, spacing = 0.1
intervals=10
rectangle_width=1.0/intervals
color="#000000"
for interval in xrange(0,intervals):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	#vertical line
	line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
	draw.line(line,fill=color)
	#horizontal line
	line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
	draw.line(line,fill=color)

#label points
color="#000000"
point_label_position=(x2pil(0.45-0.02), y2pil(0.05-0.02))
draw.text(point_label_position, "(0.6,0.0)", font=label_font, fill=color)
point_label_position=(x2pil(-0.15-0.02), y2pil(0.85-0.02))
draw.text(point_label_position, "(0.0,0.8)", font=label_font, fill=color)
point_label_position=(x2pil(0.45-0.02), y2pil(0.85-0.02))
draw.text(point_label_position, "(0.6,0.8)", font=label_font, fill=color)

#label sides of triangle
color="#000000"
point_label_position=(x2pil(0.25-0.02), y2pil(0.0-0.02))
draw.text(point_label_position, "x^2=0.36", font=label_font, fill=color)
point_label_position=(x2pil(0.55-0.02), y2pil(0.37-0.02))
draw.text(point_label_position, "y^2=0.64", font=label_font, fill=color)
point_label_position=(x2pil(0.15-0.02), y2pil(0.47-0.02))
draw.text(point_label_position, "r^2=1.0", font=label_font, fill=color)


im.show()
im.save("pi_pythagorus.png")

#upper bounds graph
im=Image.new (mode,size,"white")
#draw quarter cicle
draw=ImageDraw.Draw(im)
#arc, angle goes clockwise :-(
draw.arc(bbox,-90,0,"#00FF00")
draw_axes()

label_position=(x2pil(0.6), y2pil(1.0))
color="#0000ff"
intervals=10
rectangle_width=1.0/intervals
label="Upper Bounds"
draw.text(label_position, label, font=label_font, fill=color)
for interval in xrange(0,intervals):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	#vertical line
	line=(x2pil(x+rectangle_width),y2pil(0),x2pil(x+rectangle_width),y2pil(h))
	draw.line(line,fill=color)
	#horizontal line
	line=(x2pil(x),y2pil(h),x2pil(x+rectangle_width),y2pil(h))
	draw.line(line,fill=color)

im.show()
im.save("pi_upper_bounds.png")

#lower bounds graph
im=Image.new (mode,size,"white")
#draw quarter cicle
draw=ImageDraw.Draw(im)
#arc, angle goes clockwise :-(
draw.arc(bbox,-90,0,"#00FF00")
draw_axes()

color="#ff0000"
label="Lower Bounds"
draw.text(label_position, label, font=label_font, fill=color)
for interval in xrange(1,intervals+1):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	#vertical line
	#line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
	#draw.line(line,fill=color)
	#horizontal line
	#line=(x2pil(x-interval),y2pil(h),x2pil(x),y2pil(h))
	#draw.line(line,fill=color)
	#rectangle
	rbox=(x2pil(x),y2pil(0),x2pil(x-rectangle_width),y2pil(h))
	draw.rectangle(rbox, outline="#ffffff", fill=color)
	#horizontal line - shows box on right of zero height
	line=(x2pil(x-rectangle_width),y2pil(h),x2pil(x),y2pil(h))
	draw.line(line,fill=color)

im.show()
im.save("pi_lower_bounds.png")

#upper and lower bounds together
im=Image.new (mode,size,"white")
#draw quarter cicle
draw=ImageDraw.Draw(im)
#arc, angle goes clockwise :-(
draw.arc(bbox,-90,0,"#00FF00")
draw_axes()

color="#0000ff"
label="Upper Bounds"
draw.text(label_position, label, font=label_font, fill=color)
for interval in xrange(0,intervals):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	#vertical line
	#line=(x2pil(x+1.0/intervals),y2pil(0),x2pil(x+1.0/intervals),y2pil(h))
	line=(x2pil(x+rectangle_width),y2pil(0),x2pil(x+rectangle_width),y2pil(h))
	draw.line(line,fill=color)
	#horizontal line
	#line=(x2pil(x),y2pil(h),x2pil(x+1.0/intervals),y2pil(h))
	line=(x2pil(x),y2pil(h),x2pil(x+rectangle_width),y2pil(h))
	draw.line(line,fill=color)

color="#ff0000"
label="Lower Bounds"
label_position=(x2pil(0.6), y2pil(0.9))
draw.text(label_position, label, font=label_font, fill=color)
for interval in xrange(1,intervals+1):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	#vertical line
	#line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
	#draw.line(line,fill=color)
	#horizontal line
	#line=(x2pil(x-interval),y2pil(h),x2pil(x),y2pil(h))
	#draw.line(line,fill=color)
	#rectangle
	rbox=(x2pil(x),y2pil(0),x2pil(x-rectangle_width),y2pil(h))
	draw.rectangle(rbox, outline="#ffffff", fill=color)
	#horizontal line - shows box on right of zero height
	line=(x2pil(x-rectangle_width),y2pil(h),x2pil(x),y2pil(h))
	draw.line(line,fill=color)

im.show()
im.save("pi_upper_and_lower_bounds.png")

#upper and shifted lower bounds together
im=Image.new (mode,size,"white")
#draw quarter cicle
draw=ImageDraw.Draw(im)
#arc, angle goes clockwise :-(
draw.arc(bbox,-90,0,"#00FF00")
extra_labels=1
draw_axes()
extra_labels=0

color="#ff0000"
label="Shifted Lower Bounds"
label_position=(x2pil(0.3), y2pil(1.05))
draw.text(label_position, label, font=label_font, fill=color)

for interval in xrange(1,intervals+1):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	#vertical line
	#line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
	#draw.line(line,fill=color)
	#horizontal line
	#line=(x2pil(x-1.0/intervals),y2pil(h),x2pil(x),y2pil(h))
	#draw.line(line,fill=color)
	#rectangle
	rbox=(x2pil(x+rectangle_width),y2pil(0),x2pil(x),y2pil(h))
	draw.rectangle(rbox, outline="#ffffff", fill=color)
	#horizontal line - shows box on right of zero height
	#line=(x2pil(x-rectangle_width),y2pil(h),x2pil(x),y2pil(h))
	#draw.line(line,fill=color)

color="#0000ff"
label="Upper Bounds"
label_position=(x2pil(0.0), y2pil(1.1))
draw.text(label_position, label, font=label_font, fill=color)
for interval in xrange(0,intervals):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	#vertical line
	#line=(x2pil(x+rectangle_width),y2pil(0),x2pil(x),y2pil(h))
	line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
	draw.line(line,fill=color)
	#horizontal line
	line=(x2pil(x),y2pil(h),x2pil(x+rectangle_width),y2pil(h))
	draw.line(line,fill=color)

#arc is obliterated, do it again
draw.arc(bbox,-90,0,"#00FF00")
im.show()
im.save("pi_upper_and_shifted_lower_bounds.png")

#lower bounds graph with intervals=100
intervals=100	#400/10=40 pixels/inverval
rectangle_width=1.0/intervals
im=Image.new (mode,size,"white")
#draw quarter cicle
draw=ImageDraw.Draw(im)
#arc, angle goes clockwise :-(
draw.arc(bbox,-90,0,"#00FF00")
draw_axes()


color="#ff0000"
label="Lower Bounds, 100 intervals"
label_position=(x2pil(0.0), y2pil(1.1))
draw.text(label_position, label, font=label_font, fill=color)
for interval in xrange(0,intervals):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	rbox=(x2pil(x),y2pil(0),x2pil(x-rectangle_width),y2pil(h))
	draw.rectangle(rbox, outline="#ffffff", fill=color)
	#horizontal line - shows box on right of zero height
	line=(x2pil(x-rectangle_width),y2pil(h),x2pil(x),y2pil(h))
	draw.line(line,fill=color)

#arc is obliterated, do it again
#draw.arc(bbox,-90,0,"#00FF00")
im.show()
im.save("pi_lower_bounds_100_intervals.png")
# pi_numerical_integration_diagram.py ---------------------------

^[48]

y=0.707. The line joining this point to the center makes an angle of 45° with the x axis and has a slope=1

^[49]

30°

^[50]

You'll need 10 for the lower bound, 10 for the upper bound. 9 of these are common to both sets. There'll be 11 heights total.

^[51]

I needed 11 numbers, 0..10. Thus I used range(0,11) (sensible huh? range() is like this).

^[52]

Otherwise you'll get integer division.

^[53]

base=1.0/num_intervals; area=height/num_intervals

^[54]

#! /usr/bin/python
#py_lower.py
from math import sqrt

num_intervals = 10
base_width=1.0/num_intervals
area = 0

#alternately the loop could start
#for i in range (1,num_intervals+1):
#	height = sqrt(1-i*i*1.0/(num_intervals*num_intervals))

for i in range (0,num_intervals):
	height = sqrt(1-(i+1)*(i+1)*1.0/(num_intervals*num_intervals))
	area += height*base_width
	print "%d, %3.5f, %3.5f" %(i, height, area)

print "lower bound of pi %3.5f" %(area*4)

# pi_lower.py -------------------------------------------------------------

^[55]

#! /usr/bin/python
#py_upper.py
from math import sqrt

num_intervals = 10
base_width=1.0/num_intervals

Area=0
for i in range (0,num_intervals):
	Height = sqrt(1-i*i*1.0/(num_intervals*num_intervals))
	Area += Height*base_width
	print "%d, %3.5f, %3.5f" %(i, Height, Area)

print "upper bound for pi %3.5f" %(Area*4)

# pi_upper.py ------------------------------------

^[56]

There are 9 heights common to both outputs.

^[57]

#! /usr/bin/python
#pi_2.py
from math import sqrt

# initialise variables
#num_intervals = 1000000
num_intervals = 10
base_width=1.0/num_intervals	#width of rectangles
Area=0				#upper bound
area=0				#lower bound
a = 0				#accumulate area of rectangles

# calculate area for lower bounds
for i in range(1,num_intervals):
	h = sqrt(1-1.0*i*i/(num_intervals*num_intervals))
	a += h*base_width
	print "%10d %10.20f %10.20f" %(i, h, a)

# calculate area for upper bounds
#add left hand interval to Area and right hand interval to area.
Area = a + 1.0*base_width
area = a + 0.0*base_width	#adding 0.0 is trivial, 
				#this line is here 
				#to let the reader know what's happening

print "pi upper bound %10.20f, lower bound %10.20f" %(Area*4, area*4)
# pi_2.py -------------------------------------------------------

^[58]

signed int

^[59]

unsigned int

^[60]

#! /usr/bin/python
#pi_3.py
from math import sqrt

# initialise variables
#num_intervals = 1000000
num_intervals = 10
base_width=1.0/num_intervals	#width of rectangles
Area=0				#upper bound
area=0				#lower bound
a = 0				#accumulate area of rectangles

# calculate area for lower bounds
#for i in range(1,num_intervals):
for i in xrange(1,num_intervals):
	h = sqrt(1-1.0*i*i/(num_intervals*num_intervals))
	a += h*base_width
	#print "%10d %10.20f %10.20f" %(i, h, a)

# calculate area for upper bounds
# add left hand interval to Area and right hand interval to area.
Area = a + 1.0*base_width
area = a + 0.0*base_width	#adding 0.0 is trivial, 
				#this line is here 
				#to let the reader know what's happening

print "pi upper bound %10.20f, lower bound %10.20f" %(Area*4, area*4)

# pi_3.py -------------------------------------------------------

^[61]

The square is calculated num_intervals times. You only need to calculate it once. If you need to use a value multiple times, you calculate it, store it and use the stored value. You never calculate the same value again.

^[62]

#! /usr/bin/python

from time import time

num_intervals=100000
num_intervals_squared=num_intervals*num_intervals
sum = 0

start = time()
for i in xrange(0,num_intervals):
     #sum+=(1.0*i*i)/(num_intervals*num_intervals)
     sum+=(1.0*i*i)/num_intervals_squared

finish=time()
print "precalculate num_intervals^2:      %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

^[63]

#! /usr/bin/python
#test_optimisation.py

from time import time

#1 base

#num_intervals=100000
#num_intervals=2**17 
num_intervals=1000000
#num_intervals=2**20 
num_intervals_squared=num_intervals*num_intervals
interval_squared=1.0/num_intervals_squared
print "                                   iterations          sum   iterations/sec"
sum = 0

start = time()
for i in xrange(0,num_intervals):
     sum+=1.0/(num_intervals*num_intervals)
     #sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
 
finish=time()
print "unoptimised code:                  %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

sum = 0
start = time()
for i in xrange(0,num_intervals):
     #sum+=1.0/(num_intervals*num_intervals)
     sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
 
finish=time()
print "precalculate num_intervals^2:      %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))
#------------------------------------

^[64]

#! /usr/bin/python

from time import time

num_intervals=100000 #60k iterations/sec
#num_intervals=2^17 #60k iterations/sec
num_intervals_squared=num_intervals*num_intervals
interval_squared=1.0/num_intervals_squared
sum = 0

start = time()
for interval in xrange(0,num_intervals):
     #sum+=(1.0*interval*interval)/(num_intervals*num_intervals)
     #sum+=(1.0*interval*interval)/num_intervals_squared
     sum+=interval*interval * interval_squared

finish=time()
print "multiple by reciprocal: %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

^[65]

#! /usr/bin/python
#test_optimisation.py

from time import time

#1 base

#num_intervals=100000
#num_intervals=2**17 
#num_intervals=1000000
num_intervals=10
#num_intervals=2**20 
num_intervals_squared=num_intervals*num_intervals
interval_squared=1.0/num_intervals_squared
base_width=1.0/num_intervals
print "                              iterations          sum   iterations/sec"
sum = 0

start = time()
for i in xrange(0,num_intervals):
     sum+=1.0/(num_intervals*num_intervals)
     #sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
 
finish=time()
print "unoptimised code:             %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

sum = 0
start = time()
for i in xrange(0,num_intervals):
     #sum+=1.0/(num_intervals*num_intervals)
     sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
 
finish=time()
print "precalculate num_intervals^2: %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

sum = 0
start = time()
for i in xrange(0,num_intervals):
     #sum+=1.0/(num_intervals*num_intervals)
     #sum+=1.0/num_intervals_squared
     sum+=1.0*interval_squared
 
finish=time()
print "multiply by reciprocal:       %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

sum = 0
start = time()
for i in xrange(0,num_intervals):
     #sum+=1.0/(num_intervals*num_intervals)
     #sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
     sum+=base_width**2
     print sum
 
finish=time()
print "normalised reciprocal:        %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))
#------------------------------------

^[66]

In real math, where decimal numbers have to be represented by 1.0..9.999₁₀*10^n and binary numbers are represented by 1.0..1.1111₂*2^n.

^[67]

interval=1.0/num_intervals

sum = 0
start = time()
for i in xrange(0,num_intervals):
     #sum+=1.0/(num_intervals*num_intervals)
     #sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
     sum+=(1.0*interval)**2

finish=time()
print "normalised reciprocal           : %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

^[68]

#! /usr/bin/python
#test_optimisation.py

from time import time

#1 base

#num_intervals=100000
#num_intervals=2**17 
num_intervals=1000000
#num_intervals=2**20 
num_intervals_squared=num_intervals*num_intervals
interval_squared=1.0/num_intervals_squared
interval=1.0/num_intervals
print "                                   iterations          sum   iterations/sec"
sum = 0

start = time()
for i in xrange(0,num_intervals):
     sum+=1.0/(num_intervals*num_intervals)
     #sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
 
finish=time()
print "unoptimised code:                  %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

sum = 0
start = time()
for i in xrange(0,num_intervals):
     #sum+=1.0/(num_intervals*num_intervals)
     sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
 
finish=time()
print "precalculate num_intervals^2:      %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

sum = 0
start = time()
for i in xrange(0,num_intervals):
     #sum+=1.0/(num_intervals*num_intervals)
     #sum+=1.0/num_intervals_squared
     sum+=1.0*interval_squared
 
finish=time()
print "multiple by reciprocal:            %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))

sum = 0
start = time()
for i in xrange(0,num_intervals):
     #sum+=1.0/(num_intervals*num_intervals)
     #sum+=1.0/num_intervals_squared
     #sum+=1.0*interval_squared
     sum+=(1.0*interval)**2
 
finish=time()
print "normalised reciprocal:             %10d %10.10f    %10.5f" %(num_intervals, sum, num_intervals/(finish-start))
#------------------------------------

^[69]

We reduce the numbers by multiplying by the inverse of num_intervals. Here's the code before and after optimisation.

num_intervals=1000
#original code
for i in xrange(1,num_intervals):
	h = sqrt(1-(1.0*i*i)/(num_intervals*num_intervals))
	a += h/num_intervals


#optimised code using multiplication, rather than division to reduce numbers
base_width=1.0/num_intervals

for i in xrange(1,num_intervals):
	h = sqrt(1-(i*interval)**2)
	a += h*interval

^[70]

#! /usr/bin/python
#pi_4.py
# (C) Joseph Mack 2009, released under GPL v3

from math import sqrt
from time import time

#num_intervals = 100000000
#num_intervals = 100000000
#num_intervals = 1073741824 #2^30
#num_intervals = 134217728 #2^27
#num_intervals = 8388608 #2^27 #8m44.830s = 520 secs, 16131 iterations/sec
num_intervals = 10

base_width=1.0/num_intervals

Area=0  #upper bound
area=0  #lower bound
a = 0	#accumulated area of rectangles

start = time()
for i in xrange(1,num_intervals):
	h = sqrt(1-(i*base_width)**2)
	a += h*base_width
	#print "%10d %10.20f %10.20f" %(i, h, a)

finish=time()
#add left hand interval to Area and right hand interval to area.

Area = a + 1.0*base_width
area = a + 0.0*base_width	#adding 0.0 is trivial,
	                        #this line is here
	                        #to let the reader know what's happening

print "num_intervals %d, time %f, pi lower bound %10.20f, upper bound %10.20f speed %10.20f" %(num_intervals, finish-start, area*4, Area*4, num_intervals/(finish-start))

# pi_4.py ----------------------------------------------

^[71]

Here's what happens if you look past the end of a string.

>>> number = 3.14157
>>> str_number = str(number)
>>> print str_number[10]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IndexError: string index out of range

Your program will crash. You'll crash your rocket or kill the patient whose heart monitor is running your code.

^[72]

There is no number that corresponds to the empty string "". You can't return 0 as you didn't get matching 0's in the first place of the two numbers. The solution is to return NaN (not a number), one of the IEEE 754 special reals (along with ∞). You can't assign NaN to a variable in Python; instead you execute code that generates a NaN.

In the case when pi_result=="", run this code

if (pi_result==""):
	#there isn't a constant for NaN. We have to generate it.
	val = 1e30000   	#infinity
	pi_result = val/val        #NaN

This assigns pi_result the value of a valid real (NaN) when the match is the null string (""). With this change the code no longer crashes if the match is "", but it is now the responsibility of the calling code to handle the returned NaN (this is not a big deal, all languages handle the IEEE 754 numbers).

^[73]

#! /usr/bin/python
# compare_reals.py
# (C) Homer Simpson 2009, released under GPL v3.
# compares reals, compares digits in two reals pairwise, starting at the leftmost digit.
# prints digits that match upto the first missmatch, then exits.
#
#eg
# real_lower = 3.14157
# real_upper = 3.14161
# output is 3.141

real_lower = 3.14157
real_upper = 0.0
#real_upper = 3.14157
#real_upper = 3.14161
pi_result=""

#turn numbers into strings
string_lower = str(real_lower)
string_upper = str(real_upper)

#find length of shortest number
len_string_lower = len(string_lower)
len_string_upper = len(string_upper)
shorter = len_string_upper
if (len_string_lower < len_string_upper):
	shorter = len_string_lower
	
#print shorter

for i in range(0, shorter):
	#print i
	if (string_lower[i] == string_upper[i]):
		pi_result += string_lower[i]
	else:
		break;

print pi_result

#if outputting a real
#if (pi_result==""):
#	#there isn't a constant for NaN. We have to generate NaN.
#	val = 1e30000		#first generate infinity
#	pi_result = val/val	#inf/inf -> NaN	
#	
#print float(pi_result)

# compare_reals.py---------------------------

^[74]

The order of the algorithm is O(n).

^[75]

#! /usr/bin/python
# compare_reals_2.py
# (C) Homer Simpson 2009, released under GPL v3.
# compares reals, returns digits that match, starting at leftmost digit
#
#eg
# real_lower = 3.14157
# real_upper = 3.14161
# output is 3.141

def longest_common_subreal(real_1, real_2):

# string longest_common_subreal(real,real)
# (C) Homer Simpson 2009, released under GPL v3.
# compares reals, returns digits that match (as a string), starting at leftmost digit

	#initialise variables
	result = ""	#null string
	#turn numbers into strings
	string_1 = str(real_1)
	string_2 = str(real_2)

	#find length of shortest number
	len_string_1 = len(string_1)
	len_string_2 = len(string_2)
	shorter = len_string_2	#initialise shorter

	#change shorter if it's the wrong one
	if (len_string_1 < len_string_2):
		shorter = len_string_1
	
	#print shorter

	for i in range(0, shorter):
		if (string_1[i] == string_2[i]):
			result += string_1[i]
		else:
			break;	#exit for loop on first mismatch

	return result

	#if outputting a real
	#if (result==""):
	#	#there isn't a constant for NaN. We have to generate NaN.
	#	val = 1e30000		#first generate infinity
	#	result = val/val	#inf/inf -> NaN	
	#
	#return float(result)

# longest_common_subreal(real,real)--------------------------

#main()

real_lower = 3.14157
#real_upper = 0.0
#real_upper = 3.14157
real_upper = 3.14161
pi_result=longest_common_subreal(real_lower, real_upper)

print pi_result

# compare_reals_2.py---------------------------

^[76]

#! /usr/bin/python
#pi_5.py
# (C) Joseph Mack 2009, released under GPL v3

from math import sqrt
from time import time

def longest_common_subreal(real_1, real_2):

# string longest_common_subreal(real,real)
# (C) Homer Simpson 2009, released under GPL v3.
# compares reals, compares digits in two reals pairwise, starting at the leftmost digit.
# prints digits that match upto the first missmatch, then exits.
#
# eg
# real_lower = 3.14157
# real_upper = 3.14161
# output is 3.141

	#initialise variables
	result = ""	#null string

	#turn numbers into strings
	string_1 = str(real_1)
	string_2 = str(real_2)

	#find length of shortest number
	len_string_1 = len(string_1)
	len_string_2 = len(string_2)
	shorter = len_string_2	#initialise shorter

	#change shorter if it's the wrong one
	if (len_string_1 < len_string_2):
		shorter = len_string_1
	
	#print shorter

	for i in range(0, shorter):
		if (string_1[i] == string_2[i]):
			result += string_1[i]
		else:
			break;	#exit for loop on first mismatch

	return result

	#if outputting a real
	#if (result==""):
	#	#there isn't a constant for NaN. We have to generate NaN.
	#	val = 1e30000		#first generate infinity
	#	result = val/val	#inf/inf -> NaN	
	#
	#return float(result)

# longest_common_subreal(real,real)--------------------------

#main()

#num_intervals = 100000000
#num_intervals = 100000000
#num_intervals = 1073741824 #2^30
#num_intervals = 134217728 #2^27
#num_intervals = 8388608 #2^27 #8m44.830s = 520 secs, 16131 iterations/sec
#num_intervals = 10
num_intervals = 100

base_width=1.0/num_intervals

Area=0  #upper bound
area=0  #lower bound
a = 0	#accumulated area of rectangles

start = time()
for i in xrange(1,num_intervals):
	h = sqrt(1-(i*base_width)**2)
	a += h*base_width
	#print "%10d %10.20f %10.20f" %(i, h, a)

finish=time()
#add left hand interval to Area and right hand interval to area.

Area = a + 1.0*base_width
area = a + 0.0*base_width	#adding 0.0 is trivial,
	                        #this line is here
	                        #to let the reader know what's happening

print "num_intervals %d, time %f, pi_result %s, lower bound %10.20f, upper bound %10.20f speed %10.20f" %(num_intervals, finish-start, longest_common_subreal(area*4,Area*4), area*4, Area*4, num_intervals/(finish-start))

# pi_5.py ----------------------------------------------

^[77]

16 mins

^[78]

get a better algorithm. If we don't have one, we should go find something else to do, as we've reached our limit here.

^[79]

we'd need 10^15≅2^50 operations. In the worst case, 50 of the 52 bits of the mantissa would be rounding errors.

^[80]

The numerical integration is calculating upper and lower bounds. It is not calculating π. We get π after the numerical integration is finished, by pointing at a pair of numbers and waving our hands. The value of the upper and lower bounds is calculated to an accuracy of 24 digits when you do 10^9 iterations (it will be approx 23 digits if you do 10^10 iterations).

^[81]

We haven't a clue. All we know is that we have the upper and lower bounds to 24 decimal digits. The accuracy of π is determined by the number of digits which are the same in both numbers. If the algorithm for calculating π converges slowly, after 10^9 iterations only maybe 3 digits agree; if the algorithm converges quickly, all 24 digits might agree. We have to look elsewhere for an estimate of how quickly the algorithm converges.

^[82]

Numerator: for x=1,2,3,4..., x%2 has the value 1,0,1,0... Multiplying by 2 and subtracting 1 gives 1,-1,1,-1

Denominator: for x=1,2,3,4..., (2*x-1) has the value 1,3,5,7...

^[83]

Leibnitz Gregory: 4*10^5. Numerical integration 10^6.

^[84]

#! /usr/bin/python
#
#machin.py
#calculates pi using the Machin formula
#pi/4 = 4[1/5 - 1/(3*5^3) + 1/(5*5^5) - 1/(7*5^7) + ...] - [1/239 - 1/(3*239^3) + 1/(5*239^5) - 1/(7*239^7) + ...]

#Coded up from 
#An Imaginery Tale, The Story of sqrt(-1), Paul J. Nahin 1998, p173, Princeton University Press, ISBN 0-691-02795-1.

large_number=20
pi_over_4=0

for x in range(1, large_number):
	pi_over_4 += 4.0*((x%2)*2-1)/((2*x-1)*5**(2*x-1))
	pi_over_4 -= 1.0*((x%2)*2-1)/((2*x-1)*239**(2*x-1))
	print "x %2d pi %2.20f" %(x, pi_over_4*4)

# machin.py-------------------------

^[85]

11 iterations, for 15 significant figures.

^[86]

Machin: 4. Leibnitz Gregory: 4*10^5. Numerical integration 10^6.

^[87]

Look for the +-*/% signs in the lines of code. I get about 13 for each of the two lines, say 25 for the loop. There's a minor wrinkle: for each iteration you are also raising a number to some power, which I'm counting here as only one operation, so my answer here will be a lower bound (you want the upper bound). The number of operations goes up arithmetically for each iteration. We can get an upper bound by summing the arithmetic progression if we want. However at the moment I only want to compare the errors in the Machin series with the errors for numerical integration, so let's use the lower bound for the moment. 11 iterations are needed, giving the number of rounding errors of 265 (say 256 for a back of the envelope calculation), or 8 bits. 8 bits is about 3 decimal places (256 is about 3 decimal places), meaning that the rounding errors will make the last 3 decimal places invalid. As we will see in the next section, all the places in the output of the value of π from the Machin series just happen to be correct.

For comparison, the numerical integration method for the same number of decimal places of precision, would have all places invalid.

^[88]

Machin's formula needs about 100 iterations to get π to 100 places. Ramanujan's formula needs 100/14≅7 iterations.

^[89]

Ramanujan:                1 
Machin:                   4 
Leibnitz Gregory:    4*10^5 
Numerical integration: 10^6

^[90]

Each column is topped by an arc of the circumference. You would replace the arc with a straight line (making a triangle at the top of the column) and calculate the length of the straight line. Sum the lengths for all columns.

^[91]

(pi/4)*100=78 or 78% of the area of the square with corners (x,y)=(0,0), (x,y)=(1,1). Turn the octant over, look along the y axis, at the base of the octant (in the xz plane), where you can see the square bases of the columns. How many bases are there? How many of these bases are inside a circle of radius=1?

^[92]

#! /usr/bin/python
#
# pi_numerical_integration_diagram_78.py
#
# Joseph Mack (C) 2008, 2009 jmack (at) wm7d (dot) net. Released under GPL v3
# generates diagrams for the numerical integration under a circle, to give the value of Pi.
# uses PIL library.

from math import sqrt
import os, sys
from PIL import Image, ImageDraw, ImageFont

#fix coordinate system
#
#origin of diagram is at top left
#there appears to be no translate function ;-\
#strings are right (not center) justified
#axes origin is at 50,450 (bottom left)
x_origin=50
y_origin=450
unit=400	#400 pixels = 1 in cartesian coords

#change coordinate system.
#cartesian to pil coords
def x2pil(x):
	result=x_origin +x*unit 
	return result

def y2pil(y):
	result=y_origin -y*unit 
	return result

def draw_axes():
	#axes
	axes_origin=(x2pil(0), y2pil(0))
	x_axes=(axes_origin,x2pil(1),y2pil(0))
	y_axes=(axes_origin,x2pil(0),y2pil(1))
	draw.line(x_axes,fill="black")
	draw.line(y_axes,fill="black")
	#label axes
	color="#000000"
	x_axes_label_position_0=(x2pil(-0.02), y2pil(-0.02))
	draw.text(x_axes_label_position_0, "0", font=label_font, fill=color)
	x_axes_label_position_1=(x2pil(1-0.02), y2pil(-0.02))
	draw.text(x_axes_label_position_1, "1", font=label_font, fill=color)

	y_ayes_label_position_0=(x2pil(-0.05), y2pil(0.02))
	draw.text(y_ayes_label_position_0, "0", font=label_font, fill=color)
	y_ayes_label_position_1=(x2pil(-0.05), y2pil(1.02))
	draw.text(y_ayes_label_position_1, "1", font=label_font, fill=color)

#---------------------
size=(500,500)
mode="RGBA"
#bbox for circle
bbox=(x2pil(-1),y2pil(1),x2pil(1),y2pil(-1))

#fonts
#print sys.path
label_font = ImageFont.load("/usr/lib/python2.4/site-packages/PIL/courR18.pil")
#label_font = ImageFont.load("PIL/courR18.pil")

#slices for integration
intervals=10	#400/10=40 pixels/inverval
interval=1.0/intervals

#arc only, showing Pythagorus
im=Image.new (mode,size,"white")
#draw quarter cicle
draw=ImageDraw.Draw(im)
#arc, angle goes clockwise :-(
draw.arc(bbox,-90,0,"#00FF00")
draw_axes()

color="#000000"
label_position=(x2pil(-0.1), y2pil(1.1))
label="Bases for calculation of heights"
draw.text(label_position, label, font=label_font, fill=color)

#draw fine grid, spacing = 0.01
intervals=100
color="#00ff00"
rectangle_width=1.0/intervals
for interval in xrange(0,intervals):
	x=interval*rectangle_width
	h = sqrt(1-x**2)
	#vertical line
	line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
	draw.line(line,fill=color)
	#horizontal line
	line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
	draw.line(line,fill=color)

color="#ff0000"
#do each line by hand
x=1.0
h=0.5
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.9
h=0.6
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.8
h=0.8
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.7
h=0.8
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.6
h=0.9
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.5
h=1.0
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.4
h=1.0
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.3
h=1.0
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.2
h=1.0
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.1
h=1.0
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.0
h=1.0
##vertical line
line=(x2pil(x),y2pil(0),x2pil(x),y2pil(h))
draw.line(line,fill=color)

x=0.0
h=1.0
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.1
h=1.0
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.2
h=1.0
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.3
h=1.0
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.4
h=1.0
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.5
h=1.0
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.6
h=0.9
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.7
h=0.8
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.8
h=0.8
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=0.9
h=0.6
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

x=1.0
h=0.5
##horizontal line
line=(x2pil(0),y2pil(x),x2pil(h),y2pil(x))
draw.line(line,fill=color)

im.show()
im.save("pi_78.png")
#---------------------------

^[93]

#! /usr/bin/python
# calculate_interest.py

annual_interest=1.0
initial_deposit=1.0
evaluations_per_year=1

balance=initial_deposit*(1+annual_interest/evaluations_per_year)

print "evaluations/year %10d balance is %2.10f" %(evaluations_per_year, balance)

# calculate_interest.py-------------------

^[94]

#! /usr/bin/python
# calculate_interest.py

annual_interest=1.0
initial_deposit=1.0
evaluations_per_year=2

balance=initial_deposit*(1+annual_interest/evaluations_per_year)
balance*=(1+annual_interest/evaluations_per_year)

print "evaluations/year %10d balance is %2.10f" %(evaluations_per_year, balance)

# calculate_interest.py-------------------

^[95]

#! /usr/bin/python
# calculate_interest.py

# calculate interest on $1 for 1yr, with interest = 100%/yr
# interest is calculated at fractions of a year, updating balance each time.

annual_interest=1.0
initial_deposit=1.0
evaluations_per_year=2

balance=initial_deposit
for x in range (1, evaluations_per_year+1):
	balance=balance*(1+annual_interest/evaluations_per_year)
	print "%d balance is %f" %(x, balance)

print "evaluations/year %10d balance is %2.10f" %(evaluations_per_year, balance)
# calculate_interest.py-------------------

^[96]

#! /usr/bin/python
# calculate_interest.py

# calculate interest on $1 for 1yr, with interest = 100%/yr
# interest is calculated at fractions of a year, updating balance each time.

annual_interest=1.0
#evaluations_per_year=2

for evaluations_per_year in (1,2,10,100,1000,10000,100000,1000000,10000000,100000000,2000000000):
	initial_deposit=1.0
	balance=initial_deposit
	for x in xrange (1, evaluations_per_year+1):
		balance=balance*(1+annual_interest/evaluations_per_year)
		#print "%d balance is %f" %(x, balance)

	print "evaluations/year %10d balance is %2.10f" %(evaluations_per_year, balance)

# calculate_interest.py-------------------

^[97] for n=53 (bigger than the 64-bit mantissa), then (1+1/2^n) == 1.0. You'll get the result that e=1.

^[98]

#remembering handy numbers
2^32=4G   (approximately)

so 
10^9=2^30 (approximately)

You'd need 30 iterations.

^[99]

Assuming a 52-bit mantissa, n=2⁵². You will then be calculating e=(1+1/n)ⁿ, where n=2⁵². You can only have 52 iterations (squarings).

^[100]

No. The heights for each interval are different; each height has to be calculated independantly.

^[101]

#! /usr/bin/python
# calculate_interest_logarithmic.py

# (C) Warren Buffett 2009, licensed under GPLv3
# calculate interest on $1 for 1yr, with interest = 100%/yr
# calculate interest at fractions of a year, updating balance each time.

initial_deposit=1.0
annual_interest=1.0

iterations=1
evaluations_per_year=2**iterations
print "evaluations/year %10d" %(evaluations_per_year)
x=1+annual_interest/evaluations_per_year
iteration = 1
print "iteration %3d balance=%2.20f" %(iteration, x * initial_deposit)

# calculate_interest.py-------------------

^[102]

^[103]

^[104]

#! /usr/bin/python
# e_by_factorial.py

#initialise numbers
e=1.0
factorial_n=1.0

#main()

factorial_n /= 1	#what is "/=" doing?
e+=factorial_n

print e

# e_by_factorial.py --------------------

^[105]

#! /usr/bin/python
# e_by_factorial.py

#initialise numbers
e=1.0
factorial_n=1.0
num_iterations=20

#main()

for x in range (1,num_iterations):
	factorial_n /= x
	e+=factorial_n
	print "%2.30f" %e

# e_by_factorial.py --------------------

^[106]

A 64 bit real has a 52 bit mantissa, which only supports 15 or 16 significant decimal digits. Beyond that, the numbers being added by the series, are too small for the precision of a 64 bit real, and appear to be 0.0. See machine's epsilon for the smallest number than can be differentiated from 1.0 with a 64-bit real (the number is 1.11022302463e-16).

^[107]

256/2=128. A triangle has half the area of the corresponding rectangle/square. You can see this by making a copy of the triangle, rotating it 180° in the plane of the paper and joining it along the hypoteneuse to the first triangle, when you'll make a square (or rectangle). If you agree with this answer, you've forgotten the ???. This answer is close, but not exact.

^[108]

The area of the rectangle is 16*17=272. The number of operations is half the area of the rectangle, i.e. 136.

^[109]

The sum of the elements is half the area of the rectangle. The base (width) of the rectangle is the number of elements. The height is the sum of the value for the first element + the value of the last element. The sum then is

sum = n(first + last)/2

^[110]

The area of the object shown with "*" is 16*(16+3)/2=152.

^[111]

The number of bits in the mantissa that could be wrong is between 7 (would need 128 operations that were rounded) and 8 bits (256 operations that were rounded). The exact calculation of the number of operations doesn't give results that any different to any of the back of the envelope calculations - either the number 256=16*16 or the fencepost error value of 128. All give 7-8 bits of error (2-3 decimal digits).

^[112]

Take the last digit of each number.

1+2..+9=45

You need to do this addition 10 times (for the numbers 0x, 1x..9x) giving a sum of 450. Then you take the first digit of each number

1+2..+9=45

The first digit of each number represents a 10, so multiply this result by 10. You have to add the first digit 10 times, giving the sum of the first digits as 4500. Add your two numbers

450+4500=4950

then add 100

4950+100=5050

^[113]

100

^[114]

101*100

^[115]

101*100/2=5050 (You halve the previous result because the arithmetic progression was added in twice)

^[116]

Napoleon has nothing to do with the size of the earth, just the size of the units that the earth is measured in.

Napoleon mandated that France adopt the metric system. But first they had to devise a metric system. The m was defined as the 10,000th part of the length of the meridian, from the pole to the equator through Paris. This defines the circumference of the earth as 40,000km. Unfortunately the initial estimates weren't accurate, and as well the earth isn't spherical (it's a geoid, an oblate sphere, i.e. it's flattened at the poles due to the earth's rotation). The actual circumference is 40,075km at the equator and 40,008km through the poles. However 40,000km is close enough for most purposes.

^[117]

Number of layers                        = 146/0.5 = 292
Number of casing stones in bottom layer = 230/1   = 230
Number of casing stones in top layer    = 1

casing stones on one face  = n(first + last)/2
	                   = 292(230+1)/2
	                   = 33726

casing stones on all faces = 134904

^[118]

44 (remember the fencepost problem)

^[119]

Number of layers                  = 22
Number of windows in bottom layer = 58
Number of windows in top layer    = 8

windows on one face               = n(first + last)/2
	                          = 22*(58+8)/2
	                          =
pip:~# echo "22*(58+8)/2" | bc -l
726

Number in 4 faces
pip:~# echo "22*(58+8)/2*4" | bc -l
2904

^[120] Computer joke: but first, some background info

Not only can functions return a number, but main() in C (and now most languages) also returns a number to the operating system, when the program exits, allowing the operating system to handle execution failures. The number returned on success is 0. The numbers -128..127 are returned for different types of failures.
The Roman Empire lasted for about 400 years (computer programmers have to know this sort of stuff) (see Roman Empire http://en.wikipedia.org/wiki/Roman_Empire), and there is much debate about why it collapsed.

OK the joke: the Roman Empire collapsed because it didn't have a "0" to signal the successful completion of their C programs.

^[121]

(1,1)

^[122]

In quick succession the following occured

Kepler in 1605 discovered the laws that governed the motions of planets Kepler's Laws of Planetary Motion (http://en.wikipedia.org/wiki/Kepler's_laws_of_planetary_motion) (the planets move in ellipses about the sun at one of the focii). (Previously planets were called heavenly bodies, since they were associated with God. The words for sky, heaven and god were all related. e.g. The name "Jupiter" comes from Ju (dieu) and Pater (father). Jupiter was the "sky father".)
Galileo (1609) built a telescope (based on previous work by a Dutch spectacle maker Lippershey) allowing the exploration of the night sky.
Isaac Newton (http://en.wikipedia.org/wiki/Isaac_Newton) (and Liebnitz) discovered calculus (late 1600s). Newton determined that gravity varied as the inverse square of the distance and that an inverse square field would produce the elliptical orbits. (It didn't take long to figure out that the inverse square relationship was dependant on a 3-D universe. Although initially the math appeared difficult, the fact that the inverse square appeared in lots of places, gave people plenty of practice at using it and eventually everyone got used to it.)

The sudden arrival of a host of new moving objects: Uranus, Neptune and the asteroids, meant that people could predict where they would be at future times. Previously you had to wait for word from God on this, but now with a table of logarithms, you could bypass this requirement.

The discovery of Uranus by Herschel, was just luck (and Herschel being the first kid on the block with a telescope). The discovery of Neptune came from tracking perturbations in Uranus's orbit (see Mathematical discovery of planets http://www-groups.dcs.st-and.ac.uk/~history/HistTopics/Neptune_and_Pluto.html). Uranus had been tracked for almost 1 complete orbit, since its discovery by Herschel in 1781, when by 1845, separately Adams (in England) and Le Verrier (in France) (http://en.wikipedia.org/wiki/Urbain_Le_Verrier) determined the same location for a perturbing body. Neither Adams nor Le Verrier knew of each other's work. The results of Le Verrier's calculations were announced before the Frence Academy, in the hopes of someone finding the new planet. Adams calculations were known by some, but not publically circulated till after the discovery of Neptune. Remarkably, no-one acted on the predictions, even Airy, the Astronomer Royal in England, who was in possession of both sets of calculations showing identical results. I have not been able to find the amount of time that Le Verrier spent with tables of logarithms, but it would appear Adams (http://en.wikipedia.org/wiki/John_Couch_Adams) spend from 1843-5 on it.

You may wonder at such universal inaction on the part of observers, in the face of overwhelming evidence of a great discovery waiting to be made, but you must understand that no-one had predicted the discovery of a new planet before, and you quite justifiably didn't want to go to all the trouble of wasting an evening looking through your telescope, when you had more productive things to do. Clearly the prudent thing to do was to first let other people make a chain of successful predictions and discoveries, before expending any of your valuable time on such nonsense yourself. Eventually Galle (France) looked and after 30 mins of searching, found the planet at magnitude 8 (easily within reach of telescopes of the era), exactly where it had been predicted. On looking back through records, it was found that many people had observed Neptune, including Galileo in 1613, 200 yrs earlier, and all had observed it to move, but had not recognised it as a planet.

People weren't so reticent the next time and based on discrepencies in the orbit of both Uranus and Neptune, in January 1929 Clyde Tombaugh found Pluto. This was hailed as another success for rational thought, and was touted as such at least till my childhood (1950s). There was a minor problem: Pluto was only 1/300th the mass of the expected object. It wasn't till recently (1990's or so) that modern computers, recomputed the original data, to find that the descrepencies in the orbits of Uranus and Neptune were due to rounding errors in the original calculations. There never had been a perturbing planet. There just happens to be a lot of dim small stuff out there, which no-one knew at the time, and if you look long enough you'll find some of it. Tombaugh had not discovered Pluto by rational means; he was lucky and his extensive search was rewarded by success. This aspect of the discovery was kept quiet (hard work is not as dramatic as discovery by inspiration).

Another aspect of the discovery was kept quiet: Pluto wasn't a planet. Pluto doesn't have any of the properties of the other planets: its orbit is inclined to the plane of the ecliptic (the plane in which the planets orbit the sun) and is highly elliptical, with Pluto's orbit going inside that of Neptune for some part of its orbit. There 40 other Pluto like objects discovered so far, in inclined and highly elliptical orbits, some of which are larger or brighter (e.g. Xena) than Pluto. Pluto clearly was formed by a different mechanism to the 8 planets, all of which have essentially circular orbits and all of which orbit in the same plane. It became clear that Pluto has more affinity to these objects than it does to the planets. In 2006 Pluto was demoted from planethood and the new classification of dwarf planets was created for these objects.

While it had now been clearly established that you could discover planets by mathematics, the greater lesson that mathematics would predict other great discoveries, was still difficult to grasp. In the middle of the 20th century, the great problem in cosmology was whether the universe was in a steady state and had existed forever at its current temperature, or whether it had begun as a fireball at billions of degrees (the "big bang") 13.7Gya had been expanding (and cooling) ever since. You'd think it would be easy to differentiate these two situations, but the best minds spent half of the century on it without success. George Gamow (http://en.wikipedia.org/wiki/George_Gamow) in 1948 provided the clue. He calculated that if the universe had started off as a fireball, the temperature, following expansion (and subsequent cooling) to its current size, would be 5°K and would be observable as an isotropic background of microwave radiation. Gamow's prediction allowed anyone (even high school students) to calculate the size of antenna you would need to detect this background radiation, knowing the noise temperature of your microwave receiver. The microwave equipment of the era was too noisy to contemplate looking for the signal, but microwave electronics had only just been invented (WWII, for radar) and you could expect rapid progress in the development of microwave receivers. You could plan for the day when low noise amplifiers would be available. A back of the envelope calculation would let you know if you were going to have to wait 1,10 or 100yrs. You also knew how much longer you'd have to wait if Gamow's calculation was out by a factor of (say) 10. (It turns out that the improvements needed to find the radiation were small. The numbers are all well known by people in the field, but I couldn't find them with google.) Remarkably no-one looked for the radiation or even planned to. With the mechanism to solve the cosmic problem of the century in hand, you'd think everyone would at least keep the idea in the back of their minds (or go spend the intervening time improving microwave equipment). At least you'd keep track of other people's developments in microwave amplifiers.

As it turned out the microwave version of the low noise parametric amplifier (http://en.wikipedia.org/wiki/Parametric_oscillator) was in production shortly thereafter for satellite communications and was being built at home by ham radio operators by the 1960s. The maser (noise temperature 40°K in 1958, see http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel6/22/24842/01124540.pdf?arnumber=1124540) was being used at the front end of satellite ground stations. Meanwhile huge antennas were being built for radioastronomy and for satellite communication.

In 1962 Telstar (http://en.wikipedia.org/wiki/Telstar) was launched. It relayed phone signals between North America and Europe. Telstar was a big deal to me growing up in Australia. It was the first satellite that wasn't part of the cold war, or designed for bragging rights in the war between the two thugs on the block, USA and Russia, competing to win the hearts and minds of the world's population. Telstar's job as far as I knew was to relay phone calls between members of a family separated by a large ocean (travel was slower and more expensive back then). (It was also used to send TV images of sports programs and data like stock prices, but I didn't know that.) Unfortunately Telstar didn't last long; it was fried by radiation from atomic testing in space, by both the American and Russian bombs. Telstar had an energy budget of 14W and required huge antennas on the ground to receive the signal; in Andover Maine, Goonhilly Downs Satellite Earth Station (http://en.wikipedia.org/wiki/Goonhilly_Satellite_Earth_Station) (go look at the photos of Arthur) and the quietest microwave receivers that money could buy. (Nowadays geostationary satellites are the size and weight of a bus, have 10kW of power and can deliver TV signals into homes which have an antenna the size of a dinner plate.) The launch of Telestar and photos of the ground station antennas were blazed across the TV for months. There was even a tune by Joe Meek, played by the Tornados about Telstar, (http://en.wikipedia.org/wiki/Testar_(song)) which made it to #1 in both Britain and the US, and later a song ("Magic Star"). It's hard to imagine how any of the people looking for Gamow's background radiation, on seeing these satellite earth station antennas every night on TV in their living room, didn't think "I really ought to get time on one of these antennas".

In 1965, several generations of microwave equipment after Gamow's prediction, a pair of engineers Penzias and Wilson (http://en.wikipedia.org/wiki/Discovery_of_cosmic_microwave_background_radiation) at Bell Labs in Holmdel, NJ, did thorough and exhaustive calibrations on the horn antenna (http://en.wikipedia.org/wiki/Horn_Antenna) left over from the 1960 Echo Satellite project (http://en.wikipedia.org/wiki/Echo_satellite). The antenna was no longer needed and Bell was hoping trying to find some new use for it (Bell's intention was to save money rather than make a great discovery). Whether Penzias and Wilson were being sent off on what management thought a dead end project, I have not been able to determine. Penzias and Wilson intended to use the antenna for radioastronomy and satellite communication experiments. Penzias and Wilson found that the noise temperature of their setup was 15°K, but they could only account for 12.3°K by dissambling the setup and measuring each part separately. Most people at that stage would have said "that's near enough" and moved on, but Penzias and Wilson were top engineers and apparently not bothered by management, spent months trying to account for the excess noise.

The weak link in their calculations was that the noise temperature of the antenna could not be measured; it had to be calculated. While it's easy to model a perfectly smooth, accurate, infinite sized antenna, a real, finite, irregular (joins, bumps from screws) antenna with an inaccurate surface is difficult to model. The antenna surface was covered in white dielectic material (pigeon droppings) (see Penzias http://en.wikipedia.org/wiki/Arno_Allan_Penzias), which was removed (and the pigeons shot), without effect on the noise. The calculated antenna noise was close to the amount of excess noise. A small miscalculation about the antenna and there would be no excess noise (just as there was no planet perturbing Uranus and Neptune). Not having any other ideas, they reluctantly decided that the noise must be coming from the sky. It wasn't coming from anything localised, like the Milky Way, or nearby New York City; instead it came uniformly from all directions. Penzias and Wilson had accidentally discovered the cosmic microwave background (CMB) radiation, at 2.7°K. (The amount of noise is now agreed to be 4°K. Presumably Penzias and Wilson had overestimated the amount of noise due to the antenna by 1.3°K.)

Since Penzias and Wilson weren't looking for the CMB, they had to ask around to confirm what they were listening to. The academic world was astonished to find that the microwave background was strong enough that it was found by people who weren't even looking for it. The discovery was quickly confirmed by the many sites in the world already capable of detecting the noise, similtaneously demolishing the Steady State theory, confirming the Big Bang theory and winning a Nobel Prize for Penzias and Wilson. Some of the esteemed researchers in the field, who'd spent the last 50yrs trying to figure out the problem, pooh-poohed the discovery as trivial and certainly unworthy of a Nobel Prize. (Talk about a bunch of sore losers.) In particular, Penzias and Wilson were regarded as just a pair of technicians who'd only read out the dials of the equipment they'd been given. They were actually engineers, which made it worse. Academics are used to writing the grants, then telling engineers to build and run the equipment, do the experiments and hand over the results, so that the academic can write the papers and get the glory. (For a description of petty, venal and arrogant academics terrified of their technicians beating them to a discovery, it's hard to beat "The Hubble Wars: Astrophysics meets Astropolitics in the Two-Billion-Dollar Struggle over the Hubble Space Telescope" Eric J. Chaisson, 1994 Harper Collins, ISBN 0-06-017114-6.)

The detractors hadn't learned the lesson, that if you're going to use a piece of equipment, you have to know what it does, before you use it. Plenty of people had had the opportunity to search for the radiation themselves, but hadn't; plenty of people had heard the radiation, but had ignored it; plenty of people had had not properly calibrated their equipment, and didn't realise the radiation was there. The detractors were, in fact, loudly proclaiming to the world their lack of understanding of what's required for a great discovery. What greater discovery can there be, than seeing something when you've been told there is nothing, while all those around you, who have been told what to look for, see nothing (the reverse of the Emporer's new clothes).

There were however lots of people who recognised the great discovery. I remember being a first year undergraduate at the time. I heard about it when the Physics lecturer burst through the door and instead of giving the lecture for the day, told us about the discovery, drawing on the blackboard, a graph of the energy of the microwave background radiation as a function of frequency, and the points corresponding to the energy found by the multiple sites. It was the only time in my student life that something, from the outside world, was important enough to bump a scheduled lecture. Unfortunately the significance of the matter was largely lost on me at the time. All I knew was that something important had happened. It wasn't till sometime later that I understood where the radiation came from and why it should have that temperature. While it is often said of people of my generation, that we remember where we were when we first heard about the assassination of Jack Kennedy, I remember where I was when I heard about Penzias and Wilson's discovery of the cosmic background radiation.

^[123]

#! /usr/bin/python
#compound_interest.py

principal=0
interest_rate=0.05
annual_deposit=10000
age_start=18
age_retirement=65

principal+=annual_deposit
interest=principal*interest_rate
principal+=interest

print "principal %10.2f" %principal

^[124]

#! /usr/bin/python
#compound_interest.py

principal=0
interest_rate=0.05
annual_deposit=10000
age_start=18
age_retirement=65

for age in range(age_start+1,age_retirement+1):
	principal+=annual_deposit
	interest=principal*interest_rate
	principal+=interest

	print "age %2d principal %10.2f" %(age, principal)

# compound_interest.py------------------------

^[125]

$100k

^[126]

$370,000, $1,840,000

^[127]

Till you're 30.

^[128]

interest/month=principal*annual_interest/12

# echo "23750*6/(100*12.0)" | bc -l
1187.50

^[129]

payment = interest + pay-off on principal

1187.50+100.00=1287.50

^[130]

less: The 2nd month's payment is $6.44 less than the 1st month's payment, because your principal was reduced by the $100 payment in the first month.

1st month:
principal at start          =237500.00
interest on principal at 6% =           1187.00
principal                   =            100.00
1st payment (end month)     =           1287.50

2nd month:
principal at start 2nd month=236212.50 
interest on principal at 6% =           1181.06
payment of $100.00 principal=            100.00
2nd payment (end month)     =           1281.06

^[131]

The monthly payment is the sum of the interest that's paid in the first month and the initial principal payment.

^[132]

The principal paid out each month is the difference between the monthly payment and the interest calculated for that period.

^[133]

The number of payment periods * the monthly payment.

^[134]

#!/usr/bin/python

#mortgage_calculator.py
#Abraham Levitt (C) 2008, licensed under GPL v3

initial_principal = 250000.0    #initial value, will decrease with time
principal_remaining = initial_principal 
principal_payment = 0.0

interest  = 0.06                #annual rate
interest_payment = 0.0		#monthly interest
total_interest_paid = 0.0	#just to keep track
	                        #(in the US, if you itemise deductions, 
	                        #the interest on a mortgage for your primary residence is tax deductable)

initial_principal_payment = 100.00        #This is the initial value of the monthly principal payment
time_period = 0

#monthly payment is fixed throughout mortgage
monthly_payment = initial_principal*interest/12.0 + initial_principal_payment

#print column headers
print " time    princ. remaining     monthly_payment   principal payment    interest payment      total int paid"

#print balances at the beginning of the mortgage
print "%5d %19.2f %19.2f %19.2f %19.2f %19.2f" %(time_period, principal_remaining, monthly_payment, principal_payment, interest_payment, total_interest_paid)


#calculate for first month
interest_payment = principal_remaining*interest /12.0
total_interest_paid += interest_payment
principal_payment = monthly_payment - interest_payment
principal_remaining -= principal_payment
time_period += 1

print "%5d %19.2f %19.2f %19.2f %19.2f %19.2f" %(time_period, principal_remaining, monthly_payment, initial_principal_payment, interest_payment, total_interest_paid)

# mortgage_calculation.py ------------------

^[135]

#!/usr/bin/python

#mortgage_calculator.py
#Abraham Levitt (C) 2008, licensed under GPL v3

initial_principal = 250000.0    #initial value, will decrease with time
principal_remaining = initial_principal 
principal_payment = 0.0

interest  = 0.06                #annual rate
interest_payment = 0.0		#monthly interest
total_interest_paid = 0.0	#just to keep track
	                        #(in the US, if you itemise deductions, 
	                        #the interest on a mortgage for your primary residence is tax deductable)

#initial_principal_payment = 250.00        #This is the initial value of the monthly principal payment
initial_principal_payment = 248.876       #This is the initial value of the monthly principal payment
time_period = 0

#monthly payment is fixed throughout mortgage
monthly_payment = initial_principal*interest/12.0 + initial_principal_payment

#results:
#6%, 250$ initial payment, 1500$ monthly payment, 288k$ interest

#print variables
print " initial_principal_payment %19.2f annual_interest %2.4f" %(initial_principal_payment, interest)
#print column headers
print " time    princ. remaining     monthly_payment   principal payment    interest payment      total int paid       total payment"

#print balances at the beginning of the mortgage
print "%5d %19.2f %19.2f %19.2f %19.2f %19.2f %19.2f" %(time_period, principal_remaining, monthly_payment, principal_payment, interest_payment, total_interest_paid, total_interest_paid + initial_principal - principal_remaining)


#calculate for all months
while (principal_remaining >= 0.0):
	interest_payment = principal_remaining*interest /12.0
	total_interest_paid += interest_payment
	principal_payment = monthly_payment - interest_payment
	principal_remaining -= principal_payment
	time_period += 1
	if (time_period %12 == 0):
		print "%5d %19.2f %19.2f %19.2f %19.2f %19.2f %19.2f" %(time_period, principal_remaining, monthly_payment, principal_payment, interest_payment, total_interest_paid,  total_interest_paid + initial_principal - principal_remaining)



#print stats 
print "interest %2.4f, monthly_payment %5.2f, initial_principal_payment %5.2f, ratio (total cost)/(house price) %2.4f" %(interest, monthly_payment, initial_principal_payment, (total_interest_paid + initial_principal - principal_remaining)/initial_principal)

# mortgage_calculation_2.py ------------------

^[136]

540k$=250k$ principal + 290k$ interest. The mortgage doubles the price of the house (approximately).

^[137]

About 20yrs. Most of your initial payments are interest. In the last 10yrs most of your payments are principal.

^[138]

You own about 20k$ of your 250k$ house. You've paid 86k$ of interest in the same time.

^[139]

at 6% interest, after 6yrs: equity = 250-228k$=22k$

at 6% interest, after 6yrs: interest = 86k$

^[140]

0.5%*160k$=$800/yr. $800 is a nice amount of money to save a year.

	Note
	As you pay off the loan and your principal decreases, your savings will decrease. However it will be a while before you make a significant dent in the principal.

^[141]

$3200/$800=4yrs.

^[142]

$3200/$30≅9yrs

^[143]

dennis:# echo "3*1.0525^25" | bc -l
10.78136793454438844136

^[144]

dennis:# echo "3*1.04^25" | bc -l
7.99750899446225997912

^[145]

If you halve the size of the transistor, then you can fit 4 times as many transistors into the same area. The number of transistors scales with O(n²).

^[146]

#! /usr/bin/python

#moores_law_pi.py
#Gordon Moore (C) 1965
#calculates the number of years needed to calculate PI, 
#if the speed of the computers increases with Moore's Law

year = 10**7.5  		#secs
doubling_time = year
time = 0
iterations_required = 10**100   #iterations
iterations_done = 0     	#iterations
initial_computer_speed = 10**9  #iterations/sec 
				#This number is actually clock speed. 
				#iterations/sec will be down by 10- 100 fold, 
				#assuming 10-100 clock cycles/iteration. 
				#This is close enough for the answer here.
computer_speed = 0  		#iterations/sec 
	 
#setup variables for the first year

computer_speed = initial_computer_speed
time = year

#calculate the iterations for the year
iterations_done = computer_speed * year

#print the iterations done
print "elapsed time (yr) %5.0f, iterations done %e" %(time/year, iterations_done)

# moores_law_pi.py

^[147]

#! /usr/bin/python

#moores_law_pi.py
#Gordon Moore (C) 1965
#calculates the number of years needed to calculate PI, 
#if the speed of the computers increases with Moore's Law

year = 10**7.5  		#secs
doubling_time = year
time = 0
iterations_required = 10**100   #iterations
iterations_done = 0     	#iterations
initial_computer_speed = 10**9  #iterations/sec 
				#This number is actually clock speed. 
				#iterations/sec will be down by 10- 100 fold, 
				#assuming 10-100 clock cycles/iteration. 
				#This is close enough for the answer here.
computer_speed = 0  		#iterations/sec 
	 
#setup variables for the first year

computer_speed = initial_computer_speed

while (iterations_done < iterations_required):

	time += year
	#calculate the iterations for the year
	iterations_done += computer_speed * year

	#print the iterations done
	print "elapsed time (yr) %5.0f, iterations done %e" %(time/year, iterations_done)

	computer_speed *= 2

# moores_law_pi.py ------------------------------------

^[148]

at the end of the 277th year (the speed doubles every year, the first 277 years did half the job, the 278th year did the last half of the job).

^[149]

1 yr. (It would have been cheaper too, you wouldn't have had to buy 278 years worth of computers.)

^[150]

255 = 256-1

^[151]

The series is 1+2+4+..2³¹=2³¹+2³⁰+...+1. In binary, this is 32 1's. In hexadecimal it's ffffffffh. The sum (the number represented by 32 bits, all 1) is 2³²-1≅4G

^[152]

111₁₀

^[153]

^[154]

if 
a = b
c = d

then
c-a = d-b

^[155]

Sum=1*(2^8-1)/(2-1)=255

^[156]

Sum=1*(1-0.1^infinity)/(1-0.1) = 1/0.9 = 1.111

^[157]

Sum=1*(1-0.2^infinity)/(1-0.2) = 1/0.8 = 1.25

^[158]

Sum=a*(1-r^n)/(1-r)

for n=infinity, r<1
r^n=0

Thus 
Sum=a/(1-r)

since r!=1, this is a finite number.

^[159]

1+2+4+...2^63=2⁶⁴-1 (this just happens to be the largest unsigned integer that can be represented on a 64-bit computer)

^[160]

2⁶⁴-1 secs

(the age of the universe is 13.8Gyr)
pip:# echo "2^64/(60*60*24*365*13.8*10^9)" | bc -l
42.38713169239652409208 times the age of the universe.

^[161]

64 secs (there may be technical difficulties pouring the last few piles in 1sec).

^[162]

Bulk rice has air between the grains lowering its density. When you throw the rice into water, the water pushes away the air. I couldn't find the density of rice grains on the internet, but did find the relative density of wheat at 1.6 The Density of Wheat Starch Granules: A Tracer Dilution Procedure for Determining the Density of an Immiscible Dispersed Phase H. N. Dengate, D. W. Baruch, B.Sc., P. Meredith, B.S., Ph.D. Wheat Research Institute, D.S.I.R., P.O. Box 1489, Christchurch (New Zealand). Grains are mostly starch and should have similar densities, so we can assume individual rice grains have a R.D. (relative density) of 1.6.

^[163]

dennis:# echo "2^64*1.0/(60000*750)" | bc -l
409927646082.43448035555555555555


let's turn this into more usable units

dennis:# echo "2^64*1.0/(60000*750*10^12)" | bc -l
.40992764608243448035	#Tm^3 (Tera cubic metres)
=.41	#Tm^3 (Tera cubic metres)

^[164]

density of rice≅0.75ton/m³. Weight of rice = 0.3 Ttons

^[165]

tan()

^[166]

>>> tan(radians(26))
0.48773258856586144
>>> tan(radians(27))
0.50952544949442879
>>>

^[167]


volume=(4pi/3)*h^3
      =0.41 Tm^3

thus
h^3=0.41*(3.0/4*pi)Tm^3

>>> 0.41*(3.0/4*pi)
0.96603974097886136	

Let's take the cuberoot of 0.966 as 1.0

The cuberoot of 10^12 is 10^4

The height then is
h=1.0*10^4m 
 =10km

^[168]

10km

^[169]

10km. We have a pile of rice the size of Mt Everest.

^[170]

20km

^[171]

from Rice (http://unctad.org/infocomm/anglais/rice/market.htm), the world's production of rice in 2000 was 0.6G tons/year. The reward was 0.3 Ttons. It would take 0.3*10^12/0.6*10^9≅500yrs for the world today to produce the amount of rice needed for the reward. Back in the time of this story (I would guess about 1000yrs ago), the world's production of rice would have been less. How much less? There aren't good records of rice production, but a lot of attention has been put into estimating the world's population. As a first approximation, we'll assume that rice production and world population are closely linked. The History World's Population (http://www.xs4all.nl/~pduinker/Problemen/Wereldbevolking/index_eng.html), itself an exponential growth, was about 1/1000 th the current level. The world's rice production then would have been 1/1000th of today's level. The Grand Vizier's reward represented 0.5Myr of the world's rice production at the time. Compare this time to the amount of time since humans diverged from the apes (about 0.1-1Myr).

^[172]

sum = a(r^(n+1)-1)/(r-1)

a = 10^16.5, r=2
sum = 10^16.5*(2^(n+1)-1)

^[173]

10^16.5*(2^(n+1)-1) = 10^100
2^(n+1)-1=10^83.5

ignore the -1 (it's small compared to 10^83.5).

you now want to find "n" when

2^(n+1)=10^83.5

try 2^(n+1)/10^83.5 for a few values of n

pip:class_code# python
Python 2.4.4 (#2, Mar 30 2007, 16:26:42) 
[GCC 3.4.6] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 2**278/10**83.5
1.5358146097473691

this gives n+1=278
n = 277

^[174]

No, but lots of really neat capital equipment, coffee makers, iPods (all of which will rust, wear out, break, or become obsolete and have no residual value and houses, which will probably increase in value) was bought. Where the 9M$ is used to buy equipment which has no residual value, the money will be repaid by hard work by the employees at the various businesses.

^[175]

Yes, 9M$ - out of thin air.

^[176]

Ben Franklin

^[177]

Fixed rate mortgages are currently about 6%. An ARM starts at a lower rate (about 4%), a great advantage over a fixed rate mortgage, to suck people in. However after about 2 yrs, the rate rises, eventually reaching about 10%, a level which few people can handle.The sub-prime people were promised that they would then qualify for a fixed rate mortgage. Whether this was a real promise at the time I don't know, but by the time the interest rate ramp hit, the bubble had broken and the mortgagees were left with negative equity in their houses (their houses were worth less than they'd committed to pay).

^[178]

700G$/300M=2300$

^[179]

^[180]

^[181]

No way in hell.

^[182]

exponential. Each machine infects n others, those n machines go on to infect n² others, eventually infecting nⁿ machines.

^[183]

The ice changes the aerofoil of the wings reducing lift so the plane can't fly, at the same time increasing the weight of the plane. An iced-up plane can't fly. This isn't a problem for modern aviation as today's planes can fly above the clouds.

^[184]

An occupant in a plane which is turning, unless they have an external reference e.g. the horizon or the modern gyroscopic Artificial Horizon (http://en.wikipedia.org/wiki/Attitude_indicator) can't tell that they are tilted over. The tipping of the wings to one direction, drives the occupant directly into the seat. The occupant has no sideways forces and in a bumpy ride can't tell that there's an extra 5% weight pressing into the seat. As a result a blinded pilot has no perception of the horizontal.

I'm not a pilot, but I can't imagine how Nungesser expected to make it across the Atlantic on a moonless night. This inability to vertically orient in a plance is a possible explanation for the death of John Kennedy, (http://www.airlinesafety.com/editorials/JFKJrCrash.htm), (the son of President Kennedy) the pilot of a plane which crashed into the sea on a night approach to Martha's Vineyard (see the section "Why did Kennedy Crash?").

^[185]

The pilot only knows his speed relative to the air. To tell groundspeed, he also needs to know the speed and direction of the air relative to the ground (or ocean).

Sailors, whose lives depend on understanding the wind and sea, have a the Beaufort Scale to determine windspeed (for landlubbers, there's also a scale for land). When I was in Cubs (US==cub scouts) (in the '50s), the Cub's den was decorated with rigging diagrams of ships, showing the names of the ropes and the sails, a board covered in the knots sailors used (more than we were every asked to know), and most importantly a Beaufort scale. While we all knew that we'd be unlikely to ever sail on one of these ships, they were an important part of the founding of Australia, and we knew that understanding these diagrams would help us understand the lives of people who made our country. You knew that lives were in peril at high Beaufort numbers. Have a look at the wiki photo for Beaufort 12. Few ships can handle more than a hour of that without breaking their backs. The wreck of the Edmund Fitzgerald (http://en.wikipedia.org/wiki/SS_Edmund_Fitzgerald), occured at Beaufort 10. It was originally thought that the boat had snapped in half, but it now seems likely that the hatches lost their watertightness, or the boat had earlier hit a shallow shoal, which wasn't noticed in the rough seas, causing damage to the hull.

^[186]

in_air*(1.0-1.0/MTBF)

^[187]

in_air*(1.0-interval/MTBF)

^[188]

#! /usr/bin/python

#lindbergh_one_engine.py
#
#Charles Lindbergh (C) 2008, released under GLP v3.
#calculates the probability that Lindbergh's one engine plane 
#will still be flying after a certain number of hours.

#these should be reals otherwise will get integer arithmetic.
MTBF=200.0
in_air=1.0
time=0.0
interval=1.0	#sample at 1hr intervals

in_air *= (1.0-interval/MTBF)
time += interval
print "time %2d in_air %2.5f" %(time, in_air)

# lindbergh_one_engine.py ------------------------------

^[189]

#! /usr/bin/python

#lindbergh_one_engine_2.py
#
#Charles Lindbergh (C) 2008, released under GLP v3.
#calculates the probability that Lindbergh's one engine plane 
#will still be flying after a certain number of hours.

#these should be reals otherwise will get integer arithmetic.
MTBF=200.0
in_air=1.0
time=0.0
fuel_time=38
interval=1.0	#sample at 1hr intervals

#print header
print "time in_air " 

for time in range(0,fuel_time+1): 
	print "%4d  %5.3f" %(time, in_air)
	in_air *= (1.0-interval/MTBF)

# lindbergh_one_engine_2.py ------------------------------

^[190]

flying for 33.5hrs, he had an 84% chance of making it.

^[191]

#! /usr/bin/python

#lindbergh_one_engine_3.py
#
#Charles Lindbergh (C) 2008, released under GLP v3.
#calculates the probability that Lindbergh's one engine plane 
#will still be flying after a certain number of hours.

#These are reals.
#Don't use integers or will get integer arithmetic.

MTBF=200.0
fuel_time=38.0

print "time in_air" 

interval=1.0	#update calculations at 1hr intervals
in_air  =1.0	#initial probability of engine running
time    =0.0	#time at start of flight

while (time <= fuel_time):
	#at start of an interval, print parameters
	print "%5.3f  %5.3f" %(time, in_air)

	#at end interval, update
	time += interval
	in_air *= (1.0-interval/MTBF)

# lindbergh_one_engine_3.py ------------------------------

^[192]

The units of the y axis is probability, a number (with value between 0.0 and 1.0). The units of the x-axis is hours. The units of the area under the graph is number*hours=hours.

^[193]

The only parameter measured in hours that's been input to the graph is the MTBF=200hrs. The area under the graph must bear some simple relationship to the MTBF, e.g. it could be exactly the MTBF or e*MTBF.

^[194]

#! /usr/bin/python

#lindbergh_one_engine_4.py
#
#Charles Lindbergh (C) 2008, released under GLP v3.
#calculates the probability that Lindbergh's one engine plane 
#will still be flying after a certain number of hours.

#These are reals.
#Don't use integers or will get integer arithmetic.

MTBF=200.0
fuel_time=2500.0

print "time in_air" 

interval=1.0	#update calculations at 1hr intervals
in_air  =1.0	#initial probability of engine running
time    =0.0	#time at start of flight
area	=0.0

while (time <= fuel_time):
	#at start of an interval, print parameters
	area += in_air*interval
	if (time%100 <= interval/2.0):	#why don't we do (time%100==0)	as we've always done?
		print "%5.3f  %5.3f  %5.3f" %(time, in_air, area)

	#at end interval, update
	time += interval
	in_air *= (1.0-interval/MTBF)

# lindbergh_one_engine_4.py ------------------------------

^[195]

The probability of the engine still working after a certain time is a continuous function. We've approximated it by a discontinuous function - a set of steps. Are the steps above or below the continuous function? The continuous function for the first hour goes from 1.0 to 0.995. The step function is 1.0 for the first hour. The step function then is an upper bound.

^[196]

From our work on determining the value of π, the lower bound for a monotonic function is (upper bound - the first slice). The first slice has an area of 1hr).

^[197]

It's between 199 and 200.

^[198]

#! /usr/bin/python

#lindbergh_two_engines.py
#
#Charles Lindbergh (C) 2008, released under GLP v3.
#calculates the probability that Lindbergh's two engine plane 
#will still be flying after a certain number of hours.

MTBF=100
fuel_time=38
in_air=1.0
#time=0
print "time in_air " 

for time in range(0,fuel_time+1): 
	print "%4d  %5.3f" %(time, in_air)
	in_air *= (1.0-1.0/MTBF)

# lindbergh_two_engine.py ------------------------------

^[199]

29%

	Note
	from Joe: you can skip this part (it shows that division by reals which are powers of 2, doesn't use right shifting).

	Note
	0!=1. (there is only 1 way of arranging 0 people in a photo with 0 people.)