Pitman MTH 135/ STA 104 Probability Week 3 Normal and Binomial Distributions A new photolithography process is still in the experimental stage; about 70% of the chips made from silicon dies treated with this process work properly. Imagine that we are testing a series of these chips, and let's assume independence. In four independent trials, what is the probability that exactly ONE of the four chips works? Is it: (7/10) * (3/10) * (3/10) * (3/10) = 7*27/10^4 = 189/10000 = 0.0189, about one in 53??? No--- it is four times bigger than that, because the working chip could be any of chips # 1,2,3,4: 1 3 4 * (0.70) * (0.30) What is the chance that exactly k chips will work in n independent tries, for 0 <= k <= n, if a fraction 0 (n+1) p = (k+1) ==> p = (k+1)/(n+1), so e.g. if p = 7/10 and n=9 then k=6 and k=7 have the same probability, about 0.2668. Otherwise the unique maximum happens when np-q < k < np+p where (as usual) q = 1-p--- roughly, k \approx n*p =========== Even this maximum probability isn't very big.... P[ X = k ] maxes out at a little below 1/sqrt(n) (more precise info is in the homework and text!), so NO point has very high probability for large values of n. More interesting is the SUM of the probabilities of extreme values--- say, for p = 0.70 and n=10, P[ X <= 5 ] = \sum _0 ^5 (10:k) (0.7)^k (0.3)^(n-k) = 0.1502683 Is "5 successes in 10 tries" an extreme result if 70% of subjects improve?? . Normal Random Variables *) Appendix 5, page 531: Phi[z], z = 0.00(.01)3.59, P=0.5000...0.9998 1. The Normal Approximation to the Binomial Distribution 0.30 | 0.25 | . 0.20 | . : : 0.15 | : : : 0.10 | . : : : : 0.05 | . : : : : : . 0.00 |__,____,____;____;____;____;____;____;____;____;____;___ 0 1 2 3 4 5 6 7 8 9 10 -b*(z-a)^2 These ALWAYS follow a "bell-shaped curve", approximately c * e for some numbers a,b,c. Demoivre and Laplace figured this out, and figured out a,b,c, and figured out how to use it. It's called the Normal Distibution, and in their honor the Normal approximation to the Binomial is called the "DeMoivre-Laplace Limit Theorem". It is the first example of a more general result called the "Central Limit Theorem" (abbrv. CLT) that we'll encounter in week or two. One can use (*) as a starting-point to prove the DeMoivre-Laplace result. APPROXIMATELY, the Binomial pmf P(x) has a maximum at np, and satisfies log P(np + z) = log P(np) + \sum_{k=np}^{np+z-1} log (n-k)p / (k+1)q = log P(np) + \sum_{j=0} ^{z-1} log (npq- jp)/(npq +(j+1)q) ~ log P(np) + \sum_{j=0} ^{z-1} -j/npq ~ log P(np) - z^2/2npq since log(1+s) = s + o(s); here s = (npq- jp)/(npq +(j+1)q) - 1. -(x-np)^2 / 2npq hence P(x) ~ c * e This is the normal density function with mu=np and sigma^2 = npq. Pitman's section 2.3 gives a more detailed derivation. ================================ Show how to get limits for Normal approx'n, with change-of-variables ================================ Roll a fair die 500 times---- what's the probability of at least 100 aces? A1: \sum _{x=100}^{500} (500:x) (1/6)^x (5/6)^(500-x) = 0.0282871 = 1-pbinom(99,500,1/6) = pbinom(400,500,5/6) A2: mu = n*p = 500/6 = 83.33; sig^2 = n*p*(1-p) = 8.333^2 ( 99.5 - 83.33) / 8.333 = 1.94000000000 (500.5 - 83.33) / 8.333 = 50.06 = infinity = 1 - Phi(1.94) = pnorm(-1.94) = 0.02618984 nb: pnorm(-2) = 0.02275013, error is 2.6 times bigger! ================================ Normal approx has mean mu = 10*0.7 = 7 and sdev sqrt(10*0.7*0.3) = 1.44914 P[ <= 5 Successes ] = pbinom(5,10,0.7) = 0.0000059049 + 0.0001377810 + 0.0014467005 + 0.0090016920 + 0.0367569090 + 0.1029193452 = 0.1502683326, or approximately Phi( (5.5 - 7.0) / 1.44914 ) = pnorm(5.5, 7.0, 1.44914) = Phi ( -1.0351) = 1-Phi(1.0351) = [ <-- for tables ] = 1 - (.8485+.8508)/2 = 1 - 0.8496 = 0.1504 (not bad!) EXAMPLE: What is the chance of 50 Heads in 100 tosses of fair coin? / 100 \ 50 50 12611418068195524166851562157 = | | (0.50) (0.50) = ------------------------------ = 0.0795892... \ 50 / 158456325028528675187087900672 OR, APPROXIMATELY, = P[ (49.5 - 50)/sqrt(100*.5*.5) < (X-mu)/sqrt(Var) < (50.5 - 50)/sqrt(100*.5*.5)] = P[ -0.1 < Z < +0.1 ] approx = 2*(0.5398 - 0.5) = 2*(0.0398) = 0.0796 ============================================================================= Another way to see DeMoivre-Laplace: Stirling's approximation to the factorial function is n! ~ sqrt(2*pi*n) n^n exp(-n) (the ratio of n! to Stirling's approximation is exp(theta/12 n) for some n-dependent number 0 < theta < 1, so the approx'n is very good for big n). SO, sqrt(2*pi*n) n^n exp(-n) p^k q^(n-k) P(k) ~ -------------------------------------------------------------------- sqrt(2*pi*k) k^k exp(-k) sqrt(2*pi*(n-k)) (n-k)^(n-k) exp(-n+k) n = sqrt[------------] * [n*p/k]^k * [n*q/(n-k)]^(n-k) 2*pi*k*(n-k) Set x = k-np, so k = np+x and n-k = nq-x; then P(np+x) ~ 1/sqrt[2*pi*n*p*q] * [np/(np+x)]^(np+x) * [nq/(nq-x)]^(nq-x) = const * (1+x/np)^np * (1-x/nq)^nq * (1+x/np)/(1-x/nq)]^(-x) For any x, the last term converges to one as n->oo; taking logs of the rest, and using the approximation log(1+s) = s - s^2/2 + o(s), log P(np+x) ~ c + np log(1+x/np) + nq log(1-x/nq) ~ c + np [ x/np - x^2/2(np)^2 ] + nq [ -x/nq - x^2/2(nq)^2 ] = c + x - x^2/2np - x - x^2/2nq = c - x^2/2npq Thus, P(k) ~ 1/sqrt(2*pi*n*p*q) exp(- (k-np)^2 /2npq ), exactly DeMoivre & Laplace's result. ================================================================ Notice that a Bi(n,p) random variable can be viewed as the sum S_n = I_1 + I_2 + ... + I_n of n independent "Bernoulli" random variables, indicator variables equal to one or zero with probabilities p or q=1-p, respectively. Each I_k has mean p and variance pq, so the sum S_n has mean np and variance npq; if we standardize it, S_n - np Z_n = ------------- sqrt( n p q ) has (by the DeMoivre-Laplace limit theorem) approximately a standard Normal No(0,1) distribution. An amazing result we'll see more about later is the "central limit theorem" which asserts that if S_n = X_1 + ... + X_n for independent random variables X_k with ANY probability distribution that has a finite mean mu and variance sigma^2, the standardized quantity S_n - n mu Z_n = ----------------- sqrt( n sigma^2 ) has approximately a No(0,1) distribution for large n. What does this say for Poisson random variables X_n? Geometric? Cauchy? Gamma?