Continuous Random Variables ========== 1. Introduction Call X *continuous* if P[X in A] = integral_A f(x)dx for some function f(x)>0, called the DENSITY (pdf, for probability density function). Notice that: a) 0 <= f(x) for all x b) integral f(x)dx = 1 *) F(b) = P[X <= b] = integral [f(x): -infinity < x < b] -> f(x) = F'(x) *) f(x) CAN be bigger than 1.... can even be infinite! Example: X = U^2 -> F(x) = P[X f(x) = 1/2sqrt(x) Example: Lifetime of radio tube: 100/x^2, x>100; 0, x<= 100 PROBLEM: a) P[X < 150] = ??? (1/3) b) P[ 2 of 5 such tubes fail in 150 hrs ] = ??? (5:2)(1/3)^2(2/3)^3=80/243 2. Expectation & Variance of a Continuous Random Variable E[g(X)] = sum{ g(x) P[X=x] } for discrete variables = int{ g(x) f(x) dx } for continuous variables In particular, the mean and variance are: mu = E[X] = int{ x f(x) dx } (if it exists!) and var = V[X] = int{ (x-mu)^2 f(x) dx } = int{ x^2 f(x) dx } - mu^2 3. The Uniform Random Variable / 1 \ | ----- | if a < x < b Uniform on range alpha < X < beta: f(x) = \ b-a / ( 0 ) if xb CDF: F(x) = 0, xb. Mean: mu = (a+b)/2 Var: (b-a)^2/12 4. Normal Random Variables *) Table, page 214: F[x], x = 0.00(.01)3.49, P=0.5000...0.9998 1. The Normal Approximation to the Binomial Distribution Cool pix on p.220: exact probs are BINOMIAL, n=10, p=.7: 0.30 | 0.25 | . 0.20 | . : : 0.15 | : : : 0.10 | : : : : 0.05 | : : : : : . 0.00 |__,____,____:____:____:____:____:____:____:____:____:___ 0 1 2 3 4 5 6 7 8 9 10 Normal approx has mean mu = 10*.7 = 7 and sdev sqrt(10*.7*.3) = 1.44914 P[ <= 5 Successes ] = 0.0000059049 + 0.0001377810 + 0.0014467005 + 0.0090016920 + 0.0367569090 + 0.1029193452 = 0.1502683326, or approximately Phi( (5.5 - 7.0) / 1.44914 ) = Phi ( -1.0351) = 1-Phi(1.0351) = = 1 - (.8485+.8508)/2 = 1 - 0.8496 = 0.1504 (not bad!) EXAMPLE: What is the chance of 50 Heads in 100 tosses of fair coin? / 100 \ 50 50 12611418068195524166851562157 = | | (0.50) (0.50) = ------------------------------ = 0.0795892... \ 50 / 158456325028528675187087900672 OR, APPROXIMATELY, = P[ (49.5 - 50)/sqrt(100*.5*.5) < (X-mu)/sqrt(Var) < (50.5 - 50)/sqrt(100*.5*.5)] = P[ -0.1 < Z < +0.1 ] approx = 2*(0.5398 - 0.5) = 2*(0.0398) = 0.0796 5. Exponential Random Variables (consider doing hazard FIRST) f(x) = lam * e^(-lam * x) , x > 0 P[ T > x ] = exp(-lam * x), x>0 P[ T < x + eps | T >= x ] = 1 - exp(-lam * (x+eps)) / exp(-lam * x)) / = 1 - exp(-lam * eps) = lam*eps - (lam*eps)^2/2 + ... approx lam*eps ==> lam is *hazard* = death rate; for exponential distribution (ONLY) this is constant. 1. Hazard Rates _ Survivor function: F(t) = 1 - F(t) = P[X > t] (optimistic view...) _ Hazard: lambda(t) = f(t)/(1-F(t)) = f(t)/F(t) _ F(x) = exp(-int(0:x) lambda(t) dt) Example: lambda(t) = b*t --> F(x) = 1 - exp(-b*x^2/2), x>0 ("Rayleigh") f(x) = b * x * exp(-b*x^2/2), x>0 lambda(t) = a --> F(x) = 1 - exp(-a*x), x>0 ("Exponential") f(x) = a * exp(-a*x), x>0 6. Other Continuous Distributions 1. The Gamma Distribution: Length of time to catch t fish @ lam/hr avg rate 2. The Weibull Distribution: Lifetimes: 1-F(t) = exp(-((x-v)/alpha)^beta 3. The Cauchy Distribution: Like a normal but much flatter... no mean, var. 4. The Beta Distribution Uncertain probability 0 f_Y(y) = SUM {f_X(x)/|g'(x)| : g(x)=y} Discrete: p_Y(y) = SUM {p_X(x) : g(x)=y} Note: something new happened because of Jacobian. ================