Pitman STA 230 / MTH 230 Probability Week 7 Read: Pitman sections 4.1--4.3 Density Functions, CDF's * Probability Density Functions * Exponential and Gamma Distributions * CDF's Continuous Random Variables ========== 1. Introduction Recall that the "Cumulative Distribution Function" or "CDF" is: F(x) = P[X <= x] It's defined for all -oo < x < oo, and satisfies 3 rules: a) x < y => F(x) <= F(y) [ note < and <= signs ] b) LIM (F(x): x-> -oo) = 0, LIM (F(x): x-> +oo) = 1 c) F might not be continuous, but it's RIGHT-continuous: F(x) = LIM {F(x+eps), eps>0, eps->0} E.G.: Binomial, Exponential, Geometric, Normal, Uniform Call X *continuous* if CDF F(x) = P[X <= x] is a *continuous* function of x In this course every such F has a derivative f=F' at almost every x and F(x) = int{ f(t): -oo < t <= x } so for any interval A = (a,b] (or any other set A), Pr[ X in (a,b] ] = F(b) - F(a) = INT_A { f(x) dx } Notice that the "density function" (or "pdf") f satisfies: a) 0 <= f(x) for all x b) INT { f(x)dx } = 1 *) F(b) = P[X <= b] = integral [f(x): -oo < x <= b] -> f(x) = F'(x) *) f(x) need NOT be continuous; *) f(x) CAN be bigger than 1.... can even be infinite! Example: X = U^2 -> F(x) = P[X f(x) = 1/2sqrt(x) Examples a) Uniform on interval (alpha,beta]: calculate f(x) *and* F(x): CDF: F(x) = 0, xbet. Note: pdf f(x) is indeterminate at x=alp, x=bet, but CDF F(x) is well-defined everywhere ( F(alp)=0, F(bet)=1 ) Illustrate: - cdf might not have derivative at a few points - pdf might be > 1, but never < 0 b) X = U^2 for U ~ Un(0,1): calculate first F(x) then f(x) to illustrate - unbounded pdf - change-of-variables c) X.t = # of fish (or other events) by time t, with mean mu = t*lambda; T.1 = time of 1st event T.2 = time of 2nd event T.n = time of nth event Develop connection between Poisson and exponential/gamma. Note normal limit as n->oo by CLT d) Derive Gamma function from above; show mean E[T.alpha] = alpha/lambda, V[T.alpha] = alpha/lambda^2 [both easy from E[T.alpha^p] = Gamma(alpha+p)/\lambda^p Gamma(alpha) ] ------------------------------- Tue ends Thu starts ---------------------------------- Example: Lifetime in days cell-phone battery: f(x) = 100/x^2, x>100; 0, x<= 100 (Pareto) F(x) = 0, x<100; 1-100/x, x>100 PROBLEM: a) P[X < 150] = ??? (1/3) b) P[ 2 of 5 batteries fail in 150 days ] = ??? (5:2)(1/3)^2(2/3)^3=80/243 ~~ 1/3 2. Expectation & Variance of a Continuous Random Variable E[g(X)] = sum{ g(x) P[X=x] } for discrete variables = int{ g(x) f(x) dx } for continuous variables In particular, the mean and variance for continuous dist'ns are: mu = E[X] = int{ x f(x) dx } (if it exists!) and var = V[X] = int{ (x-mu)^2 f(x) dx } = int{ x^2 f(x) dx } - mu^2 For Gamma: E[ X^p ] = lambda^(-p) * Gamma(alpha+p)/Gamma(alpha) -> mu, sig^2 For Pareto: mu = int(100:oo) x (100/x^2) dx = oo !!! More general Pa(alp,bet): f(x) = eps^alp alp x^{-alp-1}, x>eps; then E[ X^p ] = eps^p alp/(alp-p) so mu = eps alp/(alp-1) if alp > 1, var = alp/[(alp-1)^2(alp-2)] if alp>2. For Uniform (alp,bet): mu = (alp+bet)/2 Var: (bet-alp)^2 / 12 ------------------------------------------------------------------------------- H A Z A R D P[ T <= x + eps | T > x ] = P[ x < T <= x + eps ] / P[ x < T ] = 1 - exp(-lam * (x+eps)) / exp(-lam * x) = 1 - exp(-lam * eps) = lam*eps - (lam*eps)^2/2 + ... = lam*eps + o(eps) In general: P[ T <= x + eps | T > x ] = [F(x+eps) - F(x)] / [1 - F(x)] ~ eps * f(x) / [1 - F(x)] Exponential Dist'n: f(x) = lam*exp(-lam*x) [1-F(x)] = exp(-lam*x) -> hazad = lam, a constant ==> lam is *hazard* = death rate; for exponential distribution (ONLY) this is constant. 1. Hazard Rates _ Survivor function: F(t) = 1 - F(t) = P[X > t] (optimistic view...) _ Hazard: lambda(t) = f(t)/(1-F(t)) = f(t)/F(t) _ _ F(x) = exp{-int(0:x) lambda(t) dt}, F(x) = 1-F(x) Example: lambda(t) = b*t --> F(x) = 1 - exp(-b*x^2/2), x>0 ("Rayleigh") f(x) = b * x * exp(-b*x^2/2), x>0 lambda(t) = a --> F(x) = 1 - exp(-a*x), x>0 ("Exponential") f(x) = a * exp(-a*x), x>0 lambda(t) = alp * bet * t^(alp-1) -> Weibull NB: alp < 1 -> hazard DEcreases; alp > 1 -> hazard INcreases. Examples? Pareto => lambda(t) = 0 for t <= eps, alp / t for t > eps Uniform => lambda(t) = 1/(bet-x), alp < x <= bet Normal => lambda(t) ~ t 6. Other Continuous Distributions 1. The Gamma Distribution: Length of time to catch t fish @ lam/hr avg rate 2. The Weibull Distribution: Lifetimes: 1-F(t) = exp{-beta (x-eps)^alpha}, x>eps 3. The Cauchy Distribution: Like a normal but much flatter... no mean, var. 4. The Beta Distribution Uncertain probability 0 pdf f_Y(y) = SUM {f_X(x)/|g'(x)| over x: g(x)=y} Discrete: pmf p_Y(y) = SUM {p_X(x) over x: g(x)=y} Jacobian ONLY needed for continuous variables [chain rule]. ================