alpha = 0.29 0.01 -> 0.11 0.001 -> 0.02 c) Example: In n=1000 Bernoulli trials, we observe X=473 Successes. Which is true? H.0: p = 1/2 H.1: p = 1/3 The LRT says to reject for small values of T(X)=x; the P-value is P = Pr[ X <= 473 | p = 0.5 ] = sum(k = 0:473) { (1000:k) 0.5^k 0.5^(1000-k) = 0.0468 4365 so with alpha=0.05 we would REJECT H.0. If we were testing "p=1/3" we would reject that too, with P = 5e-20. The LR against H.0 is Lambda is (p1/p0)^x (q1/q0)^{n-x} = (p1 q0/p0 q1)^x (q1/q0)^n = (1/2)^473 (4/3)^1000 = 3.560809e-18 Things wouldn't look so silly if we compared to H.1: p.1 = 0.45 d) Advantages: Sampling: No need for prior Bayes: Computes what you probably wanted, P[H.0 | X]. ----------------------------------------------------------------------------- Likelihood Ratio Tests (LRTs) are Uniformly Most Powerful (UMP): Of all tests of the form Reject H.0 if T(x) >= c for some statistic T and "critical value" c, the LRT (where T is the Likelihood Ratio /\(x)) is best in the sense that if it has Size alpha = P [ Lambda >= c | H.0 true ] and Power [1-beta] = P [ Lambda < c | H.1 true ] then every other test has either bigger Size (i.e. higher prob of Type I error) or smaller Power (i.e. higher prob of Type II error). This classic result is called the "Neyman-Pearson Lemma" in honor of Jersey Neyman and Eagon Pearson, who discovered it in 1930's. -------------------------------------------------------------------------- Here's a proof of the Neyman/Pearson Lemma: Set /\(x) = f.1(x)/f.0(x), the likelihood ratio against H.0: X~f.0(x), and fix any c>0. Set: R = {x: /\(x) >= c} alp = P.0[ X in R ] = Integral of f.0(x) over R bet = P.1[ X notin R ] = Integral of f.1(x) over R^c Let R* be any other test rejection region, and set alp* = P.0[ X in R* ] = Integral of f.0(x) over R* bet* = P.1[ X notin R* ] = Integral of f.1(x) over R*^c Set A = R\R* and B = R*\R Set A.0 = Integral of f.0 over A B.0 = Integral of f.0 over B A.1 = Integral of f.1 over A B.1 = Integral of f.1 over B and note that, since /\ >= c in A and /\ < c in B, A.1 = Integral of /\(x) f.0(x) over A >= c * A.0 B.1 = Integral of /\(x) f.0(x) over B < c * B.0 Then c * (alp* - alp) = c * (B.0 - A.0) > B.1 - c*A.0 and (bet* - bet) = A.1 - B.1 >= c*a.0 - B.1 SO... if (B.1-c*A.0) is >= 0, then alp* > alp; or if (B.1-c*A.0) is < 0, then bet* > bet. -------------------------------------------------------------------------- EXAMPLES of Likelihood Ratio tests: One-sided Normal Mean with Known Variance: Let X.i ~ No(mu, sig^2) with sig^2 known, and test: H.0: mu = mu.0 vs. H.1: mu = mu.1 Compute ratio of pdf's for samples of size n; take log; get: (n/2sig^2) [ (x-bar - mu.0)^2 - (x-bar - mu.1)^2 ] = const + const * (mu.1-mu.0) * X-bar and hence we Reject for Large values of X-bar, if mu.1 > mu.0 Small if mu.1 < mu.0 Fixed alpha = 0.05, mu.1 > mu.0: REJECT when: (X-bar - mu.0)/[sig/sqrt(n)] > 1.645, i.e., X-bar > mu.0 + 1.645 sig / sqrt(n) P-value: P(x) = Phi( sqrt(n) * (mu.0 - X-bar) / sig ) ------------------------------------------------------------------------------- COMPOSITE SITUATIONS: Usually one or both of the hypotheses we consider will be "composite", ie, have more than one point of Theta in them, ie, won't completely specify the data pdf. For example, we might like to test: H_0: X.i ~ No(mu.0, sig^2) for some sig^2>0 vs. H_1: X.i ~ No(mu.1, sig^2) for some sig^2>0, where both hypotheses are half-lines in the (mu x sig) plane. OR H_0: X.i ~ Bi( n=1000, p = 0.5 ) vs. H_1: X.i ~ Bi( n=1000, p < 0.5 ) where H.1 is the segment [0,1/2) while H.0 is { 1/2 }. Usually NO TEST will be UNIFORMLY most powerful against ALL alternatives; we have to make some choices. One special case where life is good: Monotone Likelihood Ratio. If H.0 and H.1 assert that theta lies in Theta.0 and Theta.1, respectively, and if f(x|th.1) / f(x|th.0) is a monotonic increasing function of T(x) for every th.1 in Theta_1 and every th.0 in Theta_0, then "Reject if T(x) > c" is Most Powerful test *uniformly* in Theta_0 and Theta_1, or "UMP". UMP tests DO NOT EXIST for most cases, including for example: - two-sided tests for Normal mean [ each 1-sided test is more powerful for SOME mu ] - one-sided test of center parameter for Cauchy dist'n [ LHR is not monotonic, even for n=1 observation ] ------------------------------------------------------------------------------- GLRT One approach in problems with composite hypotheses is to base tests on the "generalized likelihood ratio" statistic: sup{ f(x | theta): theta in Theta.1 } ------------------------------------------- sup{ f(x | theta): theta in Theta.0 } It's worth computing theta-hat first. That will tell you the numeratior, if th-hat is in Theta.1, or the denominator, if th-hat is in Theta.0. For example: Normal Mean, Known Variance: H.0: X ~ No(mu = mu.0, sigma^2 ) H.1: X ~ No(mu arbitrary, sigma^2 ) The NUMERATOR (H.1) max comes at theta=X-bar, its value there is: (2 pi sig^2)^(-n/2) * e^(-S/2sig^2) where S = Sum (X.i-x-bar)^2. The DENOMINATOR max comes at theta = mu.0 (no choice!) its value there is (2 pi sig^2)^(-n/2) * e^(-S/2sig^2 ) * e^(-n (x.bar - mu.0)^2/2sig^2) so the log ratio is (n/2sig^2) * ( x.bar - mu.0 )^2 and the GLR test rejects for large values of | x.bar - mu.0 |, i.e. it's the obvious two-sided test with fixed-alpha rejection region (for alpha=0.05) R = {x: | x.bar - mu.0 | > 1.96 sig / sqrt(n) } and P-value P(x) = 2 Phi( - sqrt(n) * |mu.0 - X-bar| / sig ) Normal Mean, Unknown Variance: H.0: X ~ No(mu = mu.0, sigma^2 arbitrary) H.1: X ~ No(mu arbitrary, sigma^2 arbitrary) The numerator max comes at theta=X-bar, sig^2 = hat-sig^2, where hat-sig^2 = S/n, S = Sum{ (x.i - x-bar)^2 }; its value there is (2 pi S/n)^(-n/2) * e^(-n/2) the denominator max comes at sig^2 = (1/n) Sum (x_i-mu.0)^2 = S/n + (mu.0 - X-bar)^2; its value there is (2 pi [S/n + (mu.0-X-bar)^2] )^(-n/2) * e^(-n/2) After cancelling a few things, the GLR statistic is the -(n/2) power of: 1 + (mu.0-X-bar)^2 / (S/n), a monotonic function of the Student t statistic. SO, the two- sided GLR test of H: mu=mu.0 will reject for large values of X-bar - mu.0 |t|, where t = ---------------------- (*) sqrt{ (S/(n-1)) / n } with fixed-alpha rejection region R = {x: |t| > t_c } for "critical value" t_c = qt(1-alp/2, n-1) with n-1 deg fdm. The P-value for "t" as in (*) would be: P = 2 * pt(- abs( t ), n-1) Another Example: For Poisson data, test H.0: theta = th.0 vs two-sided alternative; now let S denote the sum of X.1 ... X.n, and Xb = X-bar = S/n: (n X-bar)^S exp(-n X-bar)/S! T(x) = -------------------------------- (n th.0)^S exp(-n th.0)/S! = (S/n th.0)^S exp(n th.0 - S) = nth power of: (X-bar/ th.0)^X-bar exp(th.0-X-bar) with logarithm: X-bar * [ log(X-bar/ th.0) ] - [ X-bar - th.0 ] which gives the GLR two-sided test for a Poisson mean--- for X-bar close to theta (always true for large n), very close to: ~~ | th.0 - X-bar |^2 / 2th.0 Here you may need to use simulation (for small n) or Normal approximation (for large n) or a numerical search to get the precise rejection region or P-value. ------------------------------------------------------------------------------- ASYMPTOTICS In a one-parameter natural exponential family, with pdf of the form f(x | th) = exp[ th * T(x) - A(th) ] * h(x) the LRT of H.0: th=th.0 vs. H.1: th=th.1 will reject for large values of S.n = T(x.1) + ... + T(x.n), if th.1>th.0 (otherwise small values). This statistic has mean n A'(th.0) and variance n A"(th.0) under the null hypothesis, for a sample of size n, so by the CLT the P-value for a one-sided test will be about: P(x) ~~ Phi { [ n A'(th.0) - S.n ] / sqrt(n A"(th.0)) } while a Bayesian test of the same hypotheses would yield a posterior probability of H.0 of exactly P[ H.0 | x ] = 1 ---------------------------------------------------------------- 1 + (pi.1/pi.0) * exp( (th.1-th.0) * S.n + n [A(th.0)-A(th.1)] ) -------------------------------------------------------------------------------