Ngrid <- seq(from=max(y),by=2,length=200) # grid on N thgrid <- seq(0.1,0.9,length=200) # grid on theta lp <- matrix(0,nrow=200,ncol=200) # initialize log posterior # matrix for (i in 1:200){ # evaluate LOG posterior on grid lp[i,] <- lpost(Ngrid[i],thgrid) } lp <- lp-max(lp) # standardize by substracting the max p <- exp(lp) # now go back to natural units pN <- apply(p,1,sum) # compute marginal posterior in N (for P(N>100|y)) image(Ngrid,thgrid,p,xlab="N",ylab="THETA") # plot joint plot (Ngrid,pN,xlab="N",ylab="P(N|y)",type="l") # plot marginal abline(v=100,col=2)
(a) The problem defines p(th,lambda) and p(N|theta,lambda)=Poi(N|mu=th/lambda). Consider:
p(th,lambda,N) = p(th,lambda)*p(N|theta,lambda) = 1/lambda *Poi(...)How do you get from the joint dist on (th,N,lambda) to the desired marginal prior on (th,N)? It's the standard trick of getting rid of "nuisance parameters"..
(b) Now you should have a prior p(th,N) from part (a). Combine this with the binomial likelihood p(y|th,N) = prod p(y[i]|th,N) to get the posterior
p(th,N|y) = C*....(where C is a proporitionality constant).
Now the problem is in the same form as several earlier problems, i.e. we have a bivariate posterior p(par1,par2) and want to simulate from it. Just use your program from the last hw (3.8), change the likelihood, and the grid on the two pars appropriately and you are done.
- First get an approximate linear fit as a first guess of the parameter values: x <- 1976:1985 y <- c(24,25,31,31,22,21,26,20,16,22) lf <- lm(y ~ x) summary(lf) You'll see a=1800, b= -0.9 approx. - now, recode x as -10 for 1976 through -1 for 1985. Since b is negative, this will enforce a+b*x to always be positive and you'll not run into troubles with the poisson par (which has to pos). Redo the linear fit to get a better idea of the par values with the recoded x. - now use a first guess of agrid & bgrid, fiddle around a little and you'll soon find a reasonable grid, e.g. agrid <- seq(10,30,length=200) bgrid <- seq(-2.0,-0.1,length=200) - A comment about (g) and (h). Denote with mu=alpha+beta*0 (0 for 1986) the mean for y[1986]. Since mu is the poiss par for y[1986], the expected value of y[1986] is just mu=a+0*b=a. I.e. just report a histogram of the simulated a values for (g). For h: denote as ynew=y[1986] We know the joint dist of a,b,ynew given y: p(a,b,ynew|y) = p(a,b|y)*p(ynew|a,b) for each simulated pair a,b, simulate a ynew: ynew <- rpois(1000,lambda=mu) The posterior predictive for ynew is simply the histogram for ynew.Also, see the book p23 for more comments on simulating from a predictive.
> Also, I suppose that on 4.1, we should try doing the differentiation > numerically, is that correct?Well, you could if you want to. Probably it's easier if you find the posterior mode th-hat numerically (see the hints), and then use th-hat to plug in into the analytically derived expression for I(th-hat).
But if you prefer to try a numerical evaluation of I(th-hat), that's fine with me: Denote with L(a,b)=log(y|a,b)
Use small step sizes da and db. Compute dL/da by L(ahat+da)-L(ahat) dL/da =approx ------------------ da [L(ahat+da)-L(ahat)]/da - [L(ahat)-L(ahat-da)]/da d2L/da^2 =approx -------------------------------------------------= da = {L(ahat+da)-2*L(ahat)+L(ahat-da)}/da^2 Same for d2L/db2 & d2L/dadb Now try the same again for da := da/2. If the approximation doesn't change much, stop.Life is even easier in Mathematica: see my other mail message..
In problem 4.1, change the prior to "uniform on [-3,3]" Else you would have the mode on the boundary, in which case the normal approximation asked in part (c) would be irrelevant [remember that posterior mode on the boundary was one of the "counterexamples]). You could of course use the normal approx with unif[-3,3], and then only constrain it (that's what you would normally do). Please don't include Splus programs in your submitted homework. Just the plots, and the values for the mode th-hat and I(th-hat) etc. P PS: the plot for 4.1 should look something like (don't print out this email as your homework solution :-) ...................................................................... . . ***** 0.4.. * ** <- p(th|y) . ** * . * * . * ** . ** * . * ** . ** ...... <- N(thhat,1/I(thhat)) . * .. ... * . * .. .. ** 0.3.. *.. .. * . *. .. * . *. ..** . * ..* . ** . * P . .* .** ( . .* ..** T . . * ..* H . ..* .** | 0.2.. .. * .** Y . . * ..* ) . . * . * . .. ** .** . .. * ..* . . * ..* . . * .** . .. ** .** . .. * .* 0.1.. .. ** ** . .. * .** . .. * .** . .. ** ** . .. ** ** . .. ** **. . ... ** ***. . .. *** ***... . ******** **** 0.0.. ...................................................................... -3 -2 -1 0 1 2 3 THETA
4.5c works just the same way as 4.5b: The solution is that the optimal estimator is a=m, where m is the k0/[k0+k1] quantile. To show that this is true: 1. Consider a>m. Then E[L(a,th)] - E[L(m,th)] = = .... /m = | k1(a-m) p(th|x)dth + /-infty /a + | k1(a-th)-k0(th-m) p(th|x)dth + /m /infty + | k0(m-a) p(th|x)dth = (*) /a Note that the integrand in the second integral is k1a-(k0+k1)th+k0*m >= k1*a - (k0+k1)a + k0*m = k0(m-a). i.e. /m >= | k1(a-m) p(th|x)dth + /-infty /infty + | k0(m-a) p(th|x)dth = ..... = 0. /m i.e. a>m is NOT better than m. 2. Same argument for a