Answers
to Problems on Bayesian Stats
Let A = alleged father is the real father.
Let B = child has type B blood.
We want Pr (A | B)
From the problem, we know that
Pr (A) = 0.75.
Pr(B | A) = 0.50
Pr(B | not A) = 0.09.
Hence, using Bayes rule, we have:
Pr (A | B) = Pr (A and B) / Pr (B) = Pr (B | A) Pr (A) / Pr
(B) = (.50)(.75) / Pr(B).
Now, Pr (B) = Pr ( B and A) + Pr (B and not A) = (.50)(.75) + Pr
(B | not A) Pr (not A) = (.50)(.75) + (.09)(.25) =
0.3975
Hence, Pr (A | B ) = (.50)(.75) / .3975 = .9434.
There is a 94.34% chance that the alleged father is the real father,
given the child is blood type B.
4. Differences between Bayesian and
classical inference
a) In classical inference, the probability, Pr(mu > 1400),
is a number strictly bigger than zero and strictly less than one.
False. In classical inference, mu is not treated as random.
Rather, it is some fixed number. Hence, mu is either greater
than 1400 or less than 1400. This implies that Pr(mu > 1400)
must equal zero or it must equal one. It cannot be a number in
between zero and one.
b) In Bayesian inference, the probability, Pr(mu > 1400),
is a number strictly bigger than zero and strictly less than one.
True. In Bayesian inference, mu is treated as random. We make probability statements about mu by using its posterior distribution. Hence, Pr(mu > 1400) is some number between zero and one.
c) In classical inference, our best guess at mu is its maximum
likelihood estimate.
True. For the normal curve, the maximum likelihood estimate of
mu equals the sample mean of the data.
d) If you have very strong prior beliefs about mu, the
Bayesian's best guess at mu will be affected by those beliefs.
True. The Bayesian's best guess at mu combines the prior
information about mu and the data. For example, for the normal
curve, the Bayesian's best guess at mu is a weighted average of the
sample mean and the prior mean.
e) If you draw a likelihood function for mu, the best guess
at
mu is the number corresponding to the top of the hill in the likelihood
function.
True. Maximum likelihood estimates are those which maximize
the likelihood function, i.e., have the largest values of the
likelihood function.
5. Baseball statistics
We can use Bayes rule to find the posterior distribution for
p.
For each value of p, the number of times on base (call this random
variable X) follows a binomial distribution with n = 68 and the given
p. This follows because the outcome is dichotomous, the times at
bat are
independent, and (absent other information about the game situation)
each time Drew bats he has the same chance of reaching base
safely.
Setting up the Bayes rule computations, we get
p
Pr(p) Pr(X=22 |
p) Pr(X=22,
p) Pr(p | X=22)
---------------------------------------------------------------------------------------
0.25 .05
.0408
.00204
.0352
0.30 .10
.0943
.00943
.1625
0.35 .30
.0926
.02778
.4791
0.40
.40
.0440
.01760
.3035
0.45
.10
.0107
.00107
.01850
0.50 .05
.0014
.000068
.00117
Pr(X=22) = .057988
Each entry in the third column is obtained by using the binomial
formula for the corresponding value of p. For example,
Pr(X=22 | p=0.30) = (68!)/(22! 46!) .322 .746 =
.00943
Each entry in the fourth column is obtained from the multiplication
rule:
Pr(X=22, p) =
Pr(p) Pr(X=22 | p)
Pr(X=22)
is obtained by summing Pr(X=22, p) for all values of p.
Pr(p | X=22) is
obtained from the definition of conditional probability:
Pr(p
| X=22) = Pr(X=22, p) / Pr(X=22)
6. Angioplasty
We can use Bayes rule to find the posterior distribution for
p. For each value of p, the number of severe reactions
(call this random variable X) follows a binomial distribution with n =
127 and the given p. This follows because the outcome is
dichotomous, the people are independent, and (absent other information
about the people) each person has the same chance p of having a severe
reaction.
a) Setting up the Bayes rule computations, we get
p
Pr(p) Pr(X=28 |
p) Pr(X=28,
p)
Pr(p | X=28)
----------------------------------------------------------------------------------------------
0 1/6
0
0
0
0.10 1/6
.0000312
5.21
(10^(-6))
.00037
0.20
1/6
.0724
.012
.8656
0.30 1/6
.0112
.00186
.1339
0.40
1/6
.000008
1.38
(10^(-6))
.0001
0.50 1/6 6.2
(10^(-11))
1.04 (10^(-11))
7.4 (10^(-10))
Pr(X=28) = .013933
Each entry in the third column is obtained by using the binomial
formula for the corresponding value of p. For example,
Pr(X=28 | p=0.20) = (127!)/(28! 99!) .228 .899
= .0724
Each entry in the fourth column is obtained from the multiplication
rule:
Pr(X=28, p) =
Pr(p) Pr(X=28 | p)
Pr(X=28)
is obtained by summing Pr(X=28, p) for all values of p.
Pr(p | X=28) is
obtained from the definition of conditional probability:
Pr(p
| X=28) = Pr(X=28, p) / Pr(X=28)
b) Pr(p < .30) = .00037 + .8656
Note that this prior distribution is very strong, in that it forces
p to equal only one of 6 values. A more realistic prior
distribution would allow p to range from 0 to 1. But, that's more
complicated computationally than we need to show the general idea of
Bayesian statistics.