Homework 10 Solutions


Ch. 28: 


3.  null hypothesis: marital status and employment status are not
associated

    alternative hypothesis: there is some association between marital
status and employment status

Here are the expected values that you should have calculated:
                        Married         W, D, or S      Never married   
Employed                  654.1           109.3            132.6
Unemployed                 67.9            11.3             13.8
Not in labor force         62.0            10.4             12.6

There are (3-1)(3-1)=4 degrees of freedom.

chi-squared test = (24.9)^2/654.1 + (-6.3)^2/109.3 + (-18.6)^2/132.6 +
(-4.9)^2/67.9 + (-1.3)^2/11.3 + (6.2)^2/13.8 + (-20)^2/62.0 +
(7.6)^2/10.4 + (12.4)^2/12.6

So, the chi-squared test statistic is about 33.5 with 4 degrees of
freedom.  This gives us a p-value less than 0.01.  This is unlikely to
be chance variation.  It is likely that men of different marital
status have different distributions of labor force status.  (Note that
we cannot give a cause-and-effect relationship here.)



Ch. 29:


1.  (a) True (p. 549)
    (b) False (p. 555)
    (c) False (p. 547); of course, p = 4.7% gives you "statistical
        significance"



Special Review Exercises after Chp. 29 (pp. 567-577)


11.  A first-year GPA of 3.5 is 1 SD above average.  Students with
this GPA averaged about r x 1 = 0.4 SDs above average in second year,
by the regression method.  Sally must have been above average on
second-year GPA by about 0.4 SDs, putting her in the 66th percentile.

14.  The students who scored 500 on the V-SAT averaged about 0.6 X 500
+ 245 = 545 on the M-SAT.  That is the new average.  The new SD is the
r.m.s. error of the regression line, which is 80 points.  Now
(500-545)/80 is about -0.56.  So the answer is about 70% of 25000,
which is 0.70 x 25000 = 17500.

17.  The net gain is like the sum of 100 draws from a box with 18
tickets marked "+1$", 18 tickets marked "-$1", and 2 tickets marked
"-$0.50".  (These last 2 tickets correspond to 0 and 00, where you
only lose half your stake.)  The average of this box is -$1/38, which
is about -$0.0263.  The expected net gain is -$2.63.  The SD of the
box (don't use the short-cut) is $0.980.  The SE for the net gain is
$9.80.  Now $2.63/$9.80 is 0.27.  The answer is about 40%.

34.  (a) True
     (b) False (p. 482)
     (c) False (p. 482)

37.  (a) Can't be done with the information given, the Current
Population Survey isn't a simple random sample (chapter 22).

     (b) Use the method of section 28.4.

   Obs        Exp       Obs - Exp
  7  20   14.2  12.8   -7.2   7.2
 21  19   21.0  19.0    0.0   0.0
 13   9   11.5  10.5    1.5  -1.5
 23  10   17.3  15.7    5.7  -5.7

This gives a chi-squared test statistic of 11.9, on 3 degrees of
freedom.  So, p is less than 1%.

Interpretation: Women who are less well-educated tend not to be in the
labor force; women who are better educated are more likely to be in
the labor force, and in professional or managerial jobs.

Note: Even if Obs - Exp is 0.0 in some cells, that does not affect the
formula for degrees of freedom.  Degrees of freedom depend on the
model, not on the data (pp. 433, 439).



Additional Problems

1.  (a) H_0: avg = 15    H_A:  avg is more than 15

SE for avg = 3/((36)^0.5) = 3/6 = 0.5

Z test statistic = (17-15)/SE for avg = 2/0.5 = 4

So, the p-value (observed significance level) for this test is
(100%-99.9937%)/2, which is about 0.003% = 0.00003.  Since this is
much less than alpha = 0.05 = 5%, we have enough evidence to reject
the null hypothesis.  So, we conclude that the average number of
contacts per week is greater than 15.

    (b) Say that really the average is 16.

First find what the "rejection region divider" is in standard z-units,
then in the units of the problem.

15 + (1.65)(0.5) = 15.825

How many SEs (for average) is 15.825 from the alternative average of
16?
(15.825 - 16)/0.5 = -0.35

How much area between negative infinity (the left tail of the normal
curve continues on "forever") and -0.35 under the curve centered at an
average of 16?

(100% - 27.37%)/2 = 36.315%


2.  M: event of person having mono
    S: event of person having a sore throat

P(M) = 1% = 0.01
P(S|M) = 90% = 0.9
P(S|not M) = 30% = 0.3

We want P(M|S) = P(S|M)P(M)/(P(S|M)P(M) + P(S|not M)P(not M))
               = (0.9)(0.01)/[(0.9)(0.01) + (0.3)(0.99)]
               = 0.009/(0.009 + 0.297)

So, the probability of having mono, given that you have a sore throat,
is about 0.0294, which is about 2.94%.


3.  (a) No. If each mission is independent, he cannot add the
probabilities - he has to multiply them.

    (b) P(not shot down on any of 50 missions) = (0.98)^50
This is about 0.3642, or about 36.42%.