20.1 _# tosses_ _E(#H)_ _SE(#H)_ _E(%H)_ _SE(%H)_ 100 50 5 50 5 2500 1250 25 50 1 10000 5000 50 50 .5 1000000 500000 500 50 .05 20.3 a) 50,000 b) 0 or 1 c) False, SD = .4 (using the quick method) d) True e) SE of # of incomes greater than $50K is (900)^.5 * .4 = 12. (or 1.3%) So, plus or minus 1% is (1/1.3)=.75 standard units. The chance is around 55%. 20.5 The appropriate box model has a mean of 150 lbs and a SD of 35 lbs. The expected value for the sum of 50 draws is 7500 pounds, and the SE is (50)^.5 * 35 = 250lbs, so 8000 lbs is +2 standard units. There is a 2% chance of failure. (By the way, elevators are not designed like this!) 20.6 (ii) is best. Because California is more populous, its sample will be larger, and more accurate for estimating a percentage value (see section 4). 20.7 a) True. Expected values have no associated error. b) False. see a) c) True. d) False. e) True. (You can count the tickets!) f) False. 20.9 200, with SE of about (200)^.5 * (1/3*2/3)^.5 = 12. 20.11 a) observed = 357, expected = 340 b) observed = 71.4%, expected = 68% 21.1 a) The same is like 500 draws from a box with 25,000 tickets; each ticket is marked 1 (has computer) or 0 (does not have a computer). The number of sample households with computers is like the sum of the draws. The fraction of 1's in the box is unknown, but can be estimated by the fraction in the sample, as 79/500=0.158. On the basis, the SD of the box is estimated as sqrt(0.158*0.842) ~= 0.36. The SE for the number of sample households with computers is estimated as sqrt(500)*0.36 ~= 8, and 8 out of 500 is 1.6%. The percentage of households in the town with computers is estimated as 15.8%, and the estimate is likely to be off by 1.6% or so. b) The 95% confidence interval is 15.8% +- 3.2%. (NB: You need to estimate the percentage for the town from data for the town: the national figures may not apply. In this case, the town seems pretty close to the national average.) 21.2 a) 99.6%, 0.3 of 1% b) Can't be done: the box is so lopsided that the normal approximation won't work. See exercises 5-6 on p 324; exercises 3-4 on p. 383. 21.4 a) The box has millions of tickets, one for each 17-year-old in school that year. Tickets are marked 1 for those who knew that Chaucer wrote The Canterbury Tales, and 0 for the others. The data are like 6000 draws from the box, and the number of students in the sample who know the answer is like the sum of the draws. The fraction of 1's in the box can be estimated from the sample as 0.361. On this basis, the SD of the box is estimate at sqrt(0.361*0.639) ~= 0.48. The SE for the number of students in the sample who know the answer is estimated as sqrt(6000) * 0.48 ~= 37. The SE for the percentage is 37/6000, which is about 0.6 of 1%. The percentage of students in the population who know the answer is estimated at 36.1%, give or take 0.6 of 1% or so. The 95% confidence interval is 36.1% +- 1.2%. b) 95.2% +- 0.6 of 1% 21.8 This is not the right SE. The bank has mixed up 73 cents with 73%. (The right way to figure the SE for an average is explained in Chapter 23). 21.10 710/100=7.1, so th two options describe the same event in different words. They are the same. 21.11 Option (ii) is it. For example, about 95% of the estimates will be right to within 2 SEs, about 99.7% of them will be right within 3 SEs, and so forth. 21.12 (i) is irrelevant, (ii) is a histogram for the numbers drawn, and (iii) is a probability histogram for the sum. Reason: (iii) looks like the normal; (ii) looks like a histogram for the contents of the box; (i) would allow 3 and 4 among the draws. 21.13 In all three cases, the expected value is 500 and the SE is about 16. You know what is in the box, the expected value and SE can be computed exactly and do not depend on the data. The three chance errors are 29, -16, and 14.