23.1 The sum of 400 draws will be around 400 * 100 = 40,000, give or take sqrt(400)*20 = 400 or so. The average of 400 draws will be around 100, give or take 1 or so. a) the chance is almost 100% b) the range is "expected value +- 1 SE," so the chance is about 68%. 23.3 Model: There is a box with 50,000 tickets, one for each household in the town. The ticket shows the commute distance for the head of household. The data are like 1000 draws from the box. The SD of the box is unknown, but can be estimated by the SD of the data, as 9.0 miles. The SE for the sum of the draws is estimated as sqrt(1000)*9 ~= 285 miles, and the SE for the average is estimated as 285/1000 ~= 0.3 miles. a) The average commute distance of all 50,000 heads of households in the town is estimated as 8.7 miles; this estimate is likely to be off by 0.3 miles or so. b) 8.7 miles +- 0.6 miles. Comment. The normal curve is used to approximate the probability histogram for the sample average, not the histogram for the data; the data are skewed, with a long right hand tail. 23.4 Can't be done with the information given. This is a simple random sample of households, but a cluster sample of people. The cluster is the household, and people in a household are likely to be similar with respect to commuting. For example, if a household is far from the center of town, all the occupants are likely to have a long commute. The SE is going to be bigger than the SE for a simple random sample of 2500 persons: section 22.5 23.5 Model: There is a box with 50,000 tickets, one for each household in the town. The ticket is marked 1 if the head of household commutes by car; otherwise, 0. The data are like 1000 draws from the box. The fraction of 1's in the box is unknown, but can be estimated by the fraction in the sample, as 0.721. On this basis, the SD of the box is estimated as sqrt(0.721 * 0.279) ~= 0.45. The SE of the number of 1's in 1000 draws is estimated as sqrt(1000) * 0.45 ~= 14. The SE for the percentage of 1's is 14/1000, or 1.4%. The percentage of 1's in the box is estimated as 72.1%, give or take 1.4% or so. The 95% confidence interval is 72.1% +- 2.8%. Comment: We have a simple random sample of households, and are making an inference about households. 23.7 Option (iii) is it, this is a sample of convenience (p. 424). 23.8 a) True: the interval is "average +- SE." b) True: section 21.3. c) The data don't follow the normal curve, but 68% might be right; you need the data to tell. (To see that the data don't follow the curve: enrollments can't be negative, but the SD was a lot bigger than the average, so there must have been a long right hand tail.) d) False: 325 is not the SD. (The data aren't normal, which is another problem) e) False. The normal curve is being used on the probability histogram for the sample average, not the data (p 411 and p 418-19). 23.11 There is too much spread in the histogram: the SE for the average is only about 0.3, there is a lot of area outside the range EV +- 3 SE. 23.12 This is not a 95% confidence interval: the class is a sample of convenience, not a probability sample. 24.1 a) Elevation (Mean) = 84,411 Estimate is likely off by 6 inches. You can model this like 25 draws from a box with an unknown mean and standard deviation. You can get estimates of these from your sample: Mean estimated as 84,411", SD(box) estimated by SD(sample), 30". SE for sum of 25 draws = (25)^.5 * 30" = 150". SE for average = (SE of sum)/#Draws = 6. b) True. The SE(average) provides a measure of the uncertainty in your estimates. c) False. You know the sample average, so there is no confidence interval. d) False; there is a 95% chance that the next reading will be within the interval Mean +- 2 * SD! The SD is for an individual reading, the SE is for the sample average, e) False. (Similar reasoning to d) f) False. See exercise #8, p. 421. 24.3 SE = (14 KM/s)*(2500)^.5/2500 = .28 299,774 +- .56 km/s 24.4 The SD of the measurements. 24.5 If there is a mistake in the length measurements, then that would bias the results. 24.9 a) This is like 4 draws from a box. Expected Weight = 16 oz. SE = (4)^.5 * .05 = .01 oz. b) This is like 400 draws from a box. Expected Weight = 100 pounds. SE = (400)^.5 * .05 = 1 oz. So, there is a 95% chance that they get +- 2 oz. 24.11 This procedure, remeasuring data until you get some agreement, could bias the results. Take as an example, a class where two TAs grade every test. If the TAs disagree on the letter grade, then a third TA grades it. If all three TAs disagree, then the professor grades it. Imagine that the first TA tends to give Bs, and the second and third TAs tend give As. The procedure suggested would tend to give more As, since the third TA would often agree with the second one. This introduces _bias_, a systematic change in the measurement procedure. A better way of grading would be to take the average of all three TAs grades.