This lab analyzes a data sets from Weisberg's book Applied Linear Regression. Weisberg writes:
Aerial survey methods are regularly used to estimate the number of snow geese in their summer range areas west of Hudson Bay in Canada. To obtain estimates, small aircraft fly over the range and, when a flock of geese is spotted, an experienced person estimates the number of geese in the flock. To investigate the reliability of this method of counting, an experiment was conducted in which an airplane carrying two observers flew over n=45 flocks, and each observer made an independent estimate of the number of birds in each flock. Also, a photograph of the flock was taken so that an exact count of the number of birds in the flock could be made. The columns in the data set are Photo, Observer 1 and Observer 2, in order.
Begin by reading in the data
photo <- c (56, 38, 25, 48, 38, 22, 22, 42, 34, 14,
30, 9, 18, 25, 62, 26, 88, 56, 11, 66,
42, 30, 90, 119, 165, 152, 205, 409, 342, 200,
73, 123, 150, 70, 90, 110, 95, 57, 43, 55,
325, 114, 83, 91, 56 )
observer1 <- c ( 50, 25, 30, 35, 25, 20, 12, 34, 20, 10,
25, 10, 15, 20, 40, 30, 75, 35, 9, 55,
30, 25, 40, 75, 100, 150, 120, 250, 500, 200,
50, 75, 150, 50, 60, 75, 150, 40, 25, 100,
200, 60, 40, 35, 20 )
observer2 <- c ( 40, 30, 40, 45, 30, 20, 20, 35, 30, 12,
30, 10, 18, 30, 50, 20, 120, 60, 10, 80,
35, 30, 120, 200, 200, 150, 200, 300, 500, 300,
40, 80, 120, 60, 100, 120, 150, 40, 35, 110,
400, 120, 40, 60, 40 )
Make scatter plots of Y=photo count versus X=observer1 and versus X=observer2. Do these plots suggest that linear regression is appropriate? Why or why not?
Make scatter plots of Y=sqrt(photo count) versus X=sqrt(observer1) and versus X=sqrt(observer2). Do these plots suggest that linear regression is appropriate? Why or why not?
Make scatter plots of Y=log(photo count) versus X=log(observer1) and versus X=log(observer2). Do these plots suggest that linear regression is appropriate? Why or why not?
Make a scatter plot of observer1 versus observer2. Use the log scale
if you like. Add the 45 degree line to the plot using the command
abline(0,1) immediately after the plot command. Does one
observer tend to guess higher than the other? Which one?
Use the regression of Y=log(photo count) versus X=log(observer1) to answer the following questions.
Weisberg continues:
As a result of this experiment, the practice of using visual counts of flock size to determine population estimates was discontinued in favor of using photographs.