Lab 7 (Week of 3/1/99)

Step 0: Down-load this week's SAS/Insight program

Click here and this week's program will appear in your browser window. Click on "File>Save As..." in Netscape and choose "Format for Saved Document: Text" then click "OK". The program is now saved in your account (in your home directory, by default). The file's name is "lab7.sas". Return to this page by choosing "GO>Back" from the Netscape menu bar. To get started type "sas lab7 &" in one of the terminals open on your screen.

Step 1: Questions

This week's data set has 8 variables. Four correspond to sample means, the other 4 to sample medians, drawn from a population with mean 61.6 and variance 211.2. The data represent age at diagnosis of a particular disease; the population consists of women living in a large U.S. government study region. Mu4 is a column of 100 sample means calculated from random samples of size 4 drawn from this population; Mu16, Mu100, and Mu1600 are 100 sample means for samples of size 16, 100, and 1600. Columns Med4, Med16, Med100, and Med1600 are sample medians for samples of the indicated size.

1) Distribution of the Sample Mean. Before you begin this problem, open a new spreadsheet to record your findings. Calculate summaries of the distribution of each of the four columns Mu4, Mu16, Mu100, and Mu1600. Make note of the shape of each histogram and enter standard deviation of each column in column 2 of your empty spread sheet, enter the associated n (4, 16, 100, or 1600) in column 1 of the empty spread sheet. The central limit theorem says that (at least the later two of) these histograms should have approximately what distribution? Are the histograms consistent with this? The variance of the sample mean as a function of n is the population variance divided by n. For each n, calculate (can do in the second spread sheet) the theoretical variance of the sample means in each of the 4 cases given. Compare the theoretical to observed values.

2) Distribution of the Sample Median. Repeat problem 1 for the samples of sample medians. Note that, since the parent population is approximately normal, the theoretical variance of the sample median is approximately 1.57 times the population variance divided by n (a result from class). Also note that the central limit theorem applies to means, not medians.

Return to the Stat 110B lab page.

iversen@stat.duke.edu
last updated 2 March 1999