Lab 1 Program and Questions (Week of 1/19/98)

Step 1: Download SAS/Insight program with this week's data

Click here and the program will appear in your browser window. Click on "File>Save As..." in Netscape and choose "Format for Saved Document: Text" then click "OK". The program is now saved in your account (in your home directory, by default). The file's name is "lab1.sas". Return to this page by choosing "GO>Back" from the Netscape menu bar.

This week's data set is the same data we've been looking at in class: data on homes sold in two zip codes (33134 and 33146) located in Dade County, Florida. There are 339 Observations and 9 variables. The variables are: 'Zip Code', 'Year Built', 'Sqr Feet', 'Bedrooms', 'Bathroom', 'Floors', 'Lot Size', 'Sale Amt', and 'Yr Sold'. Lets have a look at it...

Step 2: Start the SAS/Insight program with this week's data

To get started type "sas lab1 &" in one of the terminals open on your screen. You should be in the same directory in which you saved the file "lab1.sas". A spreadsheet will appear. It will have 9 columns, one for each variable in the dataset, and 339 rows, one for each observation. Observations are numbered (i=) 1 to (i=) 339 in the left most column of row labels.

Step 3: Questions

1) What fraction of the observations are homes from zip code 33134? Click "Analyze > Distribution", (Analyze is the menu option, Distribution is a sub-menu option) a menu will appear. Click on "ZIPCODE" and click on "Y", "ZIPCODE" will appear in the box below the "Y". Click "OK". A window with plots (a rectangular pie chart and a bar chart) will appear. To display values on the plots click the little arrow in the bottom left of the plot and choose "values". This will answer your question. Another way of answering the question is to produce a table of frequencies by choosing "Tables > Frequency Table" from the top of the plot window. Click "File > End" on the plot window to get rid of it.

2) Produce the same plots for the variable "BATHS" (using the "Analyze>Distribution" menu). Look at the bar chart, what is the modal category? What fraction of observations are homes with 2 bathrooms?

3) Are there more small homes in zip code 33134 than in 33146? Use the number of bathrooms to judge size. Produce the plots in Question 2), but this time for the two zip codes separately, using the "Analyze>Distribution" menu. You need to do only 1 thing different than you did for Question 2): after you click on "BATHS" and put it under "Y" click on "ZIPCODE" and then click on "GROUP", then click "OK" to produce separate analyses by zip code. Resize the plot window to see all 4 plots (ask your TA how). Which zip code in the sample has more homes with 2 or fewer bathrooms?

4) Is it likely that the difference we observe between the two zip codes is due to sampling error? On paper, use equation 1-2 to construct 95% confidence intervals for the proportion of homes in each zip code with 2 or fewer bathrooms. What do you conclude?

5) The data are a sample of homes that sold in 1994 and 1995. Is this necessarily a representative sample of homes? Is it possible that estimates (sample proportions or confidence intervals) derived from this sample are biased? Can you think of any confounding factors?

Step 4: Stop the SAS/Insight program

Click on "File>End" on the SAS/Insight menu bar to quit the program.
Return to the Stat 110B lab page.
iversen@stat.duke.edu
last updated 3 September 1997