All the public Windows machines on campus have S-Plus, not just the machines in the North building (where our computer labs take place). Today we want to do some brief experimentation with S-Plus.
To start S-Plus, click Start, then Programs, then Statistics & Mathematics, then S-PLUS 2000.
A description of the problem and the data (taken from Anderson, Sweeney, and Williams, pp. 115-6) follows:
The National Health Care Association is concerned about the shortage of nurses the health care profession is projecting for the future. To learn the current degree of job satisfaction among nurses, the association has sponsored a study of hospital nurses throughout the country. As part of this study, a sample of 50 nurses were asked to indicate their degree of satisfaction in their work, their pay, and their opportunities for promotion. Each of the three aspects of satisfaction was measured on a scale from 0 to 100, with larger values indicating higher degrees of satisfaction.
The data were broken down by the types of hospitals employing the nurses. The types of hospitals considered were private (P), Veterans Adminstration (V), and University (U).
To get this data set, click here. Use the Netscape menus (File, then Save As)to save the data to a file on your desktop (maybe "lab1.txt").
To read the data into S-Plus, choose from the S-Plus menus File, then Import Data, then From file. Choose the file that you have just saved. You should see a spreadsheet with the data in four columns, labeled "Work", "Pay", "Promote", and "Type".
We are interested in the following questions:
One particular type of graph that can be very helpful in visualizing the data and answering these questions is the boxplot. The main body of the boxplot (shaped like a box with the median line in the interior) marks the first quartile, the median, and the third quartile. The lines extending from the ends of the box are sometimes called "whiskers". By default, S-Plus draws these to the nearest value in the data set beyond a span of 1.5*IQR from the quartiles. Any data points beyond this range (often called "outliers") are drawn individually beyond the "whiskers".
First, we want to compare how the nurses feel about their work, pay, and promotion opportunities. We want a boxplot for each of these response variables. Choose Graph, then 2D Plot. From the list of graphs, choose "Box Plot" and OK. A menu will appear, with most of the boxes that you need already filled in. However, you will need to list the variable(s) that you want to plot in "y columns". To get side-by-side boxplots (easier for comparison) for each of the variables, enter the names of all three variables separated by commas; entering "Work, Pay, Promote" will yield three boxplots, in that order. This graphical display should help you answer the first question. Extra: The software is not smart about the labels. How can you change them?
To answer the second question, it will be useful to make a set of boxplots for each response variable, with the data separated and grouped by type of hospital. To tell S-Plus that the variable "Type" designates different groups, we enter the variable "Type" in the "x columns" box. The "y columns" box contains the quantitative variable of interest, either "Work", "Pay", or "Promote". (It's easiest to make a different graph for each.) Now, we can compare and contrast how the nurses in various hospitals feel about work, pay, and/or promotions in their respective types of hospitals.
At some point, you may want to obtain numerical summary statistics for the data (mean, quartiles, etc.). You can use Statistics, Data summaries, Summary statistics. The tab entitled "Data" is similar to that we saw for boxplot. The "Statistics" tab is where you specify which particular summary statistics you'd like to calculate.
Don't forget to logout of your PC when you are done!