A start on Latex:
Info on emacs editor, unix, computing, etc:
S-Plus is an interactive environment for graphics and scientific computation with a range of statistical modelling and analysis tools. We'll use S-Plus on the Duke Statistics unix machines (for Duke Statistics graduate students) and on the acpub unix systems (for other students). Go to OIT's acpub unix pages if you're not already familiar with the basics of unix workstations and the public clusters. Lab 2 provides a walk-through introduction to S-Plus. The best way to learn is to learn by doing. The S-Plus environment has a comprehensive on-line, window-based help system once you get beyond the basics.
S-Plus can be run from within the emacs editor. This allows you to easily re-use and edit S-Plus commands you used earlier. Click on the "S-Plus within emacs" link on the list above to get going. If you do this, please remember to quit S-Plus before logging off the computer, or you might spawn a runaway S-Plus process.
SOME S-PLUS BASICS AND COMMONLY USED COMMANDS
A. STARTUP AND ENVIRONMENT
Splus to start up the Splus system
motif() to open a graphics window for plots on screen
help.start() to open the help window system
or you may need to use help.start(gui="motif")
B. INPUT AND OUTPUT
x _ scan("filename") to read numbers from a file
x _ read.table(file="afilename") to read a table (matrix) of data
source("sfile.s") to read Splus code from a file
sink("file") to write everthing that follows to a file
sink() .. and back to the screen
write(..) write to file -- check help
write.table(..) write to file -- check help
C. ORGANISING DATA
x _ matrix(x,ncol=5,byrow=T) reorganises data in x to a 5 column matrix
x _ matrix(x,12,5) reorganises data in x to a 12x5 matrix, by column
x _ matrix(scan("filename"),12,5) etc
x _ c(x,c(1,10,y),1:10) c() means "catenate" to create vectors
D. RANDOM VARIATE GENERATION AND DISTRIBUTIONS
sample to sample with or without replacement from a set
dnorm() normal pdf
pnorm() normal cdf
qnorm() normal quantile function, inverse of pnorm()
rnorm() generate normal random variates
Others: ?dist where ? =d,p,q,r with parameters as follows:
|+--------------------------------------------------------------+|
|| dist Distribution Parameters Defaults ||
|+--------------------------------------------------------------+|
|| beta beta shape1, shape2 -, - ||
|| cauchy Cauchy loc, scale 0, 1 ||
|| chisq chi-square df - ||
|| exp exponential rate 1 ||
|| f F df1, df2 -, - ||
|| gamma Gamma shape - ||
|| norm normal mean, sd 0, 1 ||
|| t Student's t df - ||
|| unif uniform min, max 0, 1 ||
|+--------------------------------------------------------------+|
E. PLOTTING
par(mfrow=c(3,2)) 6 plots laid out in 3 rows of 2
par(....) many arguments to control display, such as
mfrow=c(1,1) plot layout on screen/page
bty="n" no frame drawn around graph
and see help for others
hist(x,nclass=25,prob=T) histogram
plot(x,y) scatter plot
plot(x,y,type="l") line plot
plot(x,y, ...) many arguments to control display, such as
type="l" for lines
xlim=c(0,10) range of plot on horizontal axis
ylim=c(0,1) ..
xlab="here is a label on the x-axis"
ylab=..
col=2 use second colour
lty=3 use third (broken) line type
lwd=2 twice the usual line width
lines(x, y) add lines to a plot
points(x, y) add points to a plot
tsplot(x) Plots a vector vs 1,2,....
(time series plot)
qqnorm(x) normal quantile plot
boxplot(x, ..) boxplot
mtext(side=3, line=0, cex=2, outer=T,
"This is an Overall Title For the Page")
F. ASSIGNMENT AND BASIC ARITHMETIC OPERATORS
<- or _ Assignment
* Multiply
+ Add
- Subtract
/ Divide
^ Exponentiation
G. SEQUENCE AND REPETITION
x _ 1:50
x _ seq(1,50,by=10)
x _ seq(1,50,length=50)
x _ seq(0,1,length=100)
x _ rep(y,10)
H. SUBSCRIPTS
[ ] Vector subscript x[3]_101; y _ x[20:1]+1
[,] Matrix subscript x[1,5]_0; y[1:10,]
I. RELATIONAL OPERATORS
== Equal-to
!= Not-equal-to
< Less-than
<= Less-than-or-equal-to
> Greater-than
>= Greater-than-or-equal-to
J. CONDITIONALS
if (i==1) x_10
if (i>0) { x_10; y_20}
else x_y_0
K. ITERATION
for (i in 1:10) { x[i]_i; y_c(i,y); ... }
L. ARTIHMETIC OPERATORS AND FUNCTIONS
abs(x)
cos(x), sin(x), acos(x), etc
exp(x)
gamma(x)
log(x)
log(x, base=exp(1))
max(...), min(...)
mean(x)
median(x)
mode(x)
summary(x)
quantile(x, probs=c(0,.25,.5,.75,1))
sum(...)
var(x,y)
cor(x,y)
M. ORDERING AND REORDERING DATA
sort(x) delivers the elements of x sorted in increasing order
order(x) delivers the index vector such that x[order(x)] gives sort(x)
rank(x) delivers the ranks of corresponding elements of x
Example: x _ c(5,1,7,3)
order(x) delivers 2 4 1 3 since x[2] is smallest, x[4] the next, etc
sort(x) delivers 1 3 5 7 -- the same as x[order(x)]
rank(x) delivers 3 1 4 2 since x[1] is ranked 3, x[2] is ranked 1, etc
Just get into the following routine:
Acpub unix computer users will need to make a one-time customisation of your unix account before you can run S-Plus in emacs. Log into one of the acpub computers, and in your home directory on acpub you should have a file called .emacs already; if not, create one with an editor and simply add the line
If the file is already there, add the above line at the end. This will set you up to run S-Plus inside emacs. Try it, as above.
First you should specify a default printer in the graphics window in S-Plus. Duke Statistics users will have a default printer already.
In order to directly print the displayed graph in a motif() graphics window in S-Plus, go to the Options menu on the motif window and select "Printing...". In the window that comes up there is a command line: type "lpr" to use your default printer, or something such as "lpr -P214" to select 214 as the default printer for all future graphs in this S-Plus session. Then click on the "Apply" button, and then the "Save" button, and then close the window.
Acpub unix computer users will use local printers. In the Soc 133 cluster, for example, the two printers are imaginatively named soclp1 and soclp2. So to directly print the displayed graph in a motif() graphics window in S-Plus, go to the Options menu on the motif window and select "Printing...". In the window that comes up there is a command line: type "lpr -Psoclp2" to select soclp2 as the default printer for all future graphs. Then click on the "Apply" button, and then the "Save" button, and then close the window.
From here on, clicking the "Print" selection on the motif() window will print the displayed graph to that printer
Saving postscript files of graphs
You can save graphs in postscript files for later printing. For example,
postscript(file="somefilename.ps")
plot(prior,post)
more plot commands here ...
dev.off()
creates a file called somefilename.ps in your directory, and all the graphs done before dev.off() are in there instead of on the motif() display.
Then from an x-window, you can print via
lpr somefilename.ps
which prints to your default printer. In the Soc 133 acpub cluster, you will use either of
lpr -Psoclp1 somefilename.ps lpr -Psoclp2 somefilename.ps
Working on your own PC you can ftp postscript files for viewing and printing at home.
You can save graphs in postscript files for later printing. For example,
ps.options(colors=ps.colors.rgb[c("black","blue","red","green",
"brown","cyan","magenta","SkyBlue"),])
postscript(file="somefilename.ps", horizontal=T)
plot(...)
more plot commands here ...
dev.off()
The ps.options() command before the postscript()
call gives a selection of colours. Using col=2 in plot calls thereafter
selects the 2nd color (blue), and so forth.
Note also that horizontal=F gives the graph in 'portrait' mode. There are many more arguments you can explore for the postscript call -- see the help files.
You may open two or more motif() windows and postscript() files for printing simultaneously. These are known by S-Plus as graphics devices, and referred to by dev commands. The first device opened is the window in which you are typing -- this is device 1. If you open a motif() window next, S-Plus knows it as device 2. Open another motif() window, that's device 3. Open a postscript() file next, and that's device 4, and so on. Then you can switch between devices to draw graphs on any one motif() screen, save to file, etc.-- do this using the dev.set() command whose argument is just the device number. At any time, graphs will go to the "current" device, always the last one used or opened. Here's an example; as usual the # signs are comments ignored by S-Plus.
motif() # opens motif device, number 2 by default
plot(...) # and plots on motif device 2
postscript(file="a.ps") # open postscript file a.ps, now device 3
plot(..) # and draw something there
dev.set(2) # switch back to the motif screen
plot(...) # and a new plot there
dev.set(3) # switch back to the a.ps file
plot(...) # add a plot there
motif() # open another motif screen
plot(...) # and draw there
dev.set(2) # back to the first motif screen
dev.off(3) # closes device 3 -- here the postscript file
# ...
# etc
As usual, explore the on-line help file (search by keyword dev) for more information.
You can set the size of the motif window via, for example,
motif("-geometry 600x460")
and variants. Check the S-Plus manuals and on-line help for more info.
In S-Plus with a graph displayed, enter the command
locator(1)
then click with the left mouse key somewhere on the graph.
You'll see the x and y coordinates returned.
You could also assign these to a 2 element vector, say, myvec, via
myvec_locator(1)Then click, then look at myvec by typing its name
Every variable, vector, etc created in S-Plus is saved in a directory named .Data -- use the unix command ls -a to list all "hidden" files and directories starting with the "." and that do not get listed when you use the basic ls. This will reside as a subdirectory of your home directory or, if you created a specific named directory for your work, .Data will probably be in there.
Problem: Every time you run an S-Plus session, more stuff is dumped there, for possible use in future S-Plus sessions. This grows and clogs up your disk space, and acpub allocates a limited amount to each user. Clean up periodically by simply erasing everything in there: in unix: rm .Data/* removes everything in there, but leaves the .Data directory for further use
Start-up actions using the .First() function
S-Plus provides a facility for customising your S-Plus sessions and running various functions automatically when you start-up. To begin to use this useful facility, create the .First() function in your S-Plus workspace, as follows. When you start S-Plus next time, and before you do anything else, type in the following:
.First_function() {
motif("-geometry 600x460")
par( mfrow=c(1,1), bty = "n")
options(digits=3)
}
This defines the "hidden" function .First() (hidden because its name
starts with "."). Quit S-Plus using q() as usual, then the next and
future times you fire up S-Plus in the same directory the .First
function will be run. This specific version fires up the motif
window, sets it to draw one graph per frame with no box, and
controls numerical outputs to 3dp. You can add other commands.
This is also an introduction to writing functions in S-Plus.
X-Win32 is a computer program, available from OIT for $15.00, that allows your home computer to mimic that of a unix computer if you are on the Duke network. You run it on your PC, remotely log into acpub (or an Duke Statistics computer) and then open x-windows, run emacs, S-Plus graphics, etc etc as if you were sitting in an acpub cluster. Telecommute to school.
Viewing & downloading notes and slides
You can download my slides and notes -- all in postscript and pdf formats. This is easy directly on the acpub unix machines via Netscape. Clicking on a link to a postscript document will launch the Ghostview viewer under Netscape. Clicking on a link to a pdf document will launch the Adobe Acrobat viewer under Netscape. On your own PC or Mac, use an existing postscript or pdf viewer set up a Netscape plug-in application. For postscript (which is preferred) you can easily install the Ghostview previewer for PCs and Macs; here's the info:
Ghostview and GSview under Netscape on PCs:
The more adventurous (computingwise) among you might get interested in other software, including a package called BUGS that is developing as a general-purpose statistical modelling package, and is likely to be of interest as an application and research tool in future (as is S-Plus) for students continuing with statistics at a more advanced level.
Thanks to Mike West for accumulating and organizing this page of links.