STA250: Statistics

Data Sets

Text:DeGroot & Schervish, Probability and Statistics (4th edn)

250 Home Page Syllabus

Data sets: Some lecture examples will feature data from published research, books, or other sources; I'll try to make the data available here so you can try out the methods used in class or experiment with other methods. Also some homework problems will have data sets; I'll put those here too. Click on any item below and you will find either the raw data or a further page of explanation leading to the raw data. Use the Save As feature of your web browser to save the data to a file, which you can then enter into R using the read.table command.
NameDescription
Anscombe A classic pedagogic data set illustrating the need to use graphical methods and residual analysis to assess regression fits.
Fish DDT data set from Mendenhall & Sincich Appendix III.
DDT-num DDT data set (numerical encoding) from M & S Appendix III.
DDT-chr DDT data set (character enclding) from M & S Appendix III.
CPU CPU times of 1000 computer jobs from M & S Appendix IV.
Iron Percentage iron contents for 390 ore samples from M & S Appendix V.
Cig-num Cigarette data set (numerical encoding) from M & S Appendix VI.
Cig-chr Cigarette data set (character enclding) from M & S Appendix VI.
Hand et al. 510 data sets from Handbook of Small Data Sets by Hand, Daly, Lunn, McConway, and Ostrowski.
StatLib CMU's "StatLib" statistics archive which in turn includes their
DASL "Data and Story" (DASL, pronounced like "dazzle") archive.