STA240/ENV210

STA242/ENV255 Assignment 2

Due Friday, September 5, 2003 at 5pm. Submit to A125 LSRC. Late homework will not be accepted.

Policies for this assignment:

The Nicholas School of the Environment and Earth Sciences (NSEES) advocates the highest standards of professional ethics and academic integrity. Students and faculty have developed an honor code for the school which is distributed to all students prior to matriculation and discussed during orientation. Students in this course are expected to follow the honor code.
Writeups must be done independently. This includes computations, Splus output, graphs, answers to questions and discussion of results. Copying will not be tolerated, and will be treated as a violation of the NSEES honor code. You may discuss issues and concepts with your colleagues, but your writeup must be your own.
Students requesting regrades of assignments or exams must make these requests within one week of receiving the graded material. Attach a note explaining the regrade issue to your assignment or exam and submit to instructor. The instructor or TA has the option to regrade the entire assignment or exam.

Background reading: Chapter 7 of Moore & McCabe; Read 7.2 focusing on the pooled two-sample t-test. Suggested Exercises: Exc. 7.33 p. 523 of M&M, Exc. 15, p. 52 of Sleuth

To turn in:

Moore/McCabe: p. 523, exc. 7.26, 7.44

Tetrachlorodibenzo-p-dioxin (TCDD) is a highly toxic substance found in industrial wastes. A study was conducted to determine the amount of TCDD present in the tissues of bullfrogs inhabiting the Rocky Branch Creek in central Arkansas, an area known to be contaminated by TCDD (Korfmacher, W.A., Chemosphere, Feb. 1986). The level of TCDD (in parts per trillion) was measured in several specific tissues of four female bull frogs. The ratio of TCDD in the liver tissue to TCDD in the leg muscle of the frog was recorded for each. The relative ratios of contaminant for two tissues, the liver and the ovaries, are given for each of the four frogs in the table below.

Frog	A	B	C	D
Liver	11.0	14.6	14.3	12.2
Ovaries	24.2	18.2	20.5	16.0

According to the researchers, "the data set suggests that the [mean] relative level of TCDD in the ovaries of the frogs is higher than the [mean] level in the liver of the frogs."

Test the claim using a=0.05. Clearly write out null and alternative hypotheses, test statistic, and p-value for the test. Give the result of your test in a sentence.
What assumptions are necessary for your analysis in (a)?

You may wish to use Splus to double check the p-value you obtained from the tables in Sleuth. Directions are here.

Check your answer by performing a 1-sample t-test in Splus. Go to "File" "New" "Dataset". In column 1, type the Ovaries data. In column 2, type the Liver data. Label each column. Now we'll make a new column of the differences between Ovaries and Liver measurements. Go to "Data" and "Transform". Select "Target Column" and type in "Diffs" (or some other variable name for differences between Ovaries and Liver). In "Expression", type in "Ovaries - Liver" (assuming this is how you labelled your columns, this is an equation for the differences). Go to "Statistics" "Compare Samples" "One Sample". Select "Variable" to be your new variable "Diffs". "Mean under Null" should be zero. Choose the appropriate alternative hypothesis. (Note that in this one-sided case, Splus prints out one-sided confidence intervals with one endpoint equal to "NA".)

Exercise 12, p. 51 of Sleuth. Clearly report your test statistic, p-value, and give your conclusion in a sentence.
Data Problem 21, p. 52 of Sleuth. You will turn in a writeup of this problem, analogous to this sample writeup. Directions for this writeup can be found below.

Relevant Splus commands for this problem:

See Lab 0 if you have not worked through it previously.
boxplot
summary statistics
histogram
Normal probability plot, or qqnormal plot
calculating quantiles/percentiles in Splus
two-sample t-test

Details on one-page writeup

LENGTH: 1/2 page. (you can go to a max 1 page, but you shouldn't need this much)

FORMAT: 1" margins all around, 11 point font. Times New Roman.

Language to use in the writeup: Use the Case Studies in the beginning of Chapter 2 as a guide, as well as the sample writeup provided.

Clear statement of research question
Description of data. If observational, how was the data collected? What was the experimental design? Mention any sampling issues (independence or possible correlation of data, sample size issues) that could have an impact on the analysis.
Exploratory analysis of data
- summary statistics giving center and spread of data. These can be reported in a simple table or in a sentence. If you will be looking at the difference in means, reporting this difference with units and its standard error is useful.
- relevant plots tailored to the question at hand. In this homework problem, a boxplot will suffice to give a sense of how the means differ. One way to make this plot more informative would be to provide the sample sizes for each group -- add this as text to the boxplot).
- unique features of the data that could have an impact on the analysis: outliers, shape of distribution, sample size issues.
Statistical analysis. This section includes information on the statistical tests performed, and should be written in a form similar to the "Case Studies - Summary of Statistical Findings" section at the beginning of each chapter in Statistical Sleuth.
- Assumptions. Give each assumption, and your findings from the data and what you know about the problem that support the assumption. For the two-sample t procedures, we require normality of the underlying population, a random sample, and indendent observations. Analyze the data and use what you know about the problem to verify these assumptions.
  To examine normality, create qq-normal plots for each sample, and report your findings in a sentence. "Normal quantile-quantile plots for group A revealed ... (outliers? departures from normality? that the sample was roughly normally distributed?) You don't need to include the qq-normal plots in your writeup.
- For this problem, perform a 2-sample t-test and clearly give the hypotheses being considered in the words of the problem. Give your result using a p-value and interpet its magnitude (follow case studies in the book for format). Give a confidence interval to express the difference in means between the two groups.
Summary of findings and scope of inference. To what extent does our data answer the question asked? What are the limitations of the model you have selected? What advice can you give to decisionmakers about the problem at hand? Review Section 1.2 of Statistical Sleuth. Also the "Scope of Inference" sections of the "Case Studies" should be of use here. Remember that recommending a larger sample size isn't always realistic; you will not always see perfect "textbook" datasets, and that this particular dataset may be all that is available to answer the question.
Last modified: Fri Aug 29 12:35:49 EDT 2003