Statistics 101
Data Analysis and Statistical
Inference
Instructions for lab 9
Lab Objective
The purpose of the lab is to practice confidence intervals and
significance (hypothesis) testing.
Lab Procedures
Below are links to several data sets and attached questions that can be
assessed with hypothesis tests.
Format for reporting hypothesis tests
Whenever asked to conduct a hypothesis
test, you should report the following:
- null and alternative hypotheses
- test statistic (show the numerator
and denominator that go into the test statistic)
- p-value
- conclusion - whether or not you accept or reject the null
hypothesis, and a brief 1-sentence conclusion in the context of the
problem.
|
Composition of Ancient Earth's
Atmosphere
Has the composition of Earth's atmosphere changed over time?
To study this question, geologists Robert Berner and Gary Landis (1988)
examine the composition of gas bubbles in ancient pieces of amber
(hardened tree resin preserved in sedimentary rocks).
To determine the composition of the gas bubbles, they crush the
amber in a vacuum and analyze the released gases with "time-resolved
quadruple mass spectromety" (Berner and Landis, 1988, p.
1406). After arguing that the air in the bubbles is not
contaminated by modern air, Berner and Landis (1988) present the
percentages of nitrogen and carbon dioxide plus oxygen in nine gas
bubbles in amber from the Upper Cretaceous age (about 75 to 95 million
years ago). These data are shown below. In the sample
labels, the Roman numerals correspond to the piece of amber that is
crushed (there are three pieces), and the letter corresponds to the gas
bubble within the amber that is analyzed.
Sample Label
Gas |
IA |
IB |
IIA |
IIB |
IIC |
IID |
IIIA |
IIIB |
IIIC |
N2 |
63.4 |
65.0 |
64.4 |
63.3 |
54.8 |
64.5 |
60.8 |
49.1 |
51.0 |
CO2 + O2 |
33.5 |
30.5 |
28.3 |
28.4 |
32.3 |
25.5 |
36.6 |
27.8 |
25.5 |
Berner and Landis (1988) argue that the carbon dioxide is respired
oxygen from trapped microorganisms, so that the original levels of
oxygen in the amber equal the CO2 + O2
percentages. Thus, they claim these are percentages of the two
major gases from nine samples of ancient air.
This is a study involving inference for individual means.
The data are in the file ancientair
(click here to download).
Questions
- (not handed in) Examine the normal
quantile plots of nitrogen and oxygen to see if their distributions can
be described roughly with normal curves. Recall, to get a
normal probability plot, first run Analyze - Distribution with
the variable of interest, click on the red arrow next to the
variable name, and select Normal Quantile Plot. You
don't
have to turn anything in for this part of the questions, just make the
plots to get in the good habit of checking assumptions before doing
significance tests. You'll also use Analyze-Distribution
to do the significance test. If the normal curves reasonably
describe the data, the hypothesis tests are OK to run. Otherwise,
you have to use methods not covered in this course. We'll go
ahead and run the tests regardless, just for practice.
- Modern air is known to contain 78.1%
nitrogen. Is there evidence that the percentage
of nitrogen in ancient air differed from this percentage? Use a
two-sided t-test, since we are looking for any
differences from the modern percentages.
To do a t-test
in JMP for a single mean, first
run Analyze - Distribution on the variable of interest.
Then,
click on the arrow next to the variable name, and select Test Mean.
Enter the hypothesized value of the mean in the first box.
Leave the second box empty. Click OK. JMP
reports three p-values at the bottom of the output: 1) Prob
> |t|, which is the p-value when the alternative is two-sided; 2)
Prob > t. which is the p-value when the
alternative has a > sign in it; and,
3) Prob < t, which is the p-value when the alternative has a <
sign in it. Choose the one that matches your alternative
hypothesis.
- Modern air is known to contain 20.9%
oxygen. Is there
evidence that the percentage of oxygen in ancient air differed from
this percentage? Conduct the appropriate hypothesis test.
- Form a 95% confidence interval for
the percentage of oxygen in ancient air. You can get the
relevant mean and standard
error from the JMP output. Write your 95% CI on the report, and
explain in one sentence what this interval tells you about the
percentage of oxygen in ancient air.
COMMENTS ON THIS PROBLEM.
It is not universally accepted by geologists
that the gas bubbles represent samples of air from ancient times.
This question can be answered only by experts in the field.
However, the appropriateness of t-tests for these data can be
criticized from a statistical point of view. Two criticisms include:
1) (serious criticism) the observations are not independent, since
the samples come from the same rock; and, 2) (less serious criticism)
the N2 values are not symmetric around the mean, which could
make the tests inaccurate with a small sample size. The second
criticism doesn't bother me too much, since the sample average is so
many SEs away from 78.1% that there's no doubt the chance of seeing a
sample average as or more extreme than the value in the data is very
small.
Reference: Berner, R. A. and Landis,
G. P. (1988) "Gas Bubbles in Fossil Amber as Possible Indicators of the
Major Gas Composition of Ancient
Air." Science 239, pp. 1406--1409.
Is caffeine dependence real?
The subjects are eleven people diagnosed as being dependent on
caffeine. During one time period, these people were barred from
coffee, colas, and other substances containing caffeine and instead
took capsules containing their normal caffeine intake. During a
different time period, they took placebo capsules with no
caffeine. The order of the time periods in which the subjects
took caffeine and placebos was randomized. The subjects,
pill administrators, and testers did not know when they got each pill.
Subjects were assessed on the Beck Depression Inventory, which is a
psychological test that measures depression. Higher scores on the
test mean the subject shows more symptoms of depression.
Additionally, subjects were asked to press a button 200 times as
quickly as possible, and their number of presses per minute was
measured. The researchers are interested in whether being
deprived of caffeine affects either of these outcomes.
This is a matched pairs study, because comparisons of the
treatments are made on the same person. The data are in the file caffeine
(click here).
Questions
- Make new columns for the differences in
depression scores and in beats. For both differences, subtract
placebo score from caffeine score. To input the
differences, you'll have to edit the columns and use a Formula. Ask for help if you
have trouble.
- (not handed in) Examine the distribution of
depression score differences (caffeine - placebo). Does a normal
curve seem like a reasonable description of the differences?
You don't have to turn anything in for this part, just make the
plots to check assumptions. If the normal curve seems like a
reasonable fit, you can use the t-test approach. Otherwise, you
have to use other methods that we have not covered in this course.
- Is there a
difference in the average
depression score of people on the caffeine
pill and people on the placebo? To
run the t-test, follow the same directions
as in the ancient air problem. Since this is a matched-pairs
study, we compare the caffeine and placebo groups by conducting a
one-sample test on the difference column we created, and use 0 as the
hypothesized mean
(for
no difference).
- (not handed in) Examine the
distribution of
differences in beats (caffeine - placebo). Does a normal curve
seem like a reasonable description of the differences? You don't
have to turn anything in for this part, just make the plots to check
assumptions. If the normal curve
seems like a reasonable fit, you can use the t-test approach.
Otherwise, you have to use other methods that we have not covered
in this course.
- Is there a
difference in the average
beats of people on the caffeine pill and
people on the placebo? Our hypothesized mean is 0 as before.
Reference:
Moore, D. The Basic Practice of Statistics. New
York: W.H. Freeman, 2000, p. 382.
Subliminal Messages (you will
get this
problem) and Their Effects on Math Test Scores (you will get this
problem)
A subliminal message is below our threshold of
awareness but may influence our behavior. Can subliminal messages
affect the way students learn math? A group of students who had
failed the mathematics part of the City of New York Skills
Assessment Test agreed to participate in a study of this
question. The data were originally collected in a study by John
Hudesman, and the study is described in Moore (2000, p. 400).
All students received a daily subliminal
message flashed on a screen too rapidly to be read consciously.
The students were randomly assigned to receive one of two messages.
The treatment group received the message, "Each day I am getting
better in math." The control group received the neutral message,
"People are walking on the street." All students in
both groups took a pre-test, went to a summer math skills program,
and then took a post-test.
This is a study involving inferences for
the difference in means of separate groups. It's not matched
pairs because there are two separate groups: the students who got the
subliminal message, and the students who got the neutral message.
The data for the students' test scores are in the file subliminal
(click here). People in the subliminal group have the
code "T", and people in the neutral message group have the code "C".
Questions:
- (not handed in) In this problem, the
outcome variable is the improvement in test scores. For each
group, examine the distribution of improvement scores. Do normal
curves appear reasonable descriptions of the distributions of
improvement scores in each group? You can get both normal curves
on one plot by using Analyze - Fit Y by X. Put
the continuous variable in the Y-box and the group variable in the
X-box. After running it, click on the red arrow next to the
"Oneway analysis...", and select Normal Quantile Plot - Plot Actual
by Quantile. If the data in both groups roughly follow
normal
curves, we can proceed with the significance test. Otherwise, you
use methods that we have not learned in this course.
- Is there evidence of a difference in
the average improvement in test scores (post-test score - pre-test
score) for the subliminal and neutral message groups?
To run a hypothesis test for the difference of
two means in JMP, use Analyze - Fit Y by X, inputting
the continuous variable in the Y-box and the group variable in the
X-box. After running it, go to the red arrow next to the "Oneway
analysis..." Then select Unequal Variances. The output
at the very bottom is the test. The entry under "t-Test" is the
value of the test statistic. The entry under "Prob > F" is
the p-value for the two-sided alternative hypothesis.
- Give a 95% confidence interval
for the difference in average improvements between the subliminal and
neutral groups. You can find the
appropriate values for the means and standard errors in JMP by clicking on the red arrow next to "Oneway
analysis..." and selecting Means and Std Dev.
Explain in one sentence what this
confidence
interval tells you about the effectiveness of the subliminal message
versus the neutral message.
COMMENTS ON THIS PROBLEM:
These conclusions are
valid for the subject material, message, and student populations in
this
study. However, they may not generalize to other subject material,
messages, or other populations. Additional studies involving
other subject material, other messages, and other populations are
needed before we can feel secure with broad generalizations.
Reference:
Moore, D. The Basic Practice of Statistics. New York: W.H.
Freeman and Company, 2000.