Statistics 101
Data Analysis and Statistical Inference
 

Instructions for lab 9


Lab Objective

The purpose of the lab is to practice confidence intervals and significance (hypothesis) testing.

Lab Procedures


Below are links to several data sets and attached questions that can be assessed with hypothesis tests.

Format for reporting hypothesis tests
Whenever asked to conduct a hypothesis test, you should report the following:
  1. null and alternative hypotheses
  2. test statistic (show the numerator and denominator that go into the test statistic)
  3. p-value
  4. conclusion - whether or not you accept or reject the null hypothesis, and a brief 1-sentence conclusion in the context of the problem.


Composition of Ancient Earth's Atmosphere

Has the composition of Earth's atmosphere changed over time?  To study this question, geologists Robert Berner and Gary Landis (1988) examine the composition of gas bubbles in ancient pieces of amber (hardened tree resin preserved in sedimentary rocks).

To determine the composition of the gas bubbles, they crush the amber in a vacuum and analyze the released gases with "time-resolved quadruple mass spectromety"   (Berner and Landis, 1988, p. 1406).  After arguing that the air in the bubbles is not contaminated by modern air, Berner and Landis (1988) present the percentages of nitrogen and carbon dioxide plus oxygen in nine gas bubbles in amber from the Upper Cretaceous age (about 75 to 95 million years ago).  These data are shown below.  In the sample labels, the Roman numerals correspond to the piece of amber that is crushed (there are three pieces), and the letter corresponds to the gas bubble within the amber that is analyzed.

                                                                                                                Sample Label
Gas IA IB IIA IIB IIC IID IIIA IIIB IIIC
N2 63.4 65.0 64.4 63.3 54.8 64.5 60.8 49.1 51.0
CO2 + O2 33.5 30.5 28.3 28.4 32.3 25.5 36.6 27.8 25.5

Berner and Landis (1988) argue that the carbon dioxide is respired oxygen from trapped microorganisms, so that the original levels of oxygen in the amber equal the CO2 + O2 percentages.  Thus, they claim these are percentages of the two major gases from nine samples of ancient air.

This is a study involving inference for individual means.  The data are in the file ancientair (click here to download).

Questions

  1. Modern air is known to contain 78.1% nitrogen.  Is there evidence that the percentage of nitrogen in ancient air differed from this percentage?  Use a two-sided  t-test, since we are looking for any differences from the modern percentages. 
To do a t-test in JMP for a single mean, first run Analyze - Distribution on the variable of interest.  Then, click on the arrow next to the variable name, and select Test Mean.  Enter the hypothesized value of the mean in the first box.  Leave the second box empty.  Click OK.   JMP reports three p-values at the bottom of the output:  1) Prob > |t|, which is the p-value when the alternative is two-sided; 2) Prob > t. which is the p-value when the alternative has a >  sign in it; and, 3) Prob < t, which is the p-value when the alternative has a < sign in it.  Choose the one that matches your alternative hypothesis.
  1. Modern air is known to contain 20.9% oxygen. Is there evidence that the percentage of oxygen in ancient air differed from this percentage?  Conduct the appropriate hypothesis test.
  1. Form a 95% confidence interval for the percentage of oxygen in ancient air.  You can get the relevant mean and standard error from the JMP output.  Write your 95% CI on the report, and explain in one sentence what this interval tells you about the percentage of oxygen in ancient air.


COMMENTS ON THIS PROBLEM.

It is not universally accepted by geologists that the gas bubbles represent samples of air from ancient times.  This question can be answered only by experts in the field.  However, the appropriateness of t-tests for these data can be criticized from a statistical point of view. Two criticisms include: 1) (serious criticism) the observations are not independent, since the samples come from the same rock; and, 2) (less serious criticism) the  N2 values are not symmetric around the mean, which could make the tests inaccurate with a small sample size. The second criticism doesn't bother me too much, since the sample average is so many SEs away from 78.1% that there's no doubt the chance of seeing a sample average as or more extreme than the value in the data is very small.  
   
Reference:  Berner, R. A. and Landis, G. P. (1988) "Gas Bubbles in Fossil Amber as Possible Indicators of the Major Gas Composition of Ancient
Air." Science 239, pp. 1406--1409.


Is caffeine dependence real?

The subjects are eleven people diagnosed as being dependent on caffeine.  During one time period, these people were barred from coffee, colas, and other substances containing caffeine and instead took capsules containing their normal caffeine intake.  During a different time period, they took placebo capsules with no caffeine.  The order of the time periods in which the subjects took caffeine and placebos was randomized.   The subjects, pill administrators, and testers did not know when they got each pill.

Subjects were assessed on the Beck Depression Inventory, which is a psychological test that measures depression.  Higher scores on the test mean the subject shows more symptoms of depression.   Additionally, subjects were asked to press a button 200 times as quickly as possible, and their number of presses per minute was measured.   The researchers are interested in whether being deprived of caffeine affects either of these outcomes.

This is a matched pairs study, because comparisons of the treatments are made on the same person.  The data are in the file caffeine (click here).

Questions

  1. Is there a difference in the average depression score of people on the caffeine pill and people on the placebo?   To run the t-test, follow the same directions as in the ancient air problem.  Since this is a matched-pairs study, we compare the caffeine and placebo groups by conducting a one-sample test on the difference column we created, and use 0 as the hypothesized mean (for no difference).
  1.  Is there a difference in the average beats of people on the caffeine pill and people on the placebo?  Our hypothesized mean is 0 as before.


Reference:
Moore, D.  The Basic Practice of Statistics.  New York:  W.H. Freeman, 2000, p. 382.

Subliminal Messages (you will get this problem) and Their Effects on Math Test Scores (you will get this problem)

A subliminal message is below our threshold of awareness but may influence our behavior.  Can subliminal messages affect the way students learn math? A group of students who had failed the mathematics part of the City of New York Skills Assessment Test agreed to participate in a study of this question.  The data were originally collected in a study by John Hudesman, and the study is described in Moore (2000, p. 400).

All students received a daily subliminal message flashed on a screen too rapidly to be read consciously.  The students were randomly assigned to receive one of two messages. The treatment group received the message, "Each day I am getting better in math."  The control group received the neutral message, "People are walking on the street."  All students in both groups took a pre-test, went to a summer math skills program, and then took a post-test.

This is a study involving inferences for the difference in means of separate groups.  It's not matched pairs because there are two separate groups: the students who got the subliminal message, and the students who got the neutral message.  The data for the students' test scores are in the file subliminal (click here).  People in the subliminal group have the code "T", and people in the neutral message group have the code "C".

Questions:

  1. Is there evidence of a difference in the average improvement in test scores (post-test score - pre-test score) for the subliminal and neutral message groups? 
To run a hypothesis test for the difference of two means in JMP,  use  Analyze - Fit Y by X, inputting the continuous variable in the Y-box and the group variable in the X-box.  After running it, go to the red arrow next to the "Oneway analysis..."  Then select Unequal Variances. The output at the very bottom is the test.  The entry under "t-Test" is the value of the test statistic.  The entry under "Prob > F" is the p-value for the two-sided alternative hypothesis.  
  1. Give a 95% confidence interval for the difference in average improvements between the subliminal and neutral groups.  You can find the appropriate values for the means and standard errors in JMP by clicking on the red arrow next to "Oneway analysis..." and selecting Means and Std Dev.  Explain in one sentence what this confidence interval tells you about the effectiveness of the subliminal message versus the neutral message.


COMMENTS ON THIS PROBLEM:

These conclusions are valid for the subject material, message, and student populations in this study. However, they may not generalize to other subject material, messages, or other populations.  Additional studies involving other subject material, other messages, and other populations are needed before we can feel  secure with broad generalizations.

Reference:
Moore, D. The Basic Practice of Statistics. New York: W.H. Freeman and Company, 2000.