Splus commands to do the pollution analysis


The data are in pollution.asc. These are daily measurements of particulate matter in ppm: the first observation is 2 May; the last is 26 Nov.



Below are the Splus commands I used to make the plot and to compute a one-sample t-test for the differences.

 # reading in the data using scan().  You could go and read
 # the file into a dataframe instead.
# pollution data (matched pairs example)

city_scan()
     39     68     42     34     48     82     45     NA     NA     60     57
     NA     39     NA    123     59     71     41     42     38     NA     57
     50     58     45     69     23     72     49     86     51     42     46
     NA     44     42

rural_scan()
     NA     67     42     33     46     NA     43     54     NA     NA     NA
     NA     38     88    108     57     70     42     43     39     NA     52
     48     56     44     51     21     74     48     84     51     43     45
     41     47     35


 # first the plot

par(oma=c(1,1,4,1))
plot(city,pch=1,ylab="",xlab="days",ylim=c(-5,124),axes=F)
axis(2);
axis(1,at=c(1,36,10,20),labels=c("May 2","Nov 26","July","September"))
box()
points(rural,pch=3)
points(city-rural,type="h",lwd=3)
lines(c(1,36),c(0,0),lty=2)
mtext("airborne particle matter",side=2,line=3,at=80,srt=90)
mtext("difference",side=2,line=3,at=0,srt=90)
mtext("Pollution readings at city and nearby rural site",side=3,line=0,
      cex=1.3,outer=T)
legend(locator(1),c("city","rural"),marks=c(1,3))

 # now the t-test
t.test(city,rural,paired=T)

It spits out the output, suggesting that the rural location receives 2.2 ppm less that the city location on average.

         Paired t-Test 

data:  city and rural 
t = 2.3832, df = 25, p-value = 0.0251 
alternative hypothesis: true mean of differences is not equal to 0 
95 percent confidence interval:
 0.2977415 4.0868739 
sample estimates:
 mean of x - y 
      2.192308