The data are in pollution.asc. These are daily measurements of particulate matter in ppm: the first observation is 2 May; the last is 26 Nov.
Below are the Splus commands I used to make the plot and to compute a one-sample t-test for the differences.
# reading in the data using scan(). You could go and read
# the file into a dataframe instead.
# pollution data (matched pairs example)
city_scan()
39 68 42 34 48 82 45 NA NA 60 57
NA 39 NA 123 59 71 41 42 38 NA 57
50 58 45 69 23 72 49 86 51 42 46
NA 44 42
rural_scan()
NA 67 42 33 46 NA 43 54 NA NA NA
NA 38 88 108 57 70 42 43 39 NA 52
48 56 44 51 21 74 48 84 51 43 45
41 47 35
# first the plot
par(oma=c(1,1,4,1))
plot(city,pch=1,ylab="",xlab="days",ylim=c(-5,124),axes=F)
axis(2);
axis(1,at=c(1,36,10,20),labels=c("May 2","Nov 26","July","September"))
box()
points(rural,pch=3)
points(city-rural,type="h",lwd=3)
lines(c(1,36),c(0,0),lty=2)
mtext("airborne particle matter",side=2,line=3,at=80,srt=90)
mtext("difference",side=2,line=3,at=0,srt=90)
mtext("Pollution readings at city and nearby rural site",side=3,line=0,
cex=1.3,outer=T)
legend(locator(1),c("city","rural"),marks=c(1,3))
# now the t-test
t.test(city,rural,paired=T)
It spits out the output, suggesting that the rural location receives 2.2 ppm less that the city location on average.
Paired t-Test
data: city and rural
t = 2.3832, df = 25, p-value = 0.0251
alternative hypothesis: true mean of differences is not equal to 0
95 percent confidence interval:
0.2977415 4.0868739
sample estimates:
mean of x - y
2.192308