The data are in pollution.asc. These are daily measurements of particulate matter in ppm: the first observation is 2 May; the last is 26 Nov.
Below are the Splus commands I used to make the plot and to compute a one-sample t-test for the differences.
# reading in the data using scan(). You could go and read # the file into a dataframe instead. # pollution data (matched pairs example) city_scan() 39 68 42 34 48 82 45 NA NA 60 57 NA 39 NA 123 59 71 41 42 38 NA 57 50 58 45 69 23 72 49 86 51 42 46 NA 44 42 rural_scan() NA 67 42 33 46 NA 43 54 NA NA NA NA 38 88 108 57 70 42 43 39 NA 52 48 56 44 51 21 74 48 84 51 43 45 41 47 35 # first the plot par(oma=c(1,1,4,1)) plot(city,pch=1,ylab="",xlab="days",ylim=c(-5,124),axes=F) axis(2); axis(1,at=c(1,36,10,20),labels=c("May 2","Nov 26","July","September")) box() points(rural,pch=3) points(city-rural,type="h",lwd=3) lines(c(1,36),c(0,0),lty=2) mtext("airborne particle matter",side=2,line=3,at=80,srt=90) mtext("difference",side=2,line=3,at=0,srt=90) mtext("Pollution readings at city and nearby rural site",side=3,line=0, cex=1.3,outer=T) legend(locator(1),c("city","rural"),marks=c(1,3)) # now the t-test t.test(city,rural,paired=T)
It spits out the output, suggesting that the rural location receives 2.2 ppm less that the city location on average.
Paired t-Test data: city and rural t = 2.3832, df = 25, p-value = 0.0251 alternative hypothesis: true mean of differences is not equal to 0 95 percent confidence interval: 0.2977415 4.0868739 sample estimates: mean of x - y 2.192308