Goto Lab:

Homework 4

Solutions:

You can solve almost all of these problems using the graphical interface. Go ahead and do it that way. Below we have the command line commands needed to solve the problems. You don't have to do all the work that is shown here! All you have to do is read in the data to a dataframe and then for a two sample test:
1) statistics>compare samples>two samples>t test
2) put the variables in the x variable and y vble
3) click on data has a grouping variable
4) ok
For one sample tests:
1) statistics>compare samples>one sample>t test
2) put the variable in the x variable
3) ok
For basic statistical information
1) statistics>data summaries>summary statistics
2) click on the icons of the statistics you want to know
3) choose the set/subset of variables you want to know
4) choose if you want to group variables
5) ok

Check that you have the correct answers, but don't worry if you did it a different way than what was shown here.

Exercise 13

These are the solutions:

Mean 1	Mean 2	S.d. 1	S.d. 2	pooled	s.e.	d.freedom	t(97.5)	Interval	t stat.	p-value
6.57	-1.14	5.85	3.18	4.713	2.519	12	2.17	[2.22,13.20]	3.06	0.0049

# assume the data are in two vectors bp and code.  The following
# commands put the bp data into two vectors, one for each group.
ok1_code==1
ok2_code==2
bp1_bp[ok1]
bp2_bp[ok2]
# this puts the data into two vectors: group 1 bps in bp1 and
# the group 2 bps in bp2

#this is problem 13a
m1_mean(bp1)
m2_mean(bp2)
s1_sqrt(var(bp1))
s2_sqrt(var(bp2))
#this is problem 13b. I assign throuth the command length the value
#of the number of observations to n1 and n2. Then I get the pooled
#estimate of the standard deviation with the sqrt of an ordered 
#formula as it appears in the book.
n1_length(bp1)
n2_length(bp2)
sp_sqrt((((n1-1)*var(bp1))+((n2-1)*var(bp2)))/(n1+n2-2))
#this is problem 13c. Same procedure for the SE(Y2-Y1)
sey2y1_sp*sqrt((1/n1)+(1/n2))
#you can see that the degrees of freedom is n1+n2-2=12
#see that I have used the qt function. It has arguments (x,y)
#It gives you the value of the quantile x of a t distribution with
#y degrees of freedom =) qt(x,y) 0<=x<=1,y>=1.
qt(0.975,n1+n2-2)
#this is problem 13e. The confidence intervals. Just the formula.
ica_m2-m1-qt(0.975,n1+n2-2)*sey2y1
icb_m2-m1+qt(0.975,n1+n2-2)*sey2y1
#this is problem 13f
#testing equality =) D=0
t_((m2-m1)-0)/sey2y1
#13g is direct. Look that I used abs to ensure that I'm not using
#incorrectly the t-value. I get its absolute value and compute always
#the value of the t in the right side of the bell.
pvalue_1-pt(abs(t),n1+n2-2)

Exercise 15

These are the solutions:

Confidence interval	p-value
[9.27,12.72]	0

#exercise 15
	#you just have created the formula to do this problem, so you
	#just need to enter the new data en mechanically obtain the results
	m1_29.2
	m2_18.2
	s1_7.5
	s2_5.8
	n1_126
	n2_50	
	#from now you only need to tell the program what to do with this
	#data using the commands (standard for any data) we created before.
	sp_sqrt((((n1-1)*var(bp1))+((n2-1)*var(bp2)))/(n1+n2-2))
	sey2y1_sp*sqrt((1/n1)+(1/n2))
	qt(0.975,n1+n2-2)
	ica_m2-m1-qt(0.975,n1+n2-2)*sey2y1
	icb_m2-m1+qt(0.975,n1+n2-2)*sey2y1
        t_((m2-m1)-0)/sey2y1
	pvalue_1-pt(abs(t),n1+n2-2)

Exercise 16

These are the solutions:

Confidence interval	p-value	degrees of freedom
[1.29,6.99]	0.0269	45

#Exercise 16
	#ic is the group of intrinsic
	#ec is the group of extrinsic
	ic_c(12,12,12.9,13.6,16.6,17.2,17.5,18.2,19.1,19.3,19.8,20.3,
	20.5,20.6,21.3,21.6,22.1,22.2,22.6,23.1,24.0,24.3,26.7,29.7)
	ec_c(5,5.4,6.1,10.9,11.8,12.0,12.3,14.8,15.0,16.8,17.2,17.2,
	17.4,17.5,18.5,18.7,18.7,19.2,19.5,20.7,21.2,22.1,24.0)
	#again the best way to do this is to use standard formulae.
	bp1_ic
	bp2_ec
	m1_mean(bp1)
	m2_mean(bp2)
	s1_sqrt(var(bp1))
	s2_sqrt(var(bp2))
	n1_length(bp1)
	n2_length(bp2)
	sp_sqrt((((n1-1)*var(bp1))+((n2-1)*var(bp2)))/(n1+n2-2))
	sey2y1_sp*sqrt((1/n1)+(1/n2))
	ica_m2-m1-qt(0.975,n1+n2-2)*sey2y1
	icb_m2-m1+qt(0.975,n1+n2-2)*sey2y1
   t_((m2-m1)-0)/sey2y1
	pt(abs(t),n1+n2-2)
	onesidedpvalue_1-P
	twosidedpvalue_2*(onesidedpvalue)

Exercise 20

These are the solutions:

Mean	s.d.	degrees of freedom	s.e.	Confidence interval	one-sided p	two-sided p
[1.29,6.99]	0.00269	6	2.213	[1.156,11.986]	0.0237	0.0475

#For exercise 20 let's see what we already have
	#the average for this group is m1 and the s.d. is s1
	#the standard error for the average is
	serr_s1/sqrt(n1)
	#the confidence interval would be easy to construct
	ic1_m1-(qt(0.975,n1-1)*serr)
	ic2_m1+(qt(0.975,n1-1)*serr)
	#We can see that the intervals are very broad because of the
	#few degrees of freedom that we have.
	t_m1/sqrt(n1)
	pvalue_1-pt(abs(t),n1-1)

Exercise 21

These are the solutions:

Confidence Interval	2-sided p-value	t
[0.0961,1.5281]	0.269	2.2714

#Exercise 21
	ws_c(24.5,26.9,26.9,24.3,24.1,26.5,24.6,24.2,23.6,26.2,26.2,
	24.8,25.4,23.7,25.7,25.7,26.3,26.7,23.9,24.7,28.0,27.9,25.9,
	25.7,26.6,23.2,25.7,26.3,24.3,26.7,24.9,23.8,25.6,27.0,24.7)
	wp_c(26.5,26.1,25.6,25.9,25.5,27.6,25.8,24.9,26.0,26.5,26.0,
	27.1,25.1,26.0,25.6,25.0,24.6,25,26,28.3,24.6,27.5,31.1,28.3)
	#again it's easier to take what we have and use it instead
	#of doing the same again and again for different data
	bp1_ws
	bp2_wp
	m1_mean(bp1)
	m2_mean(bp2)
	s1_sqrt(var(bp1))
	s2_sqrt(var(bp2))
	n1_length(bp1)
	n2_length(bp2)
	sp_sqrt((((n1-1)*var(bp1))+((n2-1)*var(bp2)))/(n1+n2-2))
	sey2y1_sp*sqrt((1/n1)+(1/n2))
	ica_m2-m1-qt(0.975,n1+n2-2)*sey2y1
	icb_m2-m1+qt(0.975,n1+n2-2)*sey2y1
        t_((m2-m1)-0)/sey2y1
	twosidedp_2*(1-pt(abs(t),n1+n2-2))
        
        The summary should resemble the one on pages 28 and 29.

Extra lab

These are the solutions:

#extra assignment: power test graphics
	#basis=mean(trt)-mean(cont)
	basis_mean(trt)-mean(cont)
	powr_function(basis,n,sd){
	t_(basis)/sqrt(2*(sd*sd)/(n*n))
	tbound_qt(0.95,n-2)
	1-pt(tbound-t,n-2)}
	#Now the function is stored in memory and you can use it whenever
	#you want for different data.
	trt_c(1.121,1.29,1.183,1.145,1.168,1.316,.998,1.174)
	cont_c(1.012,1.111,1.014,1.091,1.098,1.179)
	m_mean(trt)-mean(cont)
	n1_length(trt)
	n2_length(cont)
	sp_sqrt((var(trt)*(n1-1))+(var(cont)*(n2-1))/(n1+n2-2))
	siz_c(3,4,5,6,10)
	basis_seq(0,max(mean(trt),mean(cont)),length=40)
	plot(basis,powr(basis,10,sp),type="n",
	ylab="power - chance of rejecting NH",
	xlab="mu(trt)-mu(cont)",
	main="Power curve using t-test with pooled s.d.")
	for (aa in 1:5){
	lines(basis,powr(basis,siz[aa],sp),lwd=3)}
	text(locator(5),c("n=3","n=4","n=5","n=6","n=10"))

#See that to solve almost all of these exercises we have used just #a few lines and computed new data with a standard set of commands. #However the easiest way would be to create a function that takes #the data and compute everything by just pointing it to the data. #With the next 17 lines you could have answered all the problems.

#The way to do this is name_function(argument1,argument2,...){
	#commands
	#commands...
	#return(variable,variable2,variable3,...)
	#}

	doeverything_function(bp1,bp2){
	m1_mean(bp1)
	m2_mean(bp2)
	s1_sqrt(var(bp1))
	s2_sqrt(var(bp2))
	n1_length(bp1)
	n2_length(bp2)
	sp_sqrt((((n1-1)*var(bp1))+((n2-1)*var(bp2)))/(n1+n2-2))
	sey2y1_sp*sqrt((1/n1)+(1/n2))
	ica_m2-m1-qt(0.975,n1+n2-2)*sey2y1
	icb_m2-m1+qt(0.975,n1+n2-2)*sey2y1
        t_((m2-m1)-0)/sey2y1
	P_pt(abs(t),n1+n2-2)
	twosidedp_2*(1-pt(abs(t),n1+n2-2))
	return (m1,m2,s1,s2,sp,sey2y1,c("95% interval",ica,icb),
        c("onesidedp",1-P),twosidedp)
	}

#This is the same as going to the window and asking the Statistics #that you want for any data stored. You already know the path.