October 20, 2015

## Today's agenda

• Review app ex from last time

• One vs. two sided hypothesis tests

• Functions in R + activity

• Due Thursday: App ex - Revisit simulation based inference with functions

## Types of alternative hypotheses

• One sided (one tailed) alternatives: The parameter is hypothesized to be less than or greater than the null value (< or >)

• Two sided (two tailed) alternatives: The parameter is hypothesized to be not equal to the null value ($$\ne$$)
• Calculated as two times the tail area beyond the observed sample statistic
• More objective, and hence more widely preferred

Average systolic blood pressure of people with Stage 1 Hypertension is 150 mm Hg. Suppose we want to use a hypothesis test to evaluate whether a new blood pressure medication has an effect on the average blood pressure of heart patients. What are the hypotheses?

## Function Basics

In R functions are first order objects, this means we can work with them like any other object in R.

f <- function(x){
x*x
}
list(f)
## [[1]]
## function (x)
## {
##     x * x
## }
typeof(f)
## [1] "closure"

## Function Parts

The two parts of a function are the arguments (formals) and the code (body).

gcd <- function(loc1, loc2){

R <- 6371 # Earth mean radius in km
d <- acos(sin(lat1) * sin(lat2) + cos(lat1) * cos(lat2) * cos(long2-long1)) * R

return(d) # distance in km
}

## Function parts (cont.)

formals(gcd)
## $loc1 ## ## ##$loc2
body(gcd)
## {
##     deg2rad <- function(deg) return(deg * pi/180)
##     R <- 6371
##     d <- acos(sin(lat1) * sin(lat2) + cos(lat1) * cos(lat2) *
##         cos(long2 - long1)) * R
##     return(d)
## }

## Distance between LA and Durham

los_angeles <- c(34.052235, -118.243683)
durham <- c(36.002453, -78.905869)

gcd(los_angeles, durham)
## [1] 3564.199

## Return values

In the preceding slides we have seen two approaches for returning values: explicit and implicit return values. The former should be preferred of the later except in the case of very simple functions.

Explicit: includes one or more returns

f <- function(x){
return(x * x)
}

Implicit: value from last statement is returned

f <- function(x){
x * x
}

## Returning multiple and named values

If we want a function to return more than one value we can group things using either vectors or lists.

f <- function(x){
list(value = x, squared = x^2, cubed = x^3)
}
f(2)
## $value ## [1] 2 ## ##$squared
## [1] 4
##
## $cubed ## [1] 8 f(2:3) ##$value
## [1] 2 3
##
## $squared ## [1] 4 9 ## ##$cubed
## [1]  8 27

## Argument defaults

In R it is possible to give function arguments default values,

f <- function(x = 1, y = 1, z = 1){
paste0("x=", x, " y=", y, " z=", z)
}
f()
## [1] "x=1 y=1 z=1"
f(x = 2, y = 4, z = 9)
## [1] "x=2 y=4 z=9"
f(z = 3)
## [1] "x=1 y=1 z=3"

## Scoping

R has generous scoping rules, if it can't find a variable in the functions body's scope, it will look for it in the next higher scope, and so on.

y <- 1
f <- function(x){
x + y
}

f(3)
## [1] 4
g <- function(x){
y <- 2
x + y
}

g(3)
## [1] 5

Additionally, variables defined within a scope only persist for the duration of that scope, and do not overwrite variables at the higher scopes.

x <- 1
y <- 1
z <- 1
f = function(){
y <- 2
g <- function(){
z <- 3
return(x + y + z)
}
return(g())
}
f()
## [1] 6
c(x, y, z)
## [1] 1 1 1

## Getting Help

Prefixing any function name with a ? will open the related help file for that function.

?sum
?+
?[

For functions not in the base package, you can also see their implementation by entering the function name without parentheses (or using body function).

lm
## function (formula, data, subset, weights, na.action, method = "qr",
##     model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
##     contrasts = NULL, offset, ...)
## {
##     ret.x <- x
##     ret.y <- y
##     cl <- match.call()
##     mf <- match.call(expand.dots = FALSE)
##     m <- match(c("formula", "data", "subset", "weights", "na.action",
##         "offset"), names(mf), 0L)
##     mf <- mf[c(1L, m)]
##     mf$drop.unused.levels <- TRUE ## mf[[1L]] <- quote(stats::model.frame) ## mf <- eval(mf, parent.frame()) ## if (method == "model.frame") ## return(mf) ## else if (method != "qr") ## warning(gettextf("method = '%s' is not supported. Using 'qr'", ## method), domain = NA) ## mt <- attr(mf, "terms") ## y <- model.response(mf, "numeric") ## w <- as.vector(model.weights(mf)) ## if (!is.null(w) && !is.numeric(w)) ## stop("'weights' must be a numeric vector") ## offset <- as.vector(model.offset(mf)) ## if (!is.null(offset)) { ## if (length(offset) != NROW(y)) ## stop(gettextf("number of offsets is %d, should equal %d (number of observations)", ## length(offset), NROW(y)), domain = NA) ## } ## if (is.empty.model(mt)) { ## x <- NULL ## z <- list(coefficients = if (is.matrix(y)) matrix(, 0, ## 3) else numeric(), residuals = y, fitted.values = 0 * ## y, weights = w, rank = 0L, df.residual = if (!is.null(w)) sum(w != ## 0) else if (is.matrix(y)) nrow(y) else length(y)) ## if (!is.null(offset)) { ## z$fitted.values <- offset
##             z$residuals <- y - offset ## } ## } ## else { ## x <- model.matrix(mt, mf, contrasts) ## z <- if (is.null(w)) ## lm.fit(x, y, offset = offset, singular.ok = singular.ok, ## ...) ## else lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok, ## ...) ## } ## class(z) <- c(if (is.matrix(y)) "mlm", "lm") ## z$na.action <- attr(mf, "na.action")
##     z$offset <- offset ## z$contrasts <- attr(x, "contrasts")
##     z$xlevels <- .getXlevels(mt, mf) ## z$call <- cl
##     z$terms <- mt ## if (model) ## z$model <- mf
##     if (ret.x)
##         z$x <- x ## if (ret.y) ## z$y <- y
##     if (!qr)
##         z\$qr <- NULL
##     z
## }
## <bytecode: 0x7fca35a2a908>
## <environment: namespace:stats>

## When to use functions

The goal of a function should be to encapsulate a small reusable piece of code.

• Name should make it clear what the function does (think in terms of simple verbs).

• Functionality should be simple enough to be quickly understood.

• The smaller and more modular the code the easier it will be to reuse elsewhere.

• Better to change code in one location than code everywhere.

## Activity

Write a function that takes in the birth month of a person, and outputs the phrase "You are a season baby!", where season is determined by birth month.

Hint: The paste function might be useful.

## Application exercise

Functionalized inference: see course website for details