Today’s agenda

Today’s agenda

  • Review app ex from last time

  • One vs. two sided hypothesis tests

  • Functions in R + activity

  • Due Thursday: App ex - Revisit simulation based inference with functions

One vs. two sided hypothesis tests

Types of alternative hypotheses

  • One sided (one tailed) alternatives: The parameter is hypothesized to be less than or greater than the null value (< or >)

  • Two sided (two tailed) alternatives: The parameter is hypothesized to be not equal to the null value (\(\ne\))
    • Calculated as two times the tail area beyond the observed sample statistic
    • More objective, and hence more widely preferred

Average systolic blood pressure of people with Stage 1 Hypertension is 150 mm Hg. Suppose we want to use a hypothesis test to evaluate whether a new blood pressure medication has an effect on the average blood pressure of heart patients. What are the hypotheses?

Functions

Function Basics

In R functions are first order objects, this means we can work with them like any other object in R.

f <- function(x){
  x*x
}
list(f)
## [[1]]
## function (x) 
## {
##     x * x
## }
typeof(f)
## [1] "closure"

Function Parts

The two parts of a function are the arguments (formals) and the code (body).

gcd <- function(loc1, loc2){
  deg2rad <- function(deg) return(deg*pi/180)

  lat1 <- deg2rad( loc1[1] )
  lat2 <- deg2rad( loc2[1] )
  long1 <- deg2rad( loc1[2] )
  long2 <- deg2rad( loc2[2] )

  R <- 6371 # Earth mean radius in km
  d <- acos(sin(lat1) * sin(lat2) + cos(lat1) * cos(lat2) * cos(long2-long1)) * R
  
  return(d) # distance in km
}

Function parts (cont.)

formals(gcd)
## $loc1
## 
## 
## $loc2
body(gcd)
## {
##     deg2rad <- function(deg) return(deg * pi/180)
##     lat1 <- deg2rad(loc1[1])
##     lat2 <- deg2rad(loc2[1])
##     long1 <- deg2rad(loc1[2])
##     long2 <- deg2rad(loc2[2])
##     R <- 6371
##     d <- acos(sin(lat1) * sin(lat2) + cos(lat1) * cos(lat2) * 
##         cos(long2 - long1)) * R
##     return(d)
## }

Distance between LA and Durham

los_angeles <- c(34.052235, -118.243683)
durham <- c(36.002453, -78.905869)

gcd(los_angeles, durham)
## [1] 3564.199

gcd durham to la

Return values

In the preceding slides we have seen two approaches for returning values: explicit and implicit return values. The former should be preferred of the later except in the case of very simple functions.

Explicit: includes one or more returns

f <- function(x){
  return(x * x)
}


Implicit: value from last statement is returned

f <- function(x){
  x * x 
}

Returning multiple and named values

If we want a function to return more than one value we can group things using either vectors or lists.

f <- function(x){
  list(value = x, squared = x^2, cubed = x^3)
}
f(2)
## $value
## [1] 2
## 
## $squared
## [1] 4
## 
## $cubed
## [1] 8
f(2:3)
## $value
## [1] 2 3
## 
## $squared
## [1] 4 9
## 
## $cubed
## [1]  8 27

Argument defaults

In R it is possible to give function arguments default values,

f <- function(x = 1, y = 1, z = 1){
  paste0("x=", x, " y=", y, " z=", z)
}
f()
## [1] "x=1 y=1 z=1"
f(x = 2, y = 4, z = 9)
## [1] "x=2 y=4 z=9"
f(z = 3)
## [1] "x=1 y=1 z=3"

Scoping

R has generous scoping rules, if it can’t find a variable in the functions body’s scope, it will look for it in the next higher scope, and so on.

y <- 1
f <- function(x){
  x + y
}

f(3)
## [1] 4
g <- function(x){
  y <- 2
  x + y
}

g(3)
## [1] 5

Additionally, variables defined within a scope only persist for the duration of that scope, and do not overwrite variables at the higher scopes.

x <- 1
y <- 1
z <- 1
f = function(){
    y <- 2
    g <- function(){
      z <- 3
      return(x + y + z)
    }
    return(g())
}
f()
## [1] 6
c(x, y, z)
## [1] 1 1 1

Getting Help

Prefixing any function name with a ? will open the related help file for that function.

?sum
?`+`
?`[`

For functions not in the base package, you can also see their implementation by entering the function name without parentheses (or using body function).

lm
## function (formula, data, subset, weights, na.action, method = "qr", 
##     model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, 
##     contrasts = NULL, offset, ...) 
## {
##     ret.x <- x
##     ret.y <- y
##     cl <- match.call()
##     mf <- match.call(expand.dots = FALSE)
##     m <- match(c("formula", "data", "subset", "weights", "na.action", 
##         "offset"), names(mf), 0L)
##     mf <- mf[c(1L, m)]
##     mf$drop.unused.levels <- TRUE
##     mf[[1L]] <- quote(stats::model.frame)
##     mf <- eval(mf, parent.frame())
##     if (method == "model.frame") 
##         return(mf)
##     else if (method != "qr") 
##         warning(gettextf("method = '%s' is not supported. Using 'qr'", 
##             method), domain = NA)
##     mt <- attr(mf, "terms")
##     y <- model.response(mf, "numeric")
##     w <- as.vector(model.weights(mf))
##     if (!is.null(w) && !is.numeric(w)) 
##         stop("'weights' must be a numeric vector")
##     offset <- as.vector(model.offset(mf))
##     if (!is.null(offset)) {
##         if (length(offset) != NROW(y)) 
##             stop(gettextf("number of offsets is %d, should equal %d (number of observations)", 
##                 length(offset), NROW(y)), domain = NA)
##     }
##     if (is.empty.model(mt)) {
##         x <- NULL
##         z <- list(coefficients = if (is.matrix(y)) matrix(, 0, 
##             3) else numeric(), residuals = y, fitted.values = 0 * 
##             y, weights = w, rank = 0L, df.residual = if (!is.null(w)) sum(w != 
##             0) else if (is.matrix(y)) nrow(y) else length(y))
##         if (!is.null(offset)) {
##             z$fitted.values <- offset
##             z$residuals <- y - offset
##         }
##     }
##     else {
##         x <- model.matrix(mt, mf, contrasts)
##         z <- if (is.null(w)) 
##             lm.fit(x, y, offset = offset, singular.ok = singular.ok, 
##                 ...)
##         else lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok, 
##             ...)
##     }
##     class(z) <- c(if (is.matrix(y)) "mlm", "lm")
##     z$na.action <- attr(mf, "na.action")
##     z$offset <- offset
##     z$contrasts <- attr(x, "contrasts")
##     z$xlevels <- .getXlevels(mt, mf)
##     z$call <- cl
##     z$terms <- mt
##     if (model) 
##         z$model <- mf
##     if (ret.x) 
##         z$x <- x
##     if (ret.y) 
##         z$y <- y
##     if (!qr) 
##         z$qr <- NULL
##     z
## }
## <bytecode: 0x7fed8d095028>
## <environment: namespace:stats>

When to use functions

The goal of a function should be to encapsulate a small reusable piece of code.

  • Name should make it clear what the function does (think in terms of simple verbs).

  • Functionality should be simple enough to be quickly understood.

  • The smaller and more modular the code the easier it will be to reuse elsewhere.

  • Better to change code in one location than code everywhere.