# 1 Introduction

Today you will get some review of topics we have covered recently. The Console window will display executed code output and can also be used for quick code execution. However, any work done in the Console window will be lost once you exit RStudio.

We will use package `ggplot2` from `tidyverse`; so letâ€™s load `tidyverse` now with

``library(tidyverse)``

It is good practice to load all packages you will use at the start of your R Markdown document.

# 2 Console computation

The console can be used as a calculator.

• addition: `+`
• subtraction: `-`
• division: `/`
• multiplication: `*`
• modulus: `%%`
• integer division: `%/%`
• raise to power: `^`

Evaluate the following expressions in the Console window.

1. `3 + 4 * 0 - (100 / 3)`
2. `(4 + 6) * (2 ^ 6)`
3. `1 / 0`
4. `10 ^ 10 ^ 10 ^ 10`
5. `0 / 0`
6. `0.0000003 * 2`

When you launch RStudio, numerous functions are immediately available to you. A few of the mathematical and statistical functions are:

R function Purpose
`abs()` absolute value
`sin()` sine
`cos()` cosine
`tan()` tangent
`log()` logarithm
`exp()` exponential
`mean()` arithmetic mean
`median()` median
`sd()` standard deviation

Evaluate the following expressions in the Console window.

1. `abs(7)`
2. `sin(3.1415)`
3. `exp(1)`
4. logarithm of expressions 1 - 6

What logarithm did you just take? Was it the natural log, base 10, base 2? Type `?log` in the console. A question mark that precedes a functionâ€™s name or built in data object will populate the help tab.

Type the following in the Console window.

1. `?sd`
2. `?mtcars`
3. `?longley`

What are `mtcars` and `longley`?

The most important aspects of Râ€™s help resource will be the description and examples given. Examples are always at the end of the help reference. How many examples are given in the help of `sd`?

1. Run the example provided in the help for `sd`.

Investigate what the following functions do by creating an R chunk with examples of each function in use.

1. `sqrt`
2. `round`
3. `floor`
4. `ceiling`

Now is good place to save your work. Stage, commit, and push your work to GitHub. Did you remember to configure git?

``````library(usethis)
use_git_config(user.name = "", user.email = "")``````

# 3 Longley data

## 3.1 Examine the data

Consider Longleyâ€™s Economic Regression Data. This data set is built-in to R. That means it is available immediately once RStudio is launched. Type `longley` in your Console to see the entire data set. The same data is given below.

Answer the following questions about the `longley` data set.

1. How many rows and columns does `longley` have?
2. What do you think is the difference between the first column of years and the column with the label Year?

Type `head(longley)` in the Console window. What does this do? How about `tail(longley)`?

## 3.2 Summary statistics

Data set `longley` is stored in R as a data frame. Each column is a vector with the components being of the same variable type. We will learn about these details later this week. For now, to access a specific vector use `longley\$variable_name`, where `variable_name` is one of the variables in `longley`. For example,

``longley\$GNP``
`````` [1] 234.289 259.426 258.054 284.599 328.975 346.999 365.385 363.112
[9] 397.469 419.180 442.769 444.546 482.704 502.601 518.173 554.894``````

and

``longley\$Year``
`````` [1] 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
[15] 1961 1962``````

give the GNP and Year vectors of data, respectively.

In a new code chunk below get the following vectors:

1. Unemployed
2. Population
3. Employed

In another code chunk compute the mean, median, and standard deviation for each of the vectors in 3 - 5.

The `summary` function in R will give you many of these statistics. For example,

``summary(longley\$GNP)``
``````   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
234.3   317.9   381.4   387.7   454.1   554.9 ``````

gives us the minimum, maximum, mean, and quartiles of the GNP vector of data.

Insert a code chunk and use the `summary` function on two variables of your choice in `longley`.

Now is good place to save your work. Stage, commit, and push your work to GitHub.

## 3.3 Employment investigation

Suppose it is 1962. Two economists are discussing employment. Each makes the following claim.

Economist A: Employment has never been higher in the past 15 years, we have seen a gradual increase from 1947 to 1962.

Economist B: Employment has been range bound since 1947 and is at its lowest level since 1947.

Which economist is correct?

Letâ€™s look at the variable Employed across time using function `ggplot()`.

``````ggplot(data = longley, mapping = aes(x = Year, y = Employed)) +
geom_point()``````

Is this the best way to look at employment over time? What other variables in the data are changing as we progress from 1947 - 1962? Discuss this with those near you. How would you create a more meaningful representation of employment over time?

Based on your discussions, which economist do you think is correct?

Use package `ggplot2` to recreate the plots below.