--- title: "Lecture 8" subtitle: "ARMA Models" date: "2/13/2018" fontsize: 11pt output: beamer_presentation: theme: metropolis highlight: pygments fig_caption: false latex_engine: xelatex includes: in_header: ../settings.tex --- ```{r setup, include=FALSE} library(dplyr) library(ggplot2) library(forecast) ``` ```{r config, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, fig.width=7, fig.height=4.5, out.width="\\textwidth", fig.align="center", echo=TRUE, warning=FALSE ) ggplot2::theme_set(ggplot2::theme_bw()) source("../util.R") ``` # AR models ## AR(p) models {.t} We can generalize from an AR(1) to an AR(p) model by simply adding additional autoregressive terms to the model. $$ \begin{aligned} AR(p): \quad y_t &= \delta + \phi_1 \, y_{t-1} + \phi_2 \, y_{t-2} + \cdots + \phi_p \, y_{t-p} + w_t \\ &= \delta + w_t + \sum_{i=1}^p \phi_i \, y_{t-i} \end{aligned} $$ . . . What are the properities of $AR(p)$, 1. Expected value? 2. Autocovariance / autocorrelation? 3. Stationarity conditions? ## Lag operator The lag operator is convenience notation for writing out AR (and other) time series models. We define the lag operator $L$ as follows, $$L \, y_t = y_{t-1}$$ . . . this can be generalized where, $$ \begin{aligned} L^2 y_t &= L\,(L\, y_{t})\\ &= L \, y_{t-1} \\ &= y_{t-2} \\ \end{aligned} $$ therefore, $$L^k \, y_t = y_{t-k}$$ ## Lag polynomial {.t} Lets rewrite the $AR(p)$ model using the lag operator, $$ \begin{aligned} y_t &= \delta + \phi_1 \, y_{t-1} + \phi_2 \, y_{t-2} + \cdots + \phi_p \, y_{t-p} + w_t \\ y_t &= \delta + \phi_1 \, L \, y_t + \phi_2 \, L^2 \, y_t + \cdots + \phi_p \, L^p \, y_t + w_t \end{aligned} $$ . . . $$ \begin{aligned} y_t - \phi_1 \, L \, y_t - \phi_2 \, L^2 \, y_t - \cdots - \phi_p \, L^p \, y_t &= \delta + w_t \\ (1 - \phi_1 \, L - \phi_2 \, L^2 - \cdots - \phi_p \, L^p) \, y_t &= \delta + w_t \end{aligned} $$ . . . $$~$$ This polynomial of the lags $$\phi_p(L) = (1 - \phi_1 \, L - \phi_2 \, L^2 - \cdots - \phi_p \, L^p)$$ is called the characteristic polynomial of the AR process. ## Stationarity of $AR(p)$ processes {.t} **Claim**: An $AR(p)$ process is stationary if the roots of the characteristic polynomial lay *outside* the complex unit circle . . . If we define $\lambda = 1/L$ then we can rewrite the characteristic polynomial as $$ (\lambda^p - \phi_1 \lambda^{p-1} - \phi_2 \lambda^{p-2} - \cdots - \phi_{p-1} \lambda - \phi_p) $$ then as a corollary of our claim the $AR(p)$ process is stationary if the roots of this new polynomial are *inside* the complex unit circle (i.e. $|\lambda| < 1$). ## Example AR(1) ## Example AR(2) ## AR(2) Stationarity Conditions \begin{center} \includegraphics[width=0.85\textwidth]{figs/ar2_conditions.png} \end{center} \tiny{From \url{Shumway & Stofer 4th ed.}} ## Proof Sketch We can rewrite the $AR(p)$ model into an $AR(1)$ form using matrix notation $$ \begin{aligned} y_t &= \delta + \phi_1 \, y_{t-1} + \phi_2 \, y_{t-2} + \cdots + \phi_p \, y_{t-p} + w_t \\ \symbf\xi_t &= \symbf{\delta} + \symbf{F} \, \symbf\xi_{t-1} + \symbf w_t \end{aligned} $$ where $$ \begin{aligned} \begin{bmatrix} y_t \\ y_{t-1} \\ y_{t-2} \\ \vdots \\ y_{t-p+1} \end{bmatrix} &= \begin{bmatrix} \delta \\ 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix} + \begin{bmatrix} \phi_1 & \phi_2 & \phi_3 & \cdots & \phi_{p-1} & \phi_p \\ 1 & 0 & 0 & \cdots & 0 & 0 \\ 0 & 1 & 0 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \cdots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 1 & 0 \\ \end{bmatrix} \begin{bmatrix} y_{t-1} \\ y_{t-2} \\ y_{t-3} \\ \vdots \\ y_{t-p} \end{bmatrix} + \begin{bmatrix} w_t \\ 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix} \\ &= \begin{bmatrix} \delta + w_t + \sum_{i=1}^p \phi_i \, y_{t-i} \\ y_{t-1} \\ y_{t-2} \\ \vdots \\ y_{t-p+1} \end{bmatrix} \end{aligned} $$ ## Proof sketch (cont.) {.t} So just like the original $AR(1)$ we can expand out the autoregressive equation $$ \begin{aligned} \symbf\xi_t &= \symbf{\delta} + \symbf w_t + \symbf{F} \, \symbf\xi_{t-1} \\ &= \symbf{\delta} + \symbf w_t + \symbf{F} \, (\symbf\delta+\symbf w_{t-1}) + \symbf{F}^2 \, (\symbf \delta+\symbf w_{t-2}) + \cdots \\ &\qquad\qquad~\,\,\, + \symbf{F}^{t-1} \, (\symbf \delta+\symbf w_{1}) + \symbf{F}^t \, (\symbf \delta+\symbf w_0) \\ &= (\sum_{i=0}^t F^i)\symbf{\delta} + \sum_{i=0}^t F^i \, w_{t-i} \end{aligned} $$ and therefore we need $\underset{t\to\infty}{\lim} F^t \to 0$. ## Proof sketch (cont.) {.t} We can find the eigen decomposition such that $\symbf F = \symbf Q \symbf \Lambda \symbf Q^{-1}$ where the columns of $\symbf Q$ are the eigenvectors of $\symbf F$ and $\symbf \Lambda$ is a diagonal matrix of the corresponding eigenvalues. A useful property of the eigen decomposition is that $$ \symbf{F}^i = \symbf Q \symbf \Lambda^i \symbf Q^{-1} $$ . . . Using this property we can rewrite our equation from the previous slide as $$ \begin{aligned} \symbf\xi_t &= (\sum_{i=0}^t F^i)\symbf{\delta} + \sum_{i=0}^t F^i \, w_{t-i} \\ &= (\sum_{i=0}^t \symbf Q \symbf \Lambda^i \symbf Q^{-1})\symbf{\delta} + \sum_{i=0}^t \symbf Q \symbf \Lambda^i \symbf Q^{-1} \, w_{t-i} \end{aligned} $$ ## Proof sketch (cont.) {.t} $$ \symbf \Lambda^i = \begin{bmatrix} \lambda_1^i & 0 & \cdots & 0 \\ 0 & \lambda_2^i & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_p^i \\ \end{bmatrix} $$ Therefore, $$\underset{t\to\infty}{\lim} F^t \to 0$$ when $$\underset{t\to\infty}{\lim} \Lambda^t \to 0$$ which requires that $$|\lambda_i| < 1 \;\; \text{ for all} \; i$$ ## Proof sketch (cont.) Eigenvalues are defined such that for $\symbf \lambda$, $$ \det (\symbf{F}-\symbf\lambda\,\symbf{I}) = 0$$ based on our definition of $\symbf F$ our eigenvalues will therefore be the roots of $$\lambda^p -\phi_1\,\lambda^{p-1}-\phi_2\,\lambda^{p-2} - \cdots - \phi_{p_1} \, \lambda^1 - \phi_p = 0$$ . . . which if we multiply by $1/\lambda^p$ where $L = 1/\lambda$ gives $$1 -\phi_1\,L-\phi_2\,L^2 - \cdots - \phi_{p_1} \, L^{p-1} - \phi_p \, L^p = 0$$ ## Properties of $AR(2)$ {.t} For a *stationary* $AR(2)$ process where $w_t$ has $E(w_t) = 0$ and $Var(w_t) = \sigma_w^2$ ## Properties of $AR(p)$ {.t} For a *stationary* $AR(p)$ process where $w_t$ has $E(w_t) = 0$ and $Var(w_t) = \sigma_w^2$ $$ E(Y_t) = \frac{\delta}{1-\phi_1 -\phi_2-\cdots-\phi_p} $$ $$ \begin{aligned} \gamma(0) = \phi_1\gamma_1 + \phi_2\gamma_2 + \cdots + \phi_p\gamma_p + \sigma_w^2 \\ \gamma(h) = \phi_1\gamma_{j-1} + \phi_2\gamma_{j-2} + \cdots + \phi_p\gamma_{j-p} \end{aligned} $$ $$ \rho(h) = \phi_1\rho_{j-1} + \phi_2\rho_{j-2} + \cdots + \phi_p\rho_{j-p} $$ # Moving Average (MA) Processes ## MA(1) {.t} A moving average process is similar to an AR process, except that the autoregression is on the error term. $$ MA(1): \qquad y_t = \delta + w_t + \theta \, w_{t-1} $$ Properties: ## Time series ```{r echo=FALSE, dev="png", dpi=150} ma = data_frame( t = 1:200, "θ=0.1" = arima.sim(n=200, list(ma=c(0.1))) %>% strip_attrs(), "θ=0.8" = arima.sim(n=200, list(ma=c(0.8))) %>% strip_attrs(), "θ=2.0" = arima.sim(n=200, list(ma=c(2.0))) %>% strip_attrs(), "θ=-0.1" = arima.sim(n=200, list(ma=c(-0.1))) %>% strip_attrs(), "θ=-0.8" = arima.sim(n=200, list(ma=c(-0.8))) %>% strip_attrs(), "θ=-2.0" = arima.sim(n=200, list(ma=c(-2.0))) %>% strip_attrs() ) ma %>% tidyr::gather(model, y, -t) %>% mutate(model = forcats::as_factor(model)) %>% ggplot(aes(x=t,y=y)) + geom_line() + facet_wrap(~model) ``` ## ACF ```{r echo=FALSE} par(mfrow=c(2,3)) acf(ma$`θ=0.1`, lag.max = 10, main=expression(paste(theta,"=0.1"))) acf(ma$`θ=0.8`, lag.max = 10, main=expression(paste(theta,"=0.8"))) acf(ma$`θ=2.0`, lag.max = 10, main=expression(paste(theta,"=2.0"))) pacf(ma$`θ=0.1`, lag.max = 10, main=expression(paste(theta,"=0.1"))) pacf(ma$`θ=0.8`, lag.max = 10, main=expression(paste(theta,"=0.8"))) pacf(ma$`θ=2.0`, lag.max = 10, main=expression(paste(theta,"=2.0"))) ``` ## MA(q) {.t} $$ MA(q): \qquad y_t = \delta + w_t + \theta_1 \, w_{t-1} + \theta_2 \, w_{t-2} + \cdots + \theta_q \, w_{t-q} $$ Properties: $$E(y_t) = \delta$$ $$ \begin{aligned} \gamma(0) &= (1 + \theta_1^2+\theta_2^2 + \cdots + \theta_q^2) \, \sigma_w^2 \\ \gamma(h) &= \begin{cases} -\theta_k + \theta_1\theta_{k+1} + \theta_2\theta_{k+2} + \cdots + \theta_{q+k}\theta_{q} & \text{if } |k| \in \{1,\ldots,q\} \\ 0 & \text{otherwise} \end{cases} \end{aligned} $$ ## Example series ```{r echo=FALSE} ma_q = data_frame( t = 1:100, "θ={-1.5}" = arima.sim(n=100, list(ma=c(-1.5))) %>% strip_attrs(), "θ={-1.5, -1}" = arima.sim(n=100, list(ma=c(-1.5, -1))) %>% strip_attrs(), "θ={-1.5, -1, 2}" = arima.sim(n=100, list(ma=c(-1.5, -1, 2))) %>% strip_attrs(), "θ={-1.5, -1, 2, 3}" = arima.sim(n=100, list(ma=c(-1.5, -1, 2, 3))) %>% strip_attrs() ) ma_q %>% tidyr::gather(model, y, -t) %>% mutate(model = forcats::as_factor(model)) %>% ggplot(aes(x=t,y=y)) + geom_line() + facet_wrap(~model) ``` ## ACF ```{r echo=FALSE} par(mfrow=c(2,2)) acf(ma_q[[2]], lag.max = 10, main=expression(paste(theta,"={-1.5}" ))) acf(ma_q[[3]], lag.max = 10, main=expression(paste(theta,"={-1.5, -1}" ))) acf(ma_q[[4]], lag.max = 10, main=expression(paste(theta,"={-1.5, -1, 2}" ))) acf(ma_q[[5]], lag.max = 10, main=expression(paste(theta,"={-1.5, -1, 2, 3}"))) ``` # ARMA Model ## ARMA Model {.t} An ARMA model is a composite of AR and MA processes, $ARMA(p,q)$: \footnoteoutput $$ y_t = \delta + \phi_1 \, y_{t-1} + \cdots \phi_p \, y_{t-p} + w_{t} + \theta_1 w_{t-1} + \cdots + \theta_q w_{t_q} $$ $$ \phi_p(L) y_t = \delta + \theta_q(L)w_t $$ Since all $MA$ processes are stationary, we only need to examine the $AR$ aspect to determine stationarity (roots of $\phi_p(L)$ lie outside the complex unit circle). ## Time series ```{r echo=FALSE, fig.height=5} arma = data_frame( t = 1:100, "φ={0.9}, θ={-}" = arima.sim(n=100, list(ar=c( 0.9), ma=c() )) %>% strip_attrs(), "φ={-0.9}, θ={-}" = arima.sim(n=100, list(ar=c(-0.9), ma=c() )) %>% strip_attrs(), "φ={-}, θ={0.9}" = arima.sim(n=100, list(ar=c(), ma=c(0.9))) %>% strip_attrs(), "φ={-}, θ={-0.9}" = arima.sim(n=100, list(ar=c(), ma=c(-0.9))) %>% strip_attrs(), "φ={0.9}, θ={0.9}" = arima.sim(n=100, list(ar=c( 0.9), ma=c(0.9))) %>% strip_attrs(), "φ={-0.9}, θ={0.9}" = arima.sim(n=100, list(ar=c(-0.9), ma=c(0.9))) %>% strip_attrs(), "φ={0.9}, θ={-0.9}" = arima.sim(n=100, list(ar=c( 0.9), ma=c(-0.9))) %>% strip_attrs(), "φ={-0.9}, θ={-0.9}" = arima.sim(n=100, list(ar=c(-0.9), ma=c(0.9))) %>% strip_attrs() ) arma %>% tidyr::gather(model, y, -t) %>% mutate(model = forcats::as_factor(model)) %>% ggplot(aes(x=t,y=y)) + geom_line() + facet_wrap(~model, ncol=4) ``` ## $\phi=0.9$, $\theta=0$ ```{r echo=FALSE} forecast::tsdisplay(arma[[2]], main=expression(paste(phi,"={0.9}, ",theta,"={0}"))) ``` ## $\phi=0$, $\theta=0.9$ ```{r echo=FALSE} forecast::tsdisplay(arma[[4]], main=expression(paste(phi,"={0}, ",theta,"={0.9}"))) ``` ## $\phi=0.9$, $\theta=0.9$ ```{r echo=FALSE} forecast::tsdisplay(arma[[6]], main=expression(paste(phi,"={0.9}, ",theta,"={0.9}"))) ```