class: left, top, title-slide .title[ # Predictive Analytics Unit 14: Advanced Forecasting ] .author[ ### Ken Arnold
Calvin University ] --- ## Data: US retail employment ```r us_retail_employment <- us_employment %>% filter(year(Month) >= 1990, Title == "Retail Trade") %>% select(-Series_ID) autoplot(us_retail_employment, Employed) + labs(y = "Persons (thousands)", title = "Total employment in US retail") ``` <img src="slides14forecast2_files/figure-html/us-retail-timeplot-1.png" width="90%" style="display: block; margin: auto;" /> --- ## Time Series Decomposition ```r employment_components <- us_retail_employment %>% model( STL(Employed ~ trend() + season(), robust = TRUE) ) %>% components() employment_components %>% autoplot() ``` <img src="slides14forecast2_files/figure-html/decomp-1.png" width="90%" style="display: block; margin: auto;" /> --- ## "Seasonally Adjusted" data ```r employment_components %>% as_tsibble() %>% autoplot(Employed, color = "grey") + * geom_line(aes(y = Employed - season_year), color = "blue") ``` <img src="slides14forecast2_files/figure-html/seasonally-adjusted-1.png" width="90%" style="display: block; margin: auto;" /> --- ## Forecasting using the decomposition ```r us_retail_employment %>% model(stl = decomposition_model( STL(Employed ~ trend() + season(), robust = TRUE), NAIVE(season_adjust), SNAIVE(season_year) )) %>% forecast(h = 24) %>% autoplot(us_retail_employment) ``` <img src="slides14forecast2_files/figure-html/decomp-model-1.png" width="90%" style="display: block; margin: auto;" /> --- ## More complex temporal behavior ```r models <- us_retail_employment %>% model( arima = ARIMA(Employed)) models %>% forecast(h = 24) %>% autoplot(us_retail_employment) ``` <img src="slides14forecast2_files/figure-html/unnamed-chunk-1-1.png" width="90%" style="display: block; margin: auto;" /> --- ## Model Report ```r models %>% select(arima) %>% report() ``` ``` Series: Employed Model: ARIMA(3,0,1)(1,1,1)[12] w/ drift Coefficients: ar1 ar2 ar3 ma1 sar1 sma1 1.8887 -0.8486 -0.0445 -0.7833 0.1442 -0.6319 s.e. 0.0775 0.1377 0.0631 0.0569 0.1010 0.0826 constant 0.3277 s.e. 0.1630 sigma^2 estimated as 1380: log likelihood=-1736.47 AIC=3488.93 AICc=3489.36 BIC=3519.68 ``` --- ## Dynamic Models - **Time Series** model: past observations predict today's observation - **Explanatory** model: attributes of today predict today's observation - temperature, holidays, etc. - can include temporal features: day of week, hour, etc. - **Dynamic** model: includes both Examples in book, on homework --- ## Tricks for getting better accuracy - Combine multiple model types (`combination_model`) - Combine multiple training sets (bagging) - Autocorrelation makes bootstrapping tricky but possible - Use good supervised learners (e.g., xgboost, neural nets) --- ## Other methods - Exponential Smoothing (ETS) - Prophet - Neural net methods (LSTM) - Multivariate Forecasting - Time series databases <!-- tspDB -->