Scaled Forecasting with Python and R With Forecasting for Several Different Types of Models and Time Series

  • Michael Keith

Your browser needs to be JavaScript capable to view this video

Try reloading this page, or reviewing your browser settings

You're watching a preview of subscription content. Log in to check access

In this video, you will explore forecasting techniques in Python, including how to use machine learning models from Scikit Learn, as well as integrating R as a sub process to gain access to the robust forecast library, incorporating the auto.arima and tbats models. We create our own Python class called Forecaster that stores all of the relevant information about the predictions, including error metrics, model forms, hyperparameter selections, and the forecasts themselves. The module is written in such a way that models can easily be added to the framework for increasing accuracy.

In the specific scenario, we are forecasting on 150 separate time series—3 time series for each state using economic indicators from the St. Louis Federal Reserve (FRED) website. We build an API to extract the data and add our own indicators that we think will influence the future. At the time that this video is published, the country is experiencing a recession from the COVID-19 pandemic. Therefore, we will extract the recession indicator from the FRED website and make assumptions about how long the recession will last. Using this information, we create an economic outlook for each state. These forecasts can then be used to see which states we expect to recover most quickly from the recession.

This modeling process will be done in Python on a Jupyter Notebook, so it’s a good idea to have Anaconda installed on your computer so you can follow along. The rpy2 library from Python will be utilized, so having R installed in your local environment is necessary.

What You Will Learn

  • Create classes in Python

  • Make predictions with machine learning and common forecasting models

  • Manipulate data with Pandas

  • Store data in base Python structures (lists, dictionaries)

  • Create APIs and visual aids

Who This Book Is For

Data scientists with experience who are looking to take their forecasting skills to the next level.

In this video, you will explore scaled forecasting with Python and R, including how to use machine learning models from Scikit Learn.

About The Author

Michael Keith

Michael Keith works for the Utah Department of Health as a data scientist and a lot of his job role is predicting the economic outlook of the state in order to forecast medicaid enrollment. He has previously used this general approach to forecasting when he was working as an omni-channel analyst for Disney Parks and Resorts and won an award from upper-management for delivering forecasts that were timely, reproducible, and accurate. He also consults part-time for an online university and is in the process of writing labs and course materials that will be used in the Master of Science in Data Analytics program for Western Governors University. He was previously published by Apress for his video entitled Machine Learning with Regression in Python.

 

Supporting material

View source code at GitHub.

About this video

Author(s)
Michael Keith
DOI
https://doi.org/10.1007/978-1-4842-6893-3
Online ISBN
978-1-4842-6893-3
Total duration
1 hr 20 min
Publisher
Apress
Copyright information
© Michael Keith 2021

Related content

Video Transcript

[MUSIC PLAYING]

Hello. I’m Michael Keith. And this is the tutorial titled Skilled Forecasting with Python and R. This is an advanced course focused on implementing forecasts and techniques using object-oriented programming in Python. Specifically, we will build a forecaster object in Python capable of storing forecast information, including different model types, actual forecasted figures, and accuracy metrics for 150 total economic indicators, three indicators for each state in the USA.

Whereas, when you only have to worry about one or a couple of times series to forecast, you might spend some time examining and deconstructing the statistical properties of those series. The approach I will be demonstrating here will use many models to forecast each series regardless of whether any particular model is really best for any particular series.

We use both autoregressive time series methods from our libraries as well as machine learning models from scikit-learn with external regresser, and we examine the sample accuracy of each model and visualize the results to decide which model to use for each series. You might think of this approach as throwing everything at the wall and seeing what sticks, and there are certainly valid reasons to criticize such an approach. But I found there’s no better way to automatically make predictions on mini times series.

In short, if you need to scale your forecasting approach to hundreds or even thousands of series, this is the way to do it so again, this is an advanced tutorial, so I’m assuming you already have pretty decent R and Python programming skills and you already know such terms such as autoregression, stationarity, serial correlation, integration, and seasonality.

If you don’t feel like you are at that level, there are several Apress tutorial videos you can review before returning to this one, including my own on machine learning with regression in Python. Here are some of the concepts that would be covered in this course.

Creating classes in Python, creating an API to connect to like data feeds, storing data primitive Python data structures, such as lists and dictionaries and the advantages of this approach, forecasting with machine learning and other common forecasting models, such as ARIMA, running R as a subprocess in Python, evaluating models, and visualizing results.

To get started, there are links to GitHub repository with a code. There is an init and master branch. The init branch has blank coding files that allows you to fill in much of the coding if you want to practice on your own. The master branch contains the full application. The hardest part of this approach, in my opinion, is installing the rpy2 library in Python. It requires having an CRAN installed in your path if you are Windows. Sometimes it can give you errors. And if you have errors with any of the installation steps, Google is your friend.