To learn what is RNN (Recurrent Neural Networks ) why not understand ARIMA, SARIMA first ? – RNN Learning – Part 5 – day 59

A Deep Dive into ARIMA, SARIMA, and Their Relationship with Deep Learning for Time Series Forecasting In recent years, deep learning has become a dominant force in many areas of data analysis, and time series forecasting is no exception. Traditional models like ARIMA (Autoregressive Integrated Moving Average) and its seasonal extension SARIMA have long been the go-to solutions for forecasting time-dependent data. However, newer models based on Recurrent Neural Networks (RNNs), particularly Long Short-Term Memory (LSTM) networks, have emerged as powerful alternatives. Both approaches have their strengths and applications, and understanding their relationship helps in choosing the right tool for the right problem. In this blog post, we’ll explore ARIMA and SARIMA models in detail, discuss how they compare to deep learning-based models like RNNs, and demonstrate their practical implementation. Deep Learning and Time Series Forecasting Deep learning is a subset of machine learning where models learn hierarchical features from data using multiple layers of neural networks. When it comes to time series forecasting, one of the most common deep learning architectures used is Recurrent Neural Networks (RNNs). RNNs are particularly well-suited for time series because they are designed to handle sequential data, where the output at each time step depends not only on the current input but also on the previous inputs. This is achieved by maintaining a hidden state that gets updated at each time step, allowing the model to “remember” past information. Here are the key components of RNNs and their relevance to time series forecasting: Sequential Memory: RNNs are built to retain information across time steps. This makes them suitable for forecasting problems where patterns are spread across time, such as stock prices or weather data. Backpropagation Through Time (BPTT): Unlike traditional feedforward neural networks, RNNs are trained using a variant of backpropagation known as BPTT, where the network adjusts its weights by considering errors over multiple time steps. Long Short-Term Memory (LSTM): A variant of RNNs, LSTMs are particularly useful in long-term forecasting because they are designed to overcome the vanishing gradient problem, allowing them to capture long-term dependencies in data. While ARIMA and SARIMA focus on modeling the linear relationships in time series data, RNNs and LSTMs can capture complex non-linear dependencies. This makes RNNs more flexible, but they also require larger datasets and more computational power to train effectively. How RNNs Relate to ARIMA and SARIMA Models Although RNNs and ARIMA/SARIMA models operate differently, they share common ground in the context of time series forecasting: Time Dependence: Both models are designed to forecast time-dependent data, meaning they consider historical information to predict future values. Lagging Features: ARIMA uses lagging features (i.e., past values) directly, while RNNs learn patterns through sequential memory. Complexity vs Simplicity: ARIMA/SARIMA models are simpler and more interpretable but may struggle with complex, non-linear patterns. RNNs, on the other hand, can model non-linearity but require more data and computational resources. In this article, we will primarily focus on ARIMA and SARIMA models, their theoretical foundations, and how they are practically applied to time series forecasting. We’ll compare their strengths to RNNs and understand when to use which approach. Understanding ARIMA and SARIMA Models Time Series Fundamentals At the heart of time series forecasting is the ability to recognize and model patterns in data that evolves over time. This includes: Trend: A long-term upward or downward movement in the data. Seasonality: Cyclical patterns that repeat at regular intervals, such as daily, weekly, or yearly fluctuations. Stationarity: A stationary time series has a constant mean and variance over time. Many forecasting models, including ARIMA, require the data to be stationary to perform well. Autocorrelation: The correlation between a time series and its lagged values. ARIMA models rely heavily on autocorrelation to predict future values. ARIMA: Autoregressive Integrated Moving Average The ARIMA model is a well-established statistical approach to time series forecasting. It works by combining three components: Autoregressive (AR): The model regresses the target variable against its own previous values. Integrated (I): This step involves differencing the data to remove trends and make the series stationary. Moving Average (MA): The model includes a moving average component to account for the errors of past predictions. The general ARIMA model is expressed as ARIMA(p, d, q), where: p: The number of autoregressive terms. d: The degree of differencing. q: The number of lagged forecast errors used in the prediction. SARIMA: Seasonal ARIMA While ARIMA works well for non-seasonal data, time series data often contains seasonal patterns. SARIMA extends ARIMA by incorporating seasonal components: P: Seasonal autoregressive terms. D: Seasonal differencing. Q: Seasonal moving average terms. s: The length of the season (e.g., 7 for weekly seasonality). A SARIMA model is expressed as ARIMA(p, d, q) x (P, D, Q, s), where both non-seasonal and seasonal components are considered. Steps in Building ARIMA and SARIMA Models Data Preparation: Ensure the time series data is stationary. If not, apply differencing to make it stationary. Model Identification: Use tools like autocorrelation plots (ACF) and partial autocorrelation plots (PACF) to choose appropriate values for p, d, and q. Model Fitting: Train the ARIMA/SARIMA model on historical data. Forecasting: Use the fitted model to predict future data points. Model Evaluation: Measure the accuracy of the forecast using metrics like Mean Absolute Error (MAE). Deep Learning vs Traditional Models When deciding between RNNs and ARIMA/SARIMA models, it’s important to consider the complexity and nature of the data: ARIMA/SARIMA: Best suited for small to medium-sized datasets with linear patterns and clear seasonality. They require minimal data preprocessing but struggle with non-linearity. RNN/LSTM: Better suited for large datasets with complex, non-linear patterns. They excel at capturing long-term dependencies but need more data and computation to be effective. This is particularly useful for multi-step forecasts. Code Implementation of ARIMA and SARIMA with RNN Comparison In this section, we will implement the two code and we will compare ARIMA and SARIMA models with Recurrent Neural Networks (RNNs) for time series forecasting. 1. ARIMA Basic Forecast Code The ARIMA model is used to forecast rail ridership for the next day (June 1, 2019), assuming data ends on May 31, 2019. from statsmodels.tsa.arima.model import ARIMA import pandas as pd # Define the origin and end date for the dataset # This specifies the time range of interest origin, today = “2019-01-01”, “2019-05-31” # Assume the time series is stored in a pandas DataFrame named ‘df’ # Ensure that “rail” is a column in the DataFrame # Select the data within the specified date range and ensure daily frequency rail_series = df.loc[origin:today, “rail”].asfreq(“D”) # Build the ARIMA model # The ‘order’ parameter specifies (p, d, q): # – p: The number of lag observations included in the model (autoregression). # – d: The number of times the data is differenced to make it stationary. # – q: The size of the moving average window. # The ‘seasonal_order’ specifies seasonal effects, with (P, D, Q, s): # – P: Seasonal autoregressive terms. # – D: Seasonal differencing. # – Q: Seasonal moving average terms. # – s: Seasonal periodicity (e.g., 7 for weekly seasonality). model = ARIMA(rail_series, order=(1, 0, 0), seasonal_order=(0, 1, 1, 7)) # Fit the ARIMA model to the rail ridership data # This step estimates the model parameters model = model.fit() # Forecast the rail ridership for a single future time point (June 1, 2019) # The ‘forecast’ method predicts the next value based on the model y_pred = model.forecast() # Returns the predicted value print(y_pred) Explanation: ARIMA Setup: The order=(1, 0, 0) sets up the ARIMA model with one autoregressive term (AR), no differencing (d=0), and no moving average term (MA). Seasonal Component: The seasonal_order=(0, 1, 1, 7) adds seasonal differencing (D=1), a seasonal moving average (Q=1), and a seasonal period of 7 days (weekly seasonality). Forecast: After fitting the model, the predicted ridership for June 1, 2019, is 427,758.6 passengers. 2. SARIMA with Daily Retraining and MAE Calculation In this code, we extend the SARIMA model to retrain it daily for each day from March 1 to May 31, 2019. The forecasts are then compared to actual values, and the Mean Absolute Error (MAE) is calculated to evaluate the performance. import pandas as pd from statsmodels.tsa.arima.model import ARIMA # Define the time period and data range # ‘origin’: Start of the data for training # ‘start_date’ and ‘end_date’: Range for testing and evaluation origin, start_date, end_date = “2019-01-01”, “2019-03-01”, “2019-05-31” # Generate a date range for the testing period time_period = pd.date_range(start_date, end_date) # Select the rail data from the dataset, ensuring daily frequency rail_series = df.loc[origin:end_date][“rail”].asfreq(“D”) # Initialize an empty list to store predictions y_preds = [] # Loop through each day in the testing period # Retrain the ARIMA model for each day and forecast the next value for today in time_period.shift(-1): # Build the ARIMA model with parameters: # (1, 0, 0) for autoregressive terms, no differencing, no moving average # (0, 1, 1, 7) for weekly seasonality model = ARIMA(rail_series[origin:today], order=(1, 0, 0), seasonal_order=(0, 1, 1, 7)) # Fit the ARIMA model to the training data model = model.fit() # Forecast the next day’s value and append it to the predictions list y_pred = model.forecast()[0] y_preds.append(y_pred) # Convert the predictions into a pandas Series with corresponding dates y_preds = pd.Series(y_preds, index=time_period) # Calculate the Mean Absolute Error (MAE) # MAE measures the average magnitude of forecast errors mae = (y_preds – rail_series[time_period]).abs().mean() # Output the calculated MAE print(f”Mean Absolute Error (MAE): {mae}”) # Example output: MAE is 32,040.7 Explanation: Daily Retraining: The SARIMA model is retrained every day based on data up to the current day. This allows it to adapt better to recent data trends. Time Period: The forecasts are made for each day between March 1 and May 31, 2019. The predictions are stored in the list y_preds. Evaluation (MAE Calculation): The Mean Absolute Error (MAE) measures the average error in the predictions. Here, the model produces an MAE of 32,040.7, which indicates the average error in the ridership prediction over the given time period. Comparison with Recurrent Neural Networks (RNNs) Now that we have implemented ARIMA and SARIMA models, let’s explore how they compare with Recurrent Neural Networks (RNNs) for time series forecasting. Strengths of ARIMA and SARIMA: Simplicity: ARIMA and SARIMA models are relatively straightforward to implement and interpret, particularly for linear, seasonal data. Data Requirements: These models perform well on small to medium-sized datasets without requiring extensive computational resources. Seasonality: SARIMA can handle seasonal patterns explicitly, which is useful for datasets with known seasonality (e.g., weekly, monthly patterns). Limitations of ARIMA and SARIMA: Linear Assumptions: Both ARIMA and SARIMA models assume linear relationships in the data. They may struggle with complex, non-linear patterns. Long-term Dependencies: These models work well with short-term forecasts but may not capture long-term dependencies as effectively. Why Use RNNs for Time Series Forecasting? Recurrent Neural Networks (RNNs) are designed to handle sequential data like time series, where the future value depends on previous values. Unlike ARIMA and SARIMA, RNNs are capable of modeling both linear and non-linear relationships, making them powerful for complex time series forecasting. Strengths of RNNs: Sequential Memory: RNNs have a hidden state that retains information from previous time steps, allowing the model to “remember” past values and make better forecasts for long sequences. Non-linearity: RNNs can model non-linear patterns in the data, which is critical for complex time series that have intricate patterns. Handling Long-term Dependencies: With variants like Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), RNNs can capture long-term dependencies that are difficult for ARIMA/SARIMA to handle. Limitations of RNNs: Data Requirement: RNNs typically require larger datasets to train effectively compared to ARIMA/SARIMA models. Complexity: RNNs are computationally intensive, requiring more resources for training and tuning. Interpretability: Unlike ARIMA/SARIMA models, RNNs can be treated as black-box models. It is harder to interpret the relationships learned by RNNs. When to Use ARIMA/SARIMA vs. RNNs? ARIMA/SARIMA: These models are better suited for small datasets with linear relationships and seasonal patterns. They are easier to interpret and require fewer computational resources. RNNs (LSTM/GRU): If your time series data is large, has non-linear relationships, or involves long-term dependencies, RNNs or their variants (like LSTMs or GRUs) may provide better accuracy. Recap to ARIMA with a small real example Recap again : What is ARIMA? ARIMA (AutoRegressive Integrated Moving Average) is a time series forecasting model. It uses three components: Autoregressive (AR): Predicts future values based on past values. Integrated (I): Applies differencing to make the series stationary. Moving Average (MA): Uses past forecast errors to improve predictions. ARIMA is typically represented as ARIMA(, , ), where: is the number of autoregressive terms. is the degree of differencing. is the number of moving average terms. Example: ARIMA(1,1,1) Step-by-Step Given this data: Time (t)Value (y)t=150t=255t=354t=457 We aim to predict using ARIMA(1,1,1). Step 1: Differencing (I = 1) First, we apply first differencing to remove trends: For our data: The differenced series is . Step 2: Autoregression (AR = 1) In AR(1), we predict the next differenced value using : Where: (constant) (autoregressive coefficient estimated from autocorrelation) How is Estimated: To estimate , we calculate the lag-1 autocorrelation of the differenced series . Here’s the calculation: Find the mean of the differenced series: Covariance between and : Variance of : Autocorrelation: We assume is approximately for this example. Using and : The predicted differenced value is **3.1**. Step 3: Moving Average (MA = 1) The MA(1) component adjusts the prediction based on the previous error : Where: (previous error) (moving average coefficient) Now, calculate the adjusted prediction: The adjusted prediction is **2.8**. Step 4: Reverse Differencing to Get Final Prediction Finally, we reverse the differencing to bring the prediction back to the original scale: Final Prediction The predicted value for is **59.8**. Key Notes: Differencing removes trends in the data. Autoregression (AR) predicts the next value using the previous differenced value. Moving Average (MA) adjusts the prediction using past forecast errors. Reversing the differencing brings the prediction back to the original scale.       Overview Objective:Now Lets Use a Recurrent Neural Network (RNN) to predict based on the previous values example exactly of , , , and , and demonstrate how the model improves over multiple training iterations. Given Time Series Data: Step 1: Data Preparation 1.1 Organize Data into Sequences We create input-output pairs for training: Training Input Sequence: Training Target Output: Prediction Input Sequence: 1.2 Reshape Data Reshape for RNN input: Step 2: Define the RNN Model 2.1 Model Architecture Input Size: 1 Hidden Units: 1 Output Size: 1 2.2 Initialize Weights and Biases 2.3 Activation Function Use for the hidden state activation. 2.4 Initial Hidden State Initial hidden state: Step 3: Training Iterations We will perform three training iterations to observe how the model improves. Iteration 1 3.1 Forward Propagation Time Step t = 1 Input: Compute Hidden State : Compute Output : Time Step…

Thank you for reading this post, don't forget to subscribe!

Membership Required

You must be a member to access this content.

View Membership Levels

Already a member? Log in here
FAQ Chatbot

Select a Question

Or type your own question

For best results, phrase your question similar to our FAQ examples.