Predictions in Energy Economics - Which Methods Are Suitable?

This article once again focuses on predictions and forecasts in the energy industry. In the previous article, we discussed metrics that can be used for both regression and classification tasks. Here, we specifically delve into the prediction of time series, which are commonly encountered in the energy sector. Generally, a time series is understood as a sequence of real numbers x_t₁, …, x_tₙ recorded at times (t₁,…,tₙ) Examples in the energy industry include energy prices, production or consumption of energy, whether an electric vehicle is plugged in at a certain time, or the water level of a pumped storage power plant. To provide guidance in selecting suitable methods for predicting a specific time series, we have created a decision tree at the end of the article. To do so, it is recommended to closely examine the time series and become familiar with its characteristics.

Parametric/ Non-Parametric

Before making assumptions about time series, it is important to consider whether one wants to make assumptions about the time series at all. Parametric models make assumptions about the time series, such as its underlying distribution or functional form, which can be described by a number of parameters. This makes parametric models easily interpretable and generally requires less computational effort. However, for the parametric model to make accurate predictions, the assumptions must also hold true, which is why it may be easier or preferential to make no assumptions about the time series. In that case, non-parametric models are used, which do not rely on the accuracy of assumptions but are computationally more intensive and less interpretable. When choosing the appropriate non-parametric model, the focus is on the characteristics of the model, while for parametric models, the focus is on the characteristics of the time series. We will discuss the characteristics of the time series in the following.

Linear/ Non-Linear

The dependencies within a time series are an important characteristic for predicting its future behavior. A time series is called linear if there is a linear relationship between and the previous points. A time series is considered non-linear if there is a different relationship. It can be challenging to determine whether a time series is linear or non-linear. Some methods to test this include the autocorrelation function (ACF) and the partial autocorrelation function (PACF), which typically decay rapidly for linear time series (Figure 1). Another option is the Box-Pierce test, which tests the residuals of a linear model for autocorrelation. If significant autocorrelations are found, it suggests a possible explanation of a non-linear time series.

Figure 1: Comparison of the ACF between an AR(1) process (linear) and a sinusoidal wave with period 100 and random disturbances (non-linear)

Uni- / Multivariate

In most cases, a time series predicts a single target variable per time step, such as energy production, which is the target variable. Such a time series prediction is called univariate. However, it can also be useful to predict multiple dependent target variables. For example, it is crucial to have insights not only into the times when a vehicle is connected or disconnected but also to anticipate its projected load demand. Such time series predictions are called multivariate.

Exogenous Data

Additional information is often useful for a time series beyond just the time series itself, as there may be correlations. Therefore, exogenous data is frequently used to improve time series predictions. Examples of this include using predicted wind speeds when forecasting energy production from wind power. (Sometimes, however, these exogenous data may not be available or it may be helpful to not consider their influence in the prediction.)

Stationary

The stationarity of a time series describes how its underlying stochastic process evolves. More specifically, a time series is stationary if its probability distribution remains invariant under time translation. This means that the mean and variance of the underlying distribution of the time series do not change over time. One way to test whether a time series is stationary is the Augmented Dickey-Fuller (ADF) test, which tests the null hypothesis that the time series is non-stationary. Typical non-stationary characteristics include trends or seasonal patterns. However, these characteristics can often be eliminated through clever transformations, allowing the use of methods that are typically only suitable for stationary time series.

Trends and Seasonality

Many time series exhibit what are called trends or seasonal patterns, as illustrated in Figure 2. A trend describes long-term patterns of change over time. For instance, when examining the energy generated through photovoltaics in recent years, a significant increasing trend is noticeable. Seasonality describes variations that occur at regular intervals. This could entail a daily or weekly typical progression, but it can also encompass seasonal differences over a year or multiple years. Examples of time series with seasonal patterns in the energy industry include higher heating demands in winter or electricity prices that notably decrease during midday due to photovoltaic input.

Figure 2: Plot of a time series devided into its constituent elements, which encompass Gaussian noise, a trend, and a seasonal component.

Characteristics of Non-Parametric Models

For non-parametric models, both the model requirements and the amount of available data are important considerations. The following decision criteria arise:

The number of data points in the time series. A small dataset consists of fewer than several thousand data points.
The computational effort required.
The implementation effort required to create the model. However, there are also packages available that simplify implementation.

Furthermore, it is specified which models are designed to predict sequences and accommodate them as input. A sequence in this context refers to multiple consecutive time steps within a time series. The utilization of sequences can be advantageous in forecasting of time series due to the potential presence of interdependencies or patterns within the series.

Decision Tree – Which Methods are Suitable?

The decision tree is intended to serve as a guide to finding the appropriate method for the given time series. Here, you can check which properties the time series or the method should have and thus arrive at the methods that match the selected properties.

Weitere Informationen

Literatur

[1] Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (2013). Time Series Analysis: Forecasting and Control (4th ed.). Wiley.

[2] Brockwell, P. J., & Davis, R. A. (2016). Introduction to Time Series and Forecasting. Springer.

[3] Paul, M., Krishnaswamy, S., & Athiyarath, S. (2020). A comparative study and analysis of time series forecasting techniques. SN Computer Science, 175/2020.

[4] Souzaa, V. M. A., Batistaa, E. A. P. A., & Parmezana, A. R. S. (2019). Evaluation of statistical and machine learning models for time series prediction: Identifying the state-of-the-art and the best conditions for the use of each model. Information Sciences, 484.

[5] Das, M., & Gosh, S. K. (2017). Data-driven approaches for meteorological time series prediction: A comparative study of the state-of-the-art computational intelligence techniques. Elsevier.

[6] Tsay, R. S., & Chen, R. (2019). Nonlinear Time Series Analysis. Hoboken, NJ, USA: John Wiley & Sons.