SMOOTHING TECHNIQUES & TIME SERIES DECOMPOSITION

Time Series models can be used to forecast values over a time period, i.e., Forecasting values. There are many ways to forecast values. There are many ways to do it.

Table of contents hide

1 Exponential Smoothing

1.1 Single Exponential

2 Double or Triple Exponential

Exponential Smoothing

Exponential smoothing can also be known as ETS Model (Economic Time Series Model) or Holt-Winters Method.

The prerequisite for Smoothing is that the data must be ‘stationary.’ This means that the data must be stationary in order to use the Smoothing technique. If the data doesn’t have this, then it is converted into static data. If such conversion fails or is impossible, then Volatility is used, such as ARCH, GARCH, VAR, etc. The same applies to ARIMA methods.

There are many types of exponential smoothing, including Single Exponential and Double Exponential Smoothing.

The equation for exponential smoothing is Yt=f (Yt-1, Et-1), where Yt represents the current value and Yt-1 is the last time period value. Et-1 is the last period error.

Simply put, the current value of the time period is the function of the past value and past error.

You should also note that errors that are not independent of each other can show a pattern.

To understand the workings of exponential smoothing, we will use a dataset. Below is a dataset where the actual values (Price) are represented as Yt.

Single Exponential

The forecast values in exponential smoothing are represented by Ft, while the difference between Ft and Yt is represented by Et (error). The current time period is both a function of past error and past time period (Yt=f(Yt-1; Et-1)

The formula for exponential smoothing is:

Ft + 1 = Ft + a (Yt – Ft)

where,

Ft + 1 = forecast of current period

F t = the last period

a = Smoothing Constant (a number between 0 to 1)

You can also write the same formula in another way:

Ft + 1 = aY t+ (1 -a) F t

where,

F t+1= New Forecast

aY t = Alpha multiplied with the last real value

F t = Last Forecast Value

The dataset mentioned above will be used to implement any of the formulas. It is necessary to note that we do not know the exact value of alpha. A smaller value of alpha will cause visible and detectable smoothing, while a larger value will provide faster responses to changes in time series but less smoothing. You can either use trial-and-error methods to determine the alpha value, or you can use statistical software optimization techniques that automatically detect the correct alpha. We will now use the alpha value of 0.2 for the following calculations.

We use the following formula: Ft + 1 = (1 – A) Ft.

Notice how 2014-Q1 is the first entry. We don’t have any forecast values, so we use the forecast value from the past to determine the actual value. F1 =Y1. The forecast and actual first values are, therefore, identical. The second forecast value is the same as the first actual value. The forecast generally starts at the second entry, where the second forecast value assumes the previous actual value. The formula is used to calculate the forecast value and the previous forecast value. The MSE is 744.

Double or Triple Exponential

Double Exponential considers two past time periods as well as two past errors. Here, we require both a (alpha) and b (beta). Triple exponential is similar. We consider past three-time periods while requiring alpha, beta, and gamma.

If we use the single exponential smoothing method above and then add another exponential smoothing to it, our result will be twice exponential. If we keep going with the single exponential smoothing method, we’ll end up doing triple exponential smoothing. This is where ET models, which use Holter–Winters method, come in. The single exponential smoothing relies on static data, while the double exponential is capable of capturing linear trends, and the triple exponential can handle different types of data.

While the smoothing techniques can be very useful, there’s another frequently used technique called Time Series Decomposition.

Decomposition

Time Series Decomposition can be described as a pattern-based technique. Introduction To Time Series Data explains that the main components of time series data include trend, seasonality, and cyclicity.

To put it all in a formula, we can say that the current period is a function of these four components. Yt = F(Tt. St. Ct. It), where Yt represents the current time period. Tt indicates a trend, St indicates seasonality, and It indicates irregularity.

There are two types of decomposition models:

1) Additive Model: Yt = Tt + St + Ct + It

Here, Yt is the sum of the four components: Trend, Seasonality, Cyclicity, Irregularity, and Cyclicity.

2) Multiplicative: Yt = Tt x St x Ct x It

Here, Yt is the result of four components: Trend, Seasonality, and Cyclicity.

Time Series Decomposition is a process that uses a data set to decompose time series.

We have, for example, the following dataset.

Here, Yt is a price variable. The multiplicative Model can be used to determine that the ‘Price’ variable is = Tt x St x Ct x It.

Now, we will create a multiplicative-time decomposition model.

First:

First, we will add the variable ‘t.’ This will be nothing more than a time code that will be helpful in the next steps.

2nd Step:

Our data has four components: Trend, Seasonality, and Cyclicity. This can be confirmed by creating a line graph from the data.

It is clear that the price rises with the cyclicity, and there is an upward trend. The fourth quarter of each year sees the highest peak. There is also some irregularity. Cyclicity, however, is a rare phenomenon. We don’t include the cyclicity component in short-term forecasting because of our limited data.

Our Yt is therefore made up of three components: trend, seasonality, and irregularity.

3rd Step:    

This step aims to smoothen the data. The data is seasonal and irregular. We can smoothen the data by removing the peaks or the slumps. The moving average is used to calculate this. Because our season is divided into four quarters, the moving average can be calculated using four periods. We have shown below the calculation of the moving average. To get the values, we consider the fourth quarter.

We don’t calculate the moving average for row 4 (2017-Q4) because we don’t have the 17th value needed to compute the moving average.

4thStep:        

As we have taken an even number as our moving average in the previous step, we are now required to calculate a Centre Moving Average. This can be understood intuitively by looking at the first moving average we calculated, which is 67 (2014 Q3). It technically represents the center of the 2014 Q1-Q4 quarters as we averaged their values. The value of 67 should be between 2014 Q1, 2014Q2, and 2014Q3, 2014Q4. This should be repeated indefinitely, where the values should correspond to the exact center of each period.

As such, the value of 67 does not represent the 2014 Q3 but rather the value between Q2 and Q3. Because they fall between the numbers they are averaging, we don’t have a centered average. This is because the time period used to compute the average is an even number. We would not have needed to center the averages if the time periods had an odd value. However, we do. We calculate the Centred Moving Average by adding the two permanent values to the Moving average in order to return to the center.

If we find the median of the 2014 Q3-2014 Q4 values, then we can use that value to represent the 2014 Q3.

We don’t have 2017-Q3’s moving average, so we don’t know how to calculate the centered average.

These centered moving averages can now be plotted. This will give us a baseline that is data without seasonality or irregularity.

5thStep:        

The graph above can be viewed to see the differences between the orange and blue lines. This can be used for extracting seasonality or irregularity.

According to the multiplicative model, we know that Yt = Tt x St x Ct x It since there is no cyclicity, Yt = Tt x St x It.

To extract the irregularity and seasonality components, simply divide Yt by the Centered moving mean. We will get the seasonality or irregularity component by dividing the original data points by the smoothen-out data points.

This is a context: The value of 1.07 of St & It (Value of St & It 2014-Q3) indicates that the seasonality/irregularity component in 2014 Quater 3 was 7% higher than smoothed data or baseline, while the value of St & It (Value of St & It 2015-Q1) signifies that the seasonality/irregularity components were 20% lower for this time period of the year.

6thStep:

This step will extract the Seasonality component of the Seasonality and Regularity column. Seasonal Index is what we use to calculate this. Each of our “cycles” (not to be confused with cyclicity) is composed of four quarters. Each of our seasons is therefore made up of four quarters. To calculate the Season Index, we take each quarter’s Seasonal and Irregular values. This eliminates the irregularity.

These values are added to the main menu.

The seasonal Index value is what we mean. For example, the 2015 Q1 seasonality Index was 0.78. This means that the seasonal component in 2015 Q1 is 12% lower than the baseline but 19% higher in 2015 Q4.

7thStep:    

This is called Deseasonalizing. The baseline was the first time we calculated it. It was free from seasonality and irregularity. We then extracted the irregularity and seasonality from the baseline using it and the original values. We then isolated seasonality. Now that we know that Yt = Tt x ST x It, we can use the following formula: Tt x It = Yt/St

We can see the difference by plotting a line graph showing the Price variable and deseasonalizing variable. The orange line (Yt) contains all four components, while the red line (Deasoanlized) lacks peaks or slumps because the season component has been removed. This line is moving upwards, which means it has the trend and irregularity components.

8thStep:

We have so far isolated the seasonality part. This step extracts the trend component. We do this by running a simple linear analysis. The personalized variable will become our Y variable, while the t variable is our X variable. We get the following equation from our regression:

3.746x + 57.25

Here, 3.746 is a coefficient for the x variable and 57.25 for the intercept. This equation is used to calculate the trend line values. The first data point will have x = 1, while the second will have x = 2.

The trend line is a simple regression in which the x variable represents the time code and the y variable the deseasonalized values. The deseasonalized and trend lines are different because of the irregularities in the deseasonalized lines.

9thStep:  

We have been able to extract the trend and seasonality using the time series method. If we then compare them all, the orange line is Yt. It has all three components. Cyliclity is not considered in this example. The baseline is the blue line. It is calculated using a centered moving average. This gives us a baseline that is not affected by seasonality or irregularity. However, this baseline cannot be considered a trendline. The grey line with the irregularity and seasonality components is then used to create the baseline. This purple line is the seasonality line. We can then use the seasonality component to identify the trend and irregularity (red line-deseasonalized). By doing a simple regression using the depersonalized value, the black line (trendline) is possible.

Now we can make predictions using the multiplicative model. Yt = Xt x ST. First, we forecast the actual values for the time period. This will allow us to use some error measures.

We can use the MSE to measure error, such as the mean squared error. To do this, we need to subtract the original and forecasted values from each other and then square them. Then we can take the average of these values and calculate the MSE. In our case, it is 6.2.

We also forecast the next 4 quarters. This is done by taking the trend and seasonality components and multiplying them to get the following numbers.

We can visualize the future values if we plot the forecasted and actual values for the next two years (2018 and 2019, respectively).

It is clear that the multiplicative-time decomposition model can accurately forecast values. You can also use the additive model if there is a significant trend but not seasonality, but the multiplicative model works better if there is substantial seasonality or trend.

ETS and time series decomposition are medium-level techniques for forecasting values. They should be used when data shows trends and seasonality. Other high-level techniques are also discussed in the next blog, which will include techniques from the ARIMA family.