A trend $T$ is actually a smoothened version a time series, it helps to capture global tendencies. To get a trend line here is what you have to do:
- Determine the seasonal periodicity of the time series (if there is one). These periodic patterns are usually visible, but if you cannot see them form the plotted chart, there are also methods using Fourier Transform Algorithms to determine them. In our example we see a yearly periodicity, as the values for the airplane passengers come in monthly, the periodic value is m=12.
- With this number m use methods like the moving average of order m to determine the values of the trend
Now what is the moving average?
Once you understand the concept, it is easy to remember: Imagine that your dataset consists of 5 values $y_1$, ..., $y_5$. To determine the value of the trend of order $m = 3$ you would take the original value, the value of the predecessor and the value of the successor and create an average. In the simplest approach you simply take the sum of the values devided by the number of values (the so called Simple Moving Average SMA).
In the example of 5 values this would look like:
$y_1$ 3
$y_2$ 5 -> (3+5+4)/3 = 4
$y_3$ 4 -> (5+4+1)/3 = 3.33
$y_4$ 1 -> (4+1+3)/3 = 2.67
$y_5$ 3
You already see the problem here: for the beginning and the ending there are no values available, the missing tail depends on $m$.
What if we choose $m=4$? We would have to decide to take more points from past than from future. In these cases the algorithm is not symmetric anymore, usually you therefore either change to the next odd number ($m=13$) or you choose a so called centered moving average. In the centered moving average you use a simple moving average of order 2 in the first step to determine values like $y_{1.5}$. Then you use the SMA of order $m$.
$y_1$ 3
-> $y_{1.5}$ = (3+5)/2 = 4
$y_2$ 5
-> $y_{2.5}$ = (5+4)/2 = 4.5
$y_3$ 4 -> (4+4.5+2.5+2) / 4 = 3.25
-> $y_{3.5}$ = (4+1)/2 = 2.5
$y_4$ 1
-> $y_{4.5}$ = (1+3)/2 = 2
$y_5$ 3
In our earlier example of air passengers we determined $m=12$ (even) and the trend values in yellow are determined by a centered moving average of order 12.
Here the command in R (ma stands for Moving Average):
lines(ma(x, order = 3, centre = TRUE))
After successfully determining the trend, we can remove it from the original data. In an additive model we get the de-trended time series by substracting it ($Y_t - T_t$), in a multiplicative model by deviding $Y_t/T_T$. This detrended time series of our AirPassenger example looks like this:
The next step is now to get the seasonality component and the random component from the de-trened time series.
0 Comment