Different kinds of stationarity

Stationarity can mean different things, and it is crucial to understand which kind of stationarity is required for the task at hand. For simplicity, we will just look at two kinds of stationarity here: mean stationarity and variance stationarity. The following image shows four time series with different degrees of (non-)stationarity:

Different kinds of stationarity

Mean stationarity refers to the level of a series being constant. Here, individual data points can deviate, of course, but the long-run mean should be stable. Variance stationarity refers to the variance from the mean being constant. Again, there may be outliers and short sequences whose variance seems higher, but the overall variance should be at the same level. A third kind of stationarity, which is difficult to visualize and is not shown here, is covariance stationarity. This refers to the covariance between different lags being constant. When people refer to covariance stationarity, they usually mean the special condition in which mean, variance, and covariances are stationary. Many econometric models, especially in risk management, operate under this covariance stationarity assumption.

Why stationarity matters

Many classic econometric methods assume some form of stationarity. A key reason for this is that inference and hypothesis testing work better when time series are stationary. However, even from a pure forecasting point of view, stationarity helps because it takes some work away from our model. Take a look at the Not Mean Stationary series in the preceding charts. You can see that a major part of forecasting the series is to recognize the fact that the series moves upward. If we can capture this fact outside of the model, the model has to learn less and can use its capacity for other purposes. Another reason is that it keeps the values we feed into the model in the same range. Remember that we need to standardize data before using a neural network. If a stock price grows from $1 to $1,000, we end up with non-standardized data, which will in turn make training difficult.

Making a time series stationary

The standard method to achieve mean stationarity in financial data (especially prices) is called differencing. It refers to computing the returns from prices. In the following image, you can see the raw and differenced versions of S&P 500. The raw version is not mean stationary as the value grows, but the differenced version is roughly stationary.

Making a time series stationary

Another approach to mean stationarity is based on linear regression. Here, we fit a linear model to the data. A popular library for this kind of classical modeling is statsmodels, which has an inbuilt linear regression model. The following example shows how to use statsmodels to remove a linear trend from data:

time = np.linspace(0,10,1000)
series = time
series = series + np.random.randn(1000) *0.2

mdl = sm.OLS(time, series).fit()
trend = mdl.predict(time)
Making a time series stationary

It is worth emphasizing that stationarity is part of modeling and should be fit on the training set only. This is not a big issue with differencing, but can lead to problems with linear detrending.

Removing variance non-stationarity is harder. A typical approach is to compute some rolling variance and divide new values by that variance. On the training set, you can also studentize the data. To do this, you need to compute the daily variance, and then divide all values by the root of it. Again, you may do this only on the training set, as the variance computation requires that you already know the values.

When to ignore stationarity issues

There are times when you should not worry about stationarity. When forecasting a sudden change, a so-called structural break, for instance. In the Wikipedia example, we are interested in knowing when the sites begin to be visited much more frequently than they were before. In this case, removing differences in level would stop our model from learning to predict such changes. Equally, we might be able to easily incorporate the non-stationarity into our model, or it can be ensured at a later stage in the pipeline. We usually only train a neural network on a small subsequence of the entire dataset. If we standardize each subsequence, the shift of mean within the subsequence might be negligible and we would not have to worry about it. Forecasting is a much more forgiving task than inference and hypothesis testing, so we might get away with a few non-stationarities if our model can pick up on them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset