Persistence model forecast

A persistent forecast is a good baseline forecast for a time series that is linearly increasing. The persistence forecast is where the observation from the prior time step (t-1) is used to predict the observation at the current time step (t). We can implement this by taking the last observation from the training data and history accumulated by walk-forward validation, and using that to predict the current time step:

#now we can see the patterns
X=sample.values
X
train, test = X[0:-100], X[-100:]
train
test

In the sample of the first 500 observations, we set the first 400 observations to be the training data and the last 100 to be the test data. After the training and predicting, we get a one-step-ahead forecast as the following plot along with the original dataset:

# walk-forward validation
history = [x for x in train]
predictions = list()
for i in range(len(test)):
# make prediction
predictions.append(history[-1])
# observation
history.append(test[i])
# report performance
rmse = math.sqrt(mean_squared_error(test, predictions))
print('RMSE: %.3f' % rmse)
# line plot of observed vs predicted
pyplot.plot(test)
pyplot.plot(predictions)
pyplot.show()
rmse

This snippet should plot the difference between the original data and the predicted data. The output screenshot is given as follows:

Screenshot 14.5: Original data and predicted data

The total Root Mean Squared Error (RMSE) in our case was 0.5302435957527816. 

For a sample size of 50, we set the first 40 observations as the training data and the last 10 as the test data. After the training and predicting, we get the one-step-ahead forecast as the following plot along with the original dataset:

#a smaller sample of 50
sample2=sample[0:50]
X2=sample2.values
X2
train2, test2 = X2[0:-10], X2[-10:]
train2
test2

In this case, we considered a small sample to see the effect with some precision. Now, let's make a prediction and see the difference:

# walk-forward validation
history2 = [x for x in train2]
predictions2 = list()
for i in range(len(test2)):
# make prediction2
predictions2.append(history2[-1])
# observation
history2.append(test2[i])
# report performance
rmse2 = math.sqrt(mean_squared_error(test2, predictions2))
print('RMSE: %.3f' % rmse2)
# line plot of observed vs predicted
pyplot.plot(test2)
pyplot.plot(predictions2)
pyplot.show()
rmse2

The output graph should look like the following screenshot:

Screenshot 14.6: Original data versus predicted data when the sample size is small

In this case, the RMSE is 0.6220490938792729, which is way more than with our original case. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset