Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Moving block bootstrapping time series data

If you followed along with the Block bootstrapping time series data recipe, you are now aware of a simple bootstrapping scheme for time series data. The moving block bootstrapping algorithm is a bit more complicated. In this scheme, we generate overlapping blocks by moving a fixed size window, similar to the moving average. We then assemble the blocks to create new data samples.

In this recipe, we will apply the moving block bootstrap to annual temperature data to generate lists of second difference medians and the slope of an AR(1) model. This is an autoregressive model with lag 1. Also, we will try to neutralize outliers and noise with a median filter.

How to do it...

The following code snippets are from the moving_boot.ipynb file in this book's code bundle:

The imports are as follows:

import dautil as dl
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import ch6util
from scipy.signal import medfilt
from IPython.display import HTML

Define the following function to bootstrap the data:

def shuffle(temp):
    indices = np.random.choice(start, n/12)
    sample = dl.collect.flatten([temp.values[i: i + 12] for i in indices])
    sample = medfilt(sample)
    df = pd.DataFrame({'TEMP': sample}, index=temp.index[:len(sample)])
    df = df.resample('A', how=np.median)

    return df

Load the data as follows:

temp = dl.data.Weather.load()['TEMP'].resample('M', how=np.median).dropna()
n = len(temp)
start = np.arange(n - 11)
np.random.seed(2609787)

Plot a few random realizations as a sanity check:

sp = dl.plotting.Subplotter(2, 2, context)
cp = dl.plotting.CyclePlotter(sp.ax)
medians = []
slopes = []

for i in range(240):
    df = shuffle(temp)
    slopes.append(dl.ts.ar1(df.values.flatten())['slope'])
    medians.append(ch6util.diff_median(df, 2))

    if i < 5:
        cp.plot(df.index, df.values)
        
sp.label(ylabel_params=dl.data.Weather.get_header('TEMP'))

Plot the distribution of the second difference medians using the bootstrapped data:
```
sns.distplot(medians, ax=sp.next_ax())
sp.label()
```
Plot the distribution of the AR(1) model slopes using the bootstrapped data:
```
sns.distplot(slopes, ax=sp.next_ax())
sp.label()
```

Plot the confidence intervals for a varying number of bootstraps:

mins = []
tops = []
xrng = range(30, len(medians))

for i in xrng:
    min, max = dl.stats.outliers(medians[:i])
    mins.append(min)
    tops.append(max)

    
cp = dl.plotting.CyclePlotter(sp.next_ax())
cp.plot(xrng, mins, label='5 %')
cp.plot(xrng, tops, label='95 %')
sp.label()
HTML(sp.exit())

Refer to the following screenshot for the end result:

Table of Contents for
Moving block bootstrapping time series data

Moving block bootstrapping time series data

How to do it...

See also

Table of Contents for Moving block bootstrapping time series data

Create new playlist

Sign In

Sign Up

Moving block bootstrapping time series data

How to do it...

See also

Table of Contents for
Moving block bootstrapping time series data