Configuring a data generator

Similar to ARIMA, for our LSTM model, we would like the model to use lagging historical data to predict actual data at a given point in time. However, in order to feed this data forward to an LSTM model, we must format the data so that a given number of columns contain all the lagging values and one column contains the target value. In the past, this was a slightly tedious process, but now we can use a data generator to make this task much simpler. In our case, we will use a time-series generator that produces a tensor that we can use for our LSTM model.

The arguments we will include when generating our data are the data objects we will use along with the target. In this case, we can use the same data object as values for both arguments. The reason this is possible has to do with the next argument, called length, which configures the time steps to look back in order to populate the lagging price values. Afterward, we define the sampling rate and the stride, which determine the number of time steps between consecutive values for the target values per row and the lagging values per sequence, respectively. We also define the starting index value and the ending index value. We also need to determine whether data should be shuffled or kept in chronological order and whether the data should be in reverse chronological order or maintain its current sort order. Finally, we select a batch size, which specifies how many time series samples should be in each batch of the model.

For this model, we will create our generated time-series data with a length value of 3, meaning that we will look back 3 days to predict a given day. We will keep the sampling rate and stride at 1 to include all data. Next, we will split the train and test sets with an index point of 1258. We will not shuffle or reverse the data, but rather maintain its chronological order and set the batch size to 1 so that we model one price at a time. We create our train and test sets using these values for our parameters via the following code:

train_gen <- timeseries_generator(
closing_deltas,
closing_deltas,
length = 3,
sampling_rate = 1,
stride = 1,
start_index = 1,
end_index = 1258,
shuffle = FALSE,
reverse = FALSE,
batch_size = 1
)

test_gen <- timeseries_generator(
closing_deltas,
closing_deltas,
length = 3,
sampling_rate = 1,
stride = 1,
start_index = 1259,
end_index = 1507,
shuffle = FALSE,
reverse = FALSE,
batch_size = 1
)

After running this code, you will see two tensor objects in your Environment pane. Now that we have configured our data generator and used it to create two sequence tensors, we are ready to model our data using LSTM.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset