You might remember Convolution Neural Networks (ConvNets, or CNNs) from Chapter 3, Utilizing Computer Vision, where we looked briefly at roofs and insurance. In computer vision, convolutional filters slide over the image two-dimensionally. There is also a version of convolutional filters that can slide over a sequence one-dimensionally. The output is another sequence, much like the output of a two-dimensional convolution was another image. Everything else about one-dimensional convolutions is exactly the same as two-dimensional convolutions.
In this section, we're going to start by building a ConvNet that expects a fixed input length:
n_features = 29 max_len = 100 model = Sequential() model.add(Conv1D(16,5, input_shape=(100,29))) model.add(Activation('relu')) model.add(MaxPool1D(5)) model.add(Conv1D(16,5)) model.add(Activation('relu')) model.add(MaxPool1D(5)) model.add(Flatten()) model.add(Dense(1))
Notice that next to Conv1D
and Activation
, there are two more layers in this network. MaxPool1D
works exactly like MaxPooling2D
, which we used earlier in the book. It takes a piece of the sequence with a specified length and returns the maximum element in the sequence. This is similar to how it returned the maximum element of a small window in two-dimensional convolutional networks.
Take note that max pooling always returns the maximum element for each channel. Flatten
transforms the two-dimensional sequence tensor into a one-dimensional flat tensor. To use Flatten
in combination with Dense
, we need to specify the sequence length in the input shape. Here, we set it with the max_len
variable. We do this because Dense
expects a fixed input shape and Flatten
will return a tensor based on the size of its input.
An alternative to using Flatten
is GlobalMaxPool1D
, which returns the maximum element of the entire sequence. Since the sequence is fixed in size, you can use a Dense
layer afterward without fixing the input length.
Our model compiles just as you would expect:
model.compile(optimizer='adam',loss='mean_absolute_percentage_error')
We then train it on the generator that we wrote earlier. To obtain separate train and validation sets, we must first split the overall dataset and then create two generators based on the two datasets. To do this, run the following code:
from sklearn.model_selection import train_test_split batch_size = 128 train_df, val_df = train_test_split(train, test_size=0.1) train_gen = generate_batches(train_df,batch_size=batch_size) val_gen = generate_batches(val_df, batch_size=batch_size) n_train_samples = train_df.shape[0] n_val_samples = val_df.shape[0]
Finally, we can train our model on a generator, just like we did in computer vision:
model.fit_generator(train_gen, epochs=20,steps_per_epoch=n_train_samples // batch_size, validation_data= val_gen, validation_steps=n_val_samples // batch_size)
Your validation loss will still be quite high, around 12,798,928. The absolute loss value is never a good guide for how well your model is doing. You'll find that it's better to use other metrics in order to see whether your forecasts are useful. However, please note that we will reduce the loss significantly later in this chapter.