Conv1D

You might remember Convolution Neural Networks (ConvNets, or CNNs) from Chapter 3, Utilizing Computer Vision, where we looked briefly at roofs and insurance. In computer vision, convolutional filters slide over the image two-dimensionally. There is also a version of convolutional filters that can slide over a sequence one-dimensionally. The output is another sequence, much like the output of a two-dimensional convolution was another image. Everything else about one-dimensional convolutions is exactly the same as two-dimensional convolutions.

In this section, we're going to start by building a ConvNet that expects a fixed input length:

n_features = 29
max_len = 100

model = Sequential()

model.add(Conv1D(16,5, input_shape=(100,29)))
model.add(Activation('relu'))
model.add(MaxPool1D(5))

model.add(Conv1D(16,5))
model.add(Activation('relu'))
model.add(MaxPool1D(5))
model.add(Flatten())
model.add(Dense(1))

Notice that next to Conv1D and Activation, there are two more layers in this network. MaxPool1D works exactly like MaxPooling2D, which we used earlier in the book. It takes a piece of the sequence with a specified length and returns the maximum element in the sequence. This is similar to how it returned the maximum element of a small window in two-dimensional convolutional networks.

Take note that max pooling always returns the maximum element for each channel. Flatten transforms the two-dimensional sequence tensor into a one-dimensional flat tensor. To use Flatten in combination with Dense, we need to specify the sequence length in the input shape. Here, we set it with the max_len variable. We do this because Dense expects a fixed input shape and Flatten will return a tensor based on the size of its input.

An alternative to using Flatten is GlobalMaxPool1D, which returns the maximum element of the entire sequence. Since the sequence is fixed in size, you can use a Dense layer afterward without fixing the input length.

Our model compiles just as you would expect:

model.compile(optimizer='adam',loss='mean_absolute_percentage_error')

We then train it on the generator that we wrote earlier. To obtain separate train and validation sets, we must first split the overall dataset and then create two generators based on the two datasets. To do this, run the following code:

from sklearn.model_selection import train_test_split

batch_size = 128
train_df, val_df = train_test_split(train, test_size=0.1)
train_gen = generate_batches(train_df,batch_size=batch_size)
val_gen = generate_batches(val_df, batch_size=batch_size)

n_train_samples = train_df.shape[0]
n_val_samples = val_df.shape[0]

Finally, we can train our model on a generator, just like we did in computer vision:

model.fit_generator(train_gen, epochs=20,steps_per_epoch=n_train_samples // batch_size, validation_data= val_gen, validation_steps=n_val_samples // batch_size)

Your validation loss will still be quite high, around 12,798,928. The absolute loss value is never a good guide for how well your model is doing. You'll find that it's better to use other metrics in order to see whether your forecasts are useful. However, please note that we will reduce the loss significantly later in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset