Creating the dot model

Our first model will consist of two embedding layers, one for the movie index and one for the user index, as well as their dot product. We will use the keras.layers package, which contains the necessary layer implementations, as well as the Model implementation from the keras.models package. The layers that we will utilize are as follows:

  • TheInput layer, which is responsible for creating Keras tensors from more conventional Python data types
  • The Embedding layer, which is the implementation of embedding layers
  • The Flatten layer, which transforms any Keras n-dimensional tensor to a single dimensional tensor
  • The Dot layer, which implements the dot product

Furthermore, we will utilize train_test_split and metrics from sklearn:

from keras.layers import Input, Embedding, Flatten, Dot, Dense, Concatenate
from keras.models import Model
from sklearn.model_selection import train_test_split
from sklearn import metrics

import numpy as np
import pandas as pd

Apart from setting the random seed of numpy, we define a function that loads and preprocesses our data. We read the data from the .csv file, drop the timestamp, and shuffle the data by utilizing the shuffle function of pandas. Furthermore, we create a train/test split of 80%/20%. We then re-map the dataset's indices in order to have consecutive integers as indices:

def get_data():
# Read the data and drop timestamp
data = pd.read_csv('ratings.csv')
data.drop('timestamp', axis=1, inplace=True)

# Re-map the indices
users = data.userId.unique()
movies = data.movieId.unique()
# Create maps from old to new indices
moviemap={}
for i in range(len(movies)):
moviemap[movies[i]]=i
usermap={}
for i in range(len(users)):
usermap[users[i]]=i

# Change the indices
data.movieId = data.movieId.apply(lambda x: moviemap[x])
data.userId = data.userId.apply(lambda x: usermap[x])

# Shuffle the data
data = data.sample(frac=1.0).reset_index(drop=True)

# Create a train/test split
train, test = train_test_split(data, test_size=0.2)

n_users = len(users)
n_movies = len(movies)

return train, test, n_users, n_movies
train, test, n_users, n_movies = get_data()

In order to create the network, we first define the movie part of the input. We create an Input layer, which will act as the interface to our pandas dataset by accepting its data and transforming it into Keras tensors. Following this, the layer's output is fed into the Embedding layer, in order to map the integer to a five-dimensional space. We define the number of possible indices as n_movies (first parameter), and the number of features as fts (second parameter). Finally, we flatten the output. The same process is repeated for the user part:

fts = 5

# Movie part. Input accepts the index as input
# and passes it to the Embedding layer. Finally,
# Flatten transforms Embedding's output to a
# one-dimensional tensor.
movie_in = Input(shape=[1], name="Movie")
mov_embed = Embedding(n_movies, fts, name="Movie_Embed")(movie_in)
flat_movie = Flatten(name="FlattenM")(mov_embed)

# Repeat for the user.
user_in = Input(shape=[1], name="User")
user_inuser_embed = Embedding(n_users, fts, name="User_Embed")(user_in)
flat_user = Flatten(name="FlattenU")(user_inuser_embed)

Finally, we define the dot product layer, with the two flattened embeddings as inputs. We then define Model by specifying the user_in and movie_in (Input) layers as inputs, and the prod (Dot) layer as an output. After defining the model, Keras needs to compile it in order to create the computational graph. During compilation, we define the optimizer and loss functions:

# Calculate the dot-product of the two embeddings
prod = Dot(name="Mult", axes=1)([flat_movie, flat_user])

# Create and compile the model
model = Model([user_in, movie_in], prod)
model.compile('adam', 'mean_squared_error')

By calling model.summary(), we can see that the model has around 52,000 trainable parameters. All of these parameters are in the Embedding layers. This means that the network will only learn how to map the user and movie indices to the five-dimensional space. The function's output is as follows:

The model's summary

Finally, we fit the model to our train set and evaluate it on the test set. We train the network for ten epochs in order to observe how it behaves, as well as how much time it needs to train itself. The following code depicts the training progress of the network:

# Train the model on the train set
model.fit([train.userId, train.movieId], train.rating, epochs=10, verbose=1)

# Evaluate on the test set
print(metrics.mean_squared_error(test.rating,
model.predict([test.userId, test.movieId])))

Take a look at the following screenshot:

Training progress of the dot product network

The model is able to achieve an MSE of 1.28 on the test set. In order to improve the model, we could increase the number of features each Embedding layer is able to learn, but the main limitation is the dot product layer. Instead of increasing the number of features, we will give the model the freedom to choose how to combine the two layers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset