Neural recommendation systems

Instead of explicitly defining similarity metrics, we can utilize deep learning techniques in order to learn good representations and mappings of the feature space. There are a number of ways to employ neural networks in order to build recommendation systems. In this chapter, we will present two of the simplest ways to do so in order to demonstrate the ability to incorporate ensemble learning into the system. The most important piece that we will utilize in our networks is the embedding layer. These layer types accept an integer index as input and map it to an n-dimensional space. For example, a two-dimensional mapping could map 1 to [0.5, 0.5]. Utilizing these layers, we will be able to feed the user's index and the movie's index to our network, and the network will predict the rating for the specific user-movie combination.

The first architecture that we will test consists of two embedding layers, where we will multiply their outputs using a dot product, in order to predict the user's rating of the movie. The architecture is depicted in the following diagram. Although it is not a traditional neural network, we will utilize backpropagation in order to train the parameters of the two embedding layers:

Simple dot product architecture

The second architecture is a more traditional neural network. Instead of relying on a predefined operation to combine the outputs of the embedding layers (the dot product), we will allow the network to find the optimal way to combine them. Instead of a dot product, we will feed the output of the embedding layers to a series of fully-connected (dense) layers. The architecture is depicted in the following diagram:

The fully connected architecture

In order to train the networks, we will utilize the Adam optimizer, and we will use the mean squared error (MSE) as a loss function. Our goal will be to predict the ratings of movies for any given user as accurately as possible. As the embedding layers have a predetermined output dimension, we will utilize a number of networks with different dimensions in order to create a stacking ensemble. Each individual network will be a separate base learner, and a relatively simple machine learning algorithm will be utilized in order to combine the individual predictions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset