Text Generation Using Recurrent Neural Networks

In this chapter, we will describe some of the most exciting techniques in modern (at the time of writinglate 2017) machine learningrecurrent neural networks. They are, however, not new; they have been around since the 1980s, but they have become popular due to the numerous records in language-related tasks in recent years.

Why do we need a different type of architecture for text? Consider the following example:

"I live in Prague since 2015"

and 

"Since 2015 I live in Prague"

If we would like to teach a traditional feed-forward network such as a perceptron or a multi-layer perceptron to identify the date I moved to Prague, then this network would have to learn separate parameters for each input feature, which in particular implies that it would have to learn grammar to answer this simple question! This is undesirable in many applications. Similar issues motivated machine learning researchers and statisticians in the 1980s to introduce the idea of sharing parameters across different parts of the model. This idea is the secret sauce of recurrent neural networks, our next deep learning architecture.

By design, recurrent neural networks are well-suited for processing sequential data. In general, machine learning applied to sequential data can be roughly divided into four main areas:

  • Sequence prediction: Given , predict the next element of the sequence,
  • Sequence classification: Given , predict a category or label for it
  • Sequence generation: Given , generate a new element of the sequence, 
  • Sequence to sequence prediction: Given , generate an equivalent sequence, 

Applications of sequence prediction include weather forecasting and stock market prediction. For classification, we can think, for example, of sentiment analysis and document classification. Automatic image captioning or text generation are part of the sequence generation family of problems, whereas machine translation might be the most familiar example of sequence to sequence prediction we see in our everyday lives.

Our focus for this chapter is on applications of recurrent neural networks for text generation. Since, as we saw previously, text generation is part of a much larger set of problemsmany of our algorithms are portable to other contexts.

Training deep learning models is often time-consuming, and recurrent neural networks are not the exception. Our focus is on the ideas over the data, which we will illustrate with smaller datasets than those that you might encounter later on in the wild. This is for the purpose of clarity: We want to make it easier for you to get started on any standard laptop. Once you grasp the basics, you can spin off your own cluster in your favorite cloud provider.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset