Thought vectors

At the middle of the encoding and decoding text process is the generation of a thought vector. The thought vector, popularized by the godfather himself, Dr. Geoffrey Hinton, represents a vector that shows the context of one element in relation to many other elements.

For instance, the word hello could have a high relational context to many words or phrases, such as hi, how are you?, hey, goodbye, and so on. Likewise, words such as red, blue, fire, and old would have a low context when associated with the word hello, at least in regular day-to-day speech. The word or character contexts are based on the pairings we have in the machine translation file. In this example, we are using the French translation pairings, but the pairings could be anything.

This process takes place as part of the first encoding model into the thought vector or, in this case, a vector of probabilities. The LSTM layer calculates the probability or context of how the words/characters are related. You will often come across the following equation, which describes this transformation:

Consider the following:

  • = output sequence
  •  = input sequence
  • = Vector representation

The  represents the multiplication form of sigma () and is used to pool the probabilities into the thought vector. This is a big simplification of the whole process, and the interested reader is encouraged to Google more about sequence-to-sequence learning on their own. For our purposes, the critical thing to remember is that each word/character has a probability or context that relates it to another. Generating this thought vector can be time consuming and memory-intensive, as you may have already noticed. Therefore, for our purposes, we will look at a more comprehensive set of natural language tools in order to create a neural conversational bot in the next section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset