Summary

The attention mechanism is the hottest topic in deep learning today and is conceived to be in the center of most of the cutting-edge algorithms under current research, and in probable future applications. Problems such as image captioning, visual question answering, and many more have gotten great solutions by using this approach. In fact, attention is not limited to visual tasks and was conceived earlier for problems such as neural machine translations and other sophisticated NLP problems. Thus, understanding the attention mechanism is vital to mastering many advanced deep learning techniques.

CNNs are used not only for vision but also for many good applications with attention for solving complex NLP problems, such as modeling sentence pairs and machine translation. This chapter covered the attention mechanism and its application to some NLP problems, along with image captioning and recurrent vision models. In RAMs, we did not use CNN; instead, we applied RNN and attention to reduced-size representations of an image from the Glimpse Sensor. But there are recent works to apply attention to CNN-based visual models as well.

Readers are highly encouraged to go through the original papers in the references and also explore advanced concepts in using attention, such as multi-level attention, stacked attention models, and the use of RL models (such as the Asynchronous Advantage Actor-Critic (A3C) model for the hard attention control problem).

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary