One-dimension ConvNets for sentiment analysis are discussed in detail in the paper Convolutional Neural Networks for Sentence Classification, Yoon Kim, EMNLP 2014 (https://arxiv.org/abs/1408.5882) . Note that the model proposed by the paper retains some pieces of information about the position, thanks to the filter windows operating over consecutive words. The following image extracted from the paper graphically represents the key intuitions beyond the network. At the beginning the text is represented as a vector based on standard embeddings, providing us a compact representation in a one-dimensional dense space. Then the matrix is processed with multiple standard one-dimensional convolutional layers.
A couple of observations for better understanding the model are as follows:
- Filters are typically convolving on a continuous space. For images, this space is the pixel matrix representation which is spatially continuous over height and width. For texts, the continuous space is nothing more than the continuous dimension naturally induced by continuous words. If we use only words represented with one-hot encoding then space is sparse, if we use embedding then the resulting space is dense because similar words are aggregated.
- Images typically have three channels (RGB), while text naturally has only one channel because we have no need to represent colors.