Convolutional and max-pooling layers are at the heart of the LeNet family models. It is a family of multilayered feed-forward networks specialized on visual pattern recognition.
While the exact details of the model will vary greatly, the following figure points out the graphical schema of a LeNet network:
In a LeNet model, the lower layers are composed of an alternating convolution and max-pooling, while the last layers are fully-connected and correspond to a traditional feed-forward network (fully-connected + softmax layer).
The input to the first fully-connected layer is the set of all feature maps at the layer below.
From a TensorFlow implementation point of view, this means lower layers operate on 4D tensors. These are then flattened to a 2D matrix to be compatible with a feed forward implementation.