Chapter 1. Packets of thought (NLP overview)
1.1. Natural language vs. programming language
1.4. Language through a computer’s “eyes”
1.5. A brief overflight of hyperspace
Chapter 2. Build your vocabulary (word tokenization)
2.1. Challenges (a preview of stemming)
2.2. Building your vocabulary with a tokenizer
2.2.2. Measuring bag-of-words overlap
Chapter 3. Math with words (TF-IDF vectors)
Chapter 4. Finding meaning in word counts (semantic analysis)
4.1. From word counts to topic scores
4.1.1. TF-IDF vectors and lemmatization
4.3. Singular value decomposition
4.3.1. U—left singular vectors
4.4. Principal component analysis
4.4.2. Stop horsing around and get back to NLP
4.4.3. Using PCA for SMS message semantic analysis
4.4.4. Using truncated SVD for SMS message semantic analysis
4.5. Latent Dirichlet allocation (LDiA)
4.5.2. LDiA topic model for SMS messages
2. Deeper learning (neural networks)
Chapter 5. Baby steps with neural networks (perceptrons and backpropagation)
5.1. Neural networks, the ingredient list
5.1.4. Let’s go skiing—the error surface
5.1.5. Off the chair lift, onto the slope
5.1.6. Let’s shake things up a bit
Chapter 6. Reasoning with word vectors (Word2vec)
6.1. Semantic queries and analogies
6.2.1. Vector-oriented reasoning
6.2.2. How to compute Word2vec representations
6.2.3. How to use the gensim.word2vec module
6.2.4. How to generate your own word vector representations
6.2.5. Word2vec vs. GloVe (Global Vectors)
Chapter 7. Getting words in order with convolutional neural networks (CNNs)
7.3. Convolutional neural nets
7.4.1. Implementation in Keras: prepping the data
7.4.2. Convolutional neural network architecture
7.4.5. The cherry on the sundae
7.4.6. Let’s get to learning (training)
Chapter 8. Loopy (recurrent) neural networks (RNNs)
8.1. Remembering with recurrent networks
8.1.1. Backpropagation through time
8.3. Let’s get to learning our past selves
Chapter 9. Improving retention with long short-term memory networks
9.1.1. Backpropagation through time
9.1.2. Where does the rubber hit the road?
9.1.5. Words are hard. Letters are easier.
9.1.7. My turn to speak more clearly
Chapter 10. Sequence-to-sequence models and attention
10.1. Encoder-decoder architecture
10.2. Assembling a sequence-to-sequence pipeline
10.2.1. Preparing your dataset for the sequence-to-sequence training
10.3. Training the sequence-to-sequence network
10.4. Building a chatbot using sequence-to-sequence networks
10.4.1. Preparing the corpus for your training
10.4.2. Building your character dictionary
10.4.3. Generate one-hot encoded training sets
10.4.4. Train your sequence-to-sequence chatbot
3. Getting real (real-world NLP challenges)
Chapter 11. Information extraction (named entity extraction and question answering)
11.1. Named entities and relations
11.3. Information worth extracting
11.4. Extracting relationships (relations)
11.4.1. Part-of-speech (POS) tagging
11.4.2. Entity name normalization
11.4.3. Relation normalization and extraction
Chapter 12. Getting chatty (dialog engines)
12.2. Pattern-matching approach
12.8.1. Ask questions with predictable answers
Chapter 13. Scaling up (optimization, parallelization, and batch processing)
13.1. Too much of a good thing (data)
13.2. Optimizing NLP algorithms
13.2.3. Advanced indexing with Annoy
13.4. Parallelizing your NLP computations
13.5. Reducing the memory footprint during model training
13.6. Gaining model insights with TensorBoard
B. Playful Python and regular expressions
B.2. Mapping in Python (dict and OrderedDict)
C. Vectors and matrices (linear algebra fundamentals)
D. Machine learning tools and techniques
D.1. Data selection and avoiding bias
D.3. Knowing is half the battle
E.1. Steps to create your AWS GPU instance
F.1. High-dimensional vectors are different
F.2. High-dimensional indexing
Applications and project ideas
Open source full-text indexers