—Artificial general intelligence
Machine intelligence capable of solving a variety of problems that human brains can solve
—Artificial intelligence
Machine behavior that is impressive enough to be called intelligent by scientists or corporate marketers
—Artificial Intelligence Markup Language
A pattern matching and templated response specification language in XML that was invented during the building of A.L.I.C.E., one of the first conversational chatbots
—Approximate nearest neighbors
Finding the M closest vectors to a single vector in a set of N high-dimensional vectors is an O(N) problem, because you have to calculate your distance metric between every other vector and the target vector. This makes clustering an intractable O(N2).
—Artificial neural network
—Application programmer interface
A user interface for your customers that are developers, usually a command line tool, source code library, or web interface that they can interact with programmatically
—Amazon Web Services
Amazon invented the concept of cloud services when they exposed their internal infrastructure to the world.
—Bag of words
A data structure (usually a vector) that retains the counts (frequencies) of words but not their order
—Constant error carousel
A neuron that outputs its input delayed by one time step. Used within an LSTM or GRU memory unit. This is the memory register for an LSTM unit and can only be reset to a new value by the forgetting gate interrupting this “carousel.”
—Convolutional neural network
A neural network that is trained to learn filters, also known as kernels, for feature extraction in supervised learning
—Compute Unified Device Architecture
An Nvidia open source software library optimized for running general computations/algorithms on a GPU
—Directed acyclic graph
A network topology without any cycles, connections that loop back on themselves
—Deterministic finite automaton
A finite state machine that doesn’t make random choices. The re package in Python compiles regular expressions to create a DFA, but the regex can compile fuzzy regular expressions into NDFA (nondeterministic FA).
—Finite-state machine
Kyle Gorman and Wikipedia can explain this better than I (https://en.wikipedia.org/wiki/Finite-state_machine).
—Finite-state transducer
Like regular expressions, but they can output a new character to replace each character they matched. Kyle Gorman explains them well (www.openfst.org).
—Geographic information system
A database for storing, manipulating, and displaying geographic information, usually involving latitude, longitude, and altitude coordinates and traces.
—Graphical processing unit
The graphics card in a gaming rig, a cryptocurrency mining server, or a machine learning server
—Gated recurrent unit
A variation of long short-term memory networks with shared parameters to cut computation time
—A graph data structure that enables efficient search (and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs (https://arxiv.org/vc/arxiv/papers/1603/1603.09320v1.pdf) by Yu A. Malkov and D. A. Yashunin)
—High performance computing
The study of systems that maximize throughput, usually by parallelizing computation with separate map and reduce computation stages
—Integrated development environment
A desktop application for software development, such as PyCharm, Eclipse, Atom, or Sublime Text 3
—Information retrieval
The study of document and web search engine algorithms. This is what brought NLP to the forefront of important computer science disciplines in the 90s.
—India Technical University
A top-ranking technical university. The Georgia Tech of India.
—Internationalization
Preparing application for use in more than one country (locale)
—Linear discriminant analysis
A classification algorithm with linear boundaries between classes (see chapter 4)
—Latent semantic analysis
Truncated SVD applied to TF-IDF or bag-of-words vectors to create topic vectors in a vector space language model (see chapter 4)
—Locality sensitive hash
A hash that works as an efficient but approximate mapping/clustering index on dense, continuous, high-dimensional vectors (see chapter 13). Think of them as ZIP Codes that work for more than just 2D (latitude and longitude).
—Latent semantic indexing
An old-school way of describing latent semantic analysis (see LSA), but it’s a misnomer, since LSA vector-space models don’t lend themselves to being easily indexed.
—Long short-term memory
An enhanced form of a recurrent neural network that maintains a memory of state that itself is trained via backpropagation (see chapter 9)
—Multi-index hashing
A hashing and indexing approach for high-dimensional dense vectors
—Machine learning
Programming a machine with data rather than hand-coded algorithms
—Mean squared error
The sum of the square of the difference between the desired output of a machine learning model and the actual output of the model
—Never Ending Language Learning
A Carnegie Mellon knowledge extraction project that has been running continuously for years, scraping web pages and extracting general knowledge about the world (mostly “IS-A” categorical relationships between terms)
—Natural language generation
Composing text automatically, algorithmically; one of the most challenging tasks of natural language processing (NLP)
—Natural language processing
You probably know what this is by now. If not, see the introduction in chapter 1.
—Natural language understanding
Often used in recent papers to refer to natural language processing with neural networks
—Nonnegative matrix factorization
A matrix factorization similar to SVD, but constrains all elements in the matrix factors to be greater than or equal to zero
—National Science Foundation
A US government agency tasked with funding scientific research
—New York City
The US city that never sleeps
—Open source software
—Pip installs pip
The official Python package manager that downloads and installs packages automatically from the “Cheese Shop” (pypi.python.org)
—Pull request
The right way to request that someone merge your code into theirs. GitHub has some buttons and wizards to make this easy. This is how you can build your reputation as a conscientious contributor to open source.
—Principal component analysis
Truncated SVD on any numerical data, typically images or audio files
—Quadratic discriminant analysis
Similar to LDA, but allows for quadratic (curved) boundaries between classes
—Rectified linear unit
A linear neural net activation function that forces the output of a neuron to be nonzero. Equivalent to y = np.max(x, 0). The most popular and efficient activation function for image processing and NLP, because it allows back propagation to work efficiently on extremely deep networks without “vanishing the gradients.”
—Read–evaluate–print loop
The typical workflow of a developer of any scripting language that doesn’t need to be compiled. The ipython, jupyter console, and jupyter notebook REPLs are particularly powerful, with their help, ?, ??, and % magic commands, plus auto-complete, and Ctrl-R history search.[3]
Python’s REPLs even allow you to execute any shell command (including pip) installed on your OS (such as !git commit -am 'fix 123'). This lets your fingers stay on the keyboard and away from the mouse, minimizing cognitive load from context switches.
—Root mean square error
The square root of the mean squared error. A common regression error metric. It can also be used for binary and ordinal classification problems. It provides an intuitive estimate of the 1-sigma uncertainty in a model’s predictions.
—Recurrent neural network
A neural network architecture that feeds the outputs of one layer into the input of an earlier layer. RNNs are often “unfolded” into equivalent feed forward neural networks for diagramming and analysis.
—Sequential minimal optimization
A support vector machine training approach and algorithm
—Singular value decomposition
A matrix factorization that produces a diagonal matrix of eigenvalues and two orthogonal matrices containing eigenvectors. It’s the math behind LSA and PCA (see chapter 4).
—Support vector machine
A machine learning algorithm usually used for classification
—Term frequency * inverse document frequency
A normalization of word counts that improves information retrieval results (see chapter 3)
—User interface
The “affordances” you offer your user through your software, often the graphical web pages or mobile application screens that your user must interact with to use your product or service
—User experience
The nature of a customer’s interaction with your product or company, from purchase all the way through to their last contact with you. This includes your website or API UI on your website and all the other interactions with your company.
—Vector space model
A vector representation of the objects in your problem, such as words or documents in an NLP problem (see chapter 4 and chapter 6)
—Your mileage may vary
You may not get the same results that we did.