How does a Gaussian mixture model work?

Using GMM is a popular technique of soft clustering. GMM tries to model all the data points as a finite mixture of Gaussian distributions; the probability that each point belongs to each cluster is computed along with the cluster related statistics and represents an amalgamate distribution: where all the points are derived from one of K Gaussian subdistributions having own probability. In short, the functionality of GMM can be described in a three-steps pseudocode:

Objective function: Compute and maximize the log-likelihood using expectation–maximization (EM) as a framework
EM algorithm:
- E step: Compute the posterior probability of membership -i.e. nearer data points
- M step: Optimize the parameters.
Assignment: Perform soft assignment during step E.

Technically, when a statistical model is given, parameters of that model (that is, when applied to a data set) are estimated using the maximum-likelihood estimation (MLE). On the other hand, EM algorithm is an iterative process of finding maximum likelihood.

Since the GMM is an unsupervised algorithm, GMM model depends on the inferred variables. Then EM iteration rotates toward performing the expectation (E) and maximization (M) step.

The Spark MLlib implementation uses the expectation-maximization algorithm to induce the maximum-likelihood model from a given a set of data points. The current implementation uses the following parameters:

K is the number of desired clusters to cluster your data points
ConvergenceTol is the maximum change in log-likelihood at which we consider convergence achieved.
MaxIterations is the maximum number of iterations to perform without reaching the convergence point.
InitialModel is an optional starting point from which to start the EM algorithm. If this parameter is omitted, a random starting point will be constructed from the data.

Table of Contents for How does a Gaussian mixture model work?

Create new playlist

Sign In

Sign Up

Table of Contents for
How does a Gaussian mixture model work?