Member-only story

Gaussian Mixture Models (GMMs) Explained

Step-by-step follow-along | Data Series | Episode 8.7

Mazen Ahmed
4 min readFeb 20, 2024

Overview

Gaussian Mixture models are used for clustering and probability density estimation.

It has fundamental assumption that observable data is the result of multiple Gaussian distributions that each represent a cluster.

Each data point can be assigned a probability of belonging to a particular cluster, as oppose to other clustering algorithms like K-means and Hierarchical cluster, this means a single data point can belong to multiple clusters.

In 1 dimension, the Gaussian distribution is given by the following formula:

Gaussian distribution with Mean (μ) 0 and Standard Deviation(σ) 1 — From Wikimedia Commons

Where:

  • μ is the mean of distribution.
  • σ is the standard deviation — how spread out the data is.

This can be extended to multiple dimensions to have the following formula (Multivariate Gaussian Distribution):

Where:

  • μ is an n-dimensional vector — giving the mean point in the distribution.
  • Σ is a n-dimensional covariance…

--

--

Mazen Ahmed
Mazen Ahmed

No responses yet