mixture of Gaussians models
(50 minutes to learn)
Summary
Mixture of Gaussians is a probabilistic model commonly used for clustering: partitioning a set of data points into a set of clusters, where data points within a cluster are similar to one another.
Context
This concept has the prerequisites:
Core resources (read/watch one of the following)
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 9.2, up to 9.2.1, pages 430-432
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Section 11.2, pages 337-342
Supplemental resources (the following are optional, but you may find them useful)
-Free-
→ Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
Additional dependencies:
- Bayesian networks
See also
- K-means is a simpler clustering model which is faster to fit and often used as an initialization.
- When there isn't enough data to fit a general mixture of Gaussians, here are some alternative models:
- Bayesian mixture of Gaussians
- mixture of factor analyzers
- co-clustering
- Bayesian clustered tensor factorization