This project implements a comprehensive analysis of Gaussian Mixture Models (GMMs) using Python and the scikit-learn library. The goal is to explore key properties of GMMs, including:
- Data Generation: Synthetic 2D datasets from multiple Gaussian components with configurable means, covariances, and mixing weights.
- Model Fitting: EM algorithm convergence via
GaussianMixture. - Parameter Estimation: Visualization of learned means, covariances, and component weights.
- Model Selection: BIC and AIC criteria for optimal number of components.
- Visualization: 2D contour plots with 95% confidence ellipses and ground-truth vs. predicted clustering.
- Clean Project Structure:
src/for code,results/for outputs,.venv/isolated viauv. - Modern Tooling:
- Dependency management with
uv - Code formatting with
black - Environment reproducibility via
pyproject.tomlanduv.lock
- Dependency management with
- One-Click Execution: Run
uv run python main.pyto generate plots and save results.
Figure: Ground truth (left) vs. GMM fit using EM algorithm (right). Ellipses represent 2σ confidence regions.
Author: Zhixi Hu
Email: [email protected]
Python: 3.11 | Managed with: uv
_sigma(5,5)_Varying.png)