Probability Distributions and their Stories

Continuous Multivariate distributions

Story. This is a generalization of the univariate Gaussian.
Example. Finch beaks are measured for beak depth and beak length. The resulting distribution of depths and length is Gaussian distributed. In this case, the Gaussian is bivariate, with \(μ=(μ_d,μ_l)\) and \(\begin{align}
\mathsf{\Sigma} = \begin{pmatrix}
\sigma_\mathrm{d}^2 & \sigma_\mathrm{dl} \\
\sigma_\mathrm{dl} & \sigma_\mathrm{l}^2
\end{pmatrix}
\end{align}\).
Parameters. There is one vector-valued parameter, \(μ\), and a matrix-valued parameter, \(Σ\), referred to respectively as the mean and covariance matrix. The covariance matrix is symmetric and strictly positive definite.
Support. The K-variate Gaussian distribution is supported on \(\mathbb{R}^K\).
Probability density function.

\(\begin{align}
f(\mathbf{y};\boldsymbol{\mu}, \mathsf{\Sigma}) = \frac{1}{\sqrt{(2\pi)^K \det \mathsf{\Sigma}}}\,\mathrm{exp}\left[-\frac{1}{2}(\mathbf{y} - \boldsymbol{\mu})^T \cdot \mathsf{\Sigma}^{-1}\cdot (\mathbf{y} - \boldsymbol{\mu})\right].
\end{align}\)
Usage The usage below assumes that mu is a length K array, Sigma is a K×K symmetric positive definite matrix, and L is a K×K lower-triangular matrix with strictly positive values on teh diagonal that is a Cholesky factor.

Package	Syntax
NumPy	`np.random.multivariate_normal(mu, Sigma)`
SciPy	`scipy.stats.multivariate_normal(mu, Sigma)`
Stan	`multi_normal(mu, Sigma)`
NumPy Cholesky	`np.random.multivariate_normal(mu, np.dot(L, L.T))`
SciPy Cholesky	`scipy.stats.multivariate_normal(mu, np.dot(L, L.T))`
Stan Cholesky	`multi_normal_cholesky(mu, L)`

Related distributions.
- The Multivariate Gaussian is a generalization of the univariate Gaussian.
Notes.
The covariance matrix may also be written as \(Σ=S⋅C⋅S\), where
\(S=\sqrt{diag(Σ)}\),
and entry \(i\), \(j\) in the correlation matrix C is

\(C_{ij}=σ_{ij}/σ_iσ_j\).

Furthermore, because \(Σ\) is symmetric and strictly positive definite, it can be uniquely defined in terms of its Cholesky decomposition, \(L\), which satisfies \(Σ=L ⋅ LT\). In practice, you will almost always use the Cholesky representation of the Multivariate Gaussian distribution in Stan.