Notation
- We say
is distributed according to distribution with notation - Alternatively, we may notate this as
Distributions
-
The Bernoulli
denotes the probability of an experiment succeeding given is the probability of success. -
The Binomial
denotes the probability of an experiment succeeding times given trials were conducted and is the probability of success. -
The Multinomial
. is a vector such that denotes the number of times the -th outcome occur and denotes the probability of it occurring. The multinomial denotes the probability of being observed in trials. -
The Categorical
is a special case of the Multnomial where and is a one hot encoded vector. -
The Poisson
is the binomial distribution but taken with a large number of trials and probability of success . It denotes the probability of successes being observed over a long period of time for very unlikely events. -
The Empirical / Sample Distribution is a distribution where given a set of data
, we have the probability of each as being . It denotes an associated empirical measure of sample. -
The Geometric
gives the probability that the -th trial is the first success observed given the experiment succeeds with probability -
The Hypergeometric
describes the probability of successes (i.e., an object is drawn with a desired feature) in draws without replacement from a population of size with objects with that feature. -
The Negative Hypergeometric
describes the probability distribution on the number of draws needed before failures are observed (see Hypergeometric) -
The Uniform
is a distribution where all events from to are equally likely and all other events never occur. -
The Normal / Gaussian
is a distribution following a bell curve whose mean, median and mode are and whose variance is . It is of the form One interpretation of the above is that it is a limiting distribution of the Binomial Distribution over many trials.
-
The Dirac Function
is a function that is infinite at and everywhere else. It is a Gaussian with very low variance. -
The Student
is a variation of the Gaussian which is more robust to outliers (i.e., it has heavier tails). It is denoted with mean , variance and degrees of freedom . As , it becomes the Gaussian. -
The Cauchy is a variant of the Student
distribution with degree of freedom. It has such heavy tails that it does not have a mean. It can be interpreted as the distribution of the quotient of two normal distributions with the same mean. -
The Laplace
is defined as It is robust to outliers and puts more probability density at the center than the Gaussian.
-
The Gamma
is a distribution parameterized with shape and rate . It is the limiting distribution of -
The Exponential is denoted
. Where is the rate parameter .It describes the distribution of the times between events in a Poisson process. -
The Erlang Distribution is denoted
is defined as the Gamma Distribution . It is the distribution of the time until the -th event of a Poisson Process. -
The Chi-Squared
is the distribution of the sum of squared Gaussian Random Variables -
The Inverse Gamma
is the distribution of the reciprocal of a random variable following the Gamma Function -
The inverse chi-squared distribution is defined as
-
The Beta
is a family of distributions of the form Where
denotes the successes, and the failures. -
The Pareto
is used to model long tailed distributions where most items do not occur often. These distributions exhibit power laws. determines some threshold for which input is greater, but not by much (determined by ) -
The Multivariate Gaussian / Multivariate Normal is a generalization of the Gaussian over
random variables. It is defined as - The expression in the exponent is the Mahalanobis distance between vector
and mean . - This means that the contours of the probability distribution lie in ellipsoids (see Murphy Ch. 4.1.2). Informally, this means the distribution involves translating by
and rotating by . - The eigenvalues of
determine how stretched the ellipsoid contours are. The eigenvectors determine the axes of the ellipsoid.
- The expression in the exponent is the Mahalanobis distance between vector
-
The Multivariate Student
is a generalization of the distribution over random variables. It is defined as where
is the scale matrix rather than the covariance matrix and As with the
distribution, as , the distribution tends to the Multivariate Gaussian. -
The Dirichlet is a generalization of the Beta Distribution. It has support over the simplex (generalization of triangular surface) given by
The distribution is defined as
And if
, then And
controls the strength of the distribution (i.e., how peaked it is) and controls where the peak occurs. -
The Wishart is a generalization of the Gamma distribution to positive definite matrices. Its pdf is defined as
Where
is simply a normalization constant. -
The Inverse Wishart is the generalization of the inverse gamma. It is defined for
and positive definite -
The Normal Inverse Wishart is of the following form
Where
is the prior mean for is the belief in . is proportional for the prior mean for is the belief in
-
The Normal Inverse Chi-Squared is defined as
Where
is the prior mean for is the belief in . is proportional for the prior mean for is the belief in
-
The Inverse Gaussian (or Wald) is given by
Where
is the mean is the shape parameter - Input
. If the Gaussian describes Brownian motion at fixed time, the Inverse Gaussian describes the time it takes for a Brownian motion to reach a level.
-
The Normal Inverse Gaussian is defined as
Where
is the mean of the Normal distribution is a scaling factor to the variance of the Normal Distribution is the mean of the Wald distribution is the shape parameter of the Wald distribution.
-
The Boltzmann is a probability distribution that gives the probability of a certain state as a function of energy and temperature
Where
is the probability of state is the energy of state is the Boltzmann constant is the absolute temperature of the system is the number of states accessible to the system is the normalization constant. The Boltzmann distribution is the distribution that maximizes entropy.
Links
- Univariate Distribution Relationships - a graph that shows probability distributions, and how and why they are related to each other.
- Murphy Ch. 2
- Murphy 4.5 - on the Wishart and Inverse Wishart