Maximum likelihood estimation

2. Maximum likelihood estimation#

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data.
Therefore, we have to make an assumption of distribution at first. Take the normal distribution as an example, we assume
1. Mean(average) have the highest probability
2. Relatively symmetrical around the mean (no skewness)

Write the likelihood function _ where $x_{i}$ is the observed value _ $θ_{j}$ is the parameter from assumed distribution * $f (x_{i}; θ)$ is the probability function

$L (θ) = L (x_{1}, x_{2}, . . ., x_{n}; θ_{1}, θ_{2}, . . ., θ_{m}) = \prod_{i = 1}^{n} f (x_{i}; θ_{1}, θ_{2}, . . ., θ_{m})$
Get the logarithm of likelihood function
- The goal is to maximize the likelihood function, but likelihood function is product of bunch probabilities, which makes harder to calculate derivatives.
- logarithm of likelihood function will not change the maximum and minimum position, and also transfer products to summation
Get partial derivatives on distribution parameter θ
- ```
Get the solution of above equation, then find which solution makes $ln(L)$ get maximum
```
  - If there is no solution (no flat point), means $m i n (θ)$ or $m a x (θ)$ gives the maximum

\frac{\partial l n (L)}{\partial θ} = 0

P (x | μ, σ) = \frac{1}{\sqrt{2 π} σ} e^{- \frac{(x - μ)^{2}}{2 σ^{2}}}

What the probability of observing $x = 32$ from a normal distribution $N$ ~ $(μ = 28, σ = 2)$

P (x = 32 | μ = 28, σ = 2) = \frac{1}{\sqrt{2 π} * 2} e^{- \frac{(32 - 28)^{2}}{2 * 2^{2}}} = 0.03

$l n (\frac{1}{\sqrt{2 π} σ} e^{- \frac{(x - μ)^{2}}{2 σ^{2}}}) \to l n (\frac{1}{\sqrt{2 π} σ}) + l n (e^{- \frac{(x - μ)^{2}}{2 σ^{2}}})$
$l n (\frac{1}{\sqrt{2 π} σ}) + l n (e^{- \frac{(x - μ)^{2}}{2 σ^{2}}}) \to - \frac{1}{2} l n [(2 π σ^{2})] - \frac{(x - μ)^{2}}{2 σ^{2}}$
$- \frac{1}{2} l n [(2 π σ^{2})] - \frac{(x - μ)^{2}}{2 σ^{2}} \to - \frac{1}{2} l n (2 π) - l n (σ) - \frac{(x - μ)^{2}}{2 σ^{2}}$
Sum the logarithm of likelihood function for all observations $ $l n [L (μ, σ | x_{1}, x_{2}, . . ., x_{n})] = \sum_{i = 1}^{n} l n (f (x_{i}))$ $\sum_{i = 1}^{n} l n (f (x_{i})) = - \frac{n}{2} l n (2 π) - n * l n (σ) - \frac{(x_{1} - μ)^{2}}{2 σ^{2}} - . . . - \frac{(x_{n} - μ)^{2}}{2 σ^{2}}$ $

\frac{\partial}{\partial μ} l n [L (μ, σ | x_{1}, x_{2}, . . ., x_{n})] = 0 - 0 + \frac{(x_{1} - μ)}{σ^{2}} + . . . + \frac{(x_{n} - μ)}{σ^{2}}

\frac{\partial}{\partial μ} l n [L (μ, σ | x_{1}, x_{2}, . . ., x_{n})] = \frac{1}{σ^{2}} [(x_{1} + . . . + x_{n}) - n * μ]

\frac{1}{σ^{2}} [(x_{1} + . . . + x_{n}) - n * μ] = 0 \to μ = \frac{(x_{1} + . . . + x_{n})}{n}

\frac{\partial}{\partial σ} l n [L (μ, σ | x_{1}, x_{2}, . . ., x_{n})] = - \frac{n}{σ} + \frac{1}{σ^{3}} [(x_{1} - μ)^{2} + . . . + (x_{n} - μ)^{2}]

- \frac{n}{σ} + \frac{1}{σ^{3}} [(x_{1} - μ)^{2} + . . . + (x_{n} - μ)^{2}] = 0 \to - n + \frac{1}{σ^{2}} [(x_{1} - μ)^{2} + . . . + (x_{n} - μ)^{2}] = 0

- n + \frac{1}{σ^{2}} [(x_{1} - μ)^{2} + . . . + (x_{n} - μ)^{2}] = 0 \to n * σ^{2} = [(x_{1} - μ)^{2} + . . . + (x_{n} - μ)^{2}]

n * σ^{2} = [(x_{1} - μ)^{2} + . . . + (x_{n} - μ)^{2}] \to σ = \sqrt{\frac{[(x_{1} - μ)^{2} + . . . + (x_{n} - μ)^{2}]}{n}}