2. Maximum likelihood estimation#

2.1. Reading materials#

2.2. Definition#

  • In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data.

  • Therefore, we have to make an assumption of distribution at first. Take the normal distribution as an example, we assume

    1. Mean(average) have the highest probability

    2. Relatively symmetrical around the mean (no skewness)

image.png

2.3. Steps in MLE#

  1. Write the likelihood function _ where xi is the observed value _ θj is the parameter from assumed distribution * f(xi;θ) is the probability function

    L(θ)=L(x1,x2,...,xn;θ1,θ2,...,θm)=i=1nf(xi;θ1,θ2,...,θm)

    image-2.png

  2. Get the logarithm of likelihood function

    • The goal is to maximize the likelihood function, but likelihood function is product of bunch probabilities, which makes harder to calculate derivatives.

    • logarithm of likelihood function will not change the maximum and minimum position, and also transfer products to summation

      image-3.png

  3. Get partial derivatives on distribution parameter θ

    • Get the solution of above equation, then find which solution makes $ln(L)$ get maximum
      
      • If there is no solution (no flat point), means min(θ) or max(θ) gives the maximum

ln(L)θ=0

2.4. MLE for normal distribution#

2.4.1. Understand the parameter of normal distribution#

P(x|μ,σ)=12πσe(xμ)22σ2
  • What the probability of observing x=32 from a normal distribution N~(μ=28,σ=2)

P(x=32|μ=28,σ=2)=12π2e(3228)2222=0.03

2.4.2. Get logarithm of likelihood function#

  1. ln(12πσe(xμ)22σ2)ln(12πσ)+ln(e(xμ)22σ2)
  2. ln(12πσ)+ln(e(xμ)22σ2)12ln[(2πσ2)](xμ)22σ2
  3. 12ln[(2πσ2)](xμ)22σ212ln(2π)ln(σ)(xμ)22σ2
  4. Sum the logarithm of likelihood function for all observations $ln[L(μ,σ|x1,x2,...,xn)]=i=1nln(f(xi))i=1nln(f(xi))=n2ln(2π)nln(σ)(x1μ)22σ2...(xnμ)22σ2$

2.4.3. Estimate μ and σ#

  1. Get partial derivatives of μ

μln[L(μ,σ|x1,x2,...,xn)]=00+(x1μ)σ2+...+(xnμ)σ2
μln[L(μ,σ|x1,x2,...,xn)]=1σ2[(x1+...+xn)nμ]
  1. Let partial derivatives equal to 0 to get the solution

1σ2[(x1+...+xn)nμ]=0μ=(x1+...+xn)n
  1. Get partial derivatives of σ

σln[L(μ,σ|x1,x2,...,xn)]=nσ+1σ3[(x1μ)2+...+(xnμ)2]
  1. Let partial derivatives equal to 0 to get the solution

nσ+1σ3[(x1μ)2+...+(xnμ)2]=0n+1σ2[(x1μ)2+...+(xnμ)2]=0
n+1σ2[(x1μ)2+...+(xnμ)2]=0nσ2=[(x1μ)2+...+(xnμ)2]
nσ2=[(x1μ)2+...+(xnμ)2]σ=[(x1μ)2+...+(xnμ)2]n