Skip to content

Yalin(Eric) Yang's Notes

Mutual Information

gisyaliny/notes

Mutual-Information¶

Materials¶

Mutual Information, Clearly Explained!!!

Definition¶

We try to predict value of "Loves Troll 2", and try to find which predictors provide the most information to the prediction.

In theory we could use R^2, but $R^2$ only works well with continuous data
MI help us to quantify the relationship between mixture of discrete and continuous variables
- MI will categorize continuous data into different bins for transforming it to discrete data

Steps in calculating MI¶

Property¶

$MI=0$ when the predictor only have one value (never change → provide no information)
Larger MI means predictors provide more information to target variables. But MI also related to datasets, MI from different datasets are not comparable.
- MI is like a sign-regardless correlation. When the predictor is 100% possitive correlated / negative correlated to target variables, we will get the same MI

Relationship between MI and Entropy¶