Entropy

Entropy is a measure of the amount of uncertainty or randomness in a set of data.

Specifically, entropy is a function that measures the average amount of information required to represent a random variable.

How to calculate

In information theory, the entropy of a discrete random variable X with probability mass function p(X) can be calculated using the following equation:

\begin{equation} H(X) = -\sum_{x \in \mathcal{X}} p(x) \log_2 p(x) \end{equation}

where $\mathcal{X}$ is the set of possible outcomes of X, $p(x)$ is the probability of each outcome, and $\log_2$ is the base-2 logarithm.

  • The term inside the summation, $p(x) \log_2 p(x)$, is called the information content of each outcome x, and the negative sign is included to ensure that the entropy is non-negative.

  • The interpretation of entropy in machine learning is that it measures the impurity or disorder of a set of labels.

    • For example, in a binary classification problem with labels 0 and 1, if all the data points have label 0, the entropy is 0, indicating perfect purity.

Conversely, if the labels are evenly split between 0 and 1, the entropy is 1, indicating maximum impurity.

Related Topics