Softmax

Softmax is a function that is commonly used in the final layer of a neural network to convert a vector of real-valued numbers into a probability distribution.

It is a generalization of the logistic sigmoid function that is used for binary classification problems.

How to calculate

The softmax function takes as input a vector of real numbers, z = [z1, z2, …, zk], and returns a vector of the same size, p = [p1, p2, …, pk],

where each element pi represents the probability of the input belonging to the ith class.

The softmax function is defined as:

pi=ezikj=1ezj

  • In this equation, pi the predicted probability for the ith class, z is the vector of input values, and k is the total number of classes.
  • The denominator of the equation is the sum of the exponential values of all input values, which ensures that the probabilities sum up to 1.

Related Topics