Softmax is a function that is commonly used in the final layer of a neural network to convert a vector of real-valued numbers into a probability distribution.
It is a generalization of the logistic sigmoid function that is used for binary classification problems.
How to calculate
The softmax function takes as input a vector of real numbers, z = [z1, z2, …, zk], and returns a vector of the same size, p = [p1, p2, …, pk],
where each element pi represents the probability of the input belonging to the ith class.
The softmax function is defined as:
pi=ezi∑kj=1ezj
- In this equation, pi the predicted probability for the ith class, z is the vector of input values, and k is the total number of classes.
- The denominator of the equation is the sum of the exponential values of all input values, which ensures that the probabilities sum up to 1.