Your byte size news and commentary from Silicon Valley the land of startup vanities, coding, learn-to-code and unicorn billionaire stories.
Tuesday, January 23, 2018
Understanding Softmax Function 101
Softmax function is one of the most important output function used in Deep Learning, a hot popular branch of machine learning. Softmax turn a vector of numbers, called logits, into a probabilities that sum to one : 0.7+0.2+0.1 = 1. See the above image from Udacity's deep learning nanodegree. The formula takes each individual y value and take the special number e to the exponent of this y_i, also divide it by the sum of all e to the y_i exponents hence it sums to one!
First of all we have a vector of y_i, outputs of connected layers of neurons aka weights and features - dot product.
[1, 2, 3, 4]
sum_of_all_e_exp = e^1 + e^2 + e^3 + e^4
the first output is
p_0 = e^1 / sum_of_all_e_exp