softmax operation useful in classification?Suppose we give our digit classifier an image of a 3, and it outputs a score (logit) of 1 for every digit.
Chop off the negative part of its input.
y = max(0, x)
(Gradient is 1 for positive inputs, 0 for negative inputs)
ReLU interactive (name: u04n00-relu.ipynb; show preview, open in Colab)
Logistic Regression
Consider first the logistic regression, then the MLP. For each:
What’s the difference between the two?
At what probability do you decide that a class is present?