Softmax and Sigmoid

Task: more practice using the softmax function, and connect it with the sigmoid function.

Setup

In [1]:
import torch
from torch import tensor
import matplotlib.pyplot as plt
%matplotlib inline
In [2]:
def softmax(x):
    return torch.softmax(x, axis=0)

Task

Try this example:

In [3]:
x1 = tensor([0.1, 0.2, 0.3])
x2 = tensor([0.1, 0.2, 100])
In [4]:
softmax(x1)
Out[4]:
tensor([0.3006, 0.3322, 0.3672])
  1. Write a block of code that assigns p = softmax(x1) then evaluates p.sum(). Before you run it, predict what the output will be.
In [ ]:
# your code here
  1. Write a block of code that evaluates p2 = softmax(x2) and displays the result. Before you run it, predict what it will output.
In [ ]:
# your code here
  1. Evaluate torch.sigmoid(tensor(0.1)). Write an expression that uses softmax to get the same output. Hint: Give softmax a two-element tensor([num1, num2]), where one of the numbers is 0.
In [ ]:
# your code here

Analysis

  1. A valid probability distribution has no negative numbers and sums to 1. Is softmax(x) a valid probability distribution? Why or why not?

your answer here

  1. Jargon alert: sometimes x is called the "logits" and x.softmax(axis=0).log() (or x.log_softmax(axis=0)) is called the "logprobs", short for "log probabilities". Complete the following expressions for x1 (from the example above).
In [ ]:
logits = ...
logprobs = ...
probabilities = ...
  1. In light of your observations about the difference between softmax(x1) and softmax(x2), why might softmax be an appropriate name for this function?

your answer here