compute-grad-PyTorch

Task: compute the gradient of a simple function using PyTorch

Setup

In [1]:
import torch
from torch import tensor
import matplotlib.pyplot as plt
%matplotlib inline

We now define a function of two variables:

In [2]:
def square(x):
    return x * x

def double(x):
    return 2 * x

def f(x1, x2):
    return double(x1) + square(x2) + 5.0

We evaluate it at a few values.

In [3]:
f(0.0, 0.0)
Out[3]:
5.0
In [4]:
f(0.1, 0.0)
Out[4]:
5.2
In [5]:
f(0.0, 0.1)
Out[5]:
5.01

Task 1

Compute the gradient of f with respect to x1, when x1 = 1.0 and x2 = 1.0.

Steps:

  1. Initialize the input tensors. Tell PyTorch to track their gradients.
In [6]:
x1 = torch.tensor(1.0, requires_grad=True)
x2 = ...
  1. Call the function to get the output.
In [7]:
result = f(x1, x2)
  1. Call backward on the result.
In [8]:
result.backward()

The gradient is now stored in x1.grad.

In [9]:
x1.grad
Out[9]:
tensor(2.)

Task 2

Compute the gradient of f with respect to x2, when x1 = 1.0 and x2 = 1.0.

In [10]:
x1 = torch.tensor(1.0, requires_grad=True)
x2 = torch.tensor(1.0, requires_grad=True)
In [11]:
# your code here
Out[11]:
tensor(2.)

Analysis

Repeat both tasks above for several other values of x1 and x2. Also look at the definition of f and recall what you learned about derivatives in Calculus. Based on that:

  1. Write a simple mathematical expression that evaluates to the value of x1.grad for any values of x1 and x2. Use only basic math operations like + or *; don't use any autograd functionality (like .backward()).
In [12]:
x1_grad = ...
  1. Write a simple mathematical expression that evaluates to the value of x2.grad for any values of x1 and x2.

Make sure that you understand why this is different from the value of x1.grad.

In [13]:
x2_grad = ...