Differentiable Simulation

ODEs

Let’s consider ODEs of the form $y'(x) = f(x, y)$ . Assuming a fixed step size $\Delta x$ , the most common integration schemes are basically functions $g$ such that

$y(x + \Delta x) \approx g(x, y)$

For instance, Euler integration uses $g(x, y) = y + (\Delta x) f(x, y)$ .

The function $g$ is applied recursively to produce a series of points, starting from some initial condition $y_0$ .

$\begin{aligned} y_1 &= g(0, y_0) \\ y_2 &= g(\Delta t, y_1) \\ y_3 &= g(2 \Delta t, y_2) \\ \vdots \end{aligned}$

We can write this more compactly as the recurrence relation

$y_{n+1} = g(n \Delta t, y_n)$

By differentiating both sides of this relation and applying the chain rule, we find that

$\frac{\partial y_{n+1}}{\partial y_0} = \frac{\partial g}{\partial y_n} (n \Delta t, y_n) \frac{\partial y_n}{\partial y_0}$

Note that we can take the partial derivative of any component of $y_n$ with respect to any component of $y_0$ . So the equation above relates Jacobian matrices. The left hand side is the Jacobian of of $y_{n + 1}$ as a function of $y_0$ . The right hand side is the product of the Jacobian of $g$ as a function of $y_n$ with the Jacobian of $y_n$ as a function of $y_0$ . Note that $g$ necessarily involves the gradient of $y$ as a function of time (i.e. $f$ , the object that normal non-differential simulation integrates forward), so the Jacobian of $g$ will involve second order derivatives of $y$ (first evaluated with respect to time, then with respect to the prior state).

Assuming that we can compute $\partial g / \partial y$ on demand, this gives a convenient recurrence relation for $\partial y_{n+1} / \partial y_0$ in terms of $\partial y_{n} / \partial y_0$ . So to differentiate through such a simulation, all we need to do is figure out $\partial g / \partial y$ , and compute the series of derivate values $\partial y_{n} / \partial y_0$ alongside the values $y_n$ .

Euler

As mentioned above, for Euler integration $g(x, y) = y + (\Delta x) f(x, y)$ . Thus

$\frac{\partial g}{\partial y} (x, y) = 1 + (\Delta x) \frac{\partial f}{\partial y} (x, y)$

And

$\begin{aligned} \frac{\partial y_{n+1}}{\partial y_0} &= \frac{\partial g}{\partial y_n} (n \Delta t, y_n) \frac{\partial y_n}{\partial y_0} \\ &= \left( 1 + (\Delta x) \frac{\partial f}{\partial y} (x, y) \right) \frac{\partial y_n}{\partial y_0} \\ \end{aligned}$

N-Body Problem

A simple test case is the n-body problem. Let’s take our phase space coordinates to be position and velocity. Given positions $x_i$ and masses $m_i$ , Newton’s law of universal gravitation (along with Newton’s second law of motion) tells us how to compute the acceleration experienced by a particular particle.

$a_i = \sum_{j \neq i} G m_j \frac{x_j - x_i}{|x_j - x_i|^3}$

To start, let’s consider two particles. I’ll make one 100 times heavier than the other, and our goal will be to find a stable circular orbit by varying the initial velocity of the satellite. As a baseline, here’s a really bad trajectory, in which the satellite gains escape velocity and doesn’t end up orbiting at all.

Differentiating through the simulation makes it very easy to find the desired solution. This orbit was found by gradient descent, but Nelder Mead finds the same optimum.