0

i am reading the famous article named “Neural Ordinary Differential Equations” by David Duvenaud (2018) I have come across this definition of the adjoint state (page 2):

$$a(t) = \frac{\mathrm dL}{\mathrm dz}(t)$$

I don't understand what this notation exactly means. Is it the directional derivative of L with respect to the vector z(t)? Could someone give me the exact definition of this quantity? I tried several interpretations myself, in order to derive the formula for da/dt, but none match the formula in the article (also page 2).

Thank you so much

MarianD
  • 1,493
  • 2
  • 8
  • 17
sosamm
  • 3
  • 1

1 Answers1

0

You're correct. The adjoint appears to be the gradient of the loss where the derivatives are taken with respect to the components of $\mathbf{z}(t)$.

$L$ seems to map vectors to scalars (states of the system $\mathbf{z}$ to loss values). That means $L : \mathbb{R}^n \to \mathbb{R}$. Taking derivative with respect to the input (at least from the context) means that $\partial L / \partial \mathbf{z}(t)$ is a vector.

Writing this out more explicitly,

$$ \mathbf{a}(t)_j = \dfrac{\partial L}{\partial \mathbf{z}_j(t)}$$

Since the state is a function of time, the adjoint will take different values at different times.

Demetri Pananos
  • 24,380
  • 1
  • 36
  • 94