1

In this MAML paper, they use following formula of gradient descent update (see page 3, algorithm 1):

$$ \varTheta '\ =\varTheta \ −\ \alpha \nabla _{\varTheta }\mathcal{L}_{\mathcal{T}_{i}}( f_{\varTheta }) $$

My question is, what is the reason of having theta in subscript after gradient (nabla) ?

Specifically, why is it not like this:

$$ \displaystyle \varTheta '\ =\varTheta \ −\ \alpha \nabla \mathcal{L}_{\mathcal{T}_{i}}( f_{\varTheta }) \ $$

In other words why is not enough to tell it like just gradient of loss function?

gunes
  • 49,700
  • 3
  • 39
  • 75
Jan Musil
  • 291
  • 2
  • 9

1 Answers1

1

That means gradient with respect to $\theta$. In the context, there is little uncertainty and probably anyone reading the paper can understand that it is gradient wrt $\theta$ even if they hadn't used the subscript notation but there is no harm in being precise, especially in published work.

It's like writing $\frac{\partial f}{\partial \theta}$, instead of $f'$.

gunes
  • 49,700
  • 3
  • 39
  • 75