Questions tagged [automatic-differentiation]
13 questions
40
votes
1 answer
Step-by-step example of reverse-mode automatic differentiation
Not sure if this question belongs here, but it's closely related to gradient methods in optimization, which seems to be on-topic here. Anyway, feel free to migrate if you think some other community has better expertise in the topic.
In short, I'm…

ffriend
- 9,380
- 5
- 24
- 29
7
votes
2 answers
In GD-optimisation, if the gradient of the error function is w.r.t to the weights, isn't the target value dropped since it's a lone constant?
Suppose we have the absolute difference as an error function:
$\mathit{loss}(w) = |m_x(w) - t|$
where $m_x$ is simply some model with input $x$ and weight setting $w$, and $t$ is the target value.
In gradient-descent optimisation, the initial idea…

mesllo
- 579
- 2
- 16
7
votes
1 answer
What is an example use of Auto differentiation such as implemented in Tensorflow and why is it important?
I have a decent grasp of neural networks, back propagation and chain rule however I am struggling to understand auto differentiation.
The below refer to auto differentiation outside the context of back propagation:
How does auto differentiation…

Greg
- 335
- 1
- 4
- 9
6
votes
1 answer
Mathematical notation for suppressing differentiation
Basic question
Is the some existing mathematical notation to mean "treat this term as a constant when differentiating"? This would be the equivalent of detach in pytorch or stop_gradient in tensorflow and jax.
When I asked this on twitter a helpful…

Dennis Prangle
- 531
- 2
- 9
3
votes
1 answer
How to know the number of dimensions of a Jacobian?
My question comes from a comment in this question Vector Jacobian product in automatic differentiation
The question states...
$$
t = Wz, \,\,\, z\in \mathbb{R}^{m\times 1}, t \in \mathbb{R}^{n \times 1}, W\in\mathbb{R}^{n \times…

Joff
- 599
- 2
- 13
3
votes
1 answer
Vector-Jacobian Product Computational Cost
The paper FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models presents a continuous-time flow as a generative model which uses Hutchinson's trace estimator to give an unbiased estimate of the log-density, allowing for…

Lashoun
- 31
- 2
1
vote
0 answers
Derivation of ELBO in ADVI Paper, Jacobian of Elliptical Transformation
I've been following the ELBO derivations in the paper Automatic Differentiation Variational Inference and have a few questions. With the model $p(x,\theta)$, they first transform $\theta$ so that it lies on the real coordinate plane. Let $\zeta =…

James
- 11
- 2
1
vote
0 answers
Reverse-Mode Automatic Differentiation with respect to a Matrix: How to "Matrix Multiply" 4D Tensors?
This is a follow up question I have on this excellent answer: https://stats.stackexchange.com/a/235758/307400. I will save me writing down any details about reverse-mode automatic differentiation, the given answer gives a nice introduction. I also…

cherrywoods
- 111
- 2
1
vote
1 answer
automatic diffentiation (autograd): when the explicit definition of the gradient function is needed?
In Pytorch and similar machine learning software, the Autograd module computes the gradient of a function without needing to explicit declare the derivative of each single function which composes the main function. However, it is possibile to…

volperossa
- 625
- 5
- 9
1
vote
0 answers
Auto Differentiation in Deep Learning Libraries
It is said that auto-diff is very efficient in generating the derivatives for backpropagation algorithms. The why is it that some of the most widely used deep learning libraries like Theano and TensorFlow do not use this functionality? Is it because…

m1cro1ce
- 748
- 6
- 13
0
votes
0 answers
Computing the Jacobian $J_F$ with $F = h \circ f$
Let
$$
f: \mathbb{R}^l \rightarrow{} \mathbb{R}^m\\[.7ex]
h: \mathbb{R}^m \rightarrow{} \mathbb{R}^o$$
and let $$F = h \circ f \quad (F : \mathbb{R}^l \rightarrow{} \mathbb{R}^o)$$
I want to compute the Jacobian using Forward mode accumulation in…

lalaland
- 131
- 3
0
votes
0 answers
Vector Jacobian product in automatic differentiation
my questions is related to this post Higher Order of Vectorization in Backpropagation in Neural Network @shimao
I don't really get the following claim (I know how the chain rule works and what is the essence of how reverse mode autodifferentiation…

jdeJuan
- 97
- 5
0
votes
2 answers
Automatic differentiation for a function without representation
I have been studying AD for these days and I think I understand how it works, but all functions for which AD has been applied in the lectures I've studied are elementary in the mathematical sense, I mean, with a formula with elementary…

Mads C
- 3
- 1