The Mathematical Relationship Between Derivative and Convolution

Question

What I'm looking for is the mathematical proof of why we can calculate derivatives as a simple convolution of a mask, and how do we get that mask? I know it has something to do with how the derivative is approximated

Can anyone direct me to some book or anything?

Hint: In the frequency domain, both taking the derivative and convolution are multiplications with frequency-dependent factor. — Niki Estner, Sep 26 '13 at 11:24
Related to https://math.stackexchange.com/questions/1871525. — Royi, Nov 20 '21 at 08:14

score 5 · Answer 1 · answered Sep 26 '13 at 01:03

There's no mathematical proof per se; the fact is that finite differences are one method of approximating derivatives. Remember the definition of a derivative:

$$ f'(x) = \lim_{\Delta x \to 0} \frac{f(x+\Delta x) - f(x)}{\Delta x} $$

The above equation states that to calculate the derivative of a function $f$ at position $x$, one takes the difference between the function evaluated at $x$ and the function evaluated at some small distance $\Delta x$ away. The ratio of the difference in the function value between the two points and the distance traveled along the $x$-axis is defined to be the derivative, as the distance between the two points becomes infinitesimally small.

A finite-difference derivative method, then, is just a way of approximating the above relationship using some finite spacing instead of the infinitely-small spacing implied by the limit. This type of approach makes sense for real-world applications where you might not analytically know the function $f(x)$ (and therefore can't evaluate its behavior across arbitrarily-small intervals.

With that said, there are a number of ways to approximate the derivative of a signal in this way; one example is the causal first-order difference:

$$ x'(t) \approx \frac{x(t) - x(t-T)}{T} $$

This is simple to implement, but performs best if $T$ is small relative to the inverse of the signal's bandwidth. A slightly more complicated method can provide better results in some cases (the second-order difference):

$$ x'(t) \approx \frac{x(t) - 2x(t-T) +x(t-2T)}{T^2} $$

I've used the "backward second-order difference" form above to show how they could be applied using causal filters, but note that they will have a resulting delay commensurate with the order of the approximation.

score 1 · Answer 2 · answered Jan 02 '22 at 11:33

In the continuous time-function formalism, and accepting the framework or distributions or generalized functions, the answer is direct. Taking $\delta$ for the Dirac delta function, for a sufficiently well-behaved function $f$:

$$\delta' * f = \delta * f' = f'\,.$$

Therefore, the convolution mask is obvious: it would be the derivative of the Dirac delta. The derivative operator is linear, time-invariant, as for the convolution.

Issues arise in practice when the function is not continuous, not known fully: finding a discrete equivalent to the Dirac delta derivative is not obvious. Therefore, numerous finite difference approximations have proposed in many domains, to adapt to discrete data, non-uniform sample, knowledge of only one side of the data (causality), disturbances of the measurement. They often combine:

evaluation of the data on a finite interval support,
regularization or smoothing,
optimization so that the result is "close enough" to some expected behavior of the "discrete derivative".

Smoothing and optimization are often performed in a least-square sense with interpolation or extrapolation, and hence yield linear, time-invariant discrete "convolution-like" operators with masks. Solutions are numerous, due to the degrees of freedom of the above (support size, smoothing shape, domain of interpolation). Methods range from Lagrangian, Bessel, Newton-Gregory, Gauss, Sterling interpolating polynomials to FIR filter approximation. Some references are:

Numerical recipes: 5.7 Numerical Derivatives
Numerical methods that work, Forman S. Acton
Closed-form expressions for the finite difference approximations of first and higher derivatives based on Taylor series, 1999, Journal of Computational and Applied Mathematics

Note however that some use non-linear or non-time-invariant or non-space-invariant finite differentiation, for instance in real-time computing to limit instabilities or overshoot (references in CHOPtrey: contextual online polynomial extrapolation for enhanced multi-core co-simulation of complex systems), or in image processing, like in mathematical morphology. There, finite derivatives vary, or using non-linear min/max operators. They are not implemented by convolutions in that case.

score 0 · Answer 3 · answered Sep 26 '13 at 15:20

Assuming you want the derivative of a continuous band-limited (below Fs/2) signal represented by some sample points, then the sampling/reconstruction theorem can be used to demonstrate that an infinite convolution can be used to reconstruct the derivative of the signal as well as the signal. More realistically, a windowed convolution, using the same window as you would use for Sinc reconstruction or filtering, can be used for derivative-of-Sinc reconstruction. This finite length convolution kernel would then produce the derivative of the reconstructed signal at the sample points (and a poly-phase version could also estimate the derivative between sample points).

A mask can be created by computing the derivative of the Sinc reconstruction function, sampling that, and windowing it.

The Mathematical Relationship Between Derivative and Convolution

3 Answers3