Proper regression for determining correlations between derivatives of functions

Question

Say we have a permanent-magnet DC motor that roughly obeys the system equation $\ddot{x}(t) = \alpha \dot{x}(t) + \beta u(t) + \gamma$, where $x(t)$ is the displacement of the rotor, and $u(t)$ the applied voltage, at time $t$.

Say we wish to determine the values of $\alpha$, $\beta$, and $\gamma$ experimentally. If we can only directly measure $x$, not $\dot{x}$ or $\ddot{x}$, how should we go about estimating these parameters from a set of time-series measurements of $u$ and $x$?

One naive approach is to compute the derivatives through some central finite difference scheme, and then perform an OLS regression - but it is unobvious how the derivative calculation interacts with the regression. Additionally, I have found in practice that this suffers from a significant amount of regression dilution if the test is allowed to run too long at steady-state (the derivatives vanish here, and so all that's left is the noise).

Is there any more "complete" method for analyzing systems like this that handles the differentiation as part of the construction of the regression model? Is there a good theory of correlations between derivatives of time-series data?

Update: I figured I'd share some actual plots to compare some of the methods that have been suggested. In the time-domain plots below, the green line represents the path of a simulated motor that has been stepped along with the characterization constants determined from the regression. Note that the constants here are presented in slightly different form - the presented parameters match the "feedforward equation" form:

$u(t) = K_a \ddot{x}(t) + K_v \dot{x}(t) + K_s \text{sgn}(\dot{x}(t))$

Strategy 1: Central finite difference followed by direct regression of acceleration against other variables:

Strategy 2: Central finite difference followed by direction regression of next velocity from previous velocity (slightly different UI - this is from an old version of the tool):

Strategy 3: "Accumulated" formulation - regression of the next position against previous position and integrals of other terms:

Alas, for our test procedure this methodology seems more or less useless.

I have yet to successfully run a Fourier-domain analysis; I might give that a shot eventually, but it's substantially more of a chore to implement than the ones I've tried above.

did you try brute force approach: estimate the finite difference approximations of derivatives, then regress the second derivative on its first and the voltage? this may actually work — Aksakal, Jan 17 '22 at 01:11
Yes, i did try that; i even mentioned it in the post. It works, but not perfectly. — user3716267, Jan 17 '22 at 02:03
What's the nature of our measurements of the applied voltage $u(t)$? Is this measured discretely, like $x(t)$, and with the same or different sample times? Do you expect the measurement uncertainty in the voltage to be large enough to have an appreciable effect on the RHS value of the ODE? — jwimberley, Jan 17 '22 at 03:47
Is $u(t)$ a driving sinusoidal, and the $\alpha$ term a damping or friction term of some kind (and so presumably $\alpha < 0$)? — jwimberley, Jan 17 '22 at 04:19
actually, what is this $\gamma$ term? why is it there? I get the first two, physics wise — Aksakal, Jan 17 '22 at 04:43
The driving signal is usually a quasistatic linear ramp or a step function (data from both tests are combined), or a stairstep. — user3716267, Jan 17 '22 at 12:04
Should your equation not also include $x\left(t\right)$? What happens if you drive it with oscillating $u\left(t\right)$ that is on average zero? It would seem to me that both derivatives $\dot{x}$ and $\ddot{x}$ will also be zero on average, and the equation of motion will not be satisfied. On the other hand if derivatives are not zero on average, I think you may get a divergent model. So I think it should be $\ddot{x}\left(t\right)=\alpha\dot{x}\left(t\right)+\kappa x\left(t\right) + u\left(t\right) + \gamma$ — Cryo, Jan 17 '22 at 14:09
Looking up at previous questions. $\gamma$ is not a friction term. In normal settings $\alpha$ would be the friction term, whilst $\gamma$ is more like a constant force that shifts the system away from zero and keeps it there. It may help to drop $u=0$ and consider how your system behaves by starting it with some speed ($\dot{x}$) and position ($x$) and letting it follow your equation of motion. Any numerical solver will do it for you. Probably can also do it analytically. It would also help in building intuition about what your constants are likely to be (i.e. your prior) — Cryo, Jan 17 '22 at 14:24
@Cryo, I don't think $x_t$ should be there. there's no term like that in Newton or Kirgof equation for a motor. the fact that he has a static $\gamma$ baffles me, it must be some approximation, because he only has Newton equation here. — Aksakal, Jan 17 '22 at 15:00
In this case (no $x$ term), $\gamma$ is irrelevant for $x$, and sets conditions on $u$. The frequency-domain treatment makes it obvious — Cryo, Jan 17 '22 at 15:05
@Cryo this looks like Newton equation to me. you have the acceleration $\ddot x_t$ proportional to electric force $u_t$ and mechanical friction of the rotor $\dot x_t$. then $\gamma$ makes no sense. why is it there if the voltage is zero? what about electric magnetic parts, i.e. kirhgof equation? — Aksakal, Jan 17 '22 at 15:06
It is an approximation of dry friction on shaft bearings, and it is both reasonably accurate and nontrivial. Other friction forces are absorbed into the other regression parameters. If the test includes motion in both directions, the term is instead $\gamma \text{sgn}(\dot{x})$. — user3716267, Jan 17 '22 at 15:12
so, you have no induction force at all? the one proportional $\dot u_t/R$ — Aksakal, Jan 17 '22 at 15:13
That force is not important on the timescales we care about. — user3716267, Jan 17 '22 at 15:17
I think I got your setup. the $\gamma$ represents DC voltage applied. $u_t$ is the voltage you measure on the motor, which truly represents the current that goes through the motor. in this case $\dot x_t$ captures both the mechanical friction for Newton equation and the electromagnetic force in Kirhgof equation. then this should be a fairly standard set of equations to estimate — Aksakal, Jan 17 '22 at 15:19
I think at this point if you shared some measurements that you're fitting to, it could be helpful. I suspect that this should be modeled as state-space model, because there are hidden variables that you're not measuring and that could be important to dynamic of the process — Aksakal, Jan 17 '22 at 15:27
I’ll share some data when I get back to a suitable work computer. The parameters from the regression are ultimately used (among other things) to determine optimal state space feedback gains via LQR. I’ve tried to keep the identification routine itself a bit simpler than that. — user3716267, Jan 17 '22 at 15:51
here's a set of sample data from two motors: https://gist.github.com/Oblarg/e08d6557efa621b6a327f2577e9febd9. The columns go: timestamp, voltage1, voltage2, position1, position2, — user3716267, Jan 18 '22 at 00:31
The question has been cross posted on Math SE: https://math.stackexchange.com/questions/4359960/determining-correlations-of-derivatives-of-a-function-given-only-measurements-of — Tyberius, Jan 18 '22 at 19:30
The problem with fitting the equation directly is the messy error structure: any estimate of $\dot{x}(t)$ involves a difference of two errors and an estimate of $\ddot{x}(t)$ Involves at least three. This yields a regression in which both the explanatory and response variables have errors and those errors are correlated. Instead, *rewrite the equation.* Assuming a unit time step, one integration and a numerical first derivative give $$x_{t+1}=(\alpha+1)x_t+\beta\int^t u(x)\,\mathrm{d}x+\gamma t+C.$$ Fit this using standard ARIMA techniques with $\int^t u(x)\mathrm{d}x$ and $t$ as covariates. — whuber, Jan 18 '22 at 19:54
@whuber What would be the likely results of doing the accumulation to avoid taking differences, but then proceeding with an OLS estimate instead of an ARIMA fit? We don't have any high-powered statistical dependencies in the analysis tool at the moment, and I'm not eager to write one in C++ from scratch — user3716267, Jan 18 '22 at 20:06
That's a good question (which I was just pondering...) It depends on the magnitude of the measurement error in $x_t.$ If that's small, OLS regression ought to work well. If it's large, it will affect estimation of $\alpha.$ — whuber, Jan 18 '22 at 20:23
I'll go ahead and write a simple OLS implementation and find out. It shouldn't be hard. I'll be back with an update once I've progressed. — user3716267, Jan 18 '22 at 20:26

Aksakal · Answer 1 · 2022-01-17T03:02:56.943

If your measurements are at equidistant intervals, you can try converting this into ARIMAX(1,1,0) model, which can be estimated with OLS. Despite similarity to OLS that you described in the question, the improvement is that here you only need the first difference. Avoiding the second difference helps with stability.

I assume the time sampling $\Delta t=1$. You can always change the unit of measurement of time to make it so.

First, explode the discrete derivatives $$\Delta x_t-\Delta x_{t-1}=\gamma+\alpha\Delta x_t+\beta u_t$$ $$\Delta x_t=\frac 1 {1-\alpha}(\gamma+\Delta x_{t-1}+\beta u_t)$$ $$\Delta x_t=\underbrace{\frac 1 {1-\alpha}\gamma}_{\beta_0}+\underbrace{\frac 1 {1-\alpha}}_{\beta_1}\Delta x_{t-1}+\underbrace{\frac 1 {1-\alpha}\beta}_{\beta_2} u_t$$ This looks like ARIMAX(1,1,0) model which is ARIMA model with exogenous covariates: $$\Delta x_t=\beta_0+\beta_1\Delta x_{t-1}+\beta_2 u_t+\varepsilon_t$$

The problem is errors: ideally, it would be great to write this as $x_t=s_t+\varepsilon_t$, i.e. with true displacement and a measurement error component, and the same for $u_t$. Also, there are not only measurement errors, but perhaps noise entering displacement and voltage, which will impact the next measurement. Then the true model is probably with MA terms. However, this makes everything more complicated. Therefore, I'd start with the above simple equation, and see where it leads to. Maybe, it'll be good enough.

A simplified version above can be easily estimated by OLS in lagged first differences, i.e. $y=X\beta+e$, where $y_t=\Delta x_t$ and $X_t=(1,\Delta x_{t-1},u_t)$. And once you got the coefficients, the original parameters can be backed out with simple algebra: $$\alpha=1-1/\beta_1\\ \gamma=(1-\alpha)\beta_0\\ \beta=(1-\alpha)\beta_2$$

You can estimate this with statistical packages, such as MATLAB's arima, but it sounds like overkill to me. If you choose to use not MATLAB, but R or Python, be careful, see Hyndman's post here: https://robjhyndman.com/hyndsight/arimax/ There's a lot of confusion about ARIMAX model, and major packages estimate regression with ARIMA errors in ARIMAX functions!

We’ve tried the OLS approach you suggest (with a slightly more nuanced back-calculation than the first order algebraic expressions here). It works more or less equivalently to the other finite difference scheme I mentioned originally. I’m unfamiliar with ARIMA - is there any reason to think it’d address the dilution problem I mentioned? — user3716267, Jan 17 '22 at 03:36
the errors of measurement impact the calculation of differences in next periods, and will probably show up as moving average terms in ARIMA. so, in principle, it may help with estimation if you're able to write them into equation correctly. I cut guarantee they will though — Aksakal, Jan 17 '22 at 03:44
What happened to the second derivative in the differential equation?? — whuber, Jan 18 '22 at 19:55
I guess I don't understand your notation. Also, I haven't been able to figure out where the factor of $1/(1-\alpha)$ came from in your second equation. My suspicion is that your intended approach is the same as that in a comment I have posted to the question, but maybe the algebra needs to be revisited. — whuber, Jan 18 '22 at 20:27
@whuber the full equations for the motor have more terms. OP tried to reduce them to estimate with OLS some parameters of the full system. I think using the current equation all regressions will lead to about the same results — Aksakal, Jan 18 '22 at 20:31

score 8 · Answer 2 · answered Jan 17 '22 at 00:19

8

If you have long-enough time-series you can try to tackle it in Fourier domain. If you do that it may be a good idea to apply a window function, which would force your signal $x=x\left(t\right)$ to zero at the edges of the window, and then make your signal periodic by repeating gated signal. For what is to follow I will assume that $x\left(t\right)$ that is known for $t=-\infty...\infty$. More subtle choices can be incorporated into analysis as well.

Let:

$$ x=x\left(t\right)=\frac{1}{\sqrt{2\pi}}\int d\nu \exp\left(-i2\pi\cdot\nu\cdot t\right)\,\tilde{x}\left(\nu\right) $$

You can extract $\tilde{x}\left(\nu\right)$ using Fourier Transform, FFT will give you a discrete analogue, which requires quite similar treatment. I will stay continuous for now.

$$ \tilde{x}\left(\nu\right) = \frac{1}{\sqrt{2\pi}}\int dt \exp\left(i2\pi\cdot\nu\cdot t\right)\,x\left(t\right) $$

Taking a FT of your equation of motion then gives:

$$ \left[4\pi^2\nu^2 - i2\pi\nu\cdot\alpha\right]\tilde{x} + \beta\cdot\tilde{u}+\gamma\sqrt{2\pi}\delta\left(\nu\right)=0 $$

This is now an algebraic equation, well-suited for regression. The delta-function in the last term will not appear in a discrete treatment (it will be something better behaved).

Of course, other types of orthogonal basis functions are available. Perhaps, it would make more sense to use wavelets, really depends on your signal. Fourier methods are a good place to start though IMHO

answered Jan 17 '22 at 00:19

Cryo

336
1
6

Are you sure the delta function won’t show up in a discrete treatment? We’ve been using step (or stairstep) signals, mostly, and the DFT of a step function has a delta function in it, I believe. – user3716267 Jan 17 '22 at 03:31
DFT of a constant is a [Kroenecker delta](https://en.wikipedia.org/wiki/Kronecker_delta), which is something that is well-behaved and has finite values. Unlike the Dirac delta ($\delta\left(\nu\right)$; above) which takes infinite values. Basically in DFT you will have $\gamma\delta_{\nu,0}$ only showing up in the zero-frequency component of your calculations, non-zero frequency components will not be affected by it. – Cryo Jan 17 '22 at 09:42
Ah; that makes sense. Might that run into numerical issues if the large-but-finite delta ends up being too large? – user3716267 Jan 17 '22 at 12:52
2

It may, but that would really be an anomalous situation, which is probably worth careful consideration by subject matter expert. As in: "why do we end up with such weird model? Is is reasonable?". – Cryo Jan 17 '22 at 14:02
Also see my comment to the main question. Sometimes numerical issues in the model are a sign of unsuitable equations of motion, i.e. computer is doing its job, but you are asking to model an unrealistic scenario – Cryo Jan 17 '22 at 14:12
Perhaps a dumb question, but is there a standard convention behind time scaling in DFT implementations? The regression equation requires that I multiply by some terms involving the frequency, but none of the APIs I'm seeing take a timestep parameter - just an array of values, so I'm not precisely sure what frequency to multiply by. – user3716267 Jan 17 '22 at 23:46
I can do a more DFT centered treatment, but before this, can you confirm that your time-series are with regular spacing? That is an important requirement for FFT. If they are not, you can still do FT treatment, but without the FFT – Cryo Jan 18 '22 at 00:11
Timing jitter's a lingering problem - some of our supported configurations have data coming from languages with automated GC, which does not help things. The ideal setup (which does thankfully constitute most of our users) has pretty reliable 5ms sampling (plus or minus 10%, eyeballing from the diagnostic plots). – user3716267 Jan 18 '22 at 00:15
Right, in this case you might be better off using more manual FT-based techniques. There is no need for periodic time-series to get FT, but automated libraries expect this. Is there a single well-defined driving frequency? I understand that $u$ may not be a sine-wave, and you are not sampling it at at constant rate, but can you at least rely on the driver completing period within the same time? What is the jitter there? – Cryo Jan 18 '22 at 00:20
Ordinarily the control signal isn't driven at a particular frequency. Data are usually accumulated from two tests: a slow voltage ramp and step function. Alternatively, we've also done single "stairstep" voltage tests. Forwards and backwards tests are run separately, trimmed, and combined. In these setups, timing accuracy for the driving signal hasn't been extremely crucial (I think it is likely about as accurate as the measurement timing). – user3716267 Jan 18 '22 at 00:28

score 5 · Answer 3 · answered Jan 18 '22 at 21:42

5

The differential equation can be rewritten

$$x_{t+1}=(\alpha+1)x_t+\beta\int^t u(x)\,\mathrm{d}x+\gamma t+C.$$

This eliminates many of the problematic aspects of the direct formulation and computing numerical derivatives, but it still leaves one untouched: because all the $x_t$ may be subject to measurement error, the explanatory variable $x_t$ on the right hand side might be subject to enough measurement error to bias the estimates.

There is a quick fix: fit the model using the data $(x_t, U_t, t, x_{t+1})$ (where $U(t) = \int_t u(x)\,\mathrm{d}x$) and then refit the model after replacing $x_t$ by the predicted values at those times.

Here, for example, is a dataset of $x_t$ represented by the $50$ dots on the right.

On the right, the derivative $\dot{x}_t$ is shown in blue and the second derivative $\ddot{x}_t$ in red.

Such data were independently generated $500$ times (with unit measurement variance), fit, and re-fit. To assess these fits, I plotted histograms of the parameter estimates.

In each histogram a vertical red line marks the true parameter value while the vertical dashed black line indicates the mean of the estimated values. In this example there is appreciable bias in evidence: the red lines are out in the tails of the distributions.

After refitting, the situation has improved, but at substantial cost in uncertainty: the estimates are much more spread out.

If you are limited to code for ordinary least squares solutions, as indicated in comments, you will have to pick your poison: precise biased estimates or much less precise unbiased estimates. Perhaps an initial study of simulated data will suggest how much bias and imprecision are present in your setting and help you choose.

answered Jan 18 '22 at 21:42

whuber

281,159
54
637
1,101

Unfortunately, after testing this seems poorly suited for our setup - the accumulators result in the regression very strongly weighting points that are highly-displaced from the start position, and it appears to badly misjudge $\beta$ in particular. – user3716267 Jan 18 '22 at 22:45
It sounds like you won't be able to make do with your OLS regression code, then. – whuber Jan 18 '22 at 23:45
I'll update the post with some actual analysis in a bit that compares a few different algorithms. – user3716267 Jan 19 '22 at 00:16
@whuber as always it would be nice to have also the underlying code attached. – Maximilian Jan 19 '22 at 17:22
1

@Max Because the OP has dismissed this solution as inapplicable to their situation, I don't want to belabor the point. The motivation for this post was to provide a helpful response to a request made by the OP in comments to the question. I will note, however, that it is fascinating to see such a stark example of a bias-variance tradeoff between two closely related (and arguably reasonable) statistical procedures in a relatively simple setting. That would be interesting to investigate further... . – whuber Jan 19 '22 at 17:32
Thanks for info @whuber – Maximilian Jan 19 '22 at 18:00
Definitely agree that the differences that have shown up here are worth investigating. For reference, if you want to play around with the stuff yourself, it is part of the [WPILib System Identification toolkit](https://github.com/wpilibsuite/sysid). The code's open-source, and I can provide some example data files. – user3716267 Jan 19 '22 at 18:12

Proper regression for determining correlations between derivatives of functions

3 Answers3