avoiding model overfitting when fitting parameters/models to an ordinary differential equation

Question

I am working on fitting an ODE model to some data. So I have a vector of time series data $\textbf{x} = [x_1, x_2, ... x_n]$, and an ODE model $\dot{x} = f(x, \theta)$, where $\theta$ is a vector of parameters. I have defined a simple squared loss function, and can include some additional regularizers, depending on convergence speed, etc. When I integrate the ODE model, I obtain a function $F(x, \theta)$ which ideally would resemble the data.

So the loss function looks something like:

$$ \mathcal{L}(x, \theta) = \sum_{i=1}^N (x_{i} - F(x_i, \theta))^2 $$

Now, I have a few different variations on the ODE model that could work, and I want to understand the right criteria to use for model selection. I come from statistics, so we generally use something like AIC or BIC to measure the goodness of fit discounted by the model complexity (meaning number of parameters). Of course AIC and BIC use a likelihood function instead of a simple loss function.

Hence I was just wondering what the equivalent criterion to AIC/BIC would be for fitting an ODE to some data. Can I just use AIC or BIC criterion but with the loss function instead of the likelihood function? Or are there other concerns that I might not have accounted for.

Any suggestions would be helpful.

I do not see anything in your description that varies at all from a standard regression framework: it is a [nonlinear model with additive errors](https://stats.stackexchange.com/a/148713/919). What do you perceive to be the difference or complication? — whuber, Feb 11 '22 at 20:09
@whuber, It is a good point. Yeah, I guess there were only two questions. First, the ODE model can be nonlinear. And second the ODE model is going to assume continuous time--even though the observations are captured at discrete time increments. Sounds like you are saying that I could use a criterion like AIC for model selection still, just use the loss function instead of a likelihood. — krishnab, Feb 11 '22 at 21:24
I guess part of my question is that I have not really seen an discussion of these model selection issues in reading applied math papers. I have not seen anyone discuss things like using AIC/BIC when developing ODE models. Coming from a stats background, these questions are natural to me, but I just was confused when I did not really see this discussion in applied math papers. — krishnab, Feb 11 '22 at 21:26

avoiding model overfitting when fitting parameters/models to an ordinary differential equation

0 Answers0