Estimating the parameters of a deterministic model

Question

I have a model $f$ that predicts human performance on a simple perceptual task (performance is quantified as $Y$) as a function of some information about the stimuli ($X$) and parameters $\theta$. The model is nonlinear and deterministic (edit: by deterministic I mean that Y is determined given X and theta):

$Y = f(X,\theta)$

I would like to infer the parameters of this model using a dataset that I have collected $(x_1,y_1)...(x_n,y_n)$. I can think of two ways to do this:

1) Pick the set of parameters that minimize the mean squared error between the predicted and measured performance scores$\sum_{i=1}^n(\hat y_i-y_i)^2$

2) Add a noise term to the model $Y = f(X,\theta) + N(0,\sigma)$ and maximize the log likelihood of $\theta$. My logic is that by adding the noise term, the likelihood can be calculated as the log sum of the probability of observing residuals $(\hat y_i-y_i) $ given $N(0,\sigma)$.

I have three questions:

1) Are either of these methods clearly incorrect?
2) Is one of these methods more correct than the other?
2) If maximizing the log likelihood is the way to go, what is the best way to go about choosing $\sigma$? My intuition is that this parameter shouldn't be fit.

A similar topic has come up before but didn't hit exactly on this same question.

I do not understand... Is the process you are modeling deterministic, or probabilistic? What do you mean by saying that your model is deterministic? If it is deterministic, then what is the role of adding noise to it? You are mentioning log-likelihood, but how would you define likelihood for a deterministic model? It is pretty unclear what do you mean - could you edit to clarify? — Tim, Feb 23 '17 at 23:40

ekatko1 · Accepted Answer · 2017-02-24T02:31:42.440

I believe the correct way to address this problem would be to minimize the negative log likelihood of the fit of your model to the data (same as maximizing the likelihood). Essentially, minimizing NLL will also minimize the mean squared error in your to give you the best fit. You may also want to also calculate confidence intervals for your parameters afterwards.

The first challenge will be to actually write a function to calculate negative log-likelihood of your data for a given set of parameters, using a specific distribution (in R, you may use dnorm() for normally distributed residuals ($\epsilon_i$) ). dnorm takes four input paramters: x, your data, mu, the mean, i.e. $F(X, \Theta)$, sd, standard deviation and log, set to TRUE to get log-likelihoods. You will want to -sum() the results from dnorm. That is your negative log-likelihood.

Then, you can use a function such as optim() in R to find the optimal set of parameters for your NLL function. Given initial guesses, the function will try to minimize the NLL (and, inadvertently, the residuals). You will need to pass guesses for each of your parameters (argument par) in $\Theta$ along with a value for $\sigma$ in the same vector. For the fun, argument, pass the NLL function described above. There are several methods that you can use, depending what kind of function and what kind of accuracy/speed trade off you are looking for. "BFGS" may be a good starting point.

Thanks for your response. Using this method, how would you recommend calculating the standard deviation $\sigma$? Would you take the standard deviation of the residuals on every iteration of gradient descent? — Matt P, Feb 24 '17 at 00:38
Hey, sorry I look back at some notes and realized you will be guessing at sigma as well as your list of other parameters theta! So you will calculate sigma in the process of finding the minimum LL. — ekatko1, Feb 24 '17 at 02:17

Estimating the parameters of a deterministic model

1 Answers1