1

Suppose I want to fit a function to measured data minimizing the chi^2. "It is known" that if it has too many correlated parameters the fit will not only have very large expected statistical errors on the fitted parameters but many times it will fail (e.g. stuck in a minimum more than 10 standard errors away from the "true" minimum). Here's my question... How can the "robustness" of a function to be fitted be defined and/or studied?

Example. Suppose my data are described by the function y=a*x+b*sin(x). x is in the range [-0.1,0.1] and a,b ~1. I take 6 measurements in this range. I can compute the expected correlation matrix and it is like 0.9xxx between a and b. However if the errors on the measured y are 1e-7 I will fit this function without any problems. If the errors on the y are about 1e-2 there is no possibilty to converge, and the actual chi^2 minimum will be tipically much farther then the expected error.

How can I study a-priori whether a function to be fitted is "robust" or not?

user26067
  • 114
  • 6
  • One way is to [look at the objective function](http://stats.stackexchange.com/questions/7308/can-the-empirical-hessian-of-an-m-estimator-be-indefinite/7629#7629) (which is usually a negative log likelihood or a measure of goodness of fit). – whuber May 24 '13 at 15:08
  • That's not exactly the point. I was talking about determining a priori the fittability. I've stated that I'm using chi^2. Usual assumptions holds therefore it is the negative log-likelihood. I can compute the expected value of the hessian (and therefore the epxected covariance matrix). But how can I say a priori if this function can be robustly fitted or not? As a rule of thumb if there are high correlations (>0.85? >0.95? >0.99?) it's "probably" bad, but not necessarily. Depends also on the error magnitude – user26067 May 24 '13 at 15:26
  • You can determine the "fittability" *a priori* by studying the chi-squared function. You will immediately see it depends on the data, so at that point you have to make assumptions about what data values you will get: but it's still a prior evaluation. For instance, in your example of a linear+sine term, if you know the data will cover a wider range you can find the fitting is robust. Thus there is no single omnibus data-independent answer to your question. – whuber May 24 '13 at 16:22

0 Answers0