5

I want to estimate the parameters that minimize a multivariate function with noise based on realized observations.

As a simple example, say I have observed the univariate sample below:

enter image description here

The samples were generated from a quadratic function plus noise, but in general the function will be unknown and I don't want to make any assumptions about it other than it is "smooth". In this case, the function at $x \approx 5.8$ happens to be the minimum, but I want to make a statistical argument that the function is minimized at $x=5$.

How should I approach this? Should I fit some kind of nonparametric model to the sample data, and then optimize the fitted, smoothed values? Or are there algorithms that tackle the optimization of noisy function directly?

user2303
  • 433
  • 4
  • 12
  • A starting point may be to simply regress $y$ on $x$ and $x^2$ including a constant. You could then apply robust regression techniques to this approach if is necessary. – John Oct 03 '12 at 18:38
  • The data in the question in the question was only an example. I am looking for a general method to minimize a function without making model assumptions about the relationship between the dependent and the independent variables. – user2303 Oct 03 '12 at 18:46
  • You might be interested in the approach described by [Jones *et al.* (1998)](http://www.ressources-actuarielles.net/EXT/ISFA/1226.nsf/9c8e3fd4d8874d60c1257052003eced6/f84f7ac703bf5862c12576d8002f5259/$FILE/Jones98.pdf). Although it focuses on functions that are expensive to evaluate, its approach is the right one in general because it aims to find the optimal point rather than to achieve some kind of good average fit overall. – whuber Oct 04 '12 at 17:27

2 Answers2

4

If you truly have a smooth function with noise there are many ways to fit it. Parametric models linear and nonlinear are possibilities, but based on how you ask the question it sounds like you are looking for some sort of a nonparametric regression. I think splines and loess are options. In the end you are searching for a minimum though. For parametric models local and global minima can be estimated and confidence intervals constructed for it. In the nonparametric situation Bill Huber described a procedure in his answer to a similar question on CV involving time series data. I think the idea can apply to your problem as well.

But the way you pose the problem is strange and contradictory. If you have a model for multivariate data with noise without having a way to perfectly filter out the noise you cannot know exactly where the minimum is. No matter what fitting technique you use you can only approximate the point where the minimum occurs. If your estimate is 5.8 the confidence interval could in a certain statistical sense rule out 5 given sufficient data but it could never prove that the minimum is at 5 and your data really seems to be telling you otherwise. Maybe there is an assumption you are making that you haven't told us about. Now that the picture is clear and you have edited your question,

I see that since the data is simulated you know the truth. The true model is a quadratic function that has a U shape and hence has a unique global minimum at 5. Even a statistical technique that assumes a quadratic model with that much noise will give you an estimate different from 5. Nonparametric models that make less restrictive assumptions will probably do worse.

For real data as I said you can't perfectly separate the signal from the noise and you will never get an 'exact right" answer. There may be models that can estimate the minimum and be consistent in a probabilistic sense but not perfect for any finite sample size.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
  • Thanks for your answer. I do want to use a nonparametric model and not make any model assumptions other than its smoothness. So if I understand correctly, you are suggesting just fitting with something like LOESS and minimizing the fitted values? – user2303 Oct 03 '12 at 18:56
  • That is one way. Huber's suggestion was something like that but I don't remember all the details. For the time series problem we were actually looking for a minimum usingg seasonal data and I recommended harmonic regression and pointed out some advantages. At the time I hadn't read Bill's answer thoroughly and didn't contrast the advantages and disadvantages of each properly. His approach was a little different because of the time series context and amounted to being efficient because it went through the series successively keeping track of the current global minimum. – Michael R. Chernick Oct 03 '12 at 19:06
2

I also believe that Michael's idea about lo(w)ess smoothing might be the obvious answer. That is if you feel confident enough to make assumptions about the smoothness of your underlying function. Nevertheless, your "smoothness with noise" optimization problem can still get tricky and send a gradient based solver in local optima even in the loess profiled function.

If the objective function is low dimensional, you don't want to assume any smoothness properties and has a "low" evaluation cost you might want to use a Simulated Annealing (a random search practically) to get good estimates for the estimation of parameters. It is dead easy to implement anyway and you save yourself the worries of "smoothness".

(Sorry I just saw you accepted the previous answer!)

usεr11852
  • 33,608
  • 2
  • 75
  • 117
  • Good points! @user11852. I was thinking about response surface methodology as part of my answer but left it off because maybe it is more useful for designing experiments than searching for global extremes of a function. But the idea of a locally quadratic function to search for extremes might make some sense. But as you say with noisy data and no real idea of a functional form it can look like a function with several local minima and search algorithms can trap you at a local minimum. Simulated Annealing is certianly one approach that can pull you out of a local minimum. – Michael R. Chernick Oct 03 '12 at 19:58