20

Suppose we have a function $f(x)$ that we can only observe through some noise. We can not compute $f(x)$ directly, only $f(x) + \eta$ where $\eta$ is some random noise. (In practice: I compute $f(x)$ using some Monte Carlo method.)

What methods are available for finding roots of $f$, i.e. computing $x$ so that $f(x) = 0$?

I am looking for methods which minimize the number of evaluations needed for $f(x)+\eta$, as this is computationally expensive.

I am particularly interested in methods that generalize to multiple dimensions (i.e. solve $f(x,y) = 0, g(x,y) = 0$).

I'm also interested in methods that can make use of some information about the variance of $\eta$, as an estimate of this may be available when computing $f(x)$ using MCMC.

Szabolcs
  • 1,118
  • 1
  • 7
  • 27
  • I am not sure what are the right tags for this question, please help in re-tagging. – Szabolcs Apr 08 '14 at 23:22
  • 3
    To be fair, I found [Stochastic approximation](https://en.wikipedia.org/wiki/Stochastic_approximation), but very little *practical* information with examples or practical discussion of when it works well and when it doesn't. Most of the info is in academic papers which seem to require quite a bit of work to convert into a practical application. Another thing I found is the keyword [Likelihood-free estimation](http://arxiv.org/pdf/1001.2058.pdf) which solves a very similar problem and there's more practical information available online. **Is there anything else?** References are welcome! – Szabolcs Apr 09 '14 at 00:38
  • interesting problem. i suppose all the gradient methods go out the window – Aksakal Apr 09 '14 at 02:10
  • also, in your case the problem is more difficult: you may control $var[\eta]$ through MC – Aksakal Apr 09 '14 at 02:19
  • **I'll add an extra 50 to Glen_b's bounty for a good answer.** – Szabolcs Apr 13 '14 at 20:35
  • @Aksakal: Gradient methods are not necessarily out of the window. If one uses a higher threshold for the minimum change in variables for the finite-difference gradients (usually it equates to zero) finite-difference work "fine" (I am not saying this is a solution here, just I have seen it in the past). – usεr11852 Apr 13 '14 at 20:51
  • @Szabolcs: Have you thought of gradient free methods like Sim. Anneal and/or Genetic Algorithms? It is a totally different train of thought in comparison to "gradient" methods but actually given you are computing $f(x)$ using MCMC methods you might be able to incorporate the optimization within the evaluation of $f(x)$ itself. – usεr11852 Apr 13 '14 at 20:54
  • @Szabolcs I don't think you'll be able to do 50. To do a second bounty you have to do more than the first one (in fact I think you might have to double it, but either way a second bounty would be 100). – Glen_b Apr 17 '14 at 16:20
  • @Glen_b Alright, if the references in the paper turn out to solve my problem, I'll give an extra 100. If they don't I'll offer 100 more for another solution ... I'm past the point where I care about reputation, I don't mind losing 100. – Szabolcs Apr 17 '14 at 17:19
  • This is not simply a stochastic root finding problem, this is also a stochastic control problem, because by controlling $\eta$ you may control the speed of convergence of root finding part. Imagine that in the beginning you go for less precise execution, which is faster, in order to quickly figure out the area of root search; then you switch to more precise but slower MCMC to find the exact solution. – Aksakal Apr 17 '14 at 21:29
  • I think for these kinds of problems you can use stochastic gradient descent, see e.g. http://finzi.psych.upenn.edu/R/library/sgd/html/sgd.html – Tom Wenseleers Nov 15 '18 at 14:28

1 Answers1

14

You might find the following references useful:

Pasupathy, R.and Kim, S. (2011) The stochastic root-finding problem: Overview, solutions, and open questions. ACM Transactions on Modeling and Computer Simulation, 21(3). [DOI] [preprint]

Waeber, R. (2013) Probabilistic Bisection Search for Stochastic Root-Finding. Ph.D dissertation, Cornell University, Ithaca. [pdf]

QuantIbex
  • 3,880
  • 1
  • 24
  • 42
  • (+1) Having a question answered with a dissertation citation from 2013 is pretty awesome. – Sycorax Apr 13 '14 at 21:30
  • 1
    This one's google-fu is strong – bdeonovic Apr 14 '14 at 01:17
  • 1
    The first paper you cite is useful, but it should be noted that there's still quite a bit of work needed to put the methods in practice. – Szabolcs May 13 '14 at 14:16
  • It would be really nice if someone that went through the methods could give an estimation of how much of work does it take to go from the paper to the most simple implementation. Took a glance at the first paper and it seems quite dense. – Heberto Mayorquin Aug 17 '18 at 09:55
  • I think for these kinds of problems you can use stochastic gradient descent, see e.g. http://finzi.psych.upenn.edu/R/library/sgd/html/sgd.html – Tom Wenseleers Nov 15 '18 at 14:28