1

The expected improvement on how to choose a next point $x$ for evaluation is to choose the point such that $$arg\,max_{x}E[f(x) - f^{max}]$$

where $f(x)$ denotes the gaussian process posterior distribution at location $x$ and $f^{max}$ denotes the current maximal point from the gaussian process.

Since $f(x)$ is random and $f^{max}$ is fixed, the above equation can be reduced to $$ \begin{aligned} EI(x) &= arg\,max_xE[f(x)] - f^{max}\\ &= arg\,max_x \mu(x) - f^{max} \end{aligned}$$

$\mu(x)$ is the posterior mean of the gausisan process which can be analytically derived. However, in this other stack exchange post, the EI is developed somewhat more complexly and I do not really understand the derivation behind it.

Is my understanding of EI that I have mentioned above wrong ? or Am i missing something

calveeen
  • 746
  • 1
  • 10
  • 2
    i realised my error. The expectation is taken with respect to the $$max(f(x) - f^*, 0)$$ which takes into account the fact that if a sample of $x$ from $f(x)$ is lower than the current optimum, than the EI for that sample is 0 rather than negative – calveeen Dec 28 '20 at 15:11

1 Answers1

2

As you have already addressed in your comment, the EI acquisition function is

\begin{equation} \begin{array}{lll} a_\text{EI}(\mathbf{x}) &=& \mathbb{E}[{\max\left(y, f(\mathbf{x}_\text{best})\right) - f(\mathbf{x}_\text{best})}] \\ &=& \mathbb{E}[{\max(y - f(\mathbf{x}_\text{best}), 0}] \\ &=& \mathbb{E}[(y - f(\mathbf{x}_\text{best})^+] \\ &=& \sigma(\mathbf{x}) \cdot (\gamma(\mathbf{x}) \Phi(\gamma(\mathbf{x})) + \phi(\gamma(\mathbf{x})), \end{array} \end{equation} where $\gamma(\mathbf{x}) = \frac{\mu(\mathbf{x}) - f(\mathbf{x}_{\text{best}})}{\sigma(\mathbf{x})},$ $\phi(\cdot)$ and $\Phi(\cdot)$ are the PDF and CDF of standard normal distribution, respectively.

kensaii
  • 264
  • 2
  • 11