As with many techniques in machine learning, a lot of heuristics are involved, and this perhaps is no different. This is my interpretation of what's going on.
Using the notation $CPV = \text{CurrentPValue}$, $T = \text{Threshold}$, and $s = \text{score}$, we can reorganize $CPV(i) < T$ into
\begin{align*}
\max_{i_1 \le i_2 \le i}[0.99^{i - i_1} (s(i_2) - s(i_1))] < \underbrace{(-2\log T)}_{\text{Another Threshold}}\left([\max_{j \le i}s(j)] - s(i)\right) \tag{$\spadesuit$}
\end{align*}
The original formulation using threshold $T$ instead of $\text{Another Treshold}$ is nice, because $T$ can be set to something between 0 and 1, in line with what we may interpret as a p-value, but the reorganization will make it clear what is going on.
I interpret the LHS of ($\spadesuit$) as a "near-current" maximum spread of our max metric $s(i)$. Without the $0.99^{i - i_1}$, it would be the maximum spread of our max metric, which gives weight to earlier iterations for which our fit was still nascent. Over time, the LHS forgives these early misdemeanors, 1% per iteration, but of course, if the earlier crimes were egregiously bad (the initial parameters were way off), it will take time to forgive it. The 0.99 should be another adjustable tuning parameter in the model, something akin to a clemency factor.
Continuing with this crime analogy, I interpret the RHS of ($\spadesuit$) as information due to new crimes. As in, if $s(i)$ hits a new max, then the RHS is 0, and it will take more time to recover from this. How seriously the this new crime lingers in the thoughts of the authorities (the overfitting detector $\text{IncToDec}$) is determined by the $\text{Another Threshold}$, and how old crimes linger with authorities is deteremined by 0.99.