The Durbin-Watson statistic to detect autocorrelation in the error terms ranges from 0 to 4. Currently, I am working out why it cannot exceed 4 analytically. The lower boundary case is obvious considering the actual statistic which is
$$DW = \frac{\sum\limits_{t=2}^{T}(\hat{\epsilon}_t - \hat{\epsilon}_{t-1})^2}{\sum\limits_{t=1}^{T} \hat{\epsilon}_t^2};$$
but for the upper boundary case, i.e. DW < 4, I ran in one particular difficulty. After expansion of the numerator I get
$$\sum\limits_{t=2}^{T}(\hat{\epsilon}_t - \hat{\epsilon}_{t-1})^2 = \sum\limits_{t=2}^{T} \hat{\epsilon}_t^2 - 2\hat{\epsilon}_t\hat{\epsilon}_{t-1} + \hat{\epsilon}_{t-1}^2 = \sum\limits_{t=2}^{T} \hat{\epsilon}_t^2 -2 \sum\limits_{t=2}^{T} \hat{\epsilon}_t\hat{\epsilon}_{t-1} + \sum\limits_{t=2}^{T} \hat{\epsilon}_{t-1}^2.$$
Obviously the last and first term are both less or equal than $\sum\limits_{t=1}^{T}\epsilon_t^2$ since they miss a non-negative term. Hence,
$$\sum\limits_{t=2}^{T} \hat{\epsilon}_t^2 + \sum\limits_{t=2}^{T} \hat{\epsilon}_{t-1}^2 \leq 2\sum\limits_{t=1}^{T} \hat{\epsilon}_t^2.$$
What I can't seem to figure out analytically is why necessarily
$$-2\sum\limits_{t=2}^{T}\hat{\epsilon}_t\hat{\epsilon}_{t-1} \leq 2 \sum\limits_{t=1}^{T} \hat{\epsilon}_t^2$$
must hold, i.e. why
$$0 \leq \sum\limits_{t=1}^{T} \hat{\epsilon}_t^2 + \sum\limits_{t=2}^{T} \hat{\epsilon}_t\hat{\epsilon}_{t-1},$$
considering that the error terms might be negative? Surely, there is some kind of estimation that yields the result but what is it?