This bound can be derived as follows. Assume first that the value function estimator $\hat{V}^0$ is initialized to be zero for all states, where the superscript indicates the iteration at which the value function estimator is considered. At first we have that
$$\max_{s\in\mathcal{S}}|V(s)-\hat{V}^0(s)|\le\sum\limits_{t=0}^{\infty}\lambda^tR_{\rm{max}}=\frac{R_{\rm{max}}}{1-\lambda}.$$
In subsequent iterations it holds that
$$e^{k+1}\triangleq\max_{s\in\mathcal{S}}|V(s)-\hat{V}^{k+1}(s)|=\max_{s\in\mathcal{S}}|r(s)+\lambda V'(s)-r(s)-\lambda\hat{V}'^k(s)|\\=\max_{s\in\mathcal{S}}\lambda|V'(s)-\hat{V}'^k(s)|=\lambda e^k,$$
where I assumed a deterministic policy and deterministic transition dynamics. Furthermore, the quantity $V'(s)$ is the value of the subsequent state and $\hat{V}'^k(s)$ is defined the same way. If we now want that $e^N\le\varepsilon$, then $\lambda^N\frac{R_{\rm{max}}}{1-\lambda}\le\varepsilon$, which implies that $(e^{\log{\lambda}})^N=e^{N\log{\lambda}}\le\frac{\varepsilon(1-\lambda)}{R_{\rm{max}}}$. Then $N\log{\lambda}\le\log{\frac{\varepsilon(1-\lambda)}{R_{\rm{max}}}}=-\log{\frac{R_{\rm{max}}}{\varepsilon(1-\lambda)}}$ or $-N\log{\frac{1}{\lambda}}\le-\log{\frac{R_{\rm{max}}}{\varepsilon(1-\lambda)}}$. And finally $N\ge\frac{\log{\frac{R_{\rm{max}}}{\varepsilon(1-\lambda)}}}{\log{\frac{1}{\lambda}}}$.
Not sure, where the factor 2 comes from. Also, you probably want to consider the more general case with stochastic transition dynamics and a stochastic policy. The derivation above should serve as a good starting point to do so though.