I've found in several books so far only a waving of hands explanation on why one can simply neglect the residuals mutual correlation, in a linear model context, and plot a QQ-graph to do a qualitative assessment on the distributional properties of the residuals.
In a comment to this other question of mine, there's a comment stating that the average mutual correlation is $-1/(n-1)$ and I am not sure why ...
In this other question, StasK answer gives the following statement: "The reason I am saying that the off-diagonal values are small is because $\sum_{j≠i}h^2_{ij}+h^2_{ii}=h_{ii}$, and in fact either the diagonal or off-diagonal entries are roughly of order $O(1/n)$ although this is not a very strict statement that is easily thrown off by the high leverage points.»
I'm looking for two well laid out explanations of the two above statements, in simple terms, for a plain, simple man. And please, do include equations.