Understanding Covariance after Variance (visually)

Question

2 Points i understood from variance derivation-

A) For calculating Variance we do not subtract (or mod add), but rather sum squared all points' differences from the mean.

B) Variance of 1,2,3 will be lower than 0,2,4 because of lower spread across the mean for the former (despite same mean)

Understanding the same concept of variance/spread of points from mean and applying to 2D instead of 1D..

I came across this explanation of covariance where areas (from mean to xi,yi) are subtracted/added based on direction of x vs y as measured from mean.

However, we wanted to know the spread from the mean, just like 1D X axis points in the case of Variance; so why aren't we summing the areas (of rectangles from mean)?

And won't this negate B)? For 2 sets of points (1,1) (1,-1) (-1,-1) (-1,1) vs (2,2), (2,-2),(-2,2), (-2,-2) will have same mean but also same covariance, although clearly points are more spread out for the latter from mean (0,0)?

If you want to know the 2D spread of $(X,Y)$ data from the mean, you will study the distribution of $X^2+Y^2$ (or some function of it). The covariance between $X$ and $Y$ tells you little about that. — whuber, Nov 29 '21 at 16:59

Understanding Covariance after Variance (visually)

0 Answers0