0

Sorry it's not typeset, but hopefully it's readable. I've recently been making a Geogebra file to show linear regression and correlation coefficient.

https://www.geogebra.org/m/kuzw2hyk

I included some elements to help show things geometrically and learnt and few things in the process which got me wondering further. For the covariance of $x$ and $y$, as $$\sum xy-\frac{\sum x\sum y}{n}$$

I realised that this can be written as $$\sum xy-n\frac{\sum x}{n}\frac{\sum y}{n}$$ which is $$\sum xy-n\bar x \bar y$$

Visually I showed this as the sum of the rectangles formed by each point with the origin, and subtracting $n$ times the area of the rectangle formed by the double mean point and origin. Subsequently I've also realised this could more easily seen by actually representing $\sum xy-n\bar x \bar y$ as $\sum(xy-\bar x \bar y)$, which could be graphically shown as the sum of the areas of each point with it's opposite vertex at the double mean point.

This leads to me wonder, is this process similar with 3 variables, ie. the volumes formed by the triple mean point. Is covariance of 3 variables 3D and if so how does this scale to more dimensions?

@WHuber Your link is very interesting, and I can see how you've avoided using the double mean point but computationally to connect all points using rectangles seems like a lot of work. Both of my geometric interpretations are equivalent but my main question is how this generlizes to the covariance of 3 variables and beyond.

Cliff
  • 21
  • 2
  • For an alternative geometric explanation of covariance, please see https://stats.stackexchange.com/a/18200/919. – whuber Jul 19 '18 at 12:40
  • @Stubbornatom. Did you typeset the formulas for me or is it automatic? – Cliff Jul 19 '18 at 12:51
  • @Cliff It is not automatic. – StubbornAtom Jul 19 '18 at 14:03
  • re your edit: please note that there is no such thing as the "covariance of three variables": *by definition,* a covariance is a property of an ordered *pair* of variables. The only way I have been able to understand your question, then, is that it's an attempt to ask about possible generalizations to three or more variables: that's what the duplicate thread addresses. – whuber Jul 28 '18 at 18:39

0 Answers0