1

Follow-up to this older post (have to make it a question since I can't post comments yet).

Specifically, could anyone kindly show how $$\operatorname{Var}[\operatorname E[X\mid K]]$$ (in total variance method) is equivalent to $$\sum_k{n_k(\bar x_k - \bar x )^2}$$ (in the "more direct" method)?

When I try to do this from first principles (MIT course, bottom right slide of page 1), I end up with $$\sum_k\frac{n_k}{n}(\bar x_k - \bar x )^2$$ - which is the same "error/typo" that OP made... so there must be something I'm missing. Something about the "weight function"? But I can't see how the example in the slide is any different from this clustering case.

Thanks a lot in advance.

Michael Hardy
  • 7,094
  • 1
  • 20
  • 38
Tim
  • 131
  • 4
  • What you ended up with looks better to me. For example, if each $n_k=1$, you would want a $\frac1n$ term – Henry Jun 03 '19 at 00:14
  • I guess then the question becomes, how is the example on the MIT slide different from k-means clustering setting? – Tim Jun 03 '19 at 00:26
  • The question in your first link is talking about the "total sum of squares" of differences from the mean than the variance expected square of the difference from the mean. Hence the $\frac1n$ factor – Henry Jun 03 '19 at 00:49
  • Just figured it out... Thank you for taking time! – Tim Jun 03 '19 at 00:51

1 Answers1

0

OK, got it... Confusion arose from different definition of "total variation" in k-means problem $\sum_{i=1}^n(x_i - \bar x)^2$ and conventional definition which the slide (and in fact all our previous training) uses, that is $\frac1n\sum_{i=1}^n(x_i - \bar x)^2$... and the difference is exactly $n$.

Tim
  • 131
  • 4