Follow-up to this older post (have to make it a question since I can't post comments yet).
Specifically, could anyone kindly show how $$\operatorname{Var}[\operatorname E[X\mid K]]$$ (in total variance method) is equivalent to $$\sum_k{n_k(\bar x_k - \bar x )^2}$$ (in the "more direct" method)?
When I try to do this from first principles (MIT course, bottom right slide of page 1), I end up with $$\sum_k\frac{n_k}{n}(\bar x_k - \bar x )^2$$ - which is the same "error/typo" that OP made... so there must be something I'm missing. Something about the "weight function"? But I can't see how the example in the slide is any different from this clustering case.
Thanks a lot in advance.