So usually the logic for all chi-squared problems is as follows:
- Formulate the Null and Alternative hypothethis
- Calculate pearson residuals
- Now we see, that those residuals fit well into the chi-sqared distribution.
- Use chi-square distribution to get probability of observed data.
I've read and watched many explanations of how chi-squared distribution is build. It's more or less clear here.
But I can't figure out transition between 1 and 2: WHY do we calculate (observed-expected)/sqr(observed) in the first place?. Why don't we use any other (random) function from "observed" and "expected"? Why not (observed-expected)^3/log(observed)? Then it wouldn't fit to chi-squared distribution, maybe it will fit for another type of test...