4

With regard to this question about classifications of groups which differ significantly, I have two somewhat newbie / general questions:

  1. Lets say I'm doing a hypothesis test (the difference between two means) and I reject the null with 95% confidence (alpha = 0.05). Does this imply that I can predict/classify new values into A or B with a success rate of 95% ? If so, how this can be done ?
  2. Not sure in what realm my question would fall, but does this somehow relate to the explanatory vs predictive modeling discussion ?
Dov
  • 1,630
  • 3
  • 14
  • 24

1 Answers1

5
  1. No, unfortunately not. To see why not, consider class A as being drawn from a Normal(0,1) distribution and class B as being drawn from a Normal(0.01,1) distribution. The two are virtually indistinguishable on an observation-by-observation basis, so, without other information, you'll not be able to assign an observation to A or B with a success rate of much over 50%. However, if your sample size is large enough, say 100,000, you'll have little problem rejecting the null hypothesis that the two means are the same if you test at the 95% level. (Whether or not you should test at the 95% level with a sample size of 100,000 is a different question.)

  2. That's an excellent question, and the answer seems to me to be yes. Assume A vs B is some factor related to the outcome; we can see that A vs B makes a (small) difference to the outcome, but knowing that is of very little help for prediction.

jbowman
  • 31,550
  • 8
  • 54
  • 107