0

I am currenly doing some student t-test assignments, and i am having a hard time understand the followig question:

  • What probability is there of obtaining these particular mean values purely by chance?

I will upload a picture of the complete text, so you a have a better understanding of what is referred to.

enter image description here

Do they mean like, if the the significant level is 0.05 then there is 5% chance of that happening?

Hope you can help me understand the question. It is mentioned several times in the assignment.

  • Your assignment seems to be using a common misunderstanding about what a p-value means. This makes it tough to answer, because I believe that the “full credit” answer to #2 is the p-value, but that’s because the question writer has a common misconception. – Dave May 29 '21 at 13:40
  • Thanks for the answer. What is the common misconception you mentoin? – Valdemar May 29 '21 at 13:43
  • Im still not quite sure what they mean. But this is what i wrote: Probabilities like the p-value range from zero (no certainty) to one (certainty). The statistical signifiance determines how big a chance there is. If the value is 0.05 it means it is only considered borderline statistical significant. If the p-value is below 0.01, that is considered statisticially significant. If the p-value is below even 0.005 it would be considered highly statistically significant. – Valdemar May 29 '21 at 13:43
  • The common misconception is that the p-value is the probability of obtaining your results purely by chance. // If those are the conventions in your class (0.05 is meh, 0.01 is good, 0.005 is really good), then that sounds like a decent answer that will earn you the points. Did you come up with those thresholds, or are they from your class? // The change I would make is that a probability of zero is certainty…that it won’t happen, not a lack of certainty. – Dave May 29 '21 at 13:49
  • Thanks for the suggestion, will change it to "will not happen". And the thresholds (0.05,0.01,0.005) is something i found in source. Not something from the class. I actually thought 0.05 was enough but it makes sense that it can be even lower. – Valdemar May 29 '21 at 14:21

1 Answers1

2

Dave was alluding to this in the comments, but I think it's important to spell this out in black and white:

The probability that these results are due to chance is not the p-value.

The p-value is the probability of observing these data, or data more extreme, IF (and only if) the null hypothesis is true.

The distinction may appear to be subtle but it is important. One way to think about this is as follows. In the case of a t-test, the null hypothesis $H_0$ is that there is no difference in means between the 2 groups. The probability you are being asked to compute is $P(H_0|D)$ : that is, the probability of that there is no actual difference in means, given the data $D$ actually observed. However, in a frequentist setting, this does not make sense. The p-value that you will compute using classical frequentist methods is $P(D|H_0)$

Robert Long
  • 53,316
  • 10
  • 84
  • 148
  • I am glad to see someone expanding on my comment, but I do not agree that the homework problem asks for the posterior probability of the null hypothesis; the homework almost seems to be asking for the usual p-value but without the usual “or more extreme” business. // I still think the “full credit” answer is just to give the p-value, though. – Dave May 29 '21 at 16:54
  • @Dave I take your point and I agree that the homework is asking for the p-value but I worry that this is one of those cases where students are not taught the correct definition of a p-value and that's what I was trying to get at. This reminds me of a paper I once read where they had surveyed a number of academics, some of whom were statistics lecturers with some statements about p-values and none if them got all the answers right. Perhaps you've also come across it ? – Robert Long May 29 '21 at 17:42
  • 1
    @Dave Also, there is this question with a number of answers here on CV that I think it very relevant: [Why is it bad to teach students that p-values are the probability that findings are due to chance?](https://stats.stackexchange.com/questions/16939/) – Robert Long May 29 '21 at 17:43
  • 1
    Another issue with the kind of question in the homework is what is meant by "by chance". For me this is very imprecise language and should be avoided. – Robert Long May 29 '21 at 17:50
  • Does this answer your question ? If so please consider marking it as the accepted answer. If not, please let us know why. Also, if you haven't already, please consider upvoting it. – Robert Long Jun 26 '21 at 12:25