5

Are there non-hypothesis testing uses for Student's t statistic?

Jeremy Miles
  • 13,917
  • 6
  • 30
  • 64
Alexis
  • 26,219
  • 5
  • 78
  • 131
  • 1
    Are you interested in alternative uses of Student's t-distribution in and of itself, or are you only focused on the concrete object of the test statistic $\frac {{\bar {X}}-\mu }{S/{\sqrt {n}}}$ ? – klumbard Jul 24 '20 at 19:13
  • @klumbard I would be happy to entertain answers that bring in the distribution, but am more specifcally interested in the statistic. – Alexis Jul 24 '20 at 19:15
  • 1
    Hmm, but the $\mu$ in the statistic is typically set in reference to some null hypothesis, right? I don't see how you can talk about "the statistic" without reference to the null hypothesis to which it is linked. Even setting $\mu=0$ is implicitly testing against the null $H_{0}: \mu = 0$. The distribution in and of itself is interesting because it's symmetric and bell-shaped with fatter tails, so we can free it from the hypothesis testing land a bit by just using it to model extreme outcomes, but that's using a parametric form to fit data, and not physically calculating t-statistics. – klumbard Jul 24 '20 at 19:25
  • 1
    @klumbard Amplify that into an answer! (And be sure to point out the implicit $\mu=0$ for $t= \frac{\theta}{\hat{\sigma}_{\theta}}$.) – Alexis Jul 24 '20 at 19:33
  • 1
    When you're not testing hypotheses, $t$ is the same as $z.$ For its many uses search here (or the Web) for *standardization.* – whuber Jul 24 '20 at 20:12
  • @whuber I am not sure I understand: standardizing by a population versus sample SE should still differentiate $z$ from $t$? – Alexis Jul 24 '20 at 20:17
  • 1
    The pivotal quantity arises also in construction of confidence intervals for $\mu$; does that count? – Glen_b Jul 25 '20 at 06:25
  • In a $z$ test, the SD is also the (same) *estimate* of the SD used in a $t$ test. The tests differ concerning which *distribution* to use for computing p-values. – whuber Jul 26 '20 at 15:13
  • 1
    @whuber Permit me to say . – Alexis Jul 26 '20 at 16:10

2 Answers2

4

A "hypothesis test" in the strictest sense always results in a binary outcome of either rejecting or failing to reject a null hypothesis. T-statistics are generally turned into p-values, which are then compared against some pre-defined threshold to make that binary determination. It is possible to use the t-statistic itself, however, as a general measure of "deviation from the null", without ever having to take the final step of testing whether the null hypothesis should be rejected or not. Using a t-statistic in this way is still derived from a hypothesis testing framework, but does not actually result in a test of whether the null should be rejected or not, so I'd argue this is not strictly a "hypothesis test".

As an example, the t-statistic can be used as a means of ranking features by significance, while accounting for the directionality of the differences. Gene set enrichment analysis, for example, searches for sets of consistently up- or down-regulated genes, so the directionality of differences is important for this method. Ranking features by their p-value will draw no distinction between up- and down-regulated genes, and simply put the most significant genes at the top of the list. Ranking by the t-statistic, on the other hand, will put the most significant up-regulated genes at one end of the list, and the most significant down-regulated genes at the other end. Although the magnitude of the t-statistic is directly related to the p-value, the sign of the t-statistic is lost when calculating a p-value for a hypothesis test. Ranking genes in this way respects the directionality and how incompatible with the null hypothesis each gene is, but does not actually make any determination if any gene is "significantly dysregulated" or not.

Nuclear Hoagie
  • 5,553
  • 16
  • 24
  • 1
    You more or less ignored the "non-hypothesis testing uses" which is the substance of my question. – Alexis Jul 24 '20 at 17:21
  • @Alexis I'd argue the analysis I'm describing isn't quite a hypothesis test, since there's no determination of statistical significance, no pre-defined probability threshold, and does not seek to reject or fail to reject a null hypothesis. Here, I'm using the t-statistic in a way that does not directly test the hypothesis of equal means. But I do see your point that the t-statistic itself comes from a hypothesis-testing framework of comparing two groups under a set of assumptions. I can't think of another way to get to the t-stat without that kind of comparison, though. – Nuclear Hoagie Jul 24 '20 at 17:37
  • 1
    Ranking by p-values is so controversial and so antithetical to the meaning of p-values that its meaning and applicability are often dismissed out of hand. – whuber Jul 24 '20 at 18:09
  • Contrast: "since there's no determination of statistical significance" with "a means of ranking features by significance" – Alexis Jul 24 '20 at 18:14
  • @whuber Ranking genes by some measure of differential expression, whether that's the t-stat or p-value or effect size is a cornerstone of GSEA-style analysis and has been applied successfully many times over, with the original paper having been cited over 20,000 times. I too tend to view p-values as a continuous "measure of surprise" rather than the binary Neyman-Pearson "significant or not", as described at https://stats.stackexchange.com/questions/137702/are-smaller-p-values-more-convincing. But that's a whole other can of worms I'd rather not open here. – Nuclear Hoagie Jul 24 '20 at 18:56
  • *p*-values are ***only*** a "measure of surprise" when situated against a specific null hypothesis. *p*-values measure the probability of observing some value (e.g., a $t$ statistic) **conditional on $\boldsymbol{H_{0}}$ being true**. – Alexis Jul 24 '20 at 19:16
  • @Alexis Agree, the t-stats computed here implicitly require a null and alternative hypothesis, so it's not totally divorced from a hypothesis testing framework. But I do think it's a little different since we don't determine, or even care, whether or not to reject the null hypothesis. To me, a true hypothesis test always results in a binary outcome of rejecting or failing to reject the null, but that's not what the t-stat is used for here. – Nuclear Hoagie Jul 24 '20 at 19:52
  • Color me mollified! – Alexis Jul 24 '20 at 20:19
4

When you ask about "the t-statistic", I think about the concrete quantity $$\frac {{\bar {X}}-\mu }{S/{\sqrt {n}}}$$

To actually calculate this quantity, we have to specify $\mu$. This is typically chosen in reference to some given null hypothesis. So to me it seems awkward to try to disentangle "the statistic" from the null hypothesis to which it is implicitly linked by $\mu$. Setting $\mu$ to 0, for example, which you're implicitly doing when you type t.test(rnorm(10))$statistic into R, is implicitly related to the hypothesis test $H_0: \mu = 0$.

Where I think of Student's t-distribution as useful is as a parametric form for fitting to data. At the end of the day, it's just another symmetric, bell-shaped distribution. It just has fatter tails than a Gaussian. So it can be used to model things for which you'd like to preserve that symmetry and bell-shape, but give the extreme outcomes more probability mass than a Gaussian does. I know it's used in finance to model asset-returns (link 1, link 2) for example, but I can't speak to how successful or useful these kinds of models are because I don't use them myself.

I'd suspect them to be of particular use to hierarchical modelers who have some prior knowledge that points to fat tails. Gelman briefly discusses using the t instead of the the Gaussian in fat-tail situations in section 17.2 of Bayesian Data Analysis.

klumbard
  • 1,291
  • 11
  • 14
  • 1
    +1 Thank you! I will wait a bit to check "accepted," in case anyone else would like to weigh in. – Alexis Jul 24 '20 at 19:45
  • 1
    +1 If the question is extended to the "t-distribution" from the "t-statistic," there are many [Bayesian uses](https://en.wikipedia.org/wiki/Student%27s_t-distribution#In_Bayesian_statistics) that certainly wouldn't be considered hypothesis tests. – EdM Jul 24 '20 at 19:55