1

I've read somewhere but can't find it again that when dealing with null t distributions in a hypothesis test you shouldn't estimate the degrees of freedom from your data but instead use a low number like 3 or 4.

Is this correct, and what is the intuition and reasoning behind this?

badmax
  • 1,659
  • 7
  • 19
  • No for a one sample problem the degrees of freedom is n-1 where n is the sample size. In general it is a specific function of the sample size minus the number of parameters estimated. So in a two sample problem with sample sizes n and m it is n+m-2. – Michael R. Chernick May 12 '17 at 02:52
  • 1
    @Michael I think you're mistaking what the question is asking (it's not clearly phrased so that's not so surprising). I believe it's asking about *fitting* a t-distribution to data (which distribution has parameters $\mu,\sigma^2$ and $\nu$), but assuming a low d.f. like $\nu=4$ rather than fitting the d.f parameter. – Glen_b May 12 '17 at 03:14
  • @Glen_b The question is too poorly written to know if your interpretation is right or mine is or neither. He mentioned null hypothesis with a t distribution but doesn't give any details about the test. – Michael R. Chernick May 12 '17 at 03:19
  • 1
    The problem alluded to is quite well known, which makes it easier to discern the intent, I think. – Glen_b May 12 '17 at 03:20
  • @badmax where did you read this? can you give the context more clearly? – Glen_b May 12 '17 at 03:21
  • Sorry for being vague, it just popped back into my head and this is all I remember. I'm not exactly clear on what the original issue was - I'm basically trying to find a problem for the solution. I do remember Andrew Gelman's name somewhere in there but his website wasn't very helpful. – badmax May 12 '17 at 03:27
  • Similar: https://stats.stackexchange.com/questions/120776/why-should-we-use-t-errors-instead-of-normal-errors/120787#120787 – kjetil b halvorsen Jan 09 '20 at 01:19

1 Answers1

2

The reason is that the d.f. parameter is very hard to estimate well from data, particularly if you're also estimating the scale parameter. Indeed you can often end up with either silly estimates or unstable estimates (e.g. from a ridge in parameter space)

Better properties are often obtained in practice by simply assuming some low d.f. (I've also seen 5, 7 and 8 used) rather than estimating it, at least at the typical sample sizes seen in financial data, for example (which are often fairly large but not large enough to make the estimation problem easy or well-behaved).

[Note that 8 is the lowest d.f. for which the sample kurtosis has finite variance, which may have been a factor in why it was used in the instance I saw it.]

Glen_b
  • 257,508
  • 32
  • 553
  • 939