4

It's similar to the post Interpretation of R's lm() output.

lm(formula = iris$Sepal.Width ~ iris$Petal.Width)

however, just a point that I can't understand for the explanation of t-value.

enter image description here

It shows that, t-value is the ratio from the first two values

t-value = estimate_mean/std.error

Questions: Is this t-value exactly the t-score in student's t distribution?

Based on my understanding, from the definition, t-score is calculated as follows. enter image description here

If assuming a null hypothesis that response residual mean is 0, the correct t-score in this lm() case, in my understanding, should be as follows.

 t-score given H_null = estimated_mean / (std.error/sqrt(n)) 
                      = sqrt(n) * estimated_mean/std.error

Therefore, t-score I derived is sqrt(n) times larger than t-value given by lm() .... Any one know which part is wrong above? Thanks!

HappyCoding
  • 191
  • 1
  • 7
  • 3
    1. What is the source of the orange image? 2. You seem to be confusing the standard deviation and the standard error in the ordinary one sample t-test (the orange image shows one, then you say the other; they're different). 3. Testing a regression coefficient is not (usually) the same as performing an ordinary one-sample t-test. – Glen_b Nov 08 '15 at 15:10
  • Although asked in terms of R, this is clearly a statistical question. IMO, it should be considered on topic here. – gung - Reinstate Monica Nov 08 '15 at 15:19
  • @Glen_b, the orange image is the one I summarized from http://stattrek.com/probability-distributions/t-distribution.aspx. could you provide some more details on how to test a regression coefficient please? – HappyCoding Nov 08 '15 at 15:34
  • @HappyCoding No, because you have a more fundamental problem as I already explained in my previous point (2). The fact that you just ignored it completely concerns me. Given (3), it might be better to look at the one sample t-statistic throughout, get that sorted out and then if you still have a question about regression, to ask about it separately (since dealing with the other issue will change what you ask). – Glen_b Nov 08 '15 at 15:37
  • I've posted an answer about your misunderstandings in the one-sample t-statistic.. – Glen_b Nov 08 '15 at 15:54

1 Answers1

2

The equation in the orange image is for a single sample. If a sample is drawn from a normal distribution with mean $\mu$ and standard deviation $\sigma$, then $\bar{x}$ is normal with mean $\mu$ and variance $\sigma/\sqrt{n}$.

Hence $\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}$ will have a standard normal distribution. However, if we estimate $\sigma/\sqrt{n}$ from the sample, by the estimated standard error of the mean, $s/\sqrt{n}$, (where $s$ is the standard deviation of the sample), then

$\frac{\bar{x}-\mu}{std.err(\bar{x})}=\frac{\bar{x}-\mu}{s/\sqrt{n}}$ has a $t$ distribution with $n-1$ degrees of freedom.

The orange image is consistent with what I just said.

But then you said:

t-score given H_null = estimated_mean / (std.error/sqrt(n))

This is not correct. Note that the standard error of the mean is $s/\sqrt{n}$, where $s$ is the standard deviation of the data.

The t-statistic in regression is slightly different (though analogous in form; its $t=\frac{b-\beta}{s.e.(b)}$ where $b$ is the estimated coefficient).

It's important to understand the problem here with the more basic case first.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • thanks @Glen_b for pointing out t distribution is different from t-statistic in regression. follow yours suggestion, I did some search and read in http://stattrek.com/regression/slope-test.aspx?Tutorial=AP. it's much clearer now. – HappyCoding Nov 09 '15 at 01:48
  • I just updated my question by replacing n with sqrt(n) – HappyCoding Nov 09 '15 at 01:51
  • @HappyCoding Thanks for the heads up; I'll edit my answer a tiny bit; that should make things clearer for later readers (because there's fewer sources of confusion, the central issue about the standard error is clearer) – Glen_b Nov 09 '15 at 01:52
  • thanks @Glen_b! That's great! It could be even better if you can also explain a bit on the differences and connections between t-statistic in sampling and regression, besides the formula difference as you pointed out above. – HappyCoding Nov 09 '15 at 01:55
  • thanks. for comments are very helpful. I just highlighted here **Note that the standard error of the mean is s/sqrt(n), where s is the standard deviation of the data.** – HappyCoding Nov 09 '15 at 02:13
  • I plan to go back and add the additions you mentioned earlier when I get a chance. – Glen_b Nov 09 '15 at 04:21