How to understand t-value in R's lm()?

Question

It's similar to the post Interpretation of R's lm() output.

lm(formula = iris$Sepal.Width ~ iris$Petal.Width)

however, just a point that I can't understand for the explanation of t-value.

It shows that, t-value is the ratio from the first two values

t-value = estimate_mean/std.error

Questions: Is this t-value exactly the t-score in student's t distribution?

Based on my understanding, from the definition, t-score is calculated as follows.

If assuming a null hypothesis that response residual mean is 0, the correct t-score in this lm() case, in my understanding, should be as follows.

 t-score given H_null = estimated_mean / (std.error/sqrt(n)) 
                      = sqrt(n) * estimated_mean/std.error

Therefore, t-score I derived is sqrt(n) times larger than t-value given by lm() .... Any one know which part is wrong above? Thanks!

1. What is the source of the orange image? 2. You seem to be confusing the standard deviation and the standard error in the ordinary one sample t-test (the orange image shows one, then you say the other; they're different). 3. Testing a regression coefficient is not (usually) the same as performing an ordinary one-sample t-test. — Glen_b, Nov 08 '15 at 15:10
Although asked in terms of R, this is clearly a statistical question. IMO, it should be considered on topic here. — gung - Reinstate Monica, Nov 08 '15 at 15:19
@Glen_b, the orange image is the one I summarized from http://stattrek.com/probability-distributions/t-distribution.aspx. could you provide some more details on how to test a regression coefficient please? — HappyCoding, Nov 08 '15 at 15:34
@HappyCoding No, because you have a more fundamental problem as I already explained in my previous point (2). The fact that you just ignored it completely concerns me. Given (3), it might be better to look at the one sample t-statistic throughout, get that sorted out and then if you still have a question about regression, to ask about it separately (since dealing with the other issue will change what you ask). — Glen_b, Nov 08 '15 at 15:37
I've posted an answer about your misunderstandings in the one-sample t-statistic.. — Glen_b, Nov 08 '15 at 15:54

Glen_b · Accepted Answer · 2015-11-09T01:53:49.160

2

The equation in the orange image is for a single sample. If a sample is drawn from a normal distribution with mean $\mu$ and standard deviation $\sigma$, then $\bar{x}$ is normal with mean $\mu$ and variance $\sigma/\sqrt{n}$.

Hence $\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}$ will have a standard normal distribution. However, if we estimate $\sigma/\sqrt{n}$ from the sample, by the estimated standard error of the mean, $s/\sqrt{n}$, (where $s$ is the standard deviation of the sample), then

$\frac{\bar{x}-\mu}{std.err(\bar{x})}=\frac{\bar{x}-\mu}{s/\sqrt{n}}$ has a $t$ distribution with $n-1$ degrees of freedom.

The orange image is consistent with what I just said.

But then you said:

t-score given H_null = estimated_mean / (std.error/sqrt(n))

This is not correct. Note that the standard error of the mean is $s/\sqrt{n}$, where $s$ is the standard deviation of the data.

The t-statistic in regression is slightly different (though analogous in form; its $t=\frac{b-\beta}{s.e.(b)}$ where $b$ is the estimated coefficient).

It's important to understand the problem here with the more basic case first.

edited Nov 09 '15 at 01:53

answered Nov 08 '15 at 15:53

Glen_b

257,508
32
553
939

thanks @Glen_b for pointing out t distribution is different from t-statistic in regression. follow yours suggestion, I did some search and read in http://stattrek.com/regression/slope-test.aspx?Tutorial=AP. it's much clearer now. – HappyCoding Nov 09 '15 at 01:48
I just updated my question by replacing n with sqrt(n) – HappyCoding Nov 09 '15 at 01:51
@HappyCoding Thanks for the heads up; I'll edit my answer a tiny bit; that should make things clearer for later readers (because there's fewer sources of confusion, the central issue about the standard error is clearer) – Glen_b Nov 09 '15 at 01:52
thanks @Glen_b! That's great! It could be even better if you can also explain a bit on the differences and connections between t-statistic in sampling and regression, besides the formula difference as you pointed out above. – HappyCoding Nov 09 '15 at 01:55
thanks. for comments are very helpful. I just highlighted here **Note that the standard error of the mean is s/sqrt(n), where s is the standard deviation of the data.** – HappyCoding Nov 09 '15 at 02:13
I plan to go back and add the additions you mentioned earlier when I get a chance. – Glen_b Nov 09 '15 at 04:21

How to understand t-value in R's lm()?

1 Answers1