Why does overall-F-test is considering right tailed only?

Question

I've seen this overall F-test several times, and when they compare F-statistics or $\alpha=5$% with its p-value always use right tailed, whereas the hypothesis is formualed as

$$H_0:\beta_0=\beta_1=\beta_2=\beta_3=0\\ H_1: \text{at least one of a tested parameter is nonzero}$$

first-two graphs are example one, second two graphs are example two Just pay attention to the critical value corresponding to $\alpha=0.05$ and p-value corresponding to $F-statistics$, their number from only right-tailed, why not two-tailed?

I may expand on this tomorrow if there are no answers (though there is no “calling dibs” on Cross Validated), but the gist is that the F-test is asking if one model is better than another. If that is the case, then you seem “significant” the parameters in the larger model but not in the smaller model. If you do a two-sided test, you are considering that the larger model is worse than the smaller model, which makes no sense from the standpoint of trying to show that a predictor impacts the response variable. — Dave, Oct 26 '21 at 01:56
@Dave Thank you for your reply. I could understand comparing the short vs long model, if significant we will reject the short model. But, what do you mean the two-sided test is referring long model worse than the short model. — LJNG, Oct 26 '21 at 02:10
The explanation is similar to the explanation for one-way ANOVA here: [Why do we use a one-tailed test F-test in analysis of variance (ANOVA)?](https://stats.stackexchange.com/a/73993/805) -- which is to say, when the assumptions of the test hold, the test is organized so that large F values will tend to occur when $H_0$ is false in any "direction"; unusually small values will indicate a situation *highly* consistent with the null (that might be surprising in some sense but not in a way that would lead you to reject the null). — Glen_b, Oct 26 '21 at 02:10
Consistent with @Glen_b you can think of it this way: If you are interesting in quantifying evidence for differences you look at the right tail. If you want to test whether means are more similar than chance would likely produce under the null, then compute the left tail probability. — Frank Harrell, Oct 26 '21 at 14:40
@Glen_b Yeah, That ANOVA post is really good. Low F-statistics may not reveal the differences in mean so that we abandon the left tail and focus on the right tail. If that's the argument, why do we use a two-tailed variance test? Because low F-statistics may fail to indicate truly a difference in variance, It should be also focused on right tailed. — LJNG, Oct 26 '21 at 17:09
The low $F$-stat does suggest a difference in variance, just not in a way that is useful to testing in regression. It suggests that the variance of the error term in the larger model is greater than the variance of the error term in the smaller model. If you use an $F$-test to compare the variances of two distributions, instead of comparing their means, then you might care if cats have a lower variance than dogs just as much as if dogs have a lower variance than cats. — Dave, Oct 26 '21 at 17:16

Dave · Accepted Answer · 2021-10-26T17:21:32.520

The gist is that the $F$-test is asking if one model is better than another. If that is the case, then you seem “significant” the parameters in the larger model but not in the smaller model. If you do a two-sided test, you are considering that the larger model is worse than the smaller model, which makes no sense from the standpoint of trying to show that a predictor impacts the response variable.

In a bit more detail, the $F$-test compares two models on squared differences between predicted values and observed values (square loss), mathematically expressed as $(y_i - \hat y_i)^2$. The models are nested, meaning that the larger one has all of the parameters that the smaller one has, plus some extra ones (maybe just one, maybe a bunch). The test then asks if the reduction square loss is enough to convince us that the additional parameters are not zero (reject the null hypothesis). If those parameters are not zero, then the variables to which they correspond can be considered contributors to the values of $y$.

This is why we only care about the one-sided test. If the evidence is that the larger model provides a worse fit than the smaller model, then that does not help us show that the additional parameters are non-zero.

EDIT

A low $F$-stat does suggest a difference in variance, just not in a way that is useful to testing in regression. It suggests that the variance of the error term in the larger model is greater than the variance of the error term in the smaller model;$^{\dagger}$ this is not compelling evidence that a feature influences $y$ and, therefore, is "significant". If you use an $F$-test to compare the variances of two distributions, instead of comparing their means (such as a $t$-test), then you might care if cats have a lower variance than dogs just as much as if dogs have a lower variance than cats.

$^{\dagger}$This is a bizarre occurrence. Larger models have smaller (or at least equal) error term variances than smaller (nested) model error term variances, except under specific circumstances. The only time I have heard of this happening is when there is extreme multicollinearity, such as including polynomial terms that wind up being highly multicollinear.

Thank you for your reply. Would you please further enlighten me about "If you do a two-sided test, you are considering that the larger model is worse than the smaller model", I don't get this argument. — LJNG, Oct 26 '21 at 17:15
@LJNG I just posted a comment to your question. See if that helps you. — Dave, Oct 26 '21 at 17:16
Would you please also check out my this question. https://stats.stackexchange.com/questions/549416/why-use-dummy-variable-if-just-focusing-one-specific-group-can-i-avoid-dummy-a — LJNG, Oct 26 '21 at 18:00

Why does overall-F-test is considering right tailed only?

1 Answers1