How is an ANOVA table calculated when using continuous predictors only?

Question

An ANOVA can be described as a regression with dummy variables. You could for example calculate the sums of squares treatment in an ANOVA table using the coefficients from a linear model

> y <- rnorm(10)
> x1 <- as.factor(c(0,0,0,0,0,0,1,1,1,1))
> y.bar <- mean(y)
> f1 <- lm(y ~ x1)
> sum(((f1$coef[1]) - y.bar)^2)*6 + sum(((f1$coef[1] + f1$coef[2]) - y.bar)^2)*4
[1] 1.784887
> anova(f1)
Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq F value Pr(>F)
x1         1 1.7849  1.7849   1.596  0.242
Residuals  8 8.9470  1.1184

However, when using two or more continuous predictors

> x2 <- rnorm(10)
> x3 <- rnorm(10)
> f2 <- lm(y ~ x2 + x3)
> anova(f2)
Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq F value Pr(>F)
x2         1 0.7797 0.77970  0.5959 0.4654
x3         1 0.7934 0.79336  0.6064 0.4617
Residuals  7 9.1588 1.30841

How are sums of squares calculated and how could they be interpreted?

Welcome to the site Andreas! Is this an self-study question? — André.B, May 22 '19 at 20:16
If you've found the answer by AdamO helpful, please don't to forget to upvote and accept it - it seems a lot of effort went into it. — Martin Modrák, May 29 '19 at 12:49

score 2 · Accepted Answer · answered May 22 '19 at 20:09

https://rcompanion.org/rcompanion/d_04.html explains well, especially as to how you get a few inconsistencies, depending on type 1, 2, or 3 sums of squares when fitting interactions.

In a model like yours with only 1 way effects it's pretty easy. The sum-of-squares for each term is calculated by calculating the SSE with and without that term in the model. That difference comprises the partial SSE.

set.seed(123) ## never forget this when using rnorm
x1 <- rnorm(10)
x2 <- rnorm(10)
y <- rnorm(10)
f2 <- lm(y ~ x1 + x2)
anova(f2)

gives:

> anova(f2)
Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq F value Pr(>F)
x1         1 1.2851 1.28508  1.7246 0.2305
x2         1 1.2965 1.29652  1.7399 0.2287
Residuals  7 5.2161 0.74515

and

f1 <- lm(y ~ x1)
sum(residuals(f1)^2)  - sum(residuals(f2)^2)

gives

> sum(residuals(f1)^2)  - sum(residuals(f2)^2) 
[1] 1.296519

which is the same 1.2965 as is displayed in the x2, Sum Sq cell.

That clarifies things for me! Thank you! – Andreas May 23 '19 at 22:19 — Andreas, May 23 '19 at 22:19

How is an ANOVA table calculated when using continuous predictors only?

1 Answers1