0

I have carried out this linear regression that includes month coded as a dummy variable:

library(plyr)
set.seed(1)
y <- rnorm(120)
x1 <- c(rep("adult", 60), rep("juvenile", 60))
x2 <- c(rep("male", 60), rep("female", 60))
x3 <- unlist(llply(month.abb, function(x) rep(x, 10)))

summary(lm(y ~ x1 + x2 + x3))

Call:
lm(formula = y ~ x1 + x2 + x3)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.46354 -0.51524 -0.03981  0.57625  1.95041 

Coefficients: (2 not defined because of singularities)
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.12073    0.28564   0.423    0.673
x1juvenile   0.00663    0.40396   0.016    0.987
x2male            NA         NA      NA       NA
x3Aug       -0.37510    0.40396  -0.929    0.355
x3Dec       -0.24718    0.40396  -0.612    0.542
x3Feb        0.12812    0.40396   0.317    0.752
x3Jan        0.01147    0.40396   0.028    0.977
x3Jul        0.32385    0.40396   0.802    0.424
x3Jun        0.02273    0.40396   0.056    0.955
x3Mar       -0.25440    0.40396  -0.630    0.530
x3May        0.01341    0.40396   0.033    0.974
x3Nov        0.22012    0.40396   0.545    0.587
x3Oct       -0.01502    0.40396  -0.037    0.970
x3Sep             NA         NA      NA       NA

Residual standard error: 0.9033 on 108 degrees of freedom
Multiple R-squared:  0.04703,   Adjusted R-squared:  -0.05003 
F-statistic: 0.4845 on 11 and 108 DF,  p-value: 0.9093

I now want to present the results of this linear regression within a table. Instead of presenting the beta for every month, is there a way to summarise the overall effect of month on y within the same table? For example, would if be acceptable to summarise the beta, se, t value and p value of x3 by using their mean values across months?

luciano
  • 12,197
  • 30
  • 87
  • 119
  • I think this is mostly (if not as directly) covered by Macro's answer in [this post](http://stats.stackexchange.com/questions/31690/how-to-test-the-statistical-significance-for-categorical-variable-in-linear-regr?rq=1). It even includes discussion of the use of `anova`. If for some reason that doesn't cover it, I can undelete my answer here. – Glen_b Dec 21 '13 at 20:32
  • question amended with more details – luciano Dec 21 '13 at 21:03
  • Now I have no clue what you're even asking. What information do you want to show? If you're not interested in summarizing contribution to the sum of squares, what *are* you interested in showing? – Glen_b Dec 21 '13 at 21:08
  • The most economical way to present this regression in a table is to show an empty table, because the regression indicates there are no significant linear relationships between the response and *any* of the variables you have included! – whuber Dec 21 '13 at 21:32
  • question again edited – luciano Dec 21 '13 at 21:40
  • No, you cannot average the t values or p values across months. I would also worry about the "two not defined because of singularities" and the odd pattern of coefficients for months. – Peter Flom Dec 21 '13 at 23:44
  • @PeterFlom: the singularities are cause by the setting of `x1`, `x2` and `x3`: for example all males are adults and the first six months, and all females are juveniles and the second six months. – Henry Dec 21 '13 at 23:59
  • Where is the month of April? – Peter Flom Dec 22 '13 at 00:01

0 Answers0