2

EDIT2: Possibly the short version of the question is just how would I test cat2 = cat3 in the STATA example in the link below

I would like to test the hypothesis that $\beta_{typeprof} = -\beta_{typewc}$. On the first glimpse this seems a pretty straight forward thing to do in R using the linearHypothesis function in the car package. However, the presence of intercepts and multiple (categorical) variables made the interpretation a bit trickier to me.

Here's my reproducible example:

library(car)
# no intercept present
mod.duncan0 <- lm(prestige ~ 0 + income + education + type,
data=Duncan)
linearHypothesis(mod.duncan0, "typeprof=-typewc")

will return

Hypothesis: typeprof + typewc = 0

Model 1: restricted model
Model 2: prestige ~ 0 + income + education + type

 Res.Df    RSS Df Sum of Sq      F Pr(>F)
1    41 3798.8                           
2    40 3798.0  1   0.86657 0.0091 0.9244

so we clearly cannot reject the H0. However, as far as I understood the linearHypothesis method after running debug linearHypothesis.default, the hypothesis matrix/formula needs to be adjusted in order to test the same thing with an intercept (see also this STATA related discussion here):

mod.duncan1 <- lm(prestige ~ income + education + type, data=Duncan)
linearHypothesis(mod.duncan1,
 "typeprof +(Intercept) =  -typewc-(Intercept) ")

which will return exactly the same. Now assume, still using reference coding, I add another categorical variable to the mix. the Duncan dataset doesn't come with one, so I made one BS variable up.

set.seed(123)
Duncan$bs <- as.factor(rbinom(45,4,.4))
mod.duncanM0 <- lm(prestige ~ 0+income + education + bs + type,
data=Duncan)

Now the questions is:

Can I -- just by leaving a general intercept out -- meaningfully test $\beta_{typeprof} = -\beta_{typewc}$, even though there are BS specific intercepts?

Note, this is a made up example and understand that testing this hypothesis doesn't make much sense with this dataset, but in the original data type is variable that has clearly a neutral reference and a positive and negative category. Btw: here's a summary of the model itself, just for the sake of completeness and to see the estimates themselves:

enter image description here

As you can see, using reference coding, all of the BS categories are included while type leaves one category (typebc) out. Does this hamper the desired interpretation in any way?

EDIT: This question and particularly @gung 's answer are related.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
hans0l0
  • 2,065
  • 4
  • 25
  • 31
  • 1
    Do not use linear regression without an intercept. It is almost always wrong, and introduces bias and inconsistenty – Repmat Oct 11 '15 at 13:27
  • Well, I know R squared etc, isn't comparable. Though you're right it's not advisable most of the time, it's a bit general. It's just bs specific intercepts. Plus in this case it's only about testing the hypothesis that the effects of typeprof and typewc are symmetric. – hans0l0 Oct 11 '15 at 13:48
  • That effect is biased for sure if you omit the intercept, hence so is the test – Repmat Oct 11 '15 at 13:49
  • @Repmat I think this discussion http://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-lm (in particular Joshua's answer) gives some insight w.r.t categorical variables. – hans0l0 Oct 11 '15 at 15:45
  • Sorry, I still don't see why you would like to exclude the intercept. Also I do not understand why you wanna test b1 = -b2 and not b1 = b2... – Repmat Oct 11 '15 at 17:51
  • If you look at the summary you can see that typeprof = typewc is not gonna happen, whereas typeprof = -typewc might be more interesting. The reason I am thinking about excluding the intercept is that I don't see an easy way of testing the linear hypothesis, no matter if it's b1 = b2 or b1 = -b2. Testing that would just mean to add the intercepts to the test to obtain the same result. I am not really keen on excluding the intercept because it's just harder to convince people. So I am happy to learn about a better to test my hypothesis. – hans0l0 Oct 11 '15 at 18:03
  • @Repmat I see your point about the test being biased, but that only holds for the categorical variable that is fully included in this case bs. The type categorical variable and its standard error don't alter a bit compared to the model with intercept. And all I want to do is control for bc, but not interpret the single coefficients of bs. – hans0l0 Oct 11 '15 at 18:36

0 Answers0