3

This question is about three-way interaction and the possibility of applying without second lower terms with keeping the main variables in the equation not like the other questions. In fact the other answers suggest there is possibility of applying . I am not here to find the best solution because I know it and I already included in my question, but to know whether is it possible regardless if it is preferable or not. thank you and please open my question for discussion

The widely known regression equation for assessing the three-way interaction is

$$ Y= B_1 X+B_2 Z+B_3 W +B_4XZ+B_5XW+B_6ZW+B_7XZW+B_0 $$

All lower order terms is included in the regression equation for the B7 coefficient to represent the effect of the three-way interaction on Y.

Is there possible way to skip the lower order terms and include only the higher term? as in:

$$ Y= B_1 X+B_2 Z+B_3 W +B_4XZW+B_0 $$

And how many observations do I need to perform such equation if X & Z are continuous variables and W is dummy variable ?

I will be thankful if anyone can provide me with any suggestions

Funn Me
  • 39
  • 4
  • Why would you do that? – abaumann Aug 28 '14 at 17:15
  • 1
    I am only interested in the higher order term, so such long equation is just noisy for me – Funn Me Aug 28 '14 at 17:23
  • Do you have reason to believe that lower ordered terms are not valid or why are you only interested in the higher order term? Knowing this can help better answer your question. – Eric Aug 28 '14 at 17:26
  • If the lower order terms are in the structural model, omitting them will give you omitted variable bias. – abaumann Aug 28 '14 at 17:28
  • 3
    This is generally a bad idea. Try reading the linked thread for more information. If there is something more you need to know or still don't quite understand after reading through that, come back here and edit your Q to explain what you learned & what you are still confused about; we can re-open this to address the needed issues. – gung - Reinstate Monica Aug 28 '14 at 17:53
  • 1
    If you just want to know if it is possible to do, despite being a bad idea, you don't really need to ask; just try it. (As mentioned below, yes it can be done.) – gung - Reinstate Monica Aug 28 '14 at 18:22
  • 1
    Please don't prevent others from further discussion. There is no point of closing the question and you can see it is different. I hope you understand – Funn Me Aug 28 '14 at 18:27
  • 2
    Since you know it can be done, please give me a source or open my question – Funn Me Aug 28 '14 at 18:39
  • Of course it *can* be done (or there'd be no point telling people not to do it) - just add a new variable *V=XZW* to the design matrix. That you have to ask rather suggests you need to think through carefully your reasons for wanting to - the stated one of only being interested in the higher order term doesn't really cut it. – Scortchi - Reinstate Monica Aug 29 '14 at 13:59
  • thanks dear @Scortchi for your valuable answer. Do you have any source which I can read through regarding this matter. This could actually be helpful if there are several variables with three-way interactions – Funn Me Aug 30 '14 at 15:47
  • A source for what, exactly? If you know how to multiply & how to carry out multiple regression then you know how to carry out a multiple regression in which one of the predictors happens to be the product of some others. Why you should or shouldn't want to is quite thoroughly dealt with in the post you linked to. [Venables (1998),"Exegeses on Linear Models", S-Plus User’s Conference, Washington DC](http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf) also discusses the empirical modelling of interactions, & the marginality principle. – Scortchi - Reinstate Monica Sep 01 '14 at 12:46
  • Considering your second model: if $W$ is a 0/1 dummy variable, then note that for one category there's no interaction between $X$ & $Z$; for the other there's a two-way interaction between $X$ & $Z$. Note also that when $Z=0$, the slope of $Y$ on $X$ is equal for the two categories. Are you *really* sure these constraints make sense for whatever you're modelling? - that the two categories distinguished by $W$ differ in this way & that zero is such a special value for $Z$? – Scortchi - Reinstate Monica Sep 01 '14 at 13:03
  • thanks a lot @Scortchi. the third dummy variable W is actually a year dummy which is added to the equation with the other years dummies (4 dummies), it will be $$ Y= B_1 X+B_2 Z+B_3 X*Z*2010+year dummies+B_0 $$ I don't want to assess the effect of the year 2010 for all variable, only for Z with X – Funn Me Sep 01 '14 at 16:24
  • What are $Y$, $X$, & $Z$? – Scortchi - Reinstate Monica Sep 01 '14 at 16:48
  • continuous variables , financial data – Funn Me Sep 01 '14 at 17:31
  • So for the reference, or any other, year the relation between financial variables $Y$ & $X$ is linear with a slope that doesn't depend on the value of financial variable $Z$. And for 2010 the slope does depend on the value of $Z$, but is constrained to be the same as that for other years when $Z$ is zero. If that makes sense in this particular situation, fair enough; your rather vague comments on what terms are interesting & on assessing the effect of this on that only for the other aren't in themselves very illuminating, & I hope you're relying on the mathematical formulation of the models .. – Scortchi - Reinstate Monica Sep 02 '14 at 12:02
  • ... rather than such verbal descriptions when deciding which to fit. If you've any doubts, plotting the predicted response for different predictor & coefficient values is very helpful. – Scortchi - Reinstate Monica Sep 02 '14 at 12:02

1 Answers1

3

In short, yes.

A little longer answer: you need to consider how you got to this particular collection of explanatory variables and their combinations.

If you used any kind of model selection/ likelihood ratio test/... then the final p-values are conditional on what you did before. see for example Efron's paper.

If in your specific data the 3-way interaction is meaningful from before you saw/collected the data then your model is representing what you want and can be used .

Additional thought:

If your data is discrete 3-way interaction is not just one thing XZW, it can be also (1-X)Z(1-W) and the results will be very different.

If your data is continuous it is even more messy.

Scortchi - Reinstate Monica
  • 27,560
  • 8
  • 81
  • 248
pes
  • 169
  • 5
  • thanks a lot for your answer. I am using fixed effects in panel data, but sorry I didn't understand your last sentence. Do you have any book or article that support or used this approach because all the books I have read follow the traditional way (first equation). thanks again @pes – Funn Me Aug 28 '14 at 17:30
  • 1
    i just mean that XZW is a very particular interaction, why not use Xlog(Z)tan(W) or product of other transformations. but for discrete variables X,Z,W say in (0,1), there are exactly 8 combinations and interaction XZW means exactly one of them (1,1,1), another form, say (1-X)Z(1-W), will point to another combination (0,1,0) – pes Aug 28 '14 at 18:23