0

I am analysing a dataset of the performance of several species given an environmental variable.

To do so, I use a simple linear model with the lm function from the stats package.

Model 1

formula_All_Species = Perf ~ -1 + Species + Species:Env + 
           Species:Env_sq

Species is a factor, Env is a numeric, as well as Env_sq with Env_sq = Env^2.

I look at the results for a single species (for instance, I look at the coefficients Species01, Species01:Env and Species01:Env_sq).

I wonder if this Model 1 is different from the following Model 2 inferred for each of the single species:

Model 2

formula_Single_Species = 1 + Env + Env_sq

In summary, how is a model with interactions between categories and variables different from several models without interaction (one for each category)?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Camchou
  • 1
  • 1
  • If you have categorical variable, the two models shouldn't be different. If you choose 1 particular species and simplify Model 1 by removing all of the other species you should end up with Model 2. I think your question is similar to this one: https://stats.stackexchange.com/questions/547577/multiple-regression-r-output-how-to-interpret-the-intercept/547588#547588 – Dave2e Jan 20 '22 at 22:48
  • Thank you, I was simply wondering if there was a difference in the inference of the parameters or something like this, I am happy to have a confirmation that both models are equivalent for a single species. – Camchou Jan 25 '22 at 12:40

0 Answers0