R lm difference between interactions with category and one models per category

Question

I am analysing a dataset of the performance of several species given an environmental variable.

To do so, I use a simple linear model with the lm function from the stats package.

Model 1

formula_All_Species = Perf ~ -1 + Species + Species:Env + 
           Species:Env_sq

Species is a factor, Env is a numeric, as well as Env_sq with Env_sq = Env^2.

I look at the results for a single species (for instance, I look at the coefficients Species01, Species01:Env and Species01:Env_sq).

I wonder if this Model 1 is different from the following Model 2 inferred for each of the single species:

Model 2

formula_Single_Species = 1 + Env + Env_sq

In summary, how is a model with interactions between categories and variables different from several models without interaction (one for each category)?

If you have categorical variable, the two models shouldn't be different. If you choose 1 particular species and simplify Model 1 by removing all of the other species you should end up with Model 2. I think your question is similar to this one: https://stats.stackexchange.com/questions/547577/multiple-regression-r-output-how-to-interpret-the-intercept/547588#547588 — Dave2e, Jan 20 '22 at 22:48
Thank you, I was simply wondering if there was a difference in the inference of the parameters or something like this, I am happy to have a confirmation that both models are equivalent for a single species. — Camchou, Jan 25 '22 at 12:40

R lm difference between interactions with category and one models per category

0 Answers0