Regression when predictors are correlated by design (single measurement vs. long-term average)

Question

I am interested in determining if a biological response variable is more strongly related to the value of an environmental predictor variable measured the same year in which the response was measured, or to the average value of that predictor calculated over a multi-year period. The single-year (annual) value and the multi-year average value at a given site predictably show strong collinearity, so it doesn't make sense to use them both as explanatory variables in the same linear regression model.

I was planning to use the annual deviation from the average as a single predictor variable instead, so that 0 corresponds to the long-term average at the site (basically mean-centering on the predictor's long-term mean value rather than on the sample mean, as user cbeleites describes in their answer here. If I have this correct, I should then be able to interpret the intercept as the expected value of the response when the predictor takes its long-term average value, and interpret the beta coefficient of the predictor as the change in the response variable for each 1-unit deviation from the average.

However, I still don't think that will allow me to determine if the average value has a stronger effect than the annual value because the average value isn’t included as a separate predictor, so I couldn’t compare models fitted with and without it using AIC or anova() in R.

My questions are:

Would it be reasonable to include both the long-term average (which takes continuous positive values) and the annual deviation from that average (+/0/-) as separate predictor variables in the model, and then assess the significance of each predictor separately as usual? Will it be a problem if the deviations are not equally distributed around 0 (i.e. if it turns out that I've sampled years and sites with higher-than-average values so I have more positive deviations than negative ones?)
Is there a better approach for determining whether or not a dependent variable is more strongly related to the annual vs. average value of the same predictor?
Will having a centered predictor be a problem if I include interaction term(s) in the regression (as suggested by bluepole's answer here)?

I've searched through SE but am not sure if I'm using the correct search terms. I've read a number of the popular threads on multicollinearity, but several of the questions that sound most similar to mine (e.g. this one or maybe this) have never been answered.

I'm having issues with your main objective: "determine if the average value has a stronger effect than the annual value". I think the actual questions should be "are there lagged effects?" and "is there strong auto-correlation?". — Roland, Aug 23 '21 at 08:50
@Roland I may have been incorrect to use the word "effect" instead of "relationship" there. I don't have a biological reason to believe that there is a lagged effect of the IV. However, I do have a reason to believe that the average value over a long period of time may have influenced the evolution of the DV I'm measuring, so my question in a regression context is "Does the average value of the predictor or the annual value of the predictor explain more of the variance in the annual value of the DV?" if that clarifies at all. — user33333, Aug 23 '21 at 17:48
I would still try to model that as auto-correlation (e.g., a moving-average process). — Roland, Aug 23 '21 at 17:53

Regression when predictors are correlated by design (single measurement vs. long-term average)

0 Answers0