I am trying to analyze the effect of different land-use cover predictors (% of land area) on my response of interest. The land-use predictors cover the dominant types of land-use and hence collectively almost sum to 100% of the land area. At the moment, I am fitting a standard multiple regression in R:
lm(response ~ land-use1 + land_use2 + land_use3 + land_use4 + land_use4, data=mydata)
However, given my land-use predictors are inherently correlated — when one land-use cover increases, other land-use covers must decrease (since they sum to 100%) — I am not sure if this standard regression model is appropriate.
The pair-wise correlations among the land-use predictors aren't super high (all |r| <0.7). Variance inflation factors suggest some multicollinearity but again it isn't too high (VIF about 4).
But still it troubles me that my land-use predictors sum to 100%. I am not sure how to interpret the regression coefficients associated with each land-use predictor, since it doesn't make sense to estimate the effect of an % increase in each land-cover without factoring in the associated decreases in other land-covers.
Are there suggestions of types of models that might be more appropriate?