Does 0-sum game violate linearity in linear regression?

Question

I have a dataset that is derived in the fashion of 0-sum game (RNA-Seq data: the total amount of reads is fixed, inclusion of one read belonging to one feature means the exclusion of another read belonging to another feature). I imagine such situation yields dependency between variables and violate the linearity assumption required for linear model. right?

score 2 · Answer 1 · answered Dec 27 '18 at 01:30

Not quite. What it violates is not linearity (i.e., that the response variable is a linear function of the parameters) but collinearity (no linear combination of the predictor variables adds up to a constant).

Lack of collinearity is not a necessary assumption of linear modeling, although it makes some analytical approaches more convenient. Strong (not perfect) collinearity is not as much of a problem as people think (especially for predictive models). Perfect collinearity (as in the case where a set of the predictor variables add up to 1.0) is often handled automatically by statistical software (essentially by throwing out one of the predictors), but you can always drop one category yourself. (See this related question ...)

Thanks for the clarification. I sense these linear model assumptions (independency, linearity and non-multicolinearity) are somehow connected, right? In such 0-sum game situation, there is a dependency between variables and this dependency causes colinearity and consequently non-linear relationship to Y. Do I get it right? — unicorn, Dec 27 '18 at 05:08

Does 0-sum game violate linearity in linear regression?

1 Answers1