4

I was reading this when I came across the term collinearity. I tried looking up what it is but top results are related to multicollinearity.

I could find here about multicollinearity

multicollinearity refers to predictors that are correlated with other predictors in the model

It is my assumption (based on their names) that multicollinearity is a type of collinearity but not sure. Do these 2 terms differ or are they synonyms?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Aseem Bansal
  • 325
  • 4
  • 11

2 Answers2

11

In statistics, the terms collinearity and multicollinearity are overlapping. Collinearity is a linear association between two explanatory variables. Multicollinearity in a multiple regression model are highly linearly related associations between two or more explanatory variables.

In case of perfect multicollinearity the design matrix $X$ has less than full rank, and therefore the moment matrix $X^{\mathsf{T}}X$ cannot be matrix inverted. Under these circumstances, for a general linear model $y = X \beta + \epsilon$, the ordinary least-squares estimator $\hat{\beta}_{OLS} = (X^{\mathsf{T}}X)^{-1}X^{\mathsf{T}}y$ does not exist.

Carl
  • 11,532
  • 7
  • 45
  • 102
  • 1
    The restriction of collinearity to just two variables is not standard: see https://en.wikipedia.org/wiki/Collinearity#Collinearity_of_points_whose_coordinates_are_given. "Multicollinearity" is a perfect synonym used exclusively in a multiple regression context. – whuber Jan 06 '17 at 16:47
  • 2
    @whuber I rarely disagree with you. However, the restriction of collinearity to just two variables is common usage to avoid grammatical number disagreement. It is a grammar thing, not a true statistical difference. Indeed, on the same web page you cited note that for [statistical language](https://en.wikipedia.org/wiki/Collinearity#Usage_in_statistics_and_econometrics) we do differentiate between them. And, admittedly, for geometry, we would not. – Carl Jan 06 '17 at 17:19
  • 3
    Thank you for pointing out that special case in the Wikipedia article; I cede your point (+1). I respect the fact that because many different intellectual communities use and contribute to statistics, many technical terms can have varied meanings. In this particular instance, though, "multicollinearity" carries such echoes of bombastic self-importance (*hey guys--to make sure people know we're serious, let's use a long, redundant, complicated technical word to describe a simple concept that already has a perfectly fine name*) that I try to avoid its use. – whuber Jan 06 '17 at 17:31
  • @whuber Agreed, looking back on [this](https://www.researchgate.net/publication/6707938_An_improved_method_for_determining_renal_sufficiency_using_volume_of_distribution_and_weight_from_bolus_Tc-99m-DTPA_two_blood_sample_paediatric_data), I used the term *collinearity* six times in Table 3, without using the term *multicollinearity* even once. However, the question was not about us, but about what to understand when we do see the term multicollinearity being used. – Carl Jan 06 '17 at 18:48
0

There are indeed slight inconsistencies in the usage of the term, depending who you ask. The most common distinction I've seen (and I tend to use), is that we have collinearity if $\det(X^T X)=0$, and multicollinearity if $\det(X^T X)\approx 0$. The latter obviously includes the former, which is why we also say "perfect multicollinearity" for "collinearity".

user11130854
  • 133
  • 7