24

When doing a GLM and you get the "not defined because of singularities" error in the anova output, how does one counteract this error from happening?

Some have suggested that it is due to collinearity between covariates or that one of the levels is not present in the dataset (see: interpreting "not defined because of singularities" in lm)

If I wanted to see which "particular treatment" is driving the model and I have 4 levels of treatment: Treat 1, Treat 2, Treat 3 & Treat 4, which are recorded in my spreadsheet as: when Treat 1 is 1 the rest are zero, when Treat 2 is 1 the rest are zero, etc., what would I have to do?

chl
  • 50,972
  • 18
  • 205
  • 364
Platypezid
  • 1,197
  • 3
  • 13
  • 16
  • I see many people have this problem- Does anyone understand the response to this persons query?https://stat.ethz.ch/pipermail/r-help/2006-April/103836.html – Platypezid Jul 25 '11 at 16:31

3 Answers3

36

You're probably getting that error because two or more of your independent variables are perfectly collinear (e.g. mis-coding dummy variables to make identical copies).

Use cor() on your data or alias() on your model for closer inspection.

Peter
  • 476
  • 5
  • 5
2

Error "not defined because of singularities" will occur due to strong correlation between your independent variables. This can be avoided by having n-1 dummy variables. In your case, for Treatment variable, you should use 3 binary dummy variables (Treat1, Treat2, Treat3).

In R programing, linear regression functin lm() will result in "NA" as co-efficient for highly correlated variables.

1

In my (somewhat limited) experiece, it is most likely due to high levels of colinearity between two or more varaibles. I would suggest using the VIF function to identify which varaibles are the most significant contributors in this respect, and take this into consideration when selecting the varaibles for elimination in the next iteration of your model.

In terms of interpreting VIF values, some suggest that under a value of 5 is an okay level, with a value of 5-10 acceptable in some circumstances. I usuaully don't rely on specific value thresholds too much, with the values still giving you a relative indication of issues of potenital colinearity.

user300086
  • 11
  • 1