2

I am using a generalized linear model in R with categorical independent variables. The model is calibrated and validated, but the results are not of good practical use, because the differences in the response variable vary too much across segments. In other words the predictions are too far apart. Is there a way to constrain the GLM coefficients in a way that they will represent only half (or some other quantity) of their real contributions to the variance? That way the differences should be smaller.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
gregorp
  • 157
  • 1
  • 9
  • 1
    one of the way to reduce prediction variation is to use shrinkage methods. http://stats.stackexchange.com/questions/20295/what-problem-do-shrinkage-methods-solve – Metariat Jul 27 '16 at 09:29
  • @Matemattica is there a way with these shrinkage methods, where I can specify the shrinkage factor? I've tried the shrink.glm (from package shrink), but it only shrinked all the coefficients by 0.9. I'd like to shrink the coefficients more, for example by 0.5. – gregorp Jul 27 '16 at 09:56
  • 2
    I think the more porpular package is glmnet. you can specify the shrinkage level as you want, but notice that there exists a trade-off. – Metariat Aug 01 '16 at 14:54
  • 1
    @Matemattica thank you, I'll try the glmnet package. – gregorp Aug 05 '16 at 11:35

1 Answers1

4

In case of difference between levels of categorical predictors that you suspect that you suspect too big, you can try Fused Lasso algorithm.

In stead of finding the MLE estimates, they propose an algorithm to minimise the following amount:

$$\sum_{i=1}^N(y_i - x_i^T\beta)^2 + \lambda_N^{(1)} \sum_{j=1}^p\mid \beta_j \mid + \lambda_N^{(1)} \sum_{j=2}^p \mid \beta_j - \beta_{j-1}\mid$$

Interpretation:

  • When you chose $\lambda = 0$, this is equal to the MLE approach.
  • When $\lambda \neq 0$ you have two pernalizations terms. The first term will shrink the coefficient of bad predictors towards 0 . The second term is what you are interested in, it pernalizes the difference between the coefficients of the categorical predictors. Big value of $\lambda$ results in small difference across levels of categorical predictors.
Metariat
  • 2,376
  • 4
  • 21
  • 41