1

So I know that for OLS, "copying" each of the N observations $(X_i,Y_i)$ once to get a dataset of size 2N has no effect on the values of the coefficients in OLS (related question).

Does this still hold true for Ridge and Lasso regressions? I've ran experiments where the coefficients turns out do differ by a small amount. But I'm not sure if that's due to computation complications. Could someone please give a theoretical explanation why the coefficient would change (or why not?)

wwyws
  • 67
  • 4
  • 2
    If you fix the penalty as some value, like 42, then no because the log-likelihood grows by a factor of log 2 when you "double" your dataset. You can probably choose a bigger penalty (multiply by log 2) to give you the same result. If you choose the penalty through some automated procedure, like cross-validation, to optimize the cross-validated R2, you'll still likely pick a smaller penalty less than log 2 times the original CVoptimal penalty. – AdamO Feb 24 '22 at 23:14
  • @AdamO Thanks. I just worked through a 1-variable example for ridge, since ridge has a closed form solution. And it became clear that coefficient will stay the same IFF I double λ the penalty term – wwyws Feb 25 '22 at 02:43
  • 1
    My [data augmentation characterization of Ridge Regression](https://stats.stackexchange.com/a/164546/919) gives useful insight. Indeed, if you were to copy the *augmented* values also, the ridge solution would be identical. Since copying them is tantamount to doubling each value, the only change will be the cosmetic one of doubling the relaxation parameter. – whuber Feb 25 '22 at 14:22

0 Answers0