6

When performing linear regression, GLMnet apparently standardizes the dependent variable ($y$) vector to have unit variance before it runs the regression, and then unstandardizes the resulting intercept and coefficients. I assume the standardization is achieved by dividing each $y_i$ by the standard deviation of the $y$ vector.

If I run glmnet with a pre-standardized $y$ how do I unstandardize the resulting equation?

(Note that I am currently running my program/GLMnet on pre-standardized x variables, so I don't have to worry about reversing the x variable standardization that GLMnet also performs.)

I thought that I could simply unstandardize by multiplying each coefficient and the intercept by the standard deviation of the $y$ vector. This does not work - the "unstandardized" equation does not match the result I get when I run glmnet with the same non-standardized $y$. The only time multiplying by the standard deviation works is when I run glmnet with lambda=0. (This effectively runs the program as an ordinary least squares fit.)

I am recreating glmnet in another language as an exercise. When I run my program and glmnet on pre-standardized $y$, I get the same result. I do not get the same result when $y$ is not pre-standardized.

My information on standardization comes from the glmnet vignette:

"Note that for “family=gaussian”, glmnet standardizes y to have unit variance before computing its lambda sequence (and then unstandardizes the resulting coefficients); if you wish to reproduce/compare results with other software, best to supply a standardized y first (Using the “1/N” variance formula)."

user5064
  • 155
  • 1
  • 9
  • 1
    Glmnet also centers the data (translates it to have mean zero), are you doing this? – Matthew Drury Jun 03 '15 at 17:50
  • Note that the response is only standardized for `family="gaussian"`. The DV is left unchanged in other regression families. – Sycorax Jun 03 '15 at 17:53
  • @Matthew - I've also converted the X matrix into z-scores - for each independent variable I subtract its mean and divide by its standard deviation. To keep things simple I am currently running both my program and GLMnet on pre-standardised xs - I'll deal with unstandardized xs once I've worked out how to deal with unstandardized ys! – user5064 Jun 03 '15 at 18:17

2 Answers2

7

This is mostly a case of carefully working out the math. I'll handle the two predictor + intercept case, it should be clear how to generalize it.

The standardized elastic net model results in the following relationship:

$$\frac{y - \mu(y)}{\sigma(y)} = \beta_1 \frac{x_1 - \mu(x_1)}{\sigma(x_1)} + \beta_2 \frac{x_1 - \mu(x_1)}{\sigma(x_1)}$$

If you very carefully move terms around until only $y$ is on the left hand side, you'll get

$$ y = \frac{\beta_1 \sigma(y)}{\sigma(x_1)} x_1 + \frac{\beta_2 \sigma(y)}{\sigma(x_2)} x_2- \left( \frac{\beta_1 \mu(x_1)}{\sigma(x_1)} + \frac{\beta_2 \mu(x_2)}{\sigma(x_2)} \right) \sigma(y) + \mu(y) $$

which gives the relationship between the standardized and unstandardized coefficients.

Here's a quick demonstration you can test this with

X <- matrix(runif(100, 0, 1), ncol=2)
y <- 1 -2*X[,1] + X[,2]

Xst <- scale(X)
yst <- scale(y)

enet <- glmnet(X, y, lambda=0)

enetst <- glmnet(Xst, yst, lambda=0)
coef <- coefficients(enetst)

# Un-standardized betas
coef[2]*sd(y)/sd(X[,1]) # = -2
coef[3]*sd(y)/sd(X[,2]) # = 1

# Unstandardized intercept (= 1)
-(coef[2]*mean(X[,1])/sd(X[,1]) + coef[3]*mean(X[,2])/sd(X[,2]))*sd(y) + mean(y)
Matthew Drury
  • 33,314
  • 2
  • 101
  • 132
  • 2
    The code as given works for lambda = 1 and lambda = 0. However, for other lambda, the unstandardising does not work. After some experimentation I think what glmnet is doing is: – user5064 Jun 05 '15 at 13:35
  • 2
    I found that glmnet also 'standardises' the lambda (by dividing the lambda by the '1/N' formula standard deviation of y) when it standardises the y values. With this alteration the equation above works for unstandardisation. – user5064 Jun 05 '15 at 13:44
  • Thanks for this answer—wondering if you could lend your expertise to a related question: https://stackoverflow.com/questions/68147109/r-how-to-translate-lasso-lambda-values-from-the-cv-glmnet-function-into-the-s – Mark White Jun 27 '21 at 18:01
1

The process GLMnet is following when it calculates coefficients for linear regression seem to be as follows:

Standardise each $x$ by subtracting the mean and dividing by the standard deviation (in calculating the standard deviation, divide by $N$, not $N-1$).

Standardise $y$ by dividing by its standard deviation (again with the 'divide by $N$' formula).

Alter the target lambda by dividing it by the standard deviation calculated for $y$.

Calculate the $\beta $s using the formula in http://www.jstatsoft.org/v33/i01/paper.

Unstandardise the $\beta$s using a variant of the formula in Matthew's answer. Note that since you did not subtract the mean of $y$ you do not need to add it at the end.

To calculate the intercept, use the unstandardised $\beta$s and the unstandardised $x$s and $y$s. Take the average of $y$ and each $x$. The formula is:

$$intercept = \bar y - \beta \bar x$$

user5064
  • 155
  • 1
  • 9