Data centering for logistic regression estimation

Question

I read in some articles that we can use data centering as per-process of logistic regression.

Centering is $X-Mean(X)$ for every value of $X$ input? What is interpretation of coefficients in this situation?
Can we use $(abs(X-Mean(X)))/STD(X)$? What is interpretation of coefficients in this situation?
Which one is better for logistic regression? I have financial ratios as inputs (highly correlated inputs).

score 2 · Accepted Answer · edited Apr 13 '17 at 12:44

2

Look at this UCLA's website for interpretation of non-transformed coefficients as initial reference.

The interpretation of your coefficients changes little after transformation. Say that in your model you have an independent variable called "GDP" and its coefficient is 1.482498.

CENTERING For a one-unit increase in GDP (take into account the scale you use for the GDP) FROM ITS MEAN, we expect a 1.482498 increase in the log-odds of the dependent variable, Y, holding all other independent variables constant.

STANDARDIZING For a one-standard deviation increase in GDP (take into account the scale you use for the GDP), we expect a 1.482498 increase in the log-odds of the dependent variable, Y, holding all other independent variables constant.

Which one is better depends on what you are interested in.

If you opt for the standardization, you have to standardize all the variables in your model, which implies you loose your constant (this might be important if the constant represent some reference level of GDP, maybe the GDP of a reference country). Standardization is scale neutral, and this is good if you want to have an idea of which of your independent variables are contributing the most in determining the levels of your dependent variable.

If you opt for centering, you do not need to transform all the variables and do not loose the constant.

I have tried to accurately understand the philosophy and repercussions of the two types of transformations, but I have not had any luck. See this question.

edited Apr 13 '17 at 12:44

Community

1

answered Jun 27 '15 at 16:16

Fuca26

795
1
9
29

Thank you for answer. For `STANDARDIZING` : For a one-standard deviation increase in GDP `FROM ITS STANDARD DEVIATION` is this true? we can have this interpretation? --- http://www.theanalysisfactor.com/how-to-get-standardized-regression-coefficients/ . What do you think about this link?. This link mentioned that : `You can then interpret your odds ratios in terms of one standard deviation increases in each X, rather than one-unit increases.` – user2991243 Jun 27 '15 at 16:25
1

no, not FROM its standard deviation. As the sentence you have posted states. the coefficient represents the effect on Y of the increase in X by one standard deviation (not FROM) – Fuca26 Jun 27 '15 at 16:28
So why have `From it's mean` in CENTERING? + Why we loose constant in STANDARDIZING? We should remove constant from logistic regression when using STANDARDIZING? – user2991243 Jun 27 '15 at 16:30
1

When you standardize, as far as I have understood, you have to transform (i.e., standardize) all of the indep var in your model. You see that if you subtract from the constant its mean, the transformation returns a 0. – Fuca26 Jun 27 '15 at 16:33
1

I think that, after standardization, if you want to be redundant, you could say: "the coefficient represents the effect on Y of the increase in X by one standard deviation from its mean" But the "from its mean" is redundant, I have never seen anybody writing it. – Fuca26 Jun 27 '15 at 16:35
Is any reference for `subtract from the constant its mean` which can i cite? I didn't see this procedure. How we can do this in softwares like Stata because these softwares automatically add constant to out model : `logistic output inp1 inp2` – user2991243 Jun 27 '15 at 16:38
1

To standardize see http://www.ats.ucla.edu/stat/stata/faq/standardize.htm; you transform each variable. There is no automatic way. Then, you type in the command for the regression: "logit y x1 x2, NOCOSTANT" no constant tells stata not to include the constant. I am not sure though. I am no expert on logit. – Fuca26 Jun 27 '15 at 16:51
Thank you but this link only discussed about standardization. I'm talking about constant term standardization. – user2991243 Jun 27 '15 at 16:54
see above the sentence about the "noconstant" option – Fuca26 Jun 27 '15 at 16:59

Data centering for logistic regression estimation

1 Answers1