1

I am running the a logistic regression model to test the effects of task variables on choice (left/right). I set up a logistic regression model per subject and test the regression coefficients against zero across subjects later on. One predictor is continuous and I normalize it to account for different possible values across subjects. One regressor is binary and I don't normalize it. One regressor can take on four different values (10,20,30,40) whereas their order and distances are meaningful. However it is still a discrete parameter. Would you normalize the regressor in this case? The results are different whether I do or don't and I wanted to hear your opinion.

I use matlabsglmfitto regress the design matrixxonywith the following optionsbetas = glmfit (x,y,'binomial','link','logit'). When normalize all variables, the respective regression weights for one example subject are (-7.14 4.283 -0.47 -0.49; intercept included). When I only normalize the continuous variablex1` the respective weights are (-5.51 4.283 -0.088 -1.01).

The t values against zero across all participants are [41.52 -3.985 and -0.032] if I normalize all values. If I only normalize the continuous variable they are [20.14 -3.89 -0.48].

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Laurie
  • 179
  • 6
  • 2
    Can you specify **exactly** what you did to *normalize* the data? Normally (pun intended) it should not matter for logistic regression, *unless you are using regularization or something else you didn't tell us*. Please tell us more of the context. – kjetil b halvorsen Aug 05 '19 at 09:04
  • Sure! I z-score the data and I do not use any kind of regularization. – Laurie Aug 05 '19 at 09:36
  • Then, can you explain in which sense the results differ? They should not ... Edit your post to include some computer output – kjetil b halvorsen Aug 05 '19 at 09:39
  • 1
    Okay I did that. Thanks – Laurie Aug 05 '19 at 10:31

1 Answers1

1

From your latest edit we can see that the estimated coefficients (which you call weights) have changed. They must, since their role is to be multiplied with the $x$'s, which was changed with the normalization (which I would have called standardization). But the models are equivalent, in the sense that the fitted probabilities (logistic regression is a regression for probabilities) will be the same.

To check that, ask your software for the fitted probabilities, and compare them. A simple way is to get the two sets of fitted probabilities and plot them against each other. I don't know how you do that in matlab, but it should be simple.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • Thanks for your answer. I get why the regression weights are different. What I don`t understand is why the t values across participants differ that much. Sure the weights should be different across partcipants but so should the variance of the distribution of weights be? – Laurie Aug 06 '19 at 07:34
  • Can you please post complete output? The $t$-values should not differ, but I cannot guess more at what has happened without looking at the output. – kjetil b halvorsen Aug 06 '19 at 07:51