what is held constant in case of categorical variables when multiple levels are present?

Question

I performed a negative binomial regression and here is my output (variable names changed from my original output):

                               Estimate Std. Error z value Pr(>|z|)    
(Intercept)                   4.041e+00  8.978e-02  45.006  < 2e-16 ***
langB                        -1.143e-01  1.181e-01  -0.968 0.333137    
langC                        -6.581e-02  1.080e-01  -0.609 0.542311    
langD                         5.237e-01  9.540e-02   5.489 4.03e-08 ***
langE                        -1.603e-01  1.076e-01  -1.490 0.136289    
langF                         9.649e-02  1.042e-01   0.926 0.354362    
langG                         1.775e-01  1.043e-01   1.702 0.088696 .  
num_users.m                   5.675e-02  7.949e-03   7.139 9.39e-13 ***
num_attributes.m              3.030e-04  9.860e-05   3.073 0.002116 ** 
num_lines.m                   7.902e-05  4.538e-05   1.741 0.081679 .  
num_distractions.m            1.892e-02  3.182e-02   0.595 0.552041    
type_freq.m                   1.613e-06  4.183e-07   3.855 0.000116 ***
prop_attended.m               1.222e+00  4.645e-02  26.299  < 2e-16 ***

I am going through the example in http://www.ats.ucla.edu/stat/r/dae/nbreg.htm to understand if I'm interpreting it correctly. As I understand, generally we interpret the effect size of one predictor holding other variables constant. For my case, taking num_users as an example, I would say for a unit increase in the number of users, the expected log count of my response variable increases by 0.06, holding all else constant or at their means. I'm however wondering what constant means in case of a categorical variable like lang. Would it be langA here, for it is chosen to be the reference? But when I did

records$lang <- relevel(records$lang, "C")

my other coefficients still didn't seem to change. So does it then mean that for the two instances I compare, I should hold the language constant, but it doesn't really matter what that constant language is?

I've read the wonderful explanation given by @gung as an answer to What does "all else equal" mean in multiple regression? but I find no mention of categorical variables there. Could somebody clarify this please?

what is held constant in case of categorical variables when multiple levels are present?

0 Answers0