0

I am using statsmodel api to run a GLM.

import statsmodels.api as sm 
import statsmodels.formula.api as smf

model = smf.glm( formula = "cost_tarif_median ~  age + anc_veh + C(formule) + C(veh_usage) + C(categorie) + C(groupe_sra) + C(zonier)" , family = sm.families.Gamma( link = sm.genmod.families.links.log() ) , data = df_train )

model_fit = model.fit()

Once the model fitted, I use de summary() command to observe the coefficients calculated, p_values etc...

I was expecting all categories from categorical variables to have a coefficient which is not the case.

For example, the "groupe" variable has 3 categories : groupe_1, groupe_2 and groupe_3. The summary only gives coefficients for groupe_2 and groupe_3 categories.

Could someone explain that fact? h

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467

0 Answers0