Let's suppose I have a dataset with classes A and B, with class A occurring in 1% of cases and class B occurring in 99% of cases. Perhaps class A is a loan default.
Suppose I want to "understand" what factors make one class A vs B by fitting a logistic regression to dependent variables X and then looking at model coefficients. Does it make sense to put a higher class_weight on class A, such as putting a weight of 99 on class A and 1 on class B? Or does the intercept already take care of this?
What if the logistic regression has a regularization parameter, would class imbalance matter more in this scenario? (because the model would be more inclined towards a constant "Predict B" model to reduce the penalty on coefficient size).
I've seen many economics papers in which a unregularized logistic regression is run on data and then the authors interpret coefficient sizes and significance, just wondering how valid this is.