Suppose I have a model output of a logistic regression: Y ~ 2.5X + 1.6Y -3.5Z. My goal is to understand the impact of variable Y.
- All variables have p < .05
- I see a big AIC change when I add Y variable in to the model
- Intuitively, Y makes sense that it is important from a business perspective. To me, I feel Y is important to the model.
My goal is to understand the log-odds impact of Y (which I'll convert to odds and probability later). At first, I thought 1.6 is the impact. However, as I was doing more tests, and I introduce variable W, my model output is now this:
Y~ 2.8X + 3.5Y + .05Z + 1.5W.
Now, I do a log liklihood test to compare the originalmodel without W and the alternate model with W. based on log likelihood test, I reject the null hypothesis and conclude that W should be included in the model.
So now......going back to the impact of Y, is the "true" impact of Y actually 3.5 and not 1.6? Does a better overall model fit equate to more accurate coefficients of individual variables?
If not, what is the best way to understand and have better confidence of my coefficients?