Using the Boston housing dataset as example, I'm comparing the Regression Coefficients between Sklearn's LinearRegression() and xgboost's XGBRegressor().
For XGBRegressior, I'm using booser='gblinear'
so that it uses linear booster and not tree based booster. According to this page, gblinear uses "delta with elastic net regularization (L1 + L2 + L2 bias) and parallel coordinate descent optimization.".
Thus, I assume my comparison is apples to apples, since I am not comparing OLS to a tree based learner.
Is my assumption correct? If so, would the interpretation of the coefficients in XGBoost be the same as in Lienar Regression? That is, they represent "the mean change in the response variable for one unit of change in the predictor variable while holding other predictors in the model constant."
The coefficients seen for both are different. Why would this be? Is it because of the regularization and optimization XGBoostRegressor makes it different?
boston = load_boston()
X = pd.DataFrame(boston.data, columns = boston.feature_names)
Y = pd.DataFrame(boston.target)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state=5)
Linear Model:
from sklearn.linear_model import LinearRegression
linear_model = LinearRegression()
linear_model.fit(X_train, y_train)
print(linear_model.coef_)
Output:
[[-1.30799852e-01 4.94030235e-02 1.09535045e-03 2.70536624e+00
-1.59570504e+01 3.41397332e+00 1.11887670e-03 -1.49308124e+00
3.64422378e-01 -1.31718155e-02 -9.52369666e-01 1.17492092e-02
-5.94076089e-01]]
XGBoost Regression with gblinear:
from xgboost import XGBRegressor
xgb_model = XGBRegressor(n_estimators=100, learning_rate=0.06, gamma=1, subsample=0.8, objective='reg:squarederror', booster='gblinear', n_jobs=-1)
xgb_model.fit(X_train, y_train)
print(xgb_model.coef_)
Output:
[-0.192631 0.0966579 -0.00972393 0.34198 0.159105 1.09779
0.039317 0.289027 -0.00622574 0.00236915 0.171237 0.0164343
-0.398639 ]