I am trying to control overfitting using xgboost in R using eta
but when I compare the overfitting of my xgb.cv readout to the xgb.train readout, I don't know why xgb.cv
doesn't seem to overfit and xgb.train
does. How can I get the same nice downward progression of mlogloss
in xgb.train
? I have balanced my classes prior to running the model.
[1] "########### i is 1 and j 1 ##################"
[1] "Creating cv..."
# this part is good -------------------
[0] train-mlogloss:1.609325+0.000006 test-mlogloss:1.609315+0.000009
[100] train-mlogloss:1.601508+0.001238 test-mlogloss:1.602480+0.001071
[200] train-mlogloss:1.594359+0.002151 test-mlogloss:1.596278+0.001812
[300] train-mlogloss:1.587120+0.002100 test-mlogloss:1.589944+0.001546
[400] train-mlogloss:1.580558+0.001839 test-mlogloss:1.584062+0.001251
[1] "Took 160 seconds to cv train with 500 rounds..."
[1] "Creating model..."
# this part is bad -------------------
[0] train-mlogloss:1.609341 test-mlogloss:1.609383
[100] train-mlogloss:1.602439 test-mlogloss:1.609435
[200] train-mlogloss:1.594991 test-mlogloss:1.609580
[300] train-mlogloss:1.587814 test-mlogloss:1.609732
My code for cv
and train
and my parameters is:
param = list("objective" = "multi:softprob"
, "eval_metric" = "mlogloss"
, 'num_class' = 5
, 'eta' = 0.001)
bst.cv = xgb.cv(param = param
, data = ce.dmatrix
, nrounds = nrounds
, nfold = 4
, stratified = T
, print.every.n = 100
, watchlist = watchlist
, early.stop.round = 10
)
bst = xgb.train(param = param
, data = ce.dmatrix
, nrounds = nrounds
, print.every.n = 100
, watchlist = watchlist
# , early.stop.round = 10
)