Why is there no improvement when training Xgboost with pseudo-Huber loss?

Question

In this StackOverflow post I asked if there was something wrong with my syntax when training an XGboost model (in R) with the native pseudo-Huber loss reg:pseudohubererror, since nor training or test error improve (remain constant). There doesn't seem to be a syntax error since custom objectives such as log-cosh loss also shows the same effect.

I am interested in understanding why it doesn't work, since training with absolute loss is a fairly popular thing, due to insensitivity to outliers, and could therefore do better than squared loss - and it is a native argument, so it must be good for something, right?. Does it have something to do with the fact that xgboost requires both a gradient and a hessian? In what context (datatype), if at all, would it work?

So far I couldn't find any example where xgboost with huber-loss is used in a concrete learning problem.

Here's the code from the post above as reference:

Code:

    library(xgboost)
    n = 1000
    X = cbind(runif(n,10,20), runif(n,0,10))
    y = X %*% c(2,3) + rnorm(n,0,1)
    
    train = xgb.DMatrix(data  = X[-n,],
                        label = y[-n])
    
    test = xgb.DMatrix(data   = t(as.matrix(X[n,])),
                       label = y[n]) 
    
    watchlist = list(train = train, test = test)
    
    xbg_test = xgb.train(data = train, objective = "reg:pseudohubererror", eval_metric = "mae", watchlist = watchlist, gamma = 1, eta = 0.01, nrounds = 10000, early_stopping_rounds = 100)

Result:

    [1] train-mae:44.372692 test-mae:33.085709 
    Multiple eval metrics are present. Will use test_mae for early stopping.
    Will train until test_mae hasn't improved in 100 rounds.
    
    [2] train-mae:44.372692 test-mae:33.085709 
    [3] train-mae:44.372688 test-mae:33.085709 
    [4] train-mae:44.372688 test-mae:33.085709 
    [5] train-mae:44.372688 test-mae:33.085709 
    [6] train-mae:44.372688 test-mae:33.085709 
    [7] train-mae:44.372688 test-mae:33.085709 
    [8] train-mae:44.372688 test-mae:33.085709 
    [9] train-mae:44.372688 test-mae:33.085709 
    [10]    train-mae:44.372692 test-mae:33.085709

Have you tried implementing mean squared error as a custom objective function? — wdkrnls, Aug 27 '21 at 22:30
@wdkrnls: Yes I have, to check if the rmse corresponds to the native function - and it did as far as I remember. I have tried multiple custom objective functions and they all improved, except the Huber loss. — PaulG, Aug 29 '21 at 08:33
That's good. I thought in the other question you had tried to use `log_cosh` and didn't have much luck with that either. I tried that one myself and couldn't get it to work, so you beat me there. — wdkrnls, Aug 30 '21 at 15:04
@wdkrnls: Sorry, I meant huber-loss and related functions such as log-cosh did *not* improve (log-cosh and I believe 2 other huber-loss implementiations were the only custom ones I tried). Other custom losses (not Huber related) I used like rmse, mse, earth-mover-distance-based losses etc. behaved as expected. — PaulG, Aug 31 '21 at 16:41

Why is there no improvement when training Xgboost with pseudo-Huber loss?

0 Answers0

Linked