2

I am studying the CatBoost paper https://arxiv.org/pdf/1706.09516.pdf (particularly Function BuildTree in page 16), and noticed that it did not mention regularization.

In particular, split selection is based on minimizing the loss of a new candidate tree, measured by cosine distance between previous iteration gradients and tree outputs. I don't see a "lambda" parameter goes in to penalize new splits.

However, in the CatBoost package there is the parameter of l2_leaf_reg, which is for "Coefficient at the L2 regularization term of the cost function". How does that parameter work?

Poland Spring
  • 31
  • 1
  • 3
  • 3
    As a comment: `catboost` formal documentation is messy (at best). I would suggest you read the code. It might do exactly what `xgboost` is doing by penalising the leaf weights' $l_2$ norm or they might have done something slightly different. – usεr11852 Aug 15 '19 at 22:12

1 Answers1

1

The value of the parameter is added to Leaf denominator for each leaf in all steps. Since it is added to denominator part, the higher l2_leaf_reg is the lower value the leaf will obtain.

It is quite intuitive though, when you think how L2 Regularization is used in typical linear regression setting.

https://github.com/catboost/catboost/blob/560e1dbcb2fb960f1eb40f4b3102b269a8d15257/catboost/private/libs/documents_importance/tree_statistics.cpp#L75

Halil Bilgin
  • 111
  • 3