I am studying the CatBoost paper https://arxiv.org/pdf/1706.09516.pdf (particularly Function BuildTree in page 16), and noticed that it did not mention regularization.
In particular, split selection is based on minimizing the loss of a new candidate tree, measured by cosine distance between previous iteration gradients and tree outputs. I don't see a "lambda" parameter goes in to penalize new splits.
However, in the CatBoost package there is the parameter of l2_leaf_reg
, which is for "Coefficient at the L2 regularization term of the cost function". How does that parameter work?