How to simply understand gradient boosting on ranking problem?

Question

I am reading Chris Burge's paper about LambdaRank, LambdaMART for learning to rank. We only need to compute the lambda, which is relevant to gradients, and use it to update model parameters, no need to know cost functions. LambdaMART turns the optimization using gradient boosting machines (GBM).

My question is, if we can implement GBM to all machine learning problems, as long as you have gradients? The post Gradient in Gradient Boosting has explained it in regression problems: the prediction target for this new tree is the gradient of it loss function. For regression problem, cost function is $C = (y − \hat{y})^2$, and the sequential regression trees fit: $z = y − \hat{y} = -\frac{\partial C}{\partial \hat{y}}$.

But the loss function in LambdaRank is not that simple (it optimizes NDCG). Can we have a generic and simple picture to understand how gradient boosting works here?

Are you asking generically how the gradient is used in gradient boosting? Because that's answered [here](https://stats.stackexchange.com/questions/338658/gradient-in-gradient-boosting). The [XGBoost documentation](https://xgboost.readthedocs.io/en/latest/tutorials/model.html) also has a good overview. Or are you asking about something else specifically about ranking losses? — Sycorax, Feb 26 '21 at 23:22
@Sycorax, I am wondering how to integrate GBM and ranking, which transforms LambdaRank to LambdaMART, i.e. mapping LambaRank into boosted regression trees. I found Sec 5-7 of Chris Burge's [paper](https://www.microsoft.com/en-us/research/uploads/prod/2016/02/MSR-TR-2010-82.pdf) has explained, but I was asking if there is a way to understand it at high level. — Hsiang Hung, Feb 27 '21 at 16:47

How to simply understand gradient boosting on ranking problem?

0 Answers0