Table of Contents
Short Summary: Various ranking system models.
Problem: given a query \(q\) and a set of documents \(D = \{d_1, \dots, d_n\}\), rank the documents based on relevance. Put it simply, ranking is a sorting problem, where the key is to find the "key" according to which to sort. The key is typically a Machine Learning (ML) model.
Unfortunately, rank information is available only after sorting, and sorting is non differentiable.
Pointwise Methods
- Considers a single candidate \(x_i = (q, d_i)\)
- Predicts the relevance scores for a given query \(S_i = M(x_i)\)
- The predicted relevance score is then compared to the ground truth
- Hence, we reformulate the problem as a regression problem
Pairwise Methods
- Considers a pair of candidates \(x_i = (q, d_i)\) and \(x_j = (q, d_j)\)
- Predicts whether \(S_i = M(x_i) > S_j = M(x_j)\)
- The predicted value will be in the \((0, 1)\) interval and we compare that to ground truth
- Unfortunately, rank information is available only after sorting, and sorting is not differentiable
- Fortunately, Gradient Descent does not really need a loss function, it only needs its gradient!
- Hence, we reformulate the problem as a binary classification problem
- Examples:
- LambdaRank defines the gradients of an implicit loss function, so that documents with high rank have much bigger gradients
- Having gradients is also enough to build a Gradient Boosting model. This is the idea that LambdaMART uses, which is a boosted tree version of LambdaRank
- Fun fact: both use "Lambda" as you don’t need the costs, only the gradients (\(\lambda\)) of the cost with respect to the model score
Listwise Methods
- Considers a list of candidates \(x_i = (q, d_i), \dots, x_n = (q, d_n)\)
- Predicts the entire order of documents
- Directly maximizes the evaluation metric
- Hence, we reformulate the problem as a metric maximization problem
- Examples:
- ListNet
- SetRank