Learning to Rank
--A Brief Review
Yunpeng Xu
1
Ranking and sorting
Rank: only has K structured categories
Sorting: each sample has a distinct rank
Generally, no need to differentiate them
2
Overview
Rank aggregation
Label ranking
Query and rank by example
Preference learning
Problems left, what we can do?
3
Ranking aggregation
Needs of combining different ranking results
Voting systems, welfare economics, decision
making
1. Hillary Clinton > John Edwards > Barack Obama
2. Barack Obama >John Edwards > Hillary Clinton
=> ?
4
Ranking aggregation (cont.)
Arrow’s impossibility theorem
Kenneth Arrow, 1951
If the decision-making body has at least
two members and at least three options to
decide among, then it is impossible to
design a social welfare function that
satisfies all these conditions at once.
5
Ranking aggregation (cont.)
Arrow’s impossibility theorem
5 fair assumptions
non-dictatorship, unrestricted domain or
universality, independence of irrelevant
alternatives, positive association of social and
individual values or monotonicity, non-imposition
or citizen sovereignty
Cannot be satisfied simultaneously
6
Ranking aggregation (cont.)
Borda’s method (1971)
Given lists
, each has n items
For each
Define
as the number of items rank below j in
Rank all items by
Hillary Clinton: 2, John Edwards: 2, Barack Obama: 2
7
Ranking aggregation (cont.) -- Border
Condorcet Criteria
If the majority prefers x to y, then x must be
ranked above y
Border’s method does not satisfy CC, neither
any method that assigns weights to each rank
position
8
Ranking aggregation (cont.)
Assumption relaxation
Maximize consensus criteria
Equivalent to minimize disagreement (Kemeny,
Social Choice Theorem)
NP Hard!
Sub-optimal solutions using heuristics
9
Ranking aggregation (cont.)
Basic idea
Assign different weights to different experts
Supervised aggregation
Weighting according to a final judger (ground truth)
Unsupervised aggregation
Aims to minimize the disagreement measured by
certain distances
10
Ranking aggregation (cont.)
Distance measure
Spearman footrule distance
Kendal tau distance
F ( , ) |{(i, j ) | i j, (i) ( j ), but (i) ( j )}|
Kendal tau distance for multiple lists
Scaled footrule distance
11
Ranking aggregation (cont.)
-Distance Measure
Kemeny optimal ranking
Minimizing Kendal distance
Still NP-Hard to compute
Local Kemenization (local optimal aggregation)
Can be computed in O(knlogn)
12
Ranking aggregation (cont.)
Supervised Ranking Aggregation (SRA WWW07)
Ground truth: preference matrix H
Example
Goal: rank by the score
It can be seen that
, or with relaxation
13
Ranking aggregation (cont.) -- SRA
Method
Use Borda’s score
Objective
14
Ranking aggregation (cont.)
Markov Chain Rank Aggregation (MCRA, WWW05)
Map a ranked list to a Markov Chain M
Compute the stationary distribution of M
Rank items based on
Example:
B>C>D
A>D>E
A>B>E
15
Ranking aggregation (cont.) - MCRA
Different transition strategies
MC1
all out-degree edges have uniform probabilities
MC2
choose a list, then choose next item on the list;
…
For disconnected graph, define transition
probability based on measure item similarity
16
Ranking aggregation (cont.)
Unsupervised Learning Algorithm for Rank
Aggregation (ULARA: Dan Roth ECML07)
Goal:
Method: maximize agreement
17
Ranking aggregation (cont.) - UCLRA
Method
Algorithm: iterative gradient decent
Initially, w is uniform, then updated iteratively
18
Overview
Rank aggregation
Label ranking
Query and rank by example
Preference learning
Problems left, what we can do?
19
Label Ranking
Goal: Map from the input space to the set of total
order over a finite set of labels
Related to multi-label or multi-class problems
Input: Customer information
Output: Porsche > Toyota > Ford
Mountain > Sea> Beach
20
Label Ranking (cont.)
Pairwise ranking (ECML03)
Train a classifier for each pair of labels
When judge on an example :
If the classifier predicts
, then count it as a
vote on
Then rank all labels according to their votes
Total
classifiers
21
Label Ranking (cont.)
Constraint Classification (NIPS 02)
Consider a linear sorting function
Goal: learn the values of
rank all labels by the score
22
Label Ranking (cont.) -- CC
Expand the feature vector
Generate positive/ negative samples in
23
Label Ranking (cont.) -- CC
Learn a separating hyper plane
Can be solved by SVM
24
Overview
Rank aggregation
Label ranking
Query and rank by example
Preference learning
Problems left, what we can do?
25
Query and rank by example
Given one query, rank retrieved items according
to their relevancy w.r.t the query.
26
Query and rank by example (cont.)
Rank on manifold
Convergence form
Essentially, this is an one-class semi-supervised
method
27
Preference learning
Given a set of items, and a set of user
preference over these items, to rank all items
according to the user preference.
Motivated by the needs of personalized search.
28
Preference learning
Input:
preference: a set of partial order on X
Output: a total order on X
or, map X onto a structured label space Y
Preference function
29
Existing methods
Learning to order things [W. Cohen 98]
Large margin ordinal regression [R. Herbrich 98]
PRanking with Ranking [K Crammer 01]
Optimizing Search Engines using Clickthrough
Data [T Joachims 02]
Efficient boosting algorithm for combining
preferences [Yoav Freund 03]
Classification Approach towards Ranking and
Sorting Problems [S Rajaram 03]
30
Existing methods
Learning to Rank using Gradient Descent [C
Burges 05]
Stability and Generalization of Bipartite
Ranking [S Agarwal 05]
Generalization Bounds for k-Partite Ranking[S
Rajaram 05]
Ranking with a p-norm push [C Rudin 05]
Magnitutde-Preserving Ranking Algorithms [C
Cortes 07]
From Pairwise Approach to Listwise [Z Cao 07]
31
Large Margin Ordinal Regression
Mapping to an axis using inner product
32
Large Margin Ordinal Regression
Consider
Then
Introduce soft margin
Solve using SVM
33
Learn to order things
A greedy ordering algorithm to order things
Calculate a score for each item
34
Learn to order things (cont.)
Combine different ranking functions
To learn the weight iteratively
35
Learn to order things
Combine preference functions
Do ranking aggregation
Update weights
based on feedbacks
36
Initially, w is uniform
At each step
Compute a combined ranking function
Produce a ranking aggregation
Measure the loss
37
RankBoost
Bipartite ranking problems
Combine weaker rankers
Sort based on values of H(x)
38
RankBoost (cont.)
Sampling distribution Initialization
Bipartite ranking problem
Learn weak
ranker
Sampling
distribution
updation
normalization
Combine weak rankers
39
Stability and Generalization
Bipartite ranking problems
Expected rank error
Empirical rank error
40
Stability and Generalization (cont.)
Stability
Remove one training sample, how much changes
Generalization
Generalize to k-partite ranking problem…
41
Rank on graph data
Objective
42
P-norm push
Focus on the topmost ranked items
The top left region is the most important
43
P-norm push (cont.)
Height of k (k is a negative sample)
Cost of sample k:
g is convex, monotonically incresasing
44
p-norm push
Run RankBoost to solve the problem
45
Thanks!
46
© Copyright 2026 Paperzz