Learning Random Walks to Rank Nodes in Graphs

Learning Random Walks to Rank Nodes in Graphs
Alekh Agarwal Soumen Chakrabarti
IIT Bombay
http://www.cse.iitb.ac.in/~soumen/doc/netrank
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
1
Learning to Rank
I
I
I
Web search, recommender systems, database search, . . .
Vector Spaces: Learn a good scoring function hβ, ψ(x)i from
training preferences for fixed feature mapping ψ on instances x
Graphs: Of great recent interest for modelling relationships
hasWord
XML
hasWord
holistic
index
cites
wrote
received
sent
wrote
worksFor
isA company
Need to exploit/respect information from links
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
2
Two Roles of Edges in Graph Ranking
Associative Networks
I Edges encode similarity
I Preference for smooth scoring functions
I Typically edge weights indicate extent of similarity
Random Walk Approach
I Edges indicate endorsement
I Motivated by Pagerank, widely used
I Each edge (u, v ) has transition probability
Q(v , u) = Pr(v |u)
Note : Equivalent for undirected graphs.
I
I
Typically Q fixed by hand and π = Qπ found
We have an “inverse problem”: given properties of π, find a
transition probability matrix
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
3
Training and Testing
I
I
Fixed directed graph G
Training set of some node-pair preferences:
“u ≺ v ” means we want score(u) < score(v )
I
I
I
I
I
More node-pair preferences in test set
Sampling distribution over node pairs not necessarily uniform, but
same for training and testing
Transductive, in the sense that node-pair space is finite
Performance measured by number of incorrect test set predictions
Learner must assign scores to satisfy training pairs without
overfitting
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
4
Ranking Using Random Walk (Agarwal+ 2006)
I
I
Pagerank vector π = Qπ, Π = diag(π), induces flow
quv = π(u)Qvu along graph edges
Goal: learn flow p close to q that satisfies training preferences
min
{0≤puv }
{0≤suv }
X
puv log
(u,v )∈E
s.t.:
X
puv
+B
suv
quv
u≺v
X
puv = 1
(KL)
(Sum)
(u,v )∈E
X
∀v ∈ V :
(u,v )∈E
∀u ≺ v :
Alekh Agarwal and Soumen Chakrabarti
X
(w ,u)∈E
pwu −
X
puv −
pvw = 0
(Balance)
pwv − suv ≤ 0
(Pref)
(v ,w )∈E
X
(w ,v )∈E
Learning Random Walks to Rank Nodes in Graphs
5
Limitations of Markovian Flow Approach
I
I
I
Only intuitive motivation for use of q
No known generalization bounds
No margin in training constraints
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
6
Ranking in Associative Networks (Agarwal 2006)
Π1/2 QΠ−1/2 +Π−1/2 QΠ1/2
2
2
f (v )
I
Directed Graph Laplacian L = I −
P
>
f Lf = (u,v )∈E π(u)Q̂uv √f (u) − √
I
Scoring algorithm:
I
π(u)
min
f :V →R
s={suv ≥0:u≺v }
1 >
f Lf
2
+B
X
π(v )
suv
I
I
I
.
enforces smoothness
subject to
(Lap)
u≺v
fv − fu ≥ 1 − suv
∀u ≺ v
Generalization
proved using algorithmic stability (Bousquet+ 2002)
p
f (v ) ∝ π(v ) minimizes f > Lf
I.e. prefers pagerank ordering in absence of training data
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
7
Limitations of the Laplacian Approach
I
“Link as similarity hint” not universal view
Millions of obscure pages u link to v =
http://kernel-machines.org, with score(u) score(v )
Score fu can be arbitrary, even negative
No intuitive meaning like probability as for π(u)
I
Generalization depends on κ = maxu∈V L+ (u, u) , hard to interpret
I
Typical QP solvers compute and store large, dense L+ in RAM
I
I
I
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
8
Our Contributions
I
I
I
I
I
Relating Laplacian regularization with KL regularization
Stability-based generalization bounds for random walk ranking
Key parameters that affect generalization
Incorporation of margin in random walk ranking
Cost-sensitive ranking framework
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
9
Laplacian-KL Correspondence
I
I
Algorithm(KL) returns flow {puv }
Define node scores using flow {puv }:
s X
fp (u) =
pwu
w :(w ,u)∈E
I
I
Same rank order as Pagerank node score:
We show that
KL(pkq) ≤ ⇒
P
w :(w ,u)∈E
pwu
fp> Lfp ≤ 4(2 ln 2)2
∴ If (KL) achieves a small KL distance, we can find scores with low
Laplacian roughness penalty too
I First hint that KL(·kq) is a good regularizer
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
10
Generalization Bound for (KL)
I
…
I
I
Alekh Agarwal and Soumen Chakrabarti
3
…
I
2
…
I
…
I
…
I
For any m ≥ 1, any δ ∈ (0, 1), the following holds w.p. ≥ 1 − δ
over random draws of sample ≺ of size m:
r
ln(1/δ)
R ≤ Remp + 2β + (4mβ + 1)
2m
Graph-agnostic case: β = 2ln2
1
2
1
λm
G may not reduce function class (a) (a)
0
(b)
Degree bound D not enough (c)
4
3
4
But G can be useful too (b)
3k
31 32
Key parameter: eccentricity
21
(c)
maxv :(u,v )∈E puv
ρ = max
u∈V minv :(u,v )∈E puv
1k
11 12
Modified β worsens with increasing D, ρ
Together, ρ and D control influence of any single node
…
I
Learning Random Walks to Rank Nodes in Graphs
11
Loss Functions and Problem with Margin
I
Thus far, our loss function has been
(
0,
f (u) − f (v ) < 0
`0 (f , u, v ) =
f (u) − f (v ), 0 ≤ f (u) − f (v )
I
I
Ideally, we want an upper bound on 0/1 loss e.g., hinge loss
(
0,
f (u) − f (v ) < −1
`1 (f , u, v ) =
1+f (u) − f (v ), −1 ≤ f (u) − f (v )
P
In (KL), (u,v ) puv = 1 means all node scores in [0, 1]
Typically most nodes scores are very small
Arbitrary additive margin (like “1”) unattainable except for
“meaningless” slacks
P
Let flows {puv } float to variable scale: {uv ∈E } puv = F
I
Luckily, KL(pkq) well-defined even if p not normalized
I
I
I
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
12
(KL) with Additive Margin
X
min
{puv },{suv }
(u,v )∈E 0
F ≥1
subject to 1 +
X
(w ,u)∈E
I
I
I
puv log pquv
+B
uv
X
suv + B1 F 2
u≺v
pwu −
X
pwv − suv ≤ 0 ∀u ≺ v
(w ,v )∈E
Small F enforces large margin
Can show (polynomial in δ) generalization bound
Generalization worsens as upper bound on F increases
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
13
Experimental Results
0.45
TestError
0.40
0.35
0.30
0.25
0.20
NoMargin
I
I
I
I
I
I
AddMargin
Laplacian
Evaluated on real and synthetic social networks
First computed reference (unweighted) pagerank
Secretely perturbed conductance of some edges
Computed perturbed pagerank, considered to be the “true” hidden
score
Sampled training and test pairs from agreements and
disagreements between the two rankings
Additive margin in (KL) helps, gives best results
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
14
Cost Sensitivity in Ranking
I
I
I
I
I
I
Tester wants no mistake at top of ranked list
Excessive cognitive burden on trainer to provide total orders or
“true” scores
Main Intuition: Use score estimate as surrogate
High confidence in nodes predicted to be ranked high.
With high probability, loss is small for pairs with large fu or fv
Generalization bounds follow from stability wrt g .
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
15
Cost-sensitive ranking experiments
scale 2
0.7
Average of
PrecCost
Accuracy
0.6
Data
0.5
Average of Prec
0.4
0.3
0.2
Average of
Prec
Average of
PrecCost
0.1
0
I
I
I
I
Number ofposition
violations in test set no longer appropriate, need
cost-sensitive performance measure
Use precision at rank k
Cost-sensitive formulation does better than cost-ignorant
counterpart
Better for other accuracy measures too, like Kendall’s tau
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
16
Summary
I
I
I
I
I
I
Connections between Laplacian and random walk setups
Generalization bounds in terms of intuitive graph parameters for
random walk ranking
Margin in random walk ranking, beats Laplacian approach in
experiments
Optimization more scalable for random walk approach
A general cost-sensitive ranking framework
Effective experimental results in modified Laplacian framework
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
17
References I
A. Agarwal, S. Chakrabarti, and S. Aggarwal, “Learning to rank
networked entities,” in SIGKDD Conference, 2006, pp. 14–23.
http://www.cse.iitb.ac.in/∼soumen/doc/netrank
S. Agarwal, “Ranking on graph data,” in ICML, 2006, pp. 25–32.
http://web.mit.edu/shivani/www/Papers/2006/
icml06-graph-ranking.pdf
O. Bousquet and A. Elisseeff, “Stability and generalization,” Journal
of Machine Learning Research, vol. 2, pp. 499–526, 2002.
http://www.cmap.polytechnique.fr/∼bousquet/papers/
BouEli01 stability jmlr.ps
Alekh Agarwal and Soumen Chakrabarti
Learning Random Walks to Rank Nodes in Graphs
18