From ranking to intransitive preference learning: rock-paper

From ranking to intransitive preference
learning: rock-paper-scissors and beyond
Tapio Pahikkala1 Willem Waegeman2 Evgeni Tsivtsivadze3
Tapio Salakoski1 Bernard De Baets2
1 TUCS,
University of Turku, Finland
2 KERMIT,
Ghent University, Belgium
3 Institute
for Computing and Information Sciences
Radboud University Nijmegen, The Netherlands
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
1 / 25
Introduction
Outline
1
Introduction
2
Stochastic transitivity and ranking representability
3
Learning intransitive reciprocal relations
4
Experiments
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
2 / 25
Introduction
The transitivity property: a classical example
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
3 / 25
Introduction
Examples of intransitivity are found in many
fields...
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
4 / 25
Stochastic transitivity and ranking representability
Outline
1
Introduction
2
Stochastic transitivity and ranking representability
3
Learning intransitive reciprocal relations
4
Experiments
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
5 / 25
Stochastic transitivity and ranking representability
Q(x, x0 ) = 5/9 Q(x0 , x00 ) = 5/9 Q(x00 , x) = 5/9
Q(x0 , x) = 4/9 Q(x00 , x0 ) = 4/9 Q(x, x00 ) = 4/9
Proposition
A relation Q : X 2 → [0, 1] is called a reciprocal relation if
Q(x, x0 ) + Q(x0 , x) = 1
Pahikkala et al. (TUCS, UGent)
∀(x, x0 ) ∈ X 2 .
Intransitive Preference Learning
September, 11 2009
6 / 25
Stochastic transitivity and ranking representability
Q(x, x0 ) = 5/9 Q(x0 , x00 ) = 5/9 Q(x00 , x) = 5/9
Q(x0 , x) = 4/9 Q(x00 , x0 ) = 4/9 Q(x, x00 ) = 4/9
Proposition
A relation Q : X 2 → [0, 1] is called a reciprocal relation if
Q(x, x0 ) + Q(x0 , x) = 1
Pahikkala et al. (TUCS, UGent)
∀(x, x0 ) ∈ X 2 .
Intransitive Preference Learning
September, 11 2009
6 / 25
Stochastic transitivity and ranking representability
Q(x, x0 ) = 5/9 Q(x0 , x00 ) = 5/9 Q(x00 , x) = 5/9
Q(x0 , x) = 4/9 Q(x00 , x0 ) = 4/9 Q(x, x00 ) = 4/9
Proposition
A relation Q : X 2 → [0, 1] is called a reciprocal relation if
Q(x, x0 ) + Q(x0 , x) = 1
Pahikkala et al. (TUCS, UGent)
∀(x, x0 ) ∈ X 2 .
Intransitive Preference Learning
September, 11 2009
6 / 25
Stochastic transitivity and ranking representability
Q(x, x0 ) = 5/9 Q(x0 , x00 ) = 5/9 Q(x00 , x) = 5/9
Q(x0 , x) = 4/9 Q(x00 , x0 ) = 4/9 Q(x, x00 ) = 4/9
Proposition
A relation Q : X 2 → [0, 1] is called a reciprocal relation if
Q(x, x0 ) + Q(x0 , x) = 1
Pahikkala et al. (TUCS, UGent)
∀(x, x0 ) ∈ X 2 .
Intransitive Preference Learning
September, 11 2009
6 / 25
Stochastic transitivity and ranking representability
Ranking representability
Definition
A reciprocal relation Q : X 2 → [0, 1] is called weakly ranking
representable if there exists a ranking function f : X → R such that for
any (x, x0 ) ∈ X 2 it holds that
Q(x, x0 ) ≤
1
⇔ f (x) ≤ f (x0 ) .
2
Q(x, x0 ) = 5/9 ⇔ x x0
Q(x0 , x00 ) = 5/9 ⇔ x0 x00
Q(x00 , x) = 5/9 ⇔ x00 x
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
7 / 25
Stochastic transitivity and ranking representability
Ranking representability
Definition
A reciprocal relation Q : X 2 → [0, 1] is called weakly ranking
representable if there exists a ranking function f : X → R such that for
any (x, x0 ) ∈ X 2 it holds that
Q(x, x0 ) ≤
1
⇔ f (x) ≤ f (x0 ) .
2
Q(x, x0 ) = 5/9 ⇔ x x0
Q(x0 , x00 ) = 5/9 ⇔ x0 x00
Q(x00 , x) = 5/9 ⇔ x00 x
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
7 / 25
Stochastic transitivity and ranking representability
Weak stochastic transitivity
Proposition (Luce and Suppes, 1965)
A reciprocal relation Q is weakly ranking representable if and only if it
satisfies weak stochastic transitivity, i.e., for any (x, x0 , x00 ) ∈ X 3 it
holds that
Q(x, x0 ) ≥ 1/2 ∧ Q(x0 , x00 ) ≥ 1/2 ⇒ Q(x, x00 ) ≥ 1/2 .
Q(x, x0 ) = 6/9
Q(x0 , x00 ) = 5/9
Q(x00 , x) = 2/9
⇔
x x0 x00
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
8 / 25
Stochastic transitivity and ranking representability
Weak stochastic transitivity
Proposition (Luce and Suppes, 1965)
A reciprocal relation Q is weakly ranking representable if and only if it
satisfies weak stochastic transitivity, i.e., for any (x, x0 , x00 ) ∈ X 3 it
holds that
Q(x, x0 ) ≥ 1/2 ∧ Q(x0 , x00 ) ≥ 1/2 ⇒ Q(x, x00 ) ≥ 1/2 .
Q(x, x0 ) = 6/9
Q(x0 , x00 ) = 5/9
Q(x00 , x) = 2/9
⇔
x x0 x00
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
8 / 25
Learning intransitive reciprocal relations
Outline
1
Introduction
2
Stochastic transitivity and ranking representability
3
Learning intransitive reciprocal relations
4
Experiments
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
9 / 25
Learning intransitive reciprocal relations
Definition of our framework
Training data E = (ei , yi )N
i=1
Training data are here couples: e = (x, x0 )
Labels yi = 2Q(xi , x0i ) + 1
Minimizing the regularized empirical error:
A(E) = argmin
h∈F
N
1X
L(h(ei ), yi ) + λkhk2F
N
i=1
Least-squares loss function: regularized least-squares
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
10 / 25
Learning intransitive reciprocal relations
Reciprocal relations are learned by defining a
specific kernel construction
Consider the following joint feature representation for a couple:
Φ(ei ) = Φ(xi , x0i ) = Ψ(xi , x0i ) − Ψ(x0i , xi ),
This yields the following kernel defined on couples:
K Φ (ei , ej ) = K Φ (xi , x0i , xj , x0j )
= hΨ(xi , x0i ) − Ψ(x0i , xi ), Ψ(xj , x0j ) − Ψ(x0j , xj )i
= K Ψ (xi , x0i , xj , x0j ) + K Ψ (x0i , xi , x0j , xj )
−K Ψ (x0i , xi , xj , x0j ) − K Ψ (xi , x0i , x0j , xj ) .
And the model becomes:
h(x, x0 ) = hw, Ψ(x, x0 ) − Ψ(x0 , x)i =
N
X
ai K Φ (xi , x0i , x, x0 ) .
i=1
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
11 / 25
Learning intransitive reciprocal relations
Reciprocal relations are learned by defining a
specific kernel construction
Consider the following joint feature representation for a couple:
Φ(ei ) = Φ(xi , x0i ) = Ψ(xi , x0i ) − Ψ(x0i , xi ),
This yields the following kernel defined on couples:
K Φ (ei , ej ) = K Φ (xi , x0i , xj , x0j )
= hΨ(xi , x0i ) − Ψ(x0i , xi ), Ψ(xj , x0j ) − Ψ(x0j , xj )i
= K Ψ (xi , x0i , xj , x0j ) + K Ψ (x0i , xi , x0j , xj )
−K Ψ (x0i , xi , xj , x0j ) − K Ψ (xi , x0i , x0j , xj ) .
And the model becomes:
h(x, x0 ) = hw, Ψ(x, x0 ) − Ψ(x0 , x)i =
N
X
ai K Φ (xi , x0i , x, x0 ) .
i=1
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
11 / 25
Learning intransitive reciprocal relations
Reciprocal relations are learned by defining a
specific kernel construction
Consider the following joint feature representation for a couple:
Φ(ei ) = Φ(xi , x0i ) = Ψ(xi , x0i ) − Ψ(x0i , xi ),
This yields the following kernel defined on couples:
K Φ (ei , ej ) = K Φ (xi , x0i , xj , x0j )
= hΨ(xi , x0i ) − Ψ(x0i , xi ), Ψ(xj , x0j ) − Ψ(x0j , xj )i
= K Ψ (xi , x0i , xj , x0j ) + K Ψ (x0i , xi , x0j , xj )
−K Ψ (x0i , xi , xj , x0j ) − K Ψ (xi , x0i , x0j , xj ) .
And the model becomes:
h(x, x0 ) = hw, Ψ(x, x0 ) − Ψ(x0 , x)i =
N
X
ai K Φ (xi , x0i , x, x0 ) .
i=1
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
11 / 25
Learning intransitive reciprocal relations
Ranking can be considered as a specific case in
this framework
Consider the following joint feature representation Ψ for a couple:
ΨT (x, x0 ) = φ(x) .
This yields the following kernel K Ψ :
KTΨ (xi , x0i , xj , x0j ) = K φ (xi , xj ) = hφ(xi ), φ(xj )i ,
And the model becomes:
h(x, x0 ) = hw, φ(x)i − hw, φ(x0 )i = f (x) − f (x0 ) ,
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
12 / 25
Learning intransitive reciprocal relations
Ranking can be considered as a specific case in
this framework
Consider the following joint feature representation Ψ for a couple:
ΨT (x, x0 ) = φ(x) .
This yields the following kernel K Ψ :
KTΨ (xi , x0i , xj , x0j ) = K φ (xi , xj ) = hφ(xi ), φ(xj )i ,
And the model becomes:
h(x, x0 ) = hw, φ(x)i − hw, φ(x0 )i = f (x) − f (x0 ) ,
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
12 / 25
Learning intransitive reciprocal relations
Ranking can be considered as a specific case in
this framework
Consider the following joint feature representation Ψ for a couple:
ΨT (x, x0 ) = φ(x) .
This yields the following kernel K Ψ :
KTΨ (xi , x0i , xj , x0j ) = K φ (xi , xj ) = hφ(xi ), φ(xj )i ,
And the model becomes:
h(x, x0 ) = hw, φ(x)i − hw, φ(x0 )i = f (x) − f (x0 ) ,
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
12 / 25
Learning intransitive reciprocal relations
Using the Kronecker-product intransitive relations
can be learned, unlike the existing approaches
Consider the following joint feature representation Ψ for a couple:
ΨI (x, x0 ) = φ(x) ⊗ φ(x0 ) ,
where ⊗ denotes the Kronecker-product:


A1,1 B · · · A1,n B


..
..
..
A⊗B =
,
.
.
.
Am,1 B · · · Am,n B
This yields the following kernel K Ψ :
KIΨ (xi , x0i , xj , x0j ) = hφ(xi ) ⊗ φ(x0i ), φ(xj ) ⊗ φ(x0j )i
= hφ(xi ), φ(xj )i ⊗ hφ(x0i ), φ(x0j )i
= K φ (xi , xj )K φ (x0i , x0j ),
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
13 / 25
Learning intransitive reciprocal relations
Using the Kronecker-product intransitive relations
can be learned, unlike the existing approaches
Consider the following joint feature representation Ψ for a couple:
ΨI (x, x0 ) = φ(x) ⊗ φ(x0 ) ,
where ⊗ denotes the Kronecker-product:


A1,1 B · · · A1,n B


..
..
..
A⊗B =
,
.
.
.
Am,1 B · · · Am,n B
This yields the following kernel K Ψ :
KIΨ (xi , x0i , xj , x0j ) = hφ(xi ) ⊗ φ(x0i ), φ(xj ) ⊗ φ(x0j )i
= hφ(xi ), φ(xj )i ⊗ hφ(x0i ), φ(x0j )i
= K φ (xi , xj )K φ (x0i , x0j ),
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
13 / 25
Experiments
Outline
1
Introduction
2
Stochastic transitivity and ranking representability
3
Learning intransitive reciprocal relations
4
Experiments
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
14 / 25
Experiments
Reciprocal relations in rock-paper-scissors
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
15 / 25
Experiments
Reciprocal relations in rock-paper-scissors
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
16 / 25
Experiments
Reciprocal relations in rock-paper-scissors
Convert probabilities to a reciprocal relation:
1
Q(x, x0 ) = P(r | x)i P(s | x0 ) + P(r | x)P(r | x0 )
2
1
+P(p | x)P(r | x0 ) + P(p | x)P(p | x0 )
2
1
0
+P(s | x)P(p | x ) + P(s | x)P(s | x0 ).
2
Example:
Player1 :x = (r = 1/2, p = 1/2, s = 0)
Player2 :x0 = (r = 0, p = 1/2, s = 1/2)
⇒ Q(x, x0 ) = 1/2(1/2 + 0/2) + 1/2(0 + 1/4) + 0(1/2 + 1/4) = 3/8
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
17 / 25
Experiments
Rock-paper-scissors: experimental setup
100 players for training
(100 games)
100 players for testing
(1000 games)
features are the mixed
strategies
training labels
y ∈ {−1, 0, 1}
test labels y ∈ [0, 1]
K φ linear kernel
three different settings
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
18 / 25
Experiments
Rock-paper-scissors: experimental setup
100 players for training
(100 games)
100 players for testing
(1000 games)
features are the mixed
strategies
training labels
y ∈ {−1, 0, 1}
test labels y ∈ [0, 1]
K φ linear kernel
three different settings
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
18 / 25
Experiments
Rock-paper-scissors: experimental setup
100 players for training
(100 games)
100 players for testing
(1000 games)
features are the mixed
strategies
training labels
y ∈ {−1, 0, 1}
test labels y ∈ [0, 1]
K φ linear kernel
three different settings
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
18 / 25
Experiments
Rock-paper-scissors: experimental setup
100 players for training
(100 games)
100 players for testing
(1000 games)
features are the mixed
strategies
training labels
y ∈ {−1, 0, 1}
test labels y ∈ [0, 1]
K φ linear kernel
three different settings
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
18 / 25
Experiments
Rock-paper-scissors: experimental setup
100 players for training
(100 games)
100 players for testing
(1000 games)
features are the mixed
strategies
training labels
y ∈ {−1, 0, 1}
test labels y ∈ [0, 1]
K φ linear kernel
three different settings
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
18 / 25
Experiments
Rock-paper-scissors: setting 1
Paper
Method
Intrans.
Trans.
Naive
Rock
Pahikkala et al. (TUCS, UGent)
MSE
0.000209
0.000162
0.000001
Scissors
Intransitive Preference Learning
September, 11 2009
19 / 25
Experiments
Rock-paper-scissors: setting 1
Paper
Method
Intrans.
Trans.
Naive
Rock
Pahikkala et al. (TUCS, UGent)
MSE
0.000209
0.000162
0.000001
Scissors
Intransitive Preference Learning
September, 11 2009
19 / 25
Experiments
Rock-paper-scissors: setting 2
Paper
Method
Intrans.
Trans.
Naive
Rock
Pahikkala et al. (TUCS, UGent)
MSE
0.000445
0.006804
0.006454
Scissors
Intransitive Preference Learning
September, 11 2009
20 / 25
Experiments
Rock-paper-scissors: setting 2
Paper
Method
Intrans.
Trans.
Naive
Rock
Pahikkala et al. (TUCS, UGent)
MSE
0.000445
0.006804
0.006454
Scissors
Intransitive Preference Learning
September, 11 2009
20 / 25
Experiments
Rock-paper-scissors: setting 3
Paper
Method
Intrans.
Trans.
Naive
Rock
Pahikkala et al. (TUCS, UGent)
MSE
0.000076
0.131972
0.125460
Scissors
Intransitive Preference Learning
September, 11 2009
21 / 25
Experiments
Rock-paper-scissors: setting 3
Paper
Method
Intrans.
Trans.
Naive
Rock
Pahikkala et al. (TUCS, UGent)
MSE
0.000076
0.131972
0.125460
Scissors
Intransitive Preference Learning
September, 11 2009
21 / 25
Experiments
Simulation of competition between species
results in stable populations after many iterations
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
22 / 25
Experiments
1.0
1.0
0.8
0.8
0.6
0.6
Weak point
Weak point
Experiment 2: competition between species in
theoretical biology
0.4
0.2
0.0
0.0
0.4
0.2
0.2
0.4
0.6
0.8
1.0
0.0
0.0
0.2
Strong point
0.4
0.6
0.8
1.0
Strong point
y = sign(d(s(x 0 ), w(x)) − d(s(x), w(x 0 )))
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
23 / 25
Experiments
The intransitive kernel clearly beats the traditional
transitive kernel
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Trans. Accuracy = 0.615 ⇔ Intrans. Accuracy = 0.850
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
24 / 25
Experiments
Discussion
Existing kernel-based ranking methods cannot predict intransitive
relations.
With our framework it is possible to represent and predict
intransitive relations in an adequate way.
Empirical results on two problems confirm that our framework is
able to learn intransitive relations, unlike ranking methods.
Many applications possible (e.g. in the life sciences), but no
publicly available datasets.
http://staff.cs.utu.fi/˜ aatapa/software/RPS
Pahikkala et al. (TUCS, UGent)
Intransitive Preference Learning
September, 11 2009
25 / 25

Download Report

From ranking to intransitive preference learning: rock-paper

Paperzz.com

Your Paperzz