The Page Rank Axioms - CS

The Page Rank Axioms
Based on Ranking Systems: The
PageRank Axioms, by Alon Altman and
Moshe Tennenholtz.
Presented by Aron Matskin
‫‪Judge and be prepared to be judged.‬‬
‫‪Ayn Rand‬‬
‫רבי שמעון אומר שלשה כתרים הם‪ :‬כתר תורה‪,‬‬
‫וכתר כהונה‪ ,‬וכתר מלכות; וכתר שם טוב עולה‬
‫על גביהן‪.‬‬
‫פירקי אבות‬
Talking Points




Ranking and reputation in general
Connections to the Internet world
PageRank web ranking system
PageRank representation theorem
Ranking: What




Abilities
Choices
Reputation
Quality
 Quality of
information



Popularity
Good looks
What not?
Ranking: How






Voting
Reputation systems
Peer review
Performance reviews
Sporting competition
Intuitive or ad-hoc
Ranking Systems’ Properties







Ad-hoc or systematic
Centralized or distributed
Feedback or indicator-based
Peer, “second-party”, or third-party
Update period
Volatility
Other?
Agents Ranking Themselves





Community reputation
Professional associations
Peer review
Performance reviews (in part)
Web page ranking
Ranking: Problems and Issues




Eliciting information
Information aggregation
Information distribution
Truthfulness




Strategic considerations
Fear of retribution / expectation of kick-backs
Coalition formation
Agent identification (pseudonym problem)
Need analysis!
Ranking Systems: Analysis

Empirical


Because theories often lack
Theoretical

Because theoreticians
need to eat, too
Provides
valuable insight
Social Choice Theory

Two approaches:


Normative – from properties to
implementations. Example: Arrow’s
Impossibility Theorem
Descriptive – from implementation to
properties. The Holy Grail: representation
theorems (uniqueness results)
PageRank Method


A method for computing a popularity
(or importance) ranking for every web
page based on the graph of the web.
Has applications in search, browsing,
and traffic estimation.
PageRank: Intuition




Internet pages form a directed graph
Node’s popularity measure is a
positive real number. The higher
number represents higher popularity.
Let’s call it weight
Node’s weight is distributed equally
among nodes it links to
We look for a stationary solution: the
sum of weights a page receives from
its backlinks is equal to its weight
1
a=2
b=2
1
1
1
c=1
PageRank as Random Walk


Suppose you land on a random page
and proceed by clicking on hyper-links
uniformly randomly
Then the (normalized) rank of a page is
the probability of visiting it
PageRank: Some Math
Represent the graph as a matrix:
a
a
b
c
a
0
1
0
b
½
0
1
c
½
0
0
b
c
PageRank: Some Math
Find a solution of the equation:
AG r = r
The solution r is the rank vector.
Under the assumption that the graph is strongly
connected there is only one normalized solution
 The assumption is not used by the real PageRank
algorithm which uses workarounds to overcome it

Calculating PageRank
Take any non-zero vector r0
Let ri+1 = AG ri
Then the sequence rk converges to r
Since the Internet graph is an expander,
the convergence is very fast: O(log n)
steps to reach given precision
PageRank: The Good News





Intuitive
Relatively easy to calculate
Hard to manipulate
Great for common case searches
May be used to assess quality of
information (assuming popularity ≈
trust)
PageRank: The Bad News

PageRank is proprietary to




Webmasters can’t manipulate it,
but
can
Every change in the algorithm is good for
someone and is bad for someone else
Popular become more popular
Popularity ≠ quality of information
The Representation Theorem




We next present a set of axioms (i.e.
properties) for ranking procedures
Some of the axioms are more intuitive then
others, but all are satisfied by PageRank
We then show that PageRank is the only
ranking algorithm that satisfies the axioms
We try to be informal, but convincing
Ranking Systems Defined
A ranking system F is a functional that
maps every finite strongly connected
directed graph (SCDG) G=(V,E) into a
reflexive, transitive, complete, and antisymmetric binary relation ≤ on V
Ranking Systems: Example

MyRank ranks vertices in G in ascending
order of the number of incoming links
a
b
MyRank(G): c = a < b
PageRank(G): c < a = b
c
Axiom 1: Isomorphism (ISO)

F satisfies ISO iff it is independent of
vertex names

Consequence: symmetric vertices have the
same rank
e=f=g=h=i=j
h
e
b
i
a
f
g
a=b
j
Axiom 2: Self Edge (SE)


Node v has a self-edge (v,v) in G’, but does
not in G. Otherwise G and G’ are identical. F
satisfies SE iff for all u,w ≠ v:
(u ≤ v  u <’ v) and (u ≤ w  u ≤’ w)
PageRank satisfies SE:
Suppose v has k outgoing edges in G. Let
(r1,…,rv,…,rN) be the rank vector of G, then
(r1,…,rv+1/k,…,rN) is the rank vector of G’
Axiom 3: Vote by Committee (VBC)
b
b
a
a
c
c
1. In the example page a links only to b and c, but
there may be more successors of a
2. Incoming links of a and all other links of the
successors of a remain the same
Axiom 4: Collapsing (COL)
a
b
b
1. The sets of predecessors of a and b are disjoint
2. Pages a and b must not link to each other or have
self-links
3. The sets of successors of a and b coincide
Axiom 5: Proxy (PRO)
=
x
=
1. All predecessors of x have the same rank
2. |P(x)| = |S(x)|
3. x is the only successor of each of its predecessors
Useful Properties: DEL
b
c
c
a
a
d
1.
2.
3.
|P(b)|=|S(b)|=1
There is no direct edge between a and c
a and c are otherwise unrestricted
d
DEL: Proof
b
a
c
VBC
b
a
d
d
c
DEL: Proof
b
a
c
VBC
b
a
d
d
c
DEL: Proof
b
a
c
ISO,PRO
b
a
d
d
c
DEL: Proof
b
c
PRO
c
a
a
d
d
DEL: Proof
c
PRO
c
a
a
d
d
DEL: Proof
c
a
VBC
c
a
d
d
DEL: Proof
c
a
c
VBC
a
d
d
DEL for Self-Edge
It can also be shown that DEL holds
for self-edges:
a
a
Useful Properties: DELETE
=
=
x
=
=
1. Nodes in P(x) have no other outgoing
edges
2. x has no other edges
DELETE: Proof
=
COL
=
x
x
=
=
y
DELETE: Proof
PRO
x
y
Useful Properties: DUPLICATE
b
b
a
c
a
d
1. All successors of a are duplicated the
same number of times
2. There are no edges from S(a) to S(a)
c
d
DUPLICATE: Proof
b
a
c
d
b
VBC
a
c
d
DUPLICATE: Proof
b
a
c
d
b
VBC
a
c
d
DUPLICATE: Proof
b
a
c
d
b
COL
a
c
d
DUPLICATE: Proof
b
a
c
d
b
ISO,PRO
a
c
d
DUPLICATE: Proof
b
a
c
d
b
COL-1
a
c
d
DUPLICATE: Proof
b
a
c
d
b
VBC-1
a
c
d
The Representation Theorem Proof



Given a SCDG G=(V,E) and a,b in V, we
eliminate all other nodes in G while
preserving the relative ranking of a and b
In the resulting graph G’ the relative ranking
of a and b given by the axioms can be
uniquely determined. Therefore the axioms
rank any SCDG uniquely
It follows that all ranking systems satisfying
the axioms coincide
Proof by Example on b and d
a
b
a
b
c
d
c
d
a
b
c
d
⅓
⅓
⅓
0
0
0
0
1
0
0
0
1
½
½
0
0
a
b
c
d
3
3
1
4
Step 1: Insert Nodes
a
c
a
b
d
b
d
c
By DEL the relative ranking is preserved
Step 2: Choose Node to Remove
a
b
d
c
Step 3: Remove “self-edges”
a
b
d
c
Step 4: Duplicate Predecessors
a
b
d
c
Step 5: DELETE the Node
b
d
c
Step 5: DELETE the Extras
b
d
c
There still are nodes to delete: back to Step 2
Step 2: Choose Node to Remove
b
d
c
Steps 3,4 - no changes
Step 5: DELETE the Node
b
d
Step 6: DELETE the Extras
b
d
No original nodes to remove: proceed to Step 7
Step 7: Balance by Duplication
b
d
This is our G’
Step 8: Equalize by Reverse DEL
b
By ISO b=d. By DEL
and SE: in G’ b<d.
d
Example for a and d
a
c
a
b
d
b
d
c
After Removal of c
b
a
d
Duplicate Predecessors of b
b
a
d
DELETE b
a
d
DELETE Extras
a
d
Before Balancing
a
d
After Balancing
a
Conclusion: a<d.
d
What about a and b?
b
a
d
What about a and b?
b
a
d
What about a and b?
a
b
What about a and b?
a
b
What about a and b?
a
b
What about a and b?
a
Conclusion: a=b.
b
Concluding Remarks

‘Representation theorems isolate the
“essence” of particular ranking systems,
and provide means for the evaluation
(and potential comparison) of such
systems’ – Alon & Tennenholtz
The End