PageRank

A On the Effect of Group
Structures on Ranking Strategies
in Folksonomies
SWSM2008 - WWW2008 Workshop on Social Web Search and Mining
WWW 2008, April 22, 2008, Beijing, China.
Presented by Jun-Ming Chen 10/06/2008
1

The paper is organized as follows:




We describe the GroupMe! System
It defines the folksonomy model, and explains the necessary
extensions of this model to reflect the grouping mechanism
Ranking algorithms, for both the folksonomy model with and
without groups, are discussed
The analysis of our experimental findings
2
Outline






Introduction
GroupMe! Tagging System
Folksonomies
Ranking Strategies
Evaluations
Conclusions
3
Pagerank


Larry Page在1996年間發明了 Pagerank的演算法
從許多優質的網頁鏈接過來的網頁,必定還是優質網頁

使用者的行為模式 -假設有一位網路使用者在網路上隨意瀏覽
 稱為隨意的網路衝浪者 (Random surfer)

Random surfer 會隨意點選網頁上的鏈結,
連結到不同網頁上去瀏覽,
直到覺得無聊時再重新隨意的選取另一個網頁作為開始,
再依照它的鏈結指引連結出去。

PageRank 就是 Random surfer 隨意選擇一個開始網頁的機率
 運用 PageRank 函式


可以依據網頁間的相互連結關係,找出重要的網頁,
提供使用者參考,或作為搜尋結果的排序用。
4
Backlink




Link Structure of the Web
Approximation of importance / quality
Pages with lots of backlinks are important
Backlinks coming from important pages convey more importance to
a page
5
Pagerank
6
Pagerank

PageRank Algo




以機率 1-d 從一個新頁面開始訪問,
或以機率d 順著超鏈結的點擊進行訪問,
該模型下,頁面 Vj 被訪問到的機率為 Pr(Vj)
Out(Vj) 是指向 Vj 的頁面集合
Pr(Vj) 即為 Vj 的 PageRank 機率值
d指damping factor, 其值在0~1, 一般設為0.85
PR(Vi) :為Vi這個頁面的PR值
In(Vi) :為連進Vi這個頁面的link數目
Out(Vj) :為Vj這個頁面連出去的link數目
公式中的 d 就是 Random surfer 感覺到無聊時重新選擇一個開始網頁的機率
7
Pagerank

如果有3個頁面A,B,C

A如果連到B,C ; B如果連到C


IF PR(A) = 4
A
PR(B) = (1-0.85) + 0.85 * 4/2 = 1.85
B
C


PR(C) = (1-0.85) + 0.85 * (4/2 + 1.85) = 3.4225
B,C會平均繼承A的PR值, 但C會單獨繼承B的PR值
8
Pagerank



PageRank is a global ranking based on the web's
graph structure
PageRank use backlinks information to bring order to
the web
A great variety of applications
9
A On the Effect of Group
Structures on Ranking Strategies
in Folksonomies
SWSM2008 - WWW2008 Workshop on Social Web Search and Mining
WWW 2008, April 22, 2008, Beijing, China.
Presented by Jun-Ming Chen 9/29/2008
10
Introduction

Recent folksonomy systems investigate on



We investigate on


combine Web resources with annotations (tags)
the users that have created the annotations.
the effect of grouping resources,
 using this additional structure for search in folksonomies.
Our experiments show that the quality of search result
ranking can be significantly improved by introducing and
exploiting the grouping of resources
11
GroupMe! Tagging System

similar to del.icio.us

Organizing resources by arranging
them in groups

Users can use groups to structure their
resources according to different topics,

users can get a quick overview of
multiple resources by using the
GroupMe! system

put new resources into a group via
simple drag & drop operations
12
Folksonomies
GroupMe! introduces groups as a new dimension in folksonomies
[6] A. Hotho, R. J¨aschke, C. Schmitz, and G. Stumme. BibSonomy: A Social Bookmark and Publication Sharing
System. In A. de Moor, S. Polovina, and
H. Delugach, editors, Proc. First Conceptual Structures Tool Interoperability Workshop, pages 87–102, Aalborg, 2006.
13
Folksonomies

traditional folksonomies


relations between tags mainly rely on their co-occurrences
(i.e. two tags are assigned to the same resource)
A GroupMe! folksonomy gains new relations between tags:


A relation between tags assigned from (possibly) different users to
different resources, where the resources are contained in the same
group.
A relation between tags assigned to a group g and tags assigned to
resources that are contained in g.
14
Ranking Strategies


GroupMe! tag assignments are exploited in the graph construction
process.
The challenge of adapting the FolkRank algorithm to GroupMe!
folksonomies is


to identify semantically appropriate strategies for constructing a graph,
adjacency matrix serves as input for the PageRank-based FolkRank
algorithm.
15
PageRank Algorithm (directed graph)
to transform the hypergraph formed
FolkRank Algorithm (undirected graph) by the traditional tag assignments
FolkRank Adaptions
GroupMe! Ranking (add group)
(Based on FolkRank Algo)
•A: Traditional Tag Assignments
•B: Groups as Tags
•C: Group Context-based Tags
(serves as input for the FolkRank algorithm)
Increase the amount of input data
•Propagation of Group Tags
•Propagation of all Tags
GF represented by the real matrix A, which is obtained from the adjacency matrix
16
PageRank Algorithm
(directed graph)
FolkRank Algorithm
(undirected graph)
GroupMe! Ranking (add group)
(Based on FolkRank Algo)
GF represented by the real matrix A, which is obtained from the adjacency matrix
PageRank iterates as follows:
17
FolkRank Adaptions

In order to adapt the FolkRank algorithm to GroupMe! Folksonomies,
we confine ourself on adapting the process of constructing the graph
GF from the hypergraph formed by the GroupMe! tag assignments.

Therefore, we introduce three main strategies:



Traditional Tag Assignments
Groups as Tags
Group Context-based Tags
18
Traditional Tag Assignments





corresponds to the normal FolkRank algorithm
constructs a 3-uniform hypergraph
Reduces GroupMe! tag assignments to traditional tag assignments
Groups are just taken into account as resources which might or
might not be tagged
Computing the weight of each edge also corresponds to the
FolkRank approach
19
Groups as Tags




create artificial tags
use a constant value wc to weight these
edges because a resource is usually added
only once to a certain group
groups g1 and g2
 are treated as normal resources,
artificial tags tg1 and tg2
are assigned to those resources which are
member of the corresponding group.
20
Group Context-based Tags



If users assign a certain tag to
resources in different groups then the
meaning of the tag may differ.
replaces every tag t with a tag ttg,
which indicates that tag t was used in
group g.
transforms all GroupMe! Tag
assignments into normal tag
assignment triples.




(u1, t2, r2, g1),  (u1, tt2g1 , r2) (=tas1)
(u1, t2, r2, g2),  (u1, tt2g2 , r2) (=tas2)
a 3-uniform hypergraph is build, which
serves as input for the construction of
GC.
The construction of GC is done as in
the normal FolkRank algorithm
21

In addition to the three strategies that can be applied to generate the
graph G, which serves as input for the FolkRank algorithm,

we present two further strategies to exploit a GroupMe! folksonomy
 Propagation of Group Tags
 Propagation of all Tags

propagate tags assigned to one resource/group to other resources
or groups.

Such techniques
 increase the amount of input data
 do not require to change the strategies described above
substantially.
22
Evaluations
In the tau
evaluation
we (τ)
concentrate
on ranking
of resources and
The Kendall
coefficient
has the following
properties:
groups as between
search for
is is
the
most(i.e.,
common
use
case ofare the
•If the agreement
theresources
two rankings
perfect
the two
rankings
same)ranking
the coefficient
has value systems.
1.
in folksonomy
•If the
disagreement between the two rankings is perfect (i.e., one ranking is the
 In order to measure the quality of our ranking strategies we used the
reverse of the other) the coefficient has value −1.
OSim and KSim metrics as proposed in [5].
•increasing values imply increasing agreement between the rankings.
 rankings
OSim( are
, ) completely
enables usindependent,
to determinethe
thecoefficient
overlap between
k
•If the
has valuethe
0 ontop
average.
resources of two rankings,
and
235 users (u), 978 tags (t), 1351 resources (r),
273 groups (g),
1758 tag assignments (tas)

KSim( , ), which is based on Kendall’s distance measure,
indicates the degree of pairwise distinct resources, ru and rv, within
the top k that have the same relative order in both rankings.
23
Measurements and Discussion
24
Measurements and Discussion
imprecise
detect adequate resources
7個
8個
25
Measurements and Discussion
26
Results

We have chosen the FolkRank algorithm, and have developed
search algorithms that extend FolkRank to exploit the group context

Our evaluation indicated that the grouping of resources significantly
improves the quality of search in folksonomies.

Grouping is an easy and well-received feature for folksonomies


improve the quality of search
seems to be a very promising approach to improve current folksonomy
systems.
27
Conclusions

GroupMe! system we have created an intuitive Web 2.0 system that
allows users to organize and maintain Web resources very easily

enables users to group Web resources they consider interesting together,
and tag also the groups.

exploit the dynamic and evolving grouping information for search

verify that the grouping information improves the quality of ranking in
folksonomy systems
[7] A. Hotho, R. J¨aschke, C. Schmitz, and G. Stumme. FolkRank: A Ranking Algorithm for Folksonomies. In
28
Proc. FGIR 2006, 2006.