Insitute of Computer Graphics and Algorithms, Vienna

Tag Ranking
Present by Jie Xiao
Dept. of Computer Science
Univ. of Texas at San Antonio
Outline
Problem
Probabilistic tag relevance estimation
Random walk tag relevance refinement
Experiment
Conclusion
[email protected]
1
Problem
There are millions of social images on
internet, which are very attractive for the
research purpose.
The tags associated with images are not
ordered by the relevance.
[email protected]
2
Problem (Cont.)
[email protected]
3
Tag relevance
There are two types of relevance to be
considered.
The relevance between a tag and an image
The relevance between two tags for the same
image.
[email protected]
4
Probabilistic Tag Relevance Estimation
Similarity between a tag and an image
x
: an image
t
: tag i associated with image x
P(t|x) : the probability that given an image x, we have the tag t.
P(t) : the prior probability of tag t occurred in the dataset
After applying Bayes’ rule, we can derive that
[email protected]
5
Probabilistic Relevance Estimation (Cont)
Since the target is to rank that tags for the individual
image and p(x) is identical for these tags, we refine it
as
[email protected]
6
Density Estimation
Let (x1, x2, …, xn) be an iid sample drawn from
some distribution with an unknown density ƒ.
Two types of methods to describe the density
Histogram
Kernel density estimator
[email protected]
7
Histogram
Credit: All of Nonparametric Statistics via UTSA library
[email protected]
8
Kernel Density Estimation
Smooth function K is used to estimate the density
[email protected]
9
Kernel Density Estimation (Cont.)
Its kernel density estimator is
[email protected]
10
Probabilistic Relevance Estimation (Cont)
Kernel Density Estimation (KDE) is adopted to
estimate the probability density function p(x|t).
Xi
xk
K
|x|
: the image set containing tag ti
: the top k near neighbor image in image set Xi
: density kernel function used to estimate the probability
: cardinality of Xi
[email protected]
11
Relevance between tags
ti, tag i associated with image x
tj, tag j associated with image x
, the image set containing tag i
, the image set containing tag j
N: the top N nearest neighbor for image x
[email protected]
12
Relevance between tags (Cont.)
[email protected]
13
Relevance between tags (Cont.)
Co-occurrence similarity between tags
f(ti) : the # of images containing tag ti
f(ti,tj) : the # of images containing both tag ti and tag tj
G
: the total # of images in Flickr
[email protected]
14
Relevance between tags (Cont.)
[email protected]
15
Relevance between tags (Cont.)
Relevance score between two tags
where
[email protected]
16
Random walk over tag graph
P: n by n transition matrix.
pij : the probability of the transition from node i to j
rk(j): relevance score of node i at iteration k
[email protected]
17
Random walk
[email protected]
18
Random walk over tag graph (Cont.)
[email protected]
19
Experiments
Dataset: 50,000 image crawled from Flickr
Popular tags:
Raw tags: more than 100,000 unique tags
Filtered tags: 13,330 unique tags
[email protected]
20
Performance Metric
Normalized Discounted Cumulative Gain
(NDCG)
r(i) : the relevance level of the i - th tag
Zn : a normalization constant that is chosen so that the optimal
ranking’s NDCG score is 1.
[email protected]
21
Experimental Result
Comparison among different tag ranking
approaches
[email protected]
22
[email protected]
23
Conclusion
Estimate the tag - image relevance by kernel
density estimation.
Estimate the tag – tag relevance by visual
similarity and tag co-occurrence.
A random walk based approach is used to
refine the ranking performance.
[email protected]
24
Thank you!
[email protected]
25