Estimate the Number of Relevant
Images Using Two-Order Markov
Chain
Presented by: WANG Xiaoling
Supervisor: Clement LEUNG
Outline
Introduction
Objective
Methodology
Experiment Results
Conclusion and Future Work
Introduction
Large collections of images have been
made available on web.
Retrieval effectiveness becomes one of the
most important parameters to measure
the performance of image retrieval
systems.
Measures:
Precision
Recall
P
number of relevant images retrieved
total number of images returned
R
number of relevant images retrieved
total number of relevant images in the database
Significant Challenge: the total number of
relevant images is not directly observable
Basic Models
Regression Model
Markov Chain
Two-Order Markov Chain
Objective
To investigate the probabilistic behavior of
the distribution of relevant images among
the returned results for the image search
engines using two-order markov chain
Methodology
Test Image Search Engine:
Query Design
70% provided by authors
One word query
Two word query
Three word query
30% suggestive term
Suggestive term with largest returned results
Suggestive term with least returned results
Methodology
Database Setup:
Stochastic process {X1, X2,…, XJ }
where XJ denotes the aggregate relevance
of all the images in page J
Equation:
20
X J YJi
i 1
where YJi=1 if the i th image on page J is
relevant, and YJi =0 if the i th image on
page J is not relevant.
20
X J YJi 18
i 1
Page J
XJ
1
18
2
19
3
20
4
19
5
20
6
19
7
20
8
18
9
19
10
18
Forecast Using Two-Order Markov Chain
Markov Chain: Stochastic process {XJ, J≥1} with state
space S={0,1,2,…20} ,
Two-Order Markov Chain: State space change to S2,
Forecast the state probability distribution of next page
π(J) based on the original state probability distribution
π(1) and transition probability matrix P . An Example
Model Test
Mean Absolute Error
Experiment Results
Forecast Results Using Two-Order Markov Chain
Page
Google
Yahoo
Bing
1
20
20
20
2
20
20
20
3
20
20
20
4
20
20
20
5
20
20
20
6
20
20
17
7
20
20
17
8
20
20
17
9
20
20
17
10
20
20
17
Test Results--Google
20
20
15
15
10
1
2
3
4
5
6
Page Number
(a)
7
8
9
10
10
20
20
15
15
1
2
3
4
5
6
Page Number
(c)
7
8
9
10
20
15
10
1
2
3
4
5
6
Page Number
(e)
7
8
9
10
10
Number of Relevant Images
10
Number of Relevant Images
10
15
15
2
3
4
5
6
Page Number
(g)
7
8
9
10
10
20
20
15
15
10
1
2
3
4
5
6
Page Number
(i)
7
8
9
10
3
4
5
6
Page Number
(b)
7
8
9
10
1
2
3
4
5
6
Page Number
(d)
7
8
9
10
1
2
3
4
5
6
Page Number
(f)
7
8
9
10
1
2
3
4
5
6
Page Number
(h)
7
8
9
10
1
2
3
4
5
6
Page Number
7
8
9
10
15
20
1
2
20
20
10
1
10
(j)
Test Results--Yahoo
20
20
15
15
10
1
2
3
4
5
6
Page Number
(a)
7
8
9
10
10
20
20
15
15
1
2
3
4
5
6
Page Number
(c)
7
8
9
10
20
15
10
1
2
3
4
5
6
Page Number
(e)
7
8
9
10
10
Number of Relevant Images
10
Number of Relevant Images
10
15
15
2
3
4
5
6
Page Number
(g)
7
8
9
10
10
20
20
15
15
10
1
2
3
4
5
6
Page Number
(i)
7
8
9
10
3
4
5
6
Page Number
(b)
7
8
9
10
1
2
3
4
5
6
Page Number
(d)
7
8
9
10
1
2
3
4
5
6
Page Number
(f)
7
8
9
10
1
2
3
4
5
6
Page Number
(h)
7
8
9
10
1
2
3
4
5
6
Page Number
7
8
9
10
15
20
1
2
20
20
10
1
10
(j)
2
3
4
1
2
3
4
1
2
3
1
2
1
2
5
6
7
8
9
10
5
6
Page Number
(b)
7
8
9
10
4
5
6
Page Number
(d)
7
8
9
10
3
4
5
6
Page Number
(f)
7
8
9
10
3
4
5
6
Page Number
7
8
9
10
7
8
9
10
Test Results--Bing
20
20
10
10
0
1
2
3
4
5
6
Page Number
7
8
9
0
10
(a)
20
20
10
10
1
2
3
4
5
6
Page Number
7
8
9
10
(c)
20
10
0
1
2
3
4
5
6
Page Number
7
8
9
10
(e)
10
0
10
10
0
1
2
3
4
5
6
Page Number
7
8
9
10
(h)
(g)
20
20
10
10
0
20
20
20
0
0
Number of Relevant Images
0
Number of Relevant Images
1
1
2
3
4
5
6
Page Number
(i)
7
8
9
10
0
1
2
3
4
5
6
Page Number
(j)
Measure for Forecast Accuracy
Mean Absolute Deviation (MAD):
forecast error
MAD
n
One-word
Two-word
Three-word
Google 2.7 2.3 1.1 0.8 0.1 0.8 1.7 1.9 0.6
Yahoo
2.0 0.1 1.1 0.5 4.8 1.1 2.2 2.2
0.4
Bing
1.3 1.9 1.2 4.5 2.0 1.2 1.2 10.5 1.1
Mean Absolute Error
Comparative Results
5
4
Regreesion
Model
Markov Chain
3
2
Two-Order
Markov Chain
1
0
Googel
Yahoo
Bing
Image Search Engine
Best Model: TwoOrder Markov Chain
Worst Model:
Regression Model
Conclusion
Two-Order Markov Chain could well
represent the distribution of relevant
images among the results pages for the
major web image search engine.
Two-Order Markov Chain is the best model
among three models we have worked.
Future Work
Our future work will try to apply Hidden
Markov Chain to this topic
Thank you!
Q&A
Two-Order Markov Chain
An example (cont’)
Suppose the stochastic process {Xt, t>=0} with
a state space S={A, B, C}
As to two-order Markov chain, the state space:
S2={AA, AB, AC, BA, BB, BC, CA, CB, CC}
The state probabilities distribution of period zero:
(0)= (AA, AB, AC, BA, BB, BC, CA, CB, CC)
An example (cont’)
The transition probability matrix:
PAA,BA=0
p AA, AA p AA, AB
0
0
0
0
pBA, AA pBA, AB
0
p 0
0
0
pCA, BC pCA, AB
0
0
0
0
p AA, AC
0
0
0
p AB, BA p AB, BB
0
0
0
p AB, BC
0
0
0
0
0
0
p AC ,CA
p AC ,CB
pBA, AC
0
0
0
0
0
0
pBB, BA
pBB, BB
pBB, BC
0
0
0
0
0
0
pBC ,CA
pBC ,CB
pCA, AC
0
0
0
0
0
0
pCB , BA
pCB , BB
pCB , BC
0
0
0
0
0
0
pCC ,CA
pCC ,CB
0
0
p AC ,CC
0
0
pBC ,CC
0
0
pCC ,CC
An example
Therefore, the probability distribution of
states for page J will be compute as:
π(J)=π(J-1)*P
[Return]
© Copyright 2026 Paperzz