Indexation de masses de documents graphiques

ANR Project
Navidomass
Salim Jouili
Supervisor
S.A. Tabbone
QGAR – LORIA
Nancy
Réunion Navidomass
Paris, le 21 Mars 2008
Introduction
 Graph-based representation
 Similarity measures of graphs

 Edit
distance
 Papadopolous and Manolopoulos measure
 Maximal common Subgraph
 Graph probing
Median Graph
 Applications
 Conclusion


Powerful structured-based representation

Used with flexibility in processing of a large variety of image’s
types (the ancient documents, the electric and architectural
plans, natural images, medical images...).

Preserves topographic information of the image as well as the
relationship between the components.

In the two last decades many works have been developed.

Step in very subfield of image analysis :
 Pattern Recognition
 Segmentation
 CBIR (Content-based image retrieval)
 Bunke
,PAMI’82 [1]:
2
1
1
(30,100)
(45,80)
(45,78)
2
3
3
2


2
(x,y) = vertices attributes
1,2 and 3 = vertices labels

1= Final point

2= angle

3 = T intersection
2
2
(30,38)
(50,100)
(50,80)
(50,78)
(50,58)
(70,58)
(70,38)
1
1
(55,80)
(55,78)
 Karray,
Master 2006 [2]:
Multilayer segmentation
Homogeneous zones
 Region

adjacency Graphs:
Fauqueur, PhD 2003 [3]:
Original image
a RAG Representation
Of the segmented image
 Region


Llados, PAMI’01 [4]:
Extraction regions of a plane graph by Jiang and
Bunke algorithm [5].
V1
e8
e1
V3
V6
V2
e4
e5
R1
e2
e7
e6
V5
adjacency Graphs:
R2
e3
A RAG G’:
V4
A plane Graph G
representing line drawing
R3
•Vertices :represent the regions
in G
•Edges : represent the regions
adjacency in G
 GCap:
Graph-based Automatic Image
Captioning, J. Pan, MDDE’04 [6].
 Most
of works in graph-based representation,
notably in document analysis, sought some
resemblance measures between represented
objects in order to :




Classify
Match
Index
...
 Edit
distance:
G1
1 operation
1 operation
Edge deletion
Vertex Substitution
G2
D(G1,G2) = 2
 Maximal
G1
common subgraph (MCS)
G2
Dmcs(G1,G2) = 1- (3/4)=0.25
 Papadoupolos
Sorted graph histogram :
SH 1= {V5(3), V4(3), V1(3), V6(2), V3(2), V2(1)}
V2
V1
and Manolopoulos Measure: [7]
V3
V6
Sorted graph histogram :
SH 2 = {V4(4), V3(4), V1(4), V6(3), V5(3), V2(2)}
V5
V4
V2
V1
V3
V6
V4
V5
Dpa. & Mano(G1,G2) =L1(SH1,SH2)=6
Primitive operations are : vertex
insertion , vertex deletion and vertex
update
 Graph


Probing, Lopresti, IJDAR’2004 [8]:
“How many vertices with degree n are present in
graph G= (V,E)?” PR collect the response from the
graphs
PR(G) = (n0,n1,n2,…) where ni=|{v∈V |deg(v) =i}|
Dprobing(G1,G2) =L1(PR(G1),PG(G2)
 The
generalized median graph aims to
extract essential information from a whole of
set of graphs in only one prototype
The generalized median graph
A set of graphs
 GGM

= arg mingUi=1 d(g,gi)
Where U is the set of all the graphs that can be
built from the original set of graphs.
 Jiang
Propose a genetic algorithm, GbR’99
[9]
 Hlaoui
proposed a solution based on the
decomposition of the problem of minimizing
the sum of distances in two parts, nodes and
edges. GbR’03 [10]
 Content-based

image retrieval :
Berretti proposed a technique of graph matching and
indexing dedicated to the graph-models in contentbased retrieve. Using m-tree indexing method.
PAMI’2001 [11].
 Segmention:

 ...
Felzenszwalb proposed a complete graph-based
approach for the segmentation of colour images. [12]
 Graph-based
representation : flexible,
universal (document’s type), spatial
information.
 Useful in many field in image analysis.
 Many solution in measurement of similarity
between graphs  depends from the data
stored in graphs.
 Ambitious research field notably for Contentbased image retrieval.

[1] H. Bunke. Attributed of programmed graph grammars and their application to schematic diagram
interpretation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 4(6), Novembre 1982.

[2] A. Karray. Recherche de lettrines par le contenu. Master's thesis, Laboratoire L3i, Universités de La
Rochelle et de Sfax, France et Tunisie, 2006.
[3] J. Fauqueur. Contributions pour la Recherche d'Images par Composantes Visuelles. PhD thesis, INRIA Université Versailles St Quentin, 2003.
[4] J. Lladòs, E. Martí, and J. J. Villanueva. Symbol recognition by error-tolerant subgraph matching
betweenregion adjacency graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence,
23(10),2001.
[5] Jiang, X.Y., Bunke, H., An Optimal Algorithm for Extracting the Regions of a Plane Graph, Pattern
Recognition Letters (14), 1993, pp. 553-558.




[6] J. Pan, H.Yang, C. Faloutsos, and P. Duygulu. Gcap : Graph-based automatic image captioning. In
Proceedings of the 4th International Workshop on Multimedia Data and Document Engineering, 2004.

[7] A. N. Papadopoulos and Y. Manolopoulos. Structure-based similarity search with graph histograms.
Proceedings of International Workshop on Similarity Search (DEXA IWOSS'99), pages 174178, Septembre
1999.
[8] D. Lopresti and G. Wilfong. A fast technique for comparing graph representations with applications to
perform evaluation. IJDAR, 6:219–229, 2004.
[9] X. Jiang, A. Munger, and H. Bunke. Scomputing the generalized median of a set of graphs. 2nd IAPR-TCIS Workshop on Graph Based Representations.





[10] A. Hlaoui and S.Wang. A new median graph algorithm. IAPR Workshop on GbRPR, LNCS 2726, pages
225–234, 2003.
[11] S. Berretti, A. D. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in contentbased retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10):1089–1105, 2001.
[12] P. F. Felzenszwalb and D. P. Huttenlocher. Efficient graph-based image segmentation. International
Journal of Computer Vision, 59(2), Septembre 2004.