Benchmarking in KW slides

Benchmarking in Knowledge Web
Raúl García-Castro, Asunción Gómez-Pérez
<rgarcia,[email protected]>
Jérôme Euzenat
<[email protected]>
September 10th, 2004
Benchmarking in KW. Sep 10th, 2004
1
© R. García-Castro, A. Gómez-Pérez
1
Research
Benchmarking
≠
Industrial
Benchmarking
WP 1.2
(From T.A. page 26)
WP 2.1
(From T.A. Page 41)
Point of view
• Tool recommendation
• Research progress
Criteria
• Utility
• Scalalability
• Robustness
• Interoperability
Tools
• Ontology development tools
• Annotation tools
• Querying and reasoning services of
ontology development tools
• Merging and alignment tools
• Ontology development tools
• Annotation tools
• Querying and reasoning services of
ontology development tools
• Semantic Web Service technology
Benchmarking in KW. Sep 10th, 2004
2
© R. García-Castro, A. Gómez-Pérez
2
Index
Benchmarking activities in Knowledge Web
Benchmarking in WP 2.1
Benchmarking in WP 2.2
Benchmarking information repository
Benchmarking in Knowledge Web
Benchmarking in KW. Sep 10th, 2004
3
© R. García-Castro, A. Gómez-Pérez
3
Benchmarking activities in KW
Overview of the benchmarking activities:
• Progress
• What to expect from them
• What are their relationships/dependencies
• What could be shared/reused between them
Benchmarking in KW. Sep 10th, 2004
4
© R. García-Castro, A. Gómez-Pérez
4
Benchmarking timeline
D1.2.1:
Utility of ontology
development tools
Utility of merging,
alignment, annotation
Performance of
querying, reasoning
WP 1.2
Roberta Cuel
D1.31:
Best practices and guidelines
for industry
Best practices and guidelines
for business cases
WP 1.3
Luigi Lancieri
D2.1.4:
D2.1.1:
Benchmarking
Benchmarking
Methodology,
SoA
criteria, test suites
D2.1.6:
Benchmarking
building tools
Benchmarking
querying, reasoning,
annotation
Benchmarking
web service
technology
WP 2.1
Raúl García
D2.2.2:
Benchmarking
methodology
for alignment
D2.2.4:
Benchmarking
alignment
results
Progress:
Finished
WP 2.2
Started
Jérôme Euzenat
Not started
0
6
12
Benchmarking in KW. Sep 10th, 2004
18
24
30
5
36
42
48
© R. García-Castro, A. Gómez-Pérez
5
Benchmarking relationships
T 1.3.1
Best Practices and Guidelines
Best
Practices
Benchmarking methodology
Benchmark suites
T 1.2.1 Utility
of ontologybased tools
Benchmarking overview
SoA ontology tech. evaluation
T 2.1.1 SoA
on the technology
of the scalability
WP
T 2.1.4 Definition
of a methodology,
general criteria for
benchmarking
Benchmarking methodology alignment
Benchmark suite alignment
Benchmarking
methodology
6
Benchmarking in KW. Sep 10th, 2004
T 2.1.6
Benchmarking
of ontology
building tools
T 2.2.2 Design
of a benchmark
suite for
alignment
T 2.2.4 Research
on alignment
techniques and
implementations
12
18
6
24
© R. García-Castro, A. Gómez-Pérez
6
Index
Benchmarking activities in Knowledge Web
Benchmarking in WP 2.1
Benchmarking in WP 2.2
Benchmarking information repository
Benchmarking in Knowledge Web
Benchmarking in KW. Sep 10th, 2004
7
© R. García-Castro, A. Gómez-Pérez
7
Benchmarking in WP 2.1
0
T 2.1.1 State of the
Art
• Overview of
benchmarking,
experimentation,
and measurement
• SoA of ontology
technology
evaluation
6
12
18
24
T 2.1.4 Definition of a
methodology, general criteria for
ontology tools benchmarking
Benchmarking methodology
Type of tools to be benchmarked:
• Ontology building tools
• Annotation tools
• Querying and reasoning services of
ontology development tools
• Semantic Web Services technology
General evaluation criteria:
• Interoperability
• Scalability
• Robustness
...
36
...
T 2.1.6 Benchmarking
of ontology building
tools
Specific evaluation criteria:
• Interoperability
• Scalability
• Robustness
48
T2.1.x
Benchmarking
querying,
reasoning,
annotation,
web service
Test suites for ontology
building tools
Benchmarking supporting
tools
Test suites for each type of tools
Benchmarking supporting tools
Benchmarking in KW. Sep 10th, 2004
8
© R. García-Castro, A. Gómez-Pérez
8
T 2.1.1: Benchmarking Ontology Technology
in D 2.1.1 Survey of Scalability Techniques for Reasoning with Ontologies
Ontology Technology/Methods
Measurement
Desired attributes
Weaknesses
Comparative analysis
...
Benchmarking
Evaluation
Experimentation
Continuous improvement
Best practices
Recommendations
• Overview of benchmarking, experimentation, and measurement
• State of the Art of Ontology-based Technology Evaluation
Benchmarking in KW. Sep 10th, 2004
9
© R. García-Castro, A. Gómez-Pérez
9
T 2.1.4: Benchmarking methodology,
criteria, and test suites
Methodology
Plan
Experiment
Improve
1 Goals identification
7 Experiment definition
10 Report writing
2 Subject identification
8 Experiment execution
11 Findings communication
3 Management involvement
9 Experiment results analysis
12 Findings implementation
4 Participant identification
13 Recalibration
5 Planning and resource allocation
6 Partner selection
General evaluation criteria:
• Interoperability
• Scalability
• Robustness
Benchmarking in KW. Sep 10th, 2004
Benchmark suites for:
• Ontology building tools
• Annotation tools
• Querying and reasoning services
• Semantic Web Services technology
10
Benchmarking supporting tools:
• Workload generators
• Test generators
• Statistical packages
•...
© R. García-Castro, A. Gómez-Pérez
10
T 2.1.6: Benchmarking of ontology
building tools
Partners/Tools:
UPM
Benchmark suites:
• Interoperability
(x tests)
• Scalability
(y tests)
• Robustness
(z tests)
...
...
...
...
Benchmarking results:
• Comparative
• Weaknesses
• (Best) practices
• Recommendations
Benchmarking
ontology
building tools
Interoperability
• Do the tools import/export from/to RDF(S)/OWL?
• Are the imported/exported ontologies the same?
• Is there any knowledge loss during import/export?
• ...
Benchmark suites:
• RDF(S) Import capability
• OWL Import capability
• RDF(S) Export capability
• OWL Export capability
Benchmarking in KW. Sep 10th, 2004
Experiments:
• Import/export RDF(S) ontologies
• Import/export OWL ontologies
• Check for knowledge loss
• ...
11
Experiment results:
• test 1
• test 2
• test 3
• ...
NO
OK
OK
Benchmarking results:
• Comparative
• Weaknesses
• (Best) practices
© R. García-Castro, A. Gómez-Pérez
11
Index
Benchmarking activities in Knowledge Web
Benchmarking in WP 2.1
Benchmarking in WP 2.2
Benchmarking information repository
Benchmarking in Knowledge Web
Benchmarking in KW. Sep 10th, 2004
12
© R. García-Castro, A. Gómez-Pérez
12
T 2.2.2 Design of a benchmark suite for
alignment
Why evaluate?
• Comparing the possible solutions;
• Detecting the best methods;
• Finding out where we are bad.
Two goals:
• For the developer: improving the solutions;
• For the user: choosing the best tools;
• For both: testing compliance with a norm.
How evaluate?
• Take a real life case and set the deadline
• Take several cases normalizing them
• Take simple cases identifying what they highlight
(benchmark suite)
• Build a challenge (MUC, TREC)
Results:
• Benchmarking methodology for alignment techniques;
• Benchmark suite for alignment;
• First evaluation campaign;
• Greater benchmarking effort.
Benchmarking in KW. Sep 10th, 2004
13
© R. García-Castro, A. Gómez-Pérez
13
T 2.2.2 What has been done?
Information Interpretation and Integration Conference (I3CON), to held at the NIST Performance Metrics for
Intelligent Systems (PerMIS) Workshop: focuses on "real-life" test cases and compare algorithm global performance.
Facts:
•
7 ontology pairs;
•
5 participants;
•
Undisclosed target alignments (independently made);
•
Ask for the alignments in normalized format;
•
Evaluation on the F-measure.
Results:
•
Difficult to find pairs in the wild (they have
been created);
•
No dominating algorithm, no most difficult case
for all;
•
5 participants was the targetted number, we
must have more next time!
The Ontology Alignment Contest at the 3rd Evaluation of Ontology-based Tools (EON) Workshop, to be held the
International Semantic Web Conference (ISWC): aims at defining a proper set of benchmark tests for assessing
feature-related behavior.
Facts:
•
1 ontology and 20 variations (15 hand-crafted on
some particular aspects);
•
Target alignment (made on purpose) published;
•
Ask for a paper, with comments on the tests and on
the achieved results (as well as the results in
normalized format).
Benchmarking in KW. Sep 10th, 2004
Results:
We are currently benchmarking the tools!
See you at
EON Workshop, ISWC 2004,
Hiroshima, JP
November …
14
© R. García-Castro, A. Gómez-Pérez
14
T 2.2.2 What’s next?
•
•
•
•
More consensus on what’s to be done?
Learn more
Take advantage of the remarks
Make a more complete:
real-world+bench suite+challenge?
• Provide automated procedures
Benchmarking in KW. Sep 10th, 2004
15
© R. García-Castro, A. Gómez-Pérez
15
Index
Benchmarking activities in Knowledge Web
Benchmarking in WP 2.1
Benchmarking in WP 2.2
Benchmarking information repository
Benchmarking in Knowledge Web
Benchmarking in KW. Sep 10th, 2004
16
© R. García-Castro, A. Gómez-Pérez
16
Benchmarking information repository
Web pages inside the Knowledge Web portal with:
• General benchmarking information
(methodology, criteria, test suites, references, ...)
• Information about the different benchmarking activities in Knowledge Web
• Benchmarking results and lessons learned
• ...
Objectives:
• Inform
• Coordinate
• Share/reuse
• ...
Proposal for a benchmarking working group in the SDK cluster.
Benchmarking in KW. Sep 10th, 2004
17
© R. García-Castro, A. Gómez-Pérez
17
Index
Benchmarking activities in Knowledge Web
Benchmarking in WP 2.1
Benchmarking in WP 2.2
Benchmarking information repository
Benchmarking in Knowledge Web
Benchmarking in KW. Sep 10th, 2004
18
© R. García-Castro, A. Gómez-Pérez
18
What is benchmarking in Knowledge Web?
In Knowledge Web:
• Benchmarking is performed over products/methods (not processes)
• Benchmarking is not a continuous process
Ends with findings communication, there is no findings implementation or recalibration
• Benchmarking technology involes evaluating technology
• Benchmarking technology is NOT just evaluating technology
We must extract practices and best practices
• Benchmarking results
• Comparative
Recommendations
(Continuous) Improvement
• Weaknesses
• (Best) practices
• Benchmarking results are needed!
Both in industry and research
• ...
Benchmarking in KW. Sep 10th, 2004
19
© R. García-Castro, A. Gómez-Pérez
19
How much do we share?
Benchmarking
methodology, criteria,
and test suites
Benchmarking results
Benchmarking in KW. Sep 10th, 2004
• Is the view about benchmarking from
industry “similar” to the view from research?
• Is it viable to have a common methodology?
Will anyone use it? 
• Can the test suites be reused between
industry/research?
• Can be useful a common way of presenting
test suites?
• ...
• Can research benchmarking results be
(re)used by industry, and viceversa?
• Can be useful a common way of presenting
results?
• ...
20
© R. García-Castro, A. Gómez-Pérez
20
Next steps
Provide the benchmarking methodology to industry:
• First draft after Manchester Research meeting. 1st October.
• Feedback from WP 1.2. End of October.
• (Almost) final version by half-November.
Set up web pages with benchmarking information in the portal:
• Benchmarking activities
• Methodology
• Criteria
• Test suites
Discuss in a mailing list and agree on a definition of “best practice”.
Next meeting? To be decided (around November) (with O2I)
Benchmarking in KW. Sep 10th, 2004
21
© R. García-Castro, A. Gómez-Pérez
21
Benchmarking in Knowledge Web
Raúl García-Castro, Asunción Gómez-Pérez
<rgarcia,[email protected]>
Jérôme Euzenat
<[email protected]>
September 10th, 2004
Benchmarking in KW. Sep 10th, 2004
22
© R. García-Castro, A. Gómez-Pérez
22