AG Nuzzolese

What’s going on in my Ph.D.?
A short report of my first year
Andrea Giovanni Nuzzolese
[email protected]
Summary
• Research questions and objectives
• State of the Art
• Main research activities
• Other research activities
• Attended schools and earned credits
Research Questions
1.
How to recognize a possible source of Knowledge Patterns?
2.
How to mine meaning from data, e.g. Linked Data, to KPs?
3.
What are the invariances, if any exists, for extracting KPs
from data?
4.
What could be a good automatic or semi-automatic method
in order to extract KPs?
5.
Why is the extraction of KPs useful?
6.
How can KR and Ontology Engineering benefit from KP
extraction?
Research Objectives
• Being able to recognize invariances from any possible source
of KPs
• Validate cognitive soundness of extracted KPs
– Define a mesure for cognitive soundness of KPs
– Desing a benchmark for cognitive soundnes of KPs
• Design a method which allows an efficient extraction of KPs
from data
• Provide an implementation of the method
• Apply extracted KPs for specific tasks
– e.g. exploratory search, recommendation systems, explanation
systems, etc…
Design Patterns
• In software engineering, design patterns
represent typical and recurring schemata
of good [Gamma et al., 1994] and bad
[Brown et al., 1998] software architectures
• They come from architecture
“Each pattern describes a problem that occurs over
and over again in our environment, and then
describes the core of the solution to that problem, in
such a way that you can use this solution a million
times over, without ever doing it the same way twice”
Knowledge Patterns
• In ontology engineering KPs are general
templates denoting recurring theory
schemata and their transformation to
create specific theories [Clark et al., 2000]
– Ontology engineering is a modeling endeavor
• [Gangemi and Presutti, 2010] presents a
cognitive vision of KPs close to the notion
of frames as described by [Minsky, 1975]
and [Baker et al., 1998]
Knowledge Patterns (cont’d)
• In [Gangemi and Presutti, 2010] a KP is a
small unit of meaning
– Task based
– Well grounded
– Cognitively sound
Ontology Extraction
• The generation of ontologies from formal and semi-formal
data is frequently called Semantic Web Mining [Stumme,
2006] or Ontology Mining [d’Amato et al., 2010]
• [Völker and Niepert, 2011] presents a statistical approach to
the induction of expressive schemata from large RDF
repositories (Statistical Schema Induction)
• In the field of ontology learning from natural language text
[Cimiano et al., 2004] presents a method for inducing
taxonomies by means of hierarchical clustering of context
vectors
• [Jäschke et al., 2008] presents a method for discovering
ontologies from folksonomies
Knowledge Pattern
Extraction
• During my first year I focused my attention on
trying to find possible approaches to the problem
of the extraction of KPs
• So far two main directions for KP extraction have
been analyzed
– Top-down: the extraction of KPs from foundational
ontologies (e.g. Dolce), frames (e.g. FrameNet),
thesauri and any other formal or semi-formal structure
– Bottom-up: the extraction of KPs directly from data,
e.g. RDBs, Linked Data
A. G. Nuzzolese, A. Gangemi, V. Presutti and P. Ciancarini. Semion: a smart triplication
tool. In O. Corcho and J. V•olker, editors, Demo Poster of the 17th Conference on
Knowledge Engineeringand Knowledge Management, pp. 166-167. CEUR Workshop
Proceedings, Lisbon, Portugal, 2010.
•
A. G. Nuzzolese, A. Gangemi, V. Presutti and P. Ciancarini.Fine-tuning triplication with
Semion. In V. Presutti, V. Svatek, and F. Share, editors, EKAW workshop on Knowledge
Injection into and Extraction from Linked Data (KIELD2010), pp. 2-14. CEUR Workshop
Proceedings, Lisbon, Portugal, 2010.
•
A. G. Nuzzolese, A. Gangemi, and V. Presutti. Gathering Lexical Linked Data and
Knowledge Patterns from FrameNet. In Proc. of the 6th International Conference on
Knowledge Capture(K-CAP), pp. 41-48. ACM, Ban, Alberta, Canada, 2011.
•
A. G. Nuzzolese, A. Gangemi, V. Presutti, P. Ciancarini: Encyclopedic knowledge
patterns fromwikipedia links. In: Aroyo, L., Noy, N., Welty, C. (eds.) Proceedings fo the
10th InternationalSemantic Web Conference (ISWC2011). Springer, pp. 520-536, 2011.
•
A. Musetti, A. G. Nuzzolese, F. Draicchio, V. Presutti, E. Blomqvist, A. Gangemi, P.
Ciancarini: Aemoo: exploratory search based on knowledge patterns over the Semantic
Web. To appear in Semantic Web Challenge 2011.
App
•
Top-down
Bottom-up
Semion
My articles
Semion
• Provides a method which
– allows to reengineer any data source to RDF
triples
– no assumption is fixed on the domain semantics,
but those that are customized by the user
• Is based on two main steps
– a syntactic transformation of the data source
– a rule-based refactoring
FrameNet LOD
• The contribution of this paper is twofold
– the production and publishing of a LOD
dataset for the FrameNet lexical database,
and
– the description of a method to produce
knowledge patterns out of FrameNet frames
• For both contributions we use Semion
EKPs
• Presents a resource of Encyclopedic KPs
that have been discovered by analyzing the
Wikipedia page links dataset
• Describes the evaluation of the extracted
EKPs with a user study
• Provides a bottom-up approach for extracting
EKPs based on the Knowledge Architecture
and the concept of Path
Aemoo
• Is a Web application supporting exploratory search over the
Semantic Web based on Encyclopedic KPs
• Aggregates knowledge from
–
–
–
–
Linked Data
Wikipedia
Twitter
Google News
• Provides an effective summary of knowledge about an entity,
including explanations
• Aemoo participated in the Semantic Web Challenge 2011 and
was selected for the final round and finally ranked 4
– http://challenge.semanticweb.org/
Relation articles-RQs
RQ1
Semion1-2
RQ2
X
FrameNet Lod
X
X
EKPs
X
X
Aemoo
①
②
③
④
⑤
RQ3
RQ4
RQ5
X
X
X
X
How to recognize a possible source of Knowledge Patterns?
How to mine meaning from data, e.g. Linked Data, to KPs?
What are the invariances, if any exists, for extracting KPs from data?
What could be a good automatic or semi-automatic method in order to
extract KPs?
Why is the extraction of KPs useful?
Other Research Activities
• Interactive Knowledge Stack (IKS)
– IKS is an Integrating Project part-funded by the
European Commission
– it will provide an open source technology platform
for semantically enhanced CMS
• Apache Stanbol
– is a modular software stack and reusable set of
components for semantic content management
– currently, there are more than 200 blogs that run
WordLift, a plug-in for WordPress based on the
refactor engine derived from Semion.
Technical Reports in IKS
• A. Adamou, E. Blomqvist, C. E. Bonafede, P. Ciancarini, E. Daga,
A. Musetti, A. G. Nuzzolese, V. Presutti, S. Germesin, M. Romanelli.
Knowledge Representation and Reasoning System (KReS) - Beta
Version Report. Technical report, IKS Consortium, 2010.
• A. Adamou, E. Blomqvist, C. E. Bonafede, E. Daga, A. G. Nuzzolese
and V. Presutti. Knowledge representation and reasoning system
(KReS) - Alpha Version Report. Technical report, IKS Consortium,
2010.
• A. Adamou, E. Blomqvist, A. Gangemi, A. G. Nuzzolese, V. Presutti,
W. Behrendt, D. Violeta and A. Conconi. IKS deliverable 3.2:
Ontological requirements for industrial cms applications. Technical
report, IKS Consortium, 2010.
• W. Kasper, J. Stefen, A. G. Nuzzolese and V. Presutti. IKS
deliverable 3.3: Requirements for semantic lifting/wrapping
components. Technical report, IKS Consortium, 2010.
Lecturer
• Invited lecturer at the Jönköping University
for the master course in Information
Retrieval
• Tutor at UniBO for the master course in
Knowledge Management and Data Mining
• Tutor at UniBO for the master course in
Computer System Security
Attended courses
• Bertinoro International Spring School
– Information Integration, Prof. Maurizio Lenzerini, Univesità La Sapienza,
Rome (Italy)
– Computational Aspects of Game Theory, Prof. Bruno Codenotti,
Consiglio Nazionale delle Ricerche, Pisa (Italy)
– Model Checking: From Finite-state to Infinite-state Systems, Prof.
Giorgio Delzanno, Università di Genova (Italy)
– Trust in Anonymity Networks, Prof. Vladimiro Sassone, University of
Southampton (United Kingdom)
• Computational Ontologies, Dr. Valentina Presutti e Dr. Aldo
Gangemi, CNR-ISTC, Rome (Italy)
• 8th European Summer School on Ontological Engineering and the
Semantic Web, Cercedilla (Spain)
• Embedded Real Time Systems, Prof. Fabio Panzieri Università di
Bologna (Italy) and Prof. Tullio Vardanega, Università di Padova
(Italy)
Earned credits
Course
Credits
Model Checking: From Finite-state to Infinite-state Systems
1
Trust in Anonimity Networks
1
Computational Ontologies
1
Embedded Real Time Systems
1
SSSW
1,5
Information Integration
1
TOTAL 5,5 (6,5)
Thank you