What’s going on in my Ph.D.? A short report of my first year Andrea Giovanni Nuzzolese [email protected] Summary • Research questions and objectives • State of the Art • Main research activities • Other research activities • Attended schools and earned credits Research Questions 1. How to recognize a possible source of Knowledge Patterns? 2. How to mine meaning from data, e.g. Linked Data, to KPs? 3. What are the invariances, if any exists, for extracting KPs from data? 4. What could be a good automatic or semi-automatic method in order to extract KPs? 5. Why is the extraction of KPs useful? 6. How can KR and Ontology Engineering benefit from KP extraction? Research Objectives • Being able to recognize invariances from any possible source of KPs • Validate cognitive soundness of extracted KPs – Define a mesure for cognitive soundness of KPs – Desing a benchmark for cognitive soundnes of KPs • Design a method which allows an efficient extraction of KPs from data • Provide an implementation of the method • Apply extracted KPs for specific tasks – e.g. exploratory search, recommendation systems, explanation systems, etc… Design Patterns • In software engineering, design patterns represent typical and recurring schemata of good [Gamma et al., 1994] and bad [Brown et al., 1998] software architectures • They come from architecture “Each pattern describes a problem that occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice” Knowledge Patterns • In ontology engineering KPs are general templates denoting recurring theory schemata and their transformation to create specific theories [Clark et al., 2000] – Ontology engineering is a modeling endeavor • [Gangemi and Presutti, 2010] presents a cognitive vision of KPs close to the notion of frames as described by [Minsky, 1975] and [Baker et al., 1998] Knowledge Patterns (cont’d) • In [Gangemi and Presutti, 2010] a KP is a small unit of meaning – Task based – Well grounded – Cognitively sound Ontology Extraction • The generation of ontologies from formal and semi-formal data is frequently called Semantic Web Mining [Stumme, 2006] or Ontology Mining [d’Amato et al., 2010] • [Völker and Niepert, 2011] presents a statistical approach to the induction of expressive schemata from large RDF repositories (Statistical Schema Induction) • In the field of ontology learning from natural language text [Cimiano et al., 2004] presents a method for inducing taxonomies by means of hierarchical clustering of context vectors • [Jäschke et al., 2008] presents a method for discovering ontologies from folksonomies Knowledge Pattern Extraction • During my first year I focused my attention on trying to find possible approaches to the problem of the extraction of KPs • So far two main directions for KP extraction have been analyzed – Top-down: the extraction of KPs from foundational ontologies (e.g. Dolce), frames (e.g. FrameNet), thesauri and any other formal or semi-formal structure – Bottom-up: the extraction of KPs directly from data, e.g. RDBs, Linked Data A. G. Nuzzolese, A. Gangemi, V. Presutti and P. Ciancarini. Semion: a smart triplication tool. In O. Corcho and J. V•olker, editors, Demo Poster of the 17th Conference on Knowledge Engineeringand Knowledge Management, pp. 166-167. CEUR Workshop Proceedings, Lisbon, Portugal, 2010. • A. G. Nuzzolese, A. Gangemi, V. Presutti and P. Ciancarini.Fine-tuning triplication with Semion. In V. Presutti, V. Svatek, and F. Share, editors, EKAW workshop on Knowledge Injection into and Extraction from Linked Data (KIELD2010), pp. 2-14. CEUR Workshop Proceedings, Lisbon, Portugal, 2010. • A. G. Nuzzolese, A. Gangemi, and V. Presutti. Gathering Lexical Linked Data and Knowledge Patterns from FrameNet. In Proc. of the 6th International Conference on Knowledge Capture(K-CAP), pp. 41-48. ACM, Ban, Alberta, Canada, 2011. • A. G. Nuzzolese, A. Gangemi, V. Presutti, P. Ciancarini: Encyclopedic knowledge patterns fromwikipedia links. In: Aroyo, L., Noy, N., Welty, C. (eds.) Proceedings fo the 10th InternationalSemantic Web Conference (ISWC2011). Springer, pp. 520-536, 2011. • A. Musetti, A. G. Nuzzolese, F. Draicchio, V. Presutti, E. Blomqvist, A. Gangemi, P. Ciancarini: Aemoo: exploratory search based on knowledge patterns over the Semantic Web. To appear in Semantic Web Challenge 2011. App • Top-down Bottom-up Semion My articles Semion • Provides a method which – allows to reengineer any data source to RDF triples – no assumption is fixed on the domain semantics, but those that are customized by the user • Is based on two main steps – a syntactic transformation of the data source – a rule-based refactoring FrameNet LOD • The contribution of this paper is twofold – the production and publishing of a LOD dataset for the FrameNet lexical database, and – the description of a method to produce knowledge patterns out of FrameNet frames • For both contributions we use Semion EKPs • Presents a resource of Encyclopedic KPs that have been discovered by analyzing the Wikipedia page links dataset • Describes the evaluation of the extracted EKPs with a user study • Provides a bottom-up approach for extracting EKPs based on the Knowledge Architecture and the concept of Path Aemoo • Is a Web application supporting exploratory search over the Semantic Web based on Encyclopedic KPs • Aggregates knowledge from – – – – Linked Data Wikipedia Twitter Google News • Provides an effective summary of knowledge about an entity, including explanations • Aemoo participated in the Semantic Web Challenge 2011 and was selected for the final round and finally ranked 4 – http://challenge.semanticweb.org/ Relation articles-RQs RQ1 Semion1-2 RQ2 X FrameNet Lod X X EKPs X X Aemoo ① ② ③ ④ ⑤ RQ3 RQ4 RQ5 X X X X How to recognize a possible source of Knowledge Patterns? How to mine meaning from data, e.g. Linked Data, to KPs? What are the invariances, if any exists, for extracting KPs from data? What could be a good automatic or semi-automatic method in order to extract KPs? Why is the extraction of KPs useful? Other Research Activities • Interactive Knowledge Stack (IKS) – IKS is an Integrating Project part-funded by the European Commission – it will provide an open source technology platform for semantically enhanced CMS • Apache Stanbol – is a modular software stack and reusable set of components for semantic content management – currently, there are more than 200 blogs that run WordLift, a plug-in for WordPress based on the refactor engine derived from Semion. Technical Reports in IKS • A. Adamou, E. Blomqvist, C. E. Bonafede, P. Ciancarini, E. Daga, A. Musetti, A. G. Nuzzolese, V. Presutti, S. Germesin, M. Romanelli. Knowledge Representation and Reasoning System (KReS) - Beta Version Report. Technical report, IKS Consortium, 2010. • A. Adamou, E. Blomqvist, C. E. Bonafede, E. Daga, A. G. Nuzzolese and V. Presutti. Knowledge representation and reasoning system (KReS) - Alpha Version Report. Technical report, IKS Consortium, 2010. • A. Adamou, E. Blomqvist, A. Gangemi, A. G. Nuzzolese, V. Presutti, W. Behrendt, D. Violeta and A. Conconi. IKS deliverable 3.2: Ontological requirements for industrial cms applications. Technical report, IKS Consortium, 2010. • W. Kasper, J. Stefen, A. G. Nuzzolese and V. Presutti. IKS deliverable 3.3: Requirements for semantic lifting/wrapping components. Technical report, IKS Consortium, 2010. Lecturer • Invited lecturer at the Jönköping University for the master course in Information Retrieval • Tutor at UniBO for the master course in Knowledge Management and Data Mining • Tutor at UniBO for the master course in Computer System Security Attended courses • Bertinoro International Spring School – Information Integration, Prof. Maurizio Lenzerini, Univesità La Sapienza, Rome (Italy) – Computational Aspects of Game Theory, Prof. Bruno Codenotti, Consiglio Nazionale delle Ricerche, Pisa (Italy) – Model Checking: From Finite-state to Infinite-state Systems, Prof. Giorgio Delzanno, Università di Genova (Italy) – Trust in Anonymity Networks, Prof. Vladimiro Sassone, University of Southampton (United Kingdom) • Computational Ontologies, Dr. Valentina Presutti e Dr. Aldo Gangemi, CNR-ISTC, Rome (Italy) • 8th European Summer School on Ontological Engineering and the Semantic Web, Cercedilla (Spain) • Embedded Real Time Systems, Prof. Fabio Panzieri Università di Bologna (Italy) and Prof. Tullio Vardanega, Università di Padova (Italy) Earned credits Course Credits Model Checking: From Finite-state to Infinite-state Systems 1 Trust in Anonimity Networks 1 Computational Ontologies 1 Embedded Real Time Systems 1 SSSW 1,5 Information Integration 1 TOTAL 5,5 (6,5) Thank you
© Copyright 2024 Paperzz