Wordnet Seminar

Wordnet Seminar
The National Library of Norway, Henrik Ibsens gate 110, Oslo
June 6th–June 7th 2011
Programme
Monday, June 6th
15.00
Coffee and tea
15.15–15.30
Introduction
Kristin Bakken, The National Library of Norway
15.30–16.30
Development of a Wordnet for Norwegian Bokmål and Nynorsk
Lars Nygaard, Kaldera språkteknologi AS
16.30–17.30
From WordNet, EuroWordNet to the Global Wordnet Grid
Piek Vossen, VU University Amsterdam
19.30
Dinner
Tuesday, June 7th
09.00
Coffee and tea
09.10–10.10
Compiling a Wordnet from a Conventional Dictionary
Bolette Sandford Pedersen, University of Copenhagen
10.10–11.10
Word Knowledge versus World Knowledge: Augmenting
WordNets with (Un)Common Sense for Robust Applications
Tony Veale, University College Dublin
11.10–11:30
Coffee Break
11.30–12.30
European Open Linguistic Infrastructure: an Industry Perspective
Raivis Skadiņš, Tilde, Latvia
12.30–13.00
Concluding Remarks
13.00
Lunch
Abstracts
Development of a Wordnet for Norwegian Bokmål and Nynorsk
Lars Nygaard, Kaldera språkteknologi AS
The topic for the talk will be methodology, technology and linguistic challenges in
the development of an Norwegian wordnet. We have chosen DanNet, a wordnet
for the closely related language Danish, as the basis for our database, and our
technique for semi-automatic translation will be demonstrated and evaluated,
along with a method for identifying inconsistencies in the underlying material.
There are several challenges in the lexicographic work, and we will present
suggestions for how the following issues should be resolved: 1) polysemy and
homonymy analysis; 2) coding of antonyms; 3) coding of prepositions and adverbs. In addition, we will present a web interface for the current version of our
wordnet.
From WordNet, EuroWordNet to the Global Wordnet Grid
Piek Vossen, VU University Amsterdam
In my presentation, I will briefly summarize the design and development of the
different wordnets in the past. Starting from the English WordNet to the
multilingual wordnet databases developed in EuroWordnet, BalkaNet, Indian
Wordnet and Asian Wordnet. The multi- and cross-lingual design of the wordnet
databases has raised many fundamental issues with respect to the definition of
the relations, what defines a word, what defines a concept and what defines a
synset. I will explain the idea of the Global Wordnet Grid as an attempt to install
a more formal framework for defining the semantics of the different wordnets in
the world in a uniform way, which will eventually lead to a better standardization.
Compiling a Wordnet from a Conventional Dictionary
Bolette Sandford Pedersen, University of Copenhagen
My talk will address the compilation of computational lexical-semantic resources
from definitions and examples in conventional dictionaries. Seen in a historical
perspective, the lack of synergy between these two kinds of resources is
surprising. Containing an enormous amount of lexical and semantic knowledge,
dictionaries are a likely source of information for use in computational semantic
lexicons and semantic knowledge bases. Where several studies have concluded
that the results of reuse experiments are disappointing mainly due to inconsistent dictionaries, other recent experiments are, however, much more promising. This new interest in reuse partly stems from the fact that dictionaries (be
they printed or electronic) are changing and improving rapidly together with
modern corpus and compilation facilities and are therefore becoming more
attractive as background resources for computational use.
One of such recent re-use experiments is the Danish wordnet, DanNet, built on
the basis of a large, corpus-based printed dictionary of modern Danish (Den
Danske Ordbog). I will describe some of the methodological issues of compiling
this wordnet by re-using human-oriented, semantic descriptions in terms of
definitions and examples. More specifically, I will discuss the issues of readjusting inconsistent and underspecified hyponymy hierarchies taken from the conventional dictionary, sense distinctions as opposed to the synonym sets of typical
wordnets, generating semantic qualia relations on the basis of sense definitions
and examples, and finally, supplementing semantic information which is implicit
in conventional dictionaries.
Word Knowledge vs. World Knowledge: Augmenting WordNets with
(Un)Common Sense for Robust Applications
Tony Veale, University College Dublin
Picasso famously remarked that “Computers are useless. They can only give you
answers”. Though reflecting a blinkered view of computers, his aphorism skewers
a widespread tendency to prize the best answers while taking the best questions
for granted. Creative processes, in art and in science, are fundamentally
introspective and question-driven, for to find the right answers one must learn to
ask the right questions. Indeed, because questions often presuppose a shared
understanding of the world, these presuppositions can be a rich source of
knowledge even when the questions themselves go unanswered.
As repositories of word knowledge, with just a tantalizing veneer of world
knowledge, WordNets focus more on providing factual answers than inspiring
introspective questions. Yet the idea of a WordNet is not incompatible with an
open-ended, question-oriented emphasis on lexical knowledge. To understand
the creative use of words, as in novel metaphors, analogies and blends, a
speaker must introspect about what aspects of a lexico-semantic source domain
can extend to a target domain. How might we enrich our WordNets then, so that
they can specify the general form of the questions an introspective speaker will
use when producing or consuming creative language?
In this talk I will explore how WordNet-enriching knowledge of the world can be
acquired by harvesting presupposition-laden questions from the Web, and show
how these questions can in turn be used as a basis for further introspection in
creative language processing. Along the way, the talk will showcase a variety of
robust Web-based applications for creative language processing.
European Open Linguistic Infrastructure: an Industry Perspective
Raivis Skadiņš, Tilde