The Automatic Generation of Formal Annotations in a MultiMedia Indexing and Searching Environment Thierry Declerck, Peter Wittenburg and Hamish Cunningham DFKI GmbH, Max-Planck-Institut für Psycholinguistik and University of Sheffield ACL/EACL2001 Workshop on Human Language Technology and Knowledge Management The MUMIS Consortium • • • • • • CTIT TSI DFKI MPI DCS ESTEAM • VDA University of Twente, Enschede, NL University of Nijmegen, Nijmegen, NL Saarbrücken, D Nijmegen, NL University of Sheffield, UK Gothenburg, SE (location Athens, GR) Hilversum, NL NLP/IE ASR NLP/IE Online SW NLP/IE Translation Software Dissemination Objectives • Technology development to automatically index (with formal annotations) lengthy multimedia recordings (off-line process) Find and annotate relevant events, together with the involved entities and relations • Technology development to exploit indexed multimedia archives (on-line process) Search for interesting scenes and play them via Internet Test Domain: Soccer Games / UEFA Tournament 2000 Off-line Task • Automatic Speech Recognition (Radio/TV Broadcasts) Automatically transforms the speech signals into texts (for 3 languages — Dutch, English and German) • Natural Language Processing (Information Extraction) Analyse all available textual documents (newspapers, speech transcripts, tickers, formal texts ...), identify and extract interesting entities, relations and events • Merging all the annotations produced so far • Create a database with formal annotations The Generation of Formal Annotations • Metadata (type of game, teams, date, final score, players etc.), as they can be used a.o. for classifying and filtering videos in the MM digital archive • Events (particular actions with time codes, involved entities and related events), as they can be extracted from the video sequences • All Formal Annotations available in XML Standard The Event Table Related to domain ontology and multilingual terminology. Guiding the generation of formal annotations Event ID Time Subcat/Modification Metadata Final whistle # 90>t>120 Subj=referee, score etc… Final score Goal kick # 0>t>120 Subj=pl, loc=loc, cons=cons,.. Dribbling # 0>t>120 Subj=pl, loc=loc, … Substitution # 0>t>120 Subj=pl, I.obj=pl, cause=c, … Team (adding pl) Red Card # 0>t>120 Subj=ref, I.obj=pl, cause=c, … Team (red at t) Goal # 0>t>=pen Subj=pl, I.obj=team, score=s, … Order of goal Off-line Task Newspaper Newspaper Newspaper Newspaper Text Text Text Texts 3 Languages RadioCommenting Commenting Radio Radio Commenting Audio Commenting (TV, Radio) 3 Languages Languages 33 Languages 3 Languages Newspaper Newspaper Newspaper Close caption Text Text Text 3 Languages multilingual IE => event tables Merging of Annotations Event = goal Player = Basler Dist. = 25 m Time = 18 Score = 1:0 Events indexed in video recording Event = goal Type = Freekick Player = Basler Dist. = 25 m Time = 17 Score: leading Event = goal Type = Freekick Player = Basler Team = Germany Time = 18 Score = 1:0 Final score = 1:0 Distance = 25 m Event = goal Player= Basler Team = Germany Time = 18 Score = 1:0 Finalscore = 1:0 Freekick Goal Pass Defense 17 min 18 min 24 min 28min 1:0 Foul Freekick Neville Basler Dribbling Matthäus Basler 25 m Campbell Scholl 25 m 60 m The Role of IE in MUMIS • Information Extraction (IE) is the task of identifying, collecting and normalizing relevant information for a specific application or user. • The relevant information is typically represented in form of predefined “templates”, which are filled by means of Natural Language (NL) analysis (Template = Event Table in MUMIS) • IE combines pattern matching mechanisms, (shallow) NLP and domain knowledge (terminology and ontology). Extension of our IE system in MUMIS • Multilingual and multisource IE. Incremental information building • Cross-document co-reference resolution • Combine Metadata and event extraction => better organisation and dynamic updating of information (KM) • Multiple presentation of results: Template, Event table and Hyperlinks (Named Entities, rel. to KM) Example of Processing Formal Texts • Formal Text • The Formal Text annotated with domainspecific information Example of Processing Semi-Formal Texts • Semi-Formal Text • The Semi-Formal Text annotated with domain-specific information Merging Component • • • Acting on the generated formal annotations (Metadata and Events), but also interleaving with the generation process of those Checking consistency, eliminating redundancy (Template Merging), in accordance with domain ontology Completing the information with domain knowledge, inference Machine On-line Tasks Searching and Displaying • Search for interesting events with formal queries Give me all goals from Overmars shot with his head in 1. Half. Event=Goal; Player=Overmars; Time<=45; Previous-Event=Headball • Indicate hits by thumbnails & let user select scene • Play scene via the Internet & allow scrolling etc Of course: slow motion, fast play, start/stop, etc • User Guidance (Lexica and Ontology) On-line Tasks Knowledge Guided User Interface & Search Engine Freekick Goal Pass Defense 17 min 18 min 24 min 28min 1:0 Foul Freekick Neville Basler Basler 25 m München - Ajax 1998 München - Porto 1996 25 m Deutschland - Brasilien 1998 Prototype Demo Play Movie Dribbling Fragment Matthäus Campbell of that Game Scholl 60 m More about MPEG (Moving Picture Coding Experts Group) • • • MPEG-1: low-level media encoding and compression format (VHS quality, ~ 2-3 Mbps - good for streaming) MPEG-2: improved media encoding and compression format (S-VHS quality, ~ 5-10 Mbps, digital TV and DVD standard MPEG-4: Codes content as objects and enables those objects to be manipulated at the client, optimized compression On-line SW Architecture Client-Server structure: • fully distributed • JMF media presentation • RMI-based interaction Ontology Client Objects Lexica Client Applet JMF Media Server Objects Query Engine Objects HTTP RMI RMI (RTP, RTSP) WWW Server Java Server MPEG Movies Keyframes Annotations Metadata JDBC Media Media Server Server MPEG1 MPEG1 DB Media Server Server rDBMS MPEG1 Media File Server Server MPEG1 Query interface: • automatic pre-selection • guided by domain knowledge • interactive, visual feedback On-line HW Architecture Media Server RAID 1Gbps Gb-Switch GB Switch Router FC Switch Tape Library Media Server Internet • efficient & reliable storage management (near-line capacity, media change, 2. Location) • high storage capacity (n TB, 1 h MPEG1 = 1 GB) • powerful media servers / powerful network UI / Annotation • UI optimization • thumbnails not that informative • which thumbnail? (several around time mark) • automatic thumbnail adjustment? • seamless operation for user • lexicon/ontology guidance • user driven input • Manual annotation tools • MediaTagger • EUDICO Gain - User Group Current Procedure MUMIS Procedure Manual Video Annotation Integration Central DB Automatic Video Annotation and DB Integration Query via PC Query via PC Results on PC Contact Video Archive Get Video Tapes Search on Tape on VCR Results on PC And Select & Play Segment & Play • What gets lost? Is it necessary? • Potential: direct Internet Service, less dependencies Acknowledgements • UEFA • DFB, FA, KNVB • EBU, WDR, NOS, SWR Allez les Bleus!!
© Copyright 2026 Paperzz