Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland University, Saarbrücken Second Pascal Challenge Workshop Venice, April 2006 SALSA-WS 09/05 Outline of this Talk • Frame Semantics • A baseline system for approximating Textual Entailment – LFG syntactical analyses with – Frame semantics – Statistical decision: entailed? • Walk-through example from RTE 2006 • RTE 2006 results / brief conclusions SALSA-WS 09/05 Frame Semantics (Fillmore 1976, Fillmore et. al. 2003) • Lexical semantic classification of predicates and their argument structure • A frame represents a prototypical situation (e.g. Commercial_transaction, Theft, Awareness) • A set of roles identifies the participants or propositions involved • Frames are organized in a hierarchy • Berkeley FrameNet Project db: 600 frames, 9.000 lexical units, 135.000 annotated sentences SALSA-WS 09/05 Linguistic Normalizations (Frame: Commerce_buy) Seller Buyer BMW bought Rover from British Aerospace. Voice: active / Rover was bought by BMW, which financed passive [...] the new Range Rover. Lexicalization Goods BMW, which acquired Rover in 1994, is now dismantling the company. Money POS: verb / BMW‘s purchase of Rover for $1.2 billion noun was a good move. SALSA-WS 09/05 Frame Semantics for RTE Focusing on lexical semantic classes and rolebased argument structure – Built-in normalizations help to determine semantic similarity at a high level of abstraction – Disregarding aspects of “deep“ semantics: negation, modality, quantification, ... – Open for deeper modeling on demand (e.g. our treatment of modality) SALSA-WS 09/05 A Baseline System for Approximating Textual Entailment • Fine-grained LFG-based syntactic analysis – English LFG grammar (Riezler et al. 2002) – Wide-coverage with high-quality probabilistic disambiguation • Frame Semantics – Shallow lexical-semantic classification of predicate-argument structure – Extensions: WordNet senses, SUMO concepts • Computing structural and semantic overlap of t and h – Hypothesis: large overlap ≈ entailment text SALSA-WS 09/05 hypothesis A Baseline System for Approximating Textual Entailment text LFG f-structure graph w/ frames & concepts hypothesis Linguistic Analyses LFG f-structure graph w/ frames & concepts text-hypothesis match graph different types of matches (aspects Computing of similarity) Semantic Overlap Feature extraction lexical, syntactic, semantic structure & overlap measures Model training Decision: & classification Statistical Entailment? Linguistic Components XLE parsing: LFG f-structure Fred / Detour / Rosy: frames & roles F-structure w/ semantics projection Rule-based: extend & refine sem. proj. • NEs, Locations • Co-reference • Modality, etc. SALSA-WS 09/05 WordNet-based WSD: WordNet & SUMO Using XLE term rewriting system (Crouch 2005) Example from RTE 2006 Pair 716 Text In 1983, Aki Kaurismäki directed his first full-time feature. Hypothesis Aki Kaurismäki directed a film. SALSA-WS 09/05 LFG F-Structures Automatic Frame Annotation for Text (SALTO Viewer) Fred & Rosy frames & roles (statistical) Collins Parse SALSA-WS 09/05 Detour System frames (via WordNet) Automatic Frame Annotation for Hypothesis 716_h: Aki Karusmäki directed a film. SALSA-WS 09/05 LFG + Frames for Hypothesis (FEFViewer) Rule-based (LFG-NER) Aki Kaurismäki directed a film. SALSA-WS 09/05 Hypothesis-Text-Match Graphs Computing Structural and Semantic overlap Match graph bundles overlapping partial graphs marked by match types • Aspects of similarity – Syntax-based (i.e. lexical and structural): Identical predicates (attributes) trigger node (edge) matches. – Semantics-based: Identical frames/concepts (roles) trigger node (edge) matches. • Degrees of similarity – Strict matching – Weak matching conditions for non-identical predicates: • “Structurally related” e.g. via coreference (relative clauses, appositives, pronominals) • “Semantically related” via WordNet, Frame-Relations SALSA-WS 09/05 t: In 1983, Aki Kaurismäki directed his first full-time feature. Grammatically related h: Aki Kaurismäki directed a film. WordNet related Statistical Modeling • Feature extraction on the basis of – – – – – • Syntactic, Semantic matches (of different types) Matching clusters’ sizes Ratio (matched vs. hypothesis) (Non-)matching modality RTE-task, fragmentary (parse),… Training/classification with WEKA tool – Feature selection 1. Predicate Matches 2. Frame overlap 3. Matching cluster size – – Model 1: Conjunctive rule (Feat. 1,2) Model 2: LogitBoost (Feat. 1,2,3) RTE 2006 Results all tasks IE IR QA SUM Model 1 59.0 49.5 59.5 54.5 72.5 Model 2 57.8 48.5 58.5 57.0 67.0 • SUM (and IR) are natural tasks for Frame Semantics, IE and QA need more deeper modeling (aboutness vs. factivity) • Error analysis – True positives: high semantic overlap – True negatives: 27% involve modality mismatches – False examples: poor modeling of dissimalrity • Many high-frequency features measuring similarity • Few low-frequency features measuring dissimilarity Brief Conclusions • Good approximation of semantic similarity – Deep LFG syntactical analyses integrated with – Shallow lexical Frame Semantics (plus other lex. resources) – Match graph measuring overlap • Need better model for semantic dissimilarity – Too few rejections (false positives >> false negatives) • Towards deeper modeling – Treatment of modal contexts – Integration of lexical inferences – Open for collaborations SALSA-WS 09/05 stmt_type(f(0),declarative). tense(f(0),past). pred(f(0),direct). mood(f(0),indicative). dsubj(f(0),f(7)). dobj(f(0),f(2)). pred(f(2),film). num(f(2),sg). det_type(f(2),indef). proper(f(7),name). pred(f(7),'Kaurismaki'). num(f(7),sg). mod(f(7),f(10)). proper(f(10),name). pred(f(10),'Aki'). num(f(10),sg). sslink(f(0),s(41)). sslink(f(2),s(42)). sslink(f(7),s(45)). sslink(f(10),s(59)). frame(s(41),'Behind_the_scenes'). artist(s(41),s(45)). production(s(41),s(42)). frame(s(42),'Behind_the_scenes'). frame(s(45),'People'). person(s(45),s(59)). person(s(45),s(45)). LFG + Frames for Hypothesis (FEF) ont(s(41),s(48)). ont(s(42),s(49)). ont(s(45),s(56)). wn_syn(s(48),'direct#v#11'). sumo_sub(s(48),'Steering'). milo_sub(s(48),'Steering'). wn_syn(s(49),'film#n#1'). sumo_sub(s(49),'MotionPicture'). milo_sub(s(49),'MotionPicture'). sumo_syn(s(56),'Human'). sumo_syn(s(58),'Human').
© Copyright 2026 Paperzz