Overview Using German to improve English parsing D5 Contributions SFB 732 D5: Biased Learning for Syntactic Disambiguation Blaubeuren - November 16, 2008 SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Research Areas Biased Learning for Syntactic Disambiguation I I Learning from monolingual text (grammatical dependencies, n-gram language model) Learning from bilingual text I I Disambiguating ambiguous German subjects and objects using the English translations in a German/English parallel text A general approach to improve English syntactic parsing using the German translations in German/English parallel text SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach NP NP SBAR who had gray hair NP CC DT NN a baby and NP DT NN a woman Figure: English parse with high attachment (incorrect) SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach NP NP CC DT NN a baby NP and NP SBAR DT NN a woman who had gray hair Figure: English parse with low attachment (correct) SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach CNP NP KON ART NN ein Baby NP und ART NN , eine Frau , S die graue Haare hatte Figure: German parse with low attachment SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach Reranking approach using rich bitext projection features I Goal: improve English parsing accuracy SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach Reranking approach using rich bitext projection features I Goal: improve English parsing accuracy I Working on parallel text, e.g., proceedings of European Parliament SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach Reranking approach using rich bitext projection features I Goal: improve English parsing accuracy I Working on parallel text, e.g., proceedings of European Parliament I Begin by parsing English sentence with Bitpar (Schmid). Select 100 most probable parses SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach Reranking approach using rich bitext projection features I Goal: improve English parsing accuracy I Working on parallel text, e.g., proceedings of European Parliament I Begin by parsing English sentence with Bitpar (Schmid). Select 100 most probable parses I Find most probable parse of German sentence SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach Reranking approach using rich bitext projection features I Goal: improve English parsing accuracy I Working on parallel text, e.g., proceedings of European Parliament I Begin by parsing English sentence with Bitpar (Schmid). Select 100 most probable parses I Find most probable parse of German sentence I Using rich bitext projection features, calculate syntactic divergence of each English parse candidate and the (projection of) the German parse SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach Reranking approach using rich bitext projection features I Goal: improve English parsing accuracy I Working on parallel text, e.g., proceedings of European Parliament I Begin by parsing English sentence with Bitpar (Schmid). Select 100 most probable parses I Find most probable parse of German sentence I Using rich bitext projection features, calculate syntactic divergence of each English parse candidate and the (projection of) the German parse I Choose a high probability English parse candidate with low syntactic divergence SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions Examples Approach Rich bitext projection features I Mix of probabilistic and heuristic features, combined in log-linear model, trained to maximize parsing accuracy I General features: tag correspondence, span size difference, parse depth difference I Specific features: coordination phenomena, NP structure I Documented in EACL submission I Current project: improve parses of Europarl corpus (1.4 million parallel sentences) SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions D5 Contributions D5 contributes to 3 Area D Goals, one long-term SFB goal I Types of contextual information: D5 uses contextual information derived from bilingual and monolingual syntactic analyses, at varying levels of granularity (e.g., parse tree vs. n-gram) SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions D5 Contributions D5 contributes to 3 Area D Goals, one long-term SFB goal I Types of contextual information: D5 uses contextual information derived from bilingual and monolingual syntactic analyses, at varying levels of granularity (e.g., parse tree vs. n-gram) I Learnability of contextual information: D5 uses statistical models of context learned from bilingual and monolingual data, often itself a product of syntactic analysis SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions D5 Contributions D5 contributes to 3 Area D Goals, one long-term SFB goal I Types of contextual information: D5 uses contextual information derived from bilingual and monolingual syntactic analyses, at varying levels of granularity (e.g., parse tree vs. n-gram) I Learnability of contextual information: D5 uses statistical models of context learned from bilingual and monolingual data, often itself a product of syntactic analysis I Use of contextual information: D5 uses statistical models of context for improving syntactic analysis SFB 732 D5: Biased Learning for Syntactic Disambiguation Overview Using German to improve English parsing D5 Contributions D5 Contributions D5 contributes to 3 Area D Goals, one long-term SFB goal I Types of contextual information: D5 uses contextual information derived from bilingual and monolingual syntactic analyses, at varying levels of granularity (e.g., parse tree vs. n-gram) I Learnability of contextual information: D5 uses statistical models of context learned from bilingual and monolingual data, often itself a product of syntactic analysis I Use of contextual information: D5 uses statistical models of context for improving syntactic analysis I Incorporating linguistic insights into statistical models: D5 uses insights into complementarity of English and German ambiguity to improve statistical syntactic disambiguation SFB 732 D5: Biased Learning for Syntactic Disambiguation
© Copyright 2026 Paperzz