4th workshop on translation quality, Lille, 07.02.14 Using corpora in the classroom to improve translation quality: report on an experiment Rudy Loock GRAMMAR BOOKS/TRANSLATION TEXTBOOKS ON-‐LINE DICTIONARIES & GLOSSARIES TRADITIONAL DICTIONARIES SMARTPHONE APPS DOCUMENTARY RESEARCH CORPORA MACHINE CAT TOOLS TRANSLATION Electronic corpus: ¾ a collection of machine-‐readable authentic texts (including transcripts of spoken data) which is sampled to be representative of a particular language or language variety ͞ŽĨĨŝĐŝĂů͟ British National Corpus, Corpus of Contemporary American English, Translational English Corpus, Frantext, Dutch Parallel Corpus, English-‐ Norwegian Parallel Corpus͙. self-‐collected/DIY corpora (Varantola 2003) exploited with concordancers Corpora can be used by translators for 3 main reasons: 1. To collect information on systemic differences between the SL and the TL (inter-‐language/cross-‐linguistic comparisons; contrastive linguistics) 2. To investigate differences between original and translated texts in a given language (intra-‐language comparisons) 3. To find out and analyze what translators do Link with TQA ORIGINAL ENGLISH TRANSLATED FRENCH TRANSLATED ENGLISH ORIGINAL FRENCH inter-‐language & intra-‐language comparisons comparable and parallel/translation corpora Adapted from Johansson & Oksefjell (1998:8) for the ENPC Case study on OBITUARIES IN ENGLISH AND FRENCH Step 1: Thanks to a comparable corpus, uncover systemic differences between original E and original F concerning the use of referring expressions to the deceased (Loock & Lefebvre-‐Scodeller 2013) Step 2: Use these differences to define usage constraints in order to improve naturalness, idiomaticity of translations (experiment with students) Case study on OBITUARIES IN ENGLISH AND FRENCH Aims: -‐ Show that a bridge exists between contrastive linguistics and translation studies (systemic differences as usage constraints) (e.g. Ebeling 1998, Johansson 1999, Ramon 2002) -‐ Show that idiomaticity, naturalness (Salkie 2007) can then be improved -‐ Corpora have an important role to play in translators education (e.g. W. Chen, L. Bowker, N. Kübler, A. Boulton, C. Frérot, R. Nita) Methodology (step 1): Compilation of a specific, DIY corpus (Necrocorpus, 100 obituaries, ca. 100,000 words), analyzed with AntConc 4 celebrities recently passed away: Amy Winehouse (1983-‐2011) Michael Jackson (1958-‐2009) Larry Hagman (1931-‐2012) Whitney Houston (1963-‐2012) Press articles in original English and French: announcement of death + biography Necrocorpus English Subcorpus Number of articles French Number of Number of words/tokens words/tokens per article Number of articles Number of Number of words/tokens words/tokens per article #Jackson 11 18,208 1,655 11 14,357 1,305 #Hagman 11 10,262 933 13 5,031 387 #Houston 13 16,373 1,259 12 8,113 676 #Winehouse 15 17,534 1,169 14 6,627 473 Total 50 62,377 1,248 50 34,128 683 = COMPARABLE CORPUS (ĂĚĚŝŶŐƵƉŽĨ͞ĐŽƌƉŽƌĂŝŶƚǁŽŽƌŵŽƌĞůĂŶŐƵĂŐĞƐ ǁŝƚŚƚŚĞƐĂŵĞŽƌƐŝŵŝůĂƌĐŽŵƉŽƐŝƚŝŽŶ͟(Teubert 1996: 245)) Referring expressions to the deceased: First Name + Family name First Name Family Name Lexical description (+ First name) + Family Name Title (Mr/Mrs/Mrs ; M./Mme/Mlle) Nickname Michael Jackson Michael Jackson Lexical descriptions Jackson Jackson Pronominal forms Same possibilities for the 2 grammatical systems Michael Mr. Jackson The King of Pop The star Bambi He Michael M. Jackson Le roi de la Pop La star Bambi Il ORIGINAL ENGLISH Results: First Name + Family Name Family Name Lexical description + Title + Name Name First name Nickname Pronominal form ORIGINAL FRENCH First Name + Family Name Family Name Lexical description Pronominal form Title + Name First name Lexical description + Name Nickname Lexical description Discussion: There are both similiarities and significant differences between original E and original F: Absence of differences: Pronominal vs. lexical formsNS: F= 67%-‐33%; A = 70%-‐30% First NameNS NicknameIN There are significant differences in usage between English and French Should be taken into account by translators (naturalness) Differences: Lexical description + Name (F>A)* Family Name (A>F)* First Name + Family Name (F>A)* Lexical Description (A>F)* Title + Name (A>F)IN * = statistically significant (p<0.0002), NS = not statistically significant, IN = test inapplicable (insufficient number of tokens) Methodology (step 2): ± TSM M2 Students (16) translated an obituary from E to F before classes on corpora started (Sept 2013) ± They compiled and analyzed their own corpus of obituaries with same methodology (Oct-‐Dec 2013) ± They decided on (not) revising their initial translation Cf. concept of the invisible translator: ͞dŚĞ utopian goal is to make it virtually impossible to tell the translation from an original text in that ůĂŶŐƵĂŐĞ͟ (Teubert 1996 : 241) = research project for M2 comparative grammar class /ŶŝƚŝĂůƚƌĂŶƐůĂƚŝŽŶї&;ŽƌLJMonteith obituary): ORIGINAL ENGLISH First Name + Family Name Family Name Pronominal form Title + Name First name Lexical description + Name TSM students | Translation 1 Nickname Family Name First Name Title Lexical description ORIGINAL FRENCH Family First Name + Name Title + Name Family Name First name Lexical description + Name Pronominal form Nickname Lexical description First Name+Family Name Pronominal form Lex. Description + Name Nickname Lex. Description Corpus analysis: 5 groups of students compiled their own corpus: ± ± ± ± ± Steve Jobs Elizabeth Taylor Heath Ledger Margaret Thatcher John Paul II Compilation of results with the same methodology as in Loock & Lefebvre Scodeller (2013) Analysis and possible revision of their initial translations Corpus analysis: English French Number of Number of words/tokens words/tokens per article Number of articles Number of Number of words/tokens words/tokens per article Subcorpus Number of articles #Thatcher 10 93,871 9,387 10 55,619 5,562 #Ledger 10 14,940 1,494 10 6,805 680.5 #JPII 10 21,336 2,134 10 10,934 1,093 #Jobs 10 21,311 2,131 10 9,005 901 #Taylor 10 13,259 1,326 10 7,106 711 Total 50 164,717 3,294 50 89,469 1,789 Results for the 5 subcorpora: ORGINAL FRENCH Other First Name+Family Name Family Name Title Lex. Description + Name Nickname First Name ORIGINAL ENGLISH First Name+Family Other Lex. Description Name Pronominal form Family Name Title Pronominal form First Name Lex. Description + Nickname Name Lex. Description NB: results still need thorough checking Results for the 5 subcorpora: Results quite similar to those for Necrocorpus But more complicated results for Pope John Paul II and Margaret Thatcher: Pope John Paul II Pope John Paul The Pope Karol (Jozef) Wojtyla Karol Wojtyla Lady Thatcher Baroness Thatcher Ex-‐Prime Minister Baroness Thatcher La « Dame de Fer » Margaret Thatcher Influence on revision: ± All students modified their translation (well, that was part of the assignment͙Ϳ ± But not in similar proportions ± And not necessarily in the same way: Suppression of Monteith Reduction of Cory Monteith Replacement with lexical descriptions, pronominal forms, or Cory Monteith Insertion of Title+Name (M(r) Monteith) Influence on revision: Some students also modified other aspects of their translations in light of what they found in their corpus: Tenses (passé composé instead of passé simple) Nominalizations: e.g͘>ŽƌƐƋƵĞŽƌLJĂϭϵĂŶƐхů͛ąŐĞĚĞϭϵ ans Lexicon: tristesse > chagrin ; Cory Monteith décède > Cory Monteith retrouvé mort (title) Translation 1 Family Name First Name+Family Name Title First Name Lex. Description Nickname + Name Lex. Description Pronominal form Original French (Necrocorpus) Family Name First Name+Family Name Translation 2 Family Name First Name+Family Name Title First Name Lex. Description + Nickname Name Lex. Description Pronominal form Lex. Description Pronominal form Title First Name Lex. Description + Name Nickname -‐ Reactions/Feedback: ¾ ¶JRRG way for a translation not to sound like WUDQVODWLRQ· ¾ ¶helped me understand why a translated text can sound translated· ¾ ¶RQFH you study such differences then you are aware of them and could translate other texts· ¾ ¶PDGH me focus on usage and not only on the correctness· ¾ ¶it improved my translation in the sense that translated French was brought closer to original )UHQFK· ¾ ¶WKH corpus-based methodology helped me in a certain way· ¾ ¶using corpora helps detach oneself from the original text· Reactions/Feedback: - complicated, time-consuming ¶DORWRIWLPHmaybe more than a translator can really afford· - ¶DELWdifficult to apply when actually working on a translation project· - ¶you FDQ·W predict what can cause your translation to sound like a translated text· - useful only ¶LIyou have to translate several documents of the same nature for the same person/theme· - ¶PRUHDERXWterminology· - ¶results which can be more easily found with other CAT tools· Conclusions: -‐ Corpora can help uncover differences in grammatical usage between 2 languages -‐ Corpora can be a good way to sensitize students to the questions of grammatical correctness vs. naturalness -‐ Still a lot of work to do: -‐ to improve the tools (corpora, concordancers) -‐ to convince students and professional translators that corpora can be a CAT tool along other CAT tools and MT Criticism of the approach: ± Who said there should be linguistic homogeneisation between original and translated language anyway?! ± Goes hand in hand with invisibility of translator, but not consensual ± No hard and fast rules! (risk of stereotyping/levelling out) ± Creativity ± Representativeness of the corpus Thank you for your attention! Contact: rudy.loock@univ-‐lille3.fr CorTEx project website: http://stl.recherche.univ-‐lille3.fr/CorTEx/CorTEx_index Acknowledgments: I would like to thank C. Lefebvre Scodeller and the M2 students of the 2013-‐14 TSM DĂƐƚĞƌ͛Ɛ Programme, LEA department, Lille 3 University.
© Copyright 2026 Paperzz