présentation

4th workshop on translation quality, Lille, 07.02.14 Using corpora in the classroom to improve translation quality: report on an experiment Rudy Loock GRAMMAR BOOKS/TRANSLATION TEXTBOOKS ON-­‐LINE DICTIONARIES & GLOSSARIES TRADITIONAL DICTIONARIES SMARTPHONE APPS DOCUMENTARY RESEARCH CORPORA MACHINE CAT TOOLS TRANSLATION ‡ Electronic corpus: ¾ a collection of machine-­‐readable authentic texts (including transcripts of spoken data) which is sampled to be representative of a particular language or language variety ƒ ͞ŽĨĨŝĐŝĂů͟ British National Corpus, Corpus of Contemporary American English, Translational English Corpus, Frantext, Dutch Parallel Corpus, English-­‐
Norwegian Parallel Corpus͙. ƒ self-­‐collected/DIY corpora (Varantola 2003) exploited with concordancers ‡ Corpora can be used by translators for 3 main reasons: 1. To collect information on systemic differences between the SL and the TL (inter-­‐language/cross-­‐linguistic comparisons; contrastive linguistics) 2. To investigate differences between original and translated texts in a given language (intra-­‐language comparisons) 3. To find out and analyze what translators do Ÿ Link with TQA ORIGINAL ENGLISH TRANSLATED FRENCH TRANSLATED ENGLISH ORIGINAL FRENCH Ÿ inter-­‐language & intra-­‐language comparisons Ÿ comparable and parallel/translation corpora Adapted from Johansson & Oksefjell (1998:8) for the ENPC ‡ Case study on OBITUARIES IN ENGLISH AND FRENCH Step 1: Thanks to a comparable corpus, uncover systemic differences between original E and original F concerning the use of referring expressions to the deceased (Loock & Lefebvre-­‐Scodeller 2013) Step 2: Use these differences to define usage constraints in order to improve naturalness, idiomaticity of translations (experiment with students) ‡ Case study on OBITUARIES IN ENGLISH AND FRENCH Aims: -­‐ Show that a bridge exists between contrastive linguistics and translation studies (systemic differences as usage constraints) (e.g. Ebeling 1998, Johansson 1999, Ramon 2002) -­‐ Show that idiomaticity, naturalness (Salkie 2007) can then be improved -­‐ Corpora have an important role to play in translators education (e.g. W. Chen, L. Bowker, N. Kübler, A. Boulton, C. Frérot, R. Nita) ‡ Methodology (step 1): Compilation of a specific, DIY corpus (Necrocorpus, 100 obituaries, ca. 100,000 words), analyzed with AntConc ƒ 4 celebrities recently passed away: ƒ Amy Winehouse (1983-­‐2011) ƒ Michael Jackson (1958-­‐2009) ƒ Larry Hagman (1931-­‐2012) ƒ Whitney Houston (1963-­‐2012) ƒ Press articles in original English and French: ƒ announcement of death + biography Necrocorpus English Subcorpus Number of articles French Number of Number of words/tokens words/tokens per article Number of articles Number of Number of words/tokens words/tokens per article #Jackson 11 18,208 1,655 11 14,357 1,305 #Hagman 11 10,262 933 13 5,031 387 #Houston 13 16,373 1,259 12 8,113 676 #Winehouse 15 17,534 1,169 14 6,627 473 Total 50 62,377 1,248 50 34,128 683 = COMPARABLE CORPUS (ĂĚĚŝŶŐƵƉŽĨ͞ĐŽƌƉŽƌĂŝŶƚǁŽŽƌŵŽƌĞůĂŶŐƵĂŐĞƐ
ǁŝƚŚƚŚĞƐĂŵĞŽƌƐŝŵŝůĂƌĐŽŵƉŽƐŝƚŝŽŶ͟(Teubert 1996: 245)) ‡ Referring expressions to the deceased: ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
First Name + Family name
First Name
Family Name
Lexical description (+ First name) + Family Name
Title (Mr/Mrs/Mrs ; M./Mme/Mlle)
Nickname
Michael Jackson Michael Jackson Lexical descriptions
Jackson Jackson Pronominal forms
Ÿ Same possibilities for the 2
grammatical systems
Michael Mr. Jackson The King of Pop The star Bambi He Michael M. Jackson Le roi de la Pop La star Bambi Il ORIGINAL ENGLISH ‡ Results: First Name + Family Name Family Name Lexical description + Title + Name Name First name Nickname Pronominal form ORIGINAL FRENCH First Name + Family Name Family Name Lexical description Pronominal form Title + Name First name Lexical description + Name Nickname Lexical description ‡ Discussion: There are both similiarities and significant differences between original E and original F: ƒ Absence of differences: Pronominal vs. lexical formsNS: F= 67%-­‐33%; A = 70%-­‐30% First NameNS NicknameIN Ÿ There are significant differences in usage between English and French Ÿ Should be taken into account by translators (naturalness) ƒ Differences: Lexical description + Name (F>A)* Family Name (A>F)* First Name + Family Name (F>A)* Lexical Description (A>F)* Title + Name (A>F)IN * = statistically significant (p<0.0002), NS = not statistically significant, IN = test inapplicable (insufficient number of tokens) ‡ Methodology (step 2): ± TSM M2 Students (16) translated an obituary from E to F before classes on corpora started (Sept 2013) ± They compiled and analyzed their own corpus of obituaries with same methodology (Oct-­‐Dec 2013) ± They decided on (not) revising their initial translation Cf. concept of the invisible translator: ͞dŚĞ utopian goal is to make it virtually impossible to tell the translation from an original text in that ůĂŶŐƵĂŐĞ͟ (Teubert 1996 : 241)
= research project for M2 comparative grammar class ‡ /ŶŝƚŝĂůƚƌĂŶƐůĂƚŝŽŶї&;ŽƌLJMonteith obituary): ORIGINAL ENGLISH First Name + Family Name Family Name Pronominal form Title + Name First name Lexical description + Name TSM students | Translation 1 Nickname Family Name First Name Title Lexical description ORIGINAL FRENCH Family First Name + Name Title + Name Family Name First name Lexical description + Name Pronominal form Nickname Lexical description First Name+Family Name Pronominal form Lex. Description + Name Nickname Lex. Description ‡ Corpus analysis: ƒ 5 groups of students compiled their own corpus: ±
±
±
±
±
Steve Jobs Elizabeth Taylor Heath Ledger Margaret Thatcher John Paul II ƒ Compilation of results with the same methodology as in Loock & Lefebvre Scodeller (2013) ƒ Analysis and possible revision of their initial translations ‡ Corpus analysis: English French Number of Number of words/tokens words/tokens per article Number of articles Number of Number of words/tokens words/tokens per article Subcorpus Number of articles #Thatcher 10 93,871 9,387 10 55,619 5,562 #Ledger 10 14,940 1,494 10 6,805 680.5 #JPII 10 21,336 2,134 10 10,934 1,093 #Jobs 10 21,311 2,131 10 9,005 901 #Taylor 10 13,259 1,326 10 7,106 711 Total 50 164,717 3,294 50 89,469 1,789 ‡ Results for the 5 subcorpora: ORGINAL FRENCH Other First Name+Family Name Family Name Title Lex. Description + Name Nickname First Name ORIGINAL ENGLISH First Name+Family Other Lex. Description Name Pronominal form Family Name Title Pronominal form First Name Lex. Description + Nickname Name Lex. Description NB: results still need thorough checking ‡ Results for the 5 subcorpora: ‡ Results quite similar to those for Necrocorpus ‡ But more complicated results for Pope John Paul II and Margaret Thatcher: Pope John Paul II Pope John Paul The Pope Karol (Jozef) Wojtyla Karol Wojtyla Lady Thatcher Baroness Thatcher Ex-­‐Prime Minister Baroness Thatcher La « Dame de Fer » Margaret Thatcher ‡ Influence on revision: ± All students modified their translation (well, that was part of the assignment͙Ϳ ± But not in similar proportions ± And not necessarily in the same way: ‡ Suppression of Monteith ‡ Reduction of Cory Monteith ‡ Replacement with lexical descriptions, pronominal forms, or Cory Monteith ‡ Insertion of Title+Name (M(r) Monteith) ‡ Influence on revision: Some students also modified other aspects of their translations in light of what they found in their corpus: Tenses (passé composé instead of passé simple) Nominalizations: e.g͘>ŽƌƐƋƵĞŽƌLJĂϭϵĂŶƐхů͛ąŐĞĚĞϭϵ
ans Lexicon: tristesse > chagrin ; Cory Monteith décède > Cory Monteith retrouvé mort (title) Translation 1 Family Name First Name+Family Name Title First Name Lex. Description Nickname + Name Lex. Description Pronominal form Original French (Necrocorpus) Family Name First Name+Family Name Translation 2 Family Name First Name+Family Name Title First Name Lex. Description + Nickname Name Lex. Description Pronominal form Lex. Description Pronominal form Title First Name Lex. Description + Name Nickname -­‐ Reactions/Feedback: ¾ ¶JRRG way for a translation not to sound like WUDQVODWLRQ·
¾ ¶helped me understand why a translated text can sound translated·
¾ ¶RQFH you study such differences then you are aware of them and could translate
other texts·
¾ ¶PDGH me focus on usage and not only on the correctness·
¾ ¶it improved my translation in the sense that translated French was brought closer
to original )UHQFK·
¾ ¶WKH corpus-based methodology helped me in a certain way·
¾ ¶using corpora helps detach oneself from the original text·
‡ Reactions/Feedback: - complicated, time-consuming ¶DORWRIWLPHmaybe more than a translator can
really afford·
- ¶DELWdifficult to apply when actually working on a translation project·
- ¶you FDQ·W predict what can cause your translation to sound like a translated text·
- useful only ¶LIyou have to translate several documents of the same nature for the
same person/theme·
- ¶PRUHDERXWterminology·
- ¶results which can be more easily found with other CAT tools·
‡ Conclusions: -­‐ Corpora can help uncover differences in grammatical usage between 2 languages -­‐ Corpora can be a good way to sensitize students to the questions of grammatical correctness vs. naturalness -­‐ Still a lot of work to do: -­‐ to improve the tools (corpora, concordancers) -­‐ to convince students and professional translators that corpora can be a CAT tool along other CAT tools and MT ‡ Criticism of the approach: ± Who said there should be linguistic homogeneisation between original and translated language anyway?! ± Goes hand in hand with invisibility of translator, but not consensual ± No hard and fast rules! (risk of stereotyping/levelling out) ± Creativity ± Representativeness of the corpus Thank you for your attention! Contact: rudy.loock@univ-­‐lille3.fr CorTEx project website: http://stl.recherche.univ-­‐lille3.fr/CorTEx/CorTEx_index Acknowledgments: I would like to thank C. Lefebvre Scodeller and the M2 students of the 2013-­‐14 TSM DĂƐƚĞƌ͛Ɛ Programme, LEA department, Lille 3 University.