Post-editing

Post-editing: A Research Perspective
Michel Simard
Interactive Language Technologies
AMTA-2012
Conference
WPTP
WPTP
• Organizers:
– Sharon O’Brien (CNGL),
– Lucia Specia (Sheffield),
– Michel Simard (NRC)
• 46+ participants:
– Research: 25 (CNGL, CMU, CBS, Columbia, CNRC, UQO, Stanford,
Edimburgh, Helsinki, SRI, DFKI, …)
– Industry: 18 (Yandex, Adobe, IBM, Microsoft, Autodesk, Intel, …)
– Others: 3 (Mitre, PAHO, NMEC)
• Countries
– US=21, Canada=2, Brazil=1, Venezuela=1
– Europe=19 (Ireland=6!)
– Asia=2 (Japan, New-Zealand)
WPTP: Program
•
•
•
•
Invited Speaker: Dr. Salim Roukos (IBM)
8 Oral Presentations
5 Posters
5 Demos
https://sites.google.com/site/wptp2012/accepted-papers
What post-editing?
• “…correction of machine translation output by human
linguists/editors” [Veale and Way 1997]
• “…the process of improving a machine-generated translation
with a minimum of manual labor”. [TAUS report 2010]
• A process of modification rather than revision.
[Loffler-Laurian 1985]
• “Repairing Texts” [Krings, 2001]
… but more and more MT use in “normal” translation contexts.
Why a workshop on
post-editing?
Growing Interest for PE
• Machine Translation Quality is improving
• MT systems availability:
– Traditional commercial systems:
SYSTRAN, ProMT, Language Weaver, etc.
– Open source: Moses
– « Cloud-based »: Google Translator Toolkit,
MemSource, Applied Language Services (CAPITA),
Microsoft Translator Hub
Scientific Research
Human Translation Process Research:
What goes on in the post-editor’s head:
• [Koponen et al.]: measure cognitive effort, based on editing
time
• [Lacruz et al.]: measure cognitive effort, based on editing
pauses
Translation Quality
• [Melby et al.]: Post-editing quality evaluation framework
Resources:
• [Carl, Green et al.]: Post-editing corpora (user activity data)
Scientific Research
Not just blue sky research:
• Statistical MT systems are explicitly optimized with specific
performance measures (typically: BLEU)
• MT quality is not absolute: it depends on the intended
application
• For post-editing, we want MT that maximizes post-editor
productivity → minimize time and/or effort
• We want to find performance measures that correlate with those
criteria.
Research Tools
•
•
•
•
•
[Federman]: APPRAISE
[Aziz & Specia]: PET
[Denkowski & Lavie]: TransCenter
[Elming & Bonk]: CASMACAT
[Beregovaya & Moran]: iOmegaT
• [Doherty & O’Brien]: eye-tracking
Experiments
• [Zhechev]: Autodesk
– UI, technical manuals, marketing
– 12 languages
– Productivity gains: 37% (PL) – 92% (FR)
– « language difficulty » more influential than amount of training
data
• [Roukos]: IBM
– UI, technical manuals, marketing
– Productivity gains in the order of 30-40%
In general, *improved* translation quality
Experiments
• [Poulis & Kolovratnik]: European Parliament (ongoing…)
• [Tatsumi et al.]: Toyohashi University of Technology
– Crowd Post-editing
– 9 languages
– Evaluation of quality : 50-70% sentences are « acceptable »
quality
New Technologies
• [Valotkaite & Asadullah]: Error Detection
– Marking potential MT errors improves PE productivity
• [Mundt et al.]: Automatic Post-editing
– APE: 2-stage MT
– Interesting approach when there is little or no control on the MT
system
– Handle « dropped words »
What Next?
Things we did not see at WPTP:
•
•
•
•
•
Intelligent MT+TM combinations
Automatic transfer of markup in MT
UI
Translator Trust
Negative results 