Is Machine Translation Getting Better over Time?

Yvette Graham
Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.

Tim Baldwin
Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.

Alistair Moffat
Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.

Justin Zobel
Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.

Status

Proc. Conf. European Assoc. Computational Linguistics, Gothenberg, April 2014, pages 443-451.

Abstract

Recent human evaluation of machine translation has focused on relative preference judgments of translation quality, making it difficult to track longitudinal improvements over time. We carry out a large-scale crowd-sourcing experiment to estimate the degree to which state-of-theart performance in machine translation has increased over the past five years. To facilitate longitudinal evaluation, we move away from relative preference judgments and instead ask human judges to provide direct estimates of the quality of individual translations in isolation from alternate outputs. For seven European language pairs, our evaluation estimates an average 10-point improvement to state-of-theart machine translation between 2007 and 2012, with Czech-to-English translation standing out as the language pair achieving most substantial gains. Our method of human evaluation offers an economically feasible and robust means of performing ongoing longitudinal evaluation of machine translation.

Full text

http://aclweb.org/anthology/E14/E14-1047.pdf