Is Machine Translation Getting Better over Time?
Yvette Graham
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Tim Baldwin
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Justin Zobel
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. Conf. European Assoc. Computational Linguistics,
Gothenberg, April 2014, pages 443-451.
Abstract
Recent human evaluation of machine translation has focused on
relative preference judgments of translation quality, making it
difficult to track longitudinal improvements over time.
We carry out a large-scale crowd-sourcing experiment to estimate the
degree to which state-of-theart performance in machine translation
has increased over the past five years.
To facilitate longitudinal evaluation, we move away from relative
preference judgments and instead ask human judges to provide direct
estimates of the quality of individual translations in isolation from
alternate outputs.
For seven European language pairs, our evaluation estimates an
average 10-point improvement to state-of-theart machine translation
between 2007 and 2012, with Czech-to-English translation standing out
as the language pair achieving most substantial gains.
Our method of human evaluation offers an economically feasible and
robust means of performing ongoing longitudinal evaluation of machine
translation.
Full text
http://aclweb.org/anthology/E14/E14-1047.pdf