Creating the best development corpus for Statistical Machine Translation systems
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/76033
Título: | Creating the best development corpus for Statistical Machine Translation systems |
---|---|
Autor/es: | Chinea-Rios, Mara | Sanchis-Trilles, Germán | Casacuberta, Francisco |
Palabras clave: | Machine Translation |
Área/s de conocimiento: | Lenguajes y Sistemas Informáticos |
Fecha de publicación: | 2018 |
Editor: | European Association for Machine Translation |
Cita bibliográfica: | Chinea-Rios, Mara; Sanchis-Trilles, Germán; Casacuberta, Francisco. “Creating the best development corpus for Statistical Machine Translation systems”. In: Pérez-Ortiz, Juan Antonio, et al. (Eds.). Proceedings of the 21st Annual Conference of the European Association for Machine Translation: 28-30 May 2018, Universitat d'Alacant, Alacant, Spain, pp. 99-108 |
Resumen: | We propose and study three different novel approaches for tackling the problem of development set selection in Statistical Machine Translation. We focus on a scenario where a machine translation system is leveraged for translating a specific test set, without further data from the domain at hand. Such test set stems from a real application of machine translation, where the texts of a specific e-commerce were to be translated. For developing our development-set selection techniques, we first conducted experiments in a controlled scenario, where labelled data from different domains was available, and evaluated the techniques both with classification and translation quality metrics. Then, the best-performing techniques were evaluated on the e-commerce data at hand, yielding consistent improvements across two language directions. |
Patrocinador/es: | The research leading to these results were partially supported by projects CoMUN-HaT-TIN2015-70924-C2-1-R (MINECO/FEDER) and PROMETEO/2018/004. |
URI: | http://hdl.handle.net/10045/76033 |
ISBN: | 978-84-09-01901-4 |
Idioma: | eng |
Tipo: | info:eu-repo/semantics/conferenceObject |
Derechos: | © 2018 The authors. This article is licensed under a Creative Commons 3.0 licence, no derivative works, attribution, CC-BY-ND. |
Revisión científica: | si |
Versión del editor: | http://eamt2018.dlsi.ua.es/proceedings-eamt2018.pdf |
Aparece en las colecciones: | EAMT2018 - Proceedings |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
EAMT2018-Proceedings_12.pdf | 1,64 MB | Adobe PDF | Abrir Vista previa | |
Este ítem está licenciado bajo Licencia Creative Commons