Choosing the correct paradigm for unknown words in rule-based machine translation systems
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/27584
Título: | Choosing the correct paradigm for unknown words in rule-based machine translation systems |
---|---|
Autor/es: | Sánchez-Cartagena, Víctor M. | Esplà-Gomis, Miquel | Sánchez-Martínez, Felipe | Pérez-Ortiz, Juan Antonio |
Grupo/s de investigación o GITE: | Transducens |
Centro, Departamento o Servicio: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos |
Palabras clave: | Machine translation | Rule-based | Unknown words |
Área/s de conocimiento: | Lenguajes y Sistemas Informáticos |
Fecha de publicación: | jun-2012 |
Editor: | CSLI Publications |
Cita bibliográfica: | SÁNCHEZ-CARTAGENA, Víctor M., et al. "Choosing the correct paradigm for unknown words in rule-based machine translation systems". En: Proceedings of the Third International Workshop on Free/Open-Source Rule-Based Machine Translation : June 13-15, 2012, Gothenburg, Sweden. Stanford, CA : CSLI Publications, 2012 |
Resumen: | Previous work on an interactive system aimed at helping non-expert users to enlarge the monolingual dictionaries of rule-based machine translation (MT) systems worked by discarding those inflection paradigms that cannot generate a set of inflected word forms validated by the user. This method, however, cannot deal with the common case where a set of different paradigms generate exactly the same set of inflected word forms, although with different inflection information attached. In this paper, we propose the use of an n-gram-based model of lexical categories and inflection information to select a single paradigm in cases where more than one paradigm generates the same set of word forms. Results obtained with a Spanish monolingual dictionary show that the correct paradigm is chosen for around 75% of the unknown words, thus making the resulting system (available under an open-source license) of valuable help to enlarge the monolingual dictionaries used in MT involving non-expert users without technical linguistic knowledge. |
Patrocinador/es: | This work has been partially funded by Spanish Ministerio de Ciencia e Innovación through project TIN2009-14009-C02-01, by Generalitat Valenciana through grant ACIF/2010/174 from VALi+d programme, and by Universitat d’Alacant through project GRE11-20. |
URI: | http://hdl.handle.net/10045/27584 |
Idioma: | eng |
Tipo: | info:eu-repo/semantics/conferenceObject |
Revisión científica: | si |
Aparece en las colecciones: | INV - TRANSDUCENS - Comunicaciones a Congresos, Conferencias, etc. |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
sanchez-cartagena12b.pdf | 178,54 kB | Adobe PDF | Abrir Vista previa | |
Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.