“Here Are the Rules: Ignore All Rules”: Automatic Contradiction Detection in Spanish
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/113915
Título: | “Here Are the Rules: Ignore All Rules”: Automatic Contradiction Detection in Spanish |
---|---|
Autor/es: | Sepúlveda-Torres, Robiert | Bonet-Jover, Alba | Saquete Boró, Estela |
Grupo/s de investigación o GITE: | Procesamiento del Lenguaje y Sistemas de Información (GPLSI) |
Centro, Departamento o Servicio: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | Universidad de Alicante. Instituto Universitario de Investigación Informática |
Palabras clave: | Contradiction detection | Natural language processing | Deep learning | Human language technologies |
Área/s de conocimiento: | Lenguajes y Sistemas Informáticos |
Fecha de publicación: | 30-mar-2021 |
Editor: | MDPI |
Cita bibliográfica: | Sepúlveda-Torres R, Bonet-Jover A, Saquete E. “Here Are the Rules: Ignore All Rules”: Automatic Contradiction Detection in Spanish. Applied Sciences. 2021; 11(7):3060. https://doi.org/10.3390/app11073060 |
Resumen: | This paper tackles automatic detection of contradictions in Spanish within the news domain. Two pieces of information are classified as compatible, contradictory, or unrelated information. To deal with the task, the ES-Contradiction dataset was created. This dataset contains a balanced number of each of the three types of information. The novelty of the research is the fine-grained annotation of the different types of contradictions in the dataset. Presently, four different types of contradictions are covered in the contradiction examples: negation, antonyms, numerical, and structural. However, future work will extend the dataset with all possible types of contradictions. In order to validate the effectiveness of the dataset, a pretrained model is used (BETO), and after performing different experiments, the system is able to detect contradiction with a F1m of 92.47%. Regarding the type of contradictions, the best results are obtained with negation contradiction (F1m = 98%), whereas structural contradictions obtain the lowest results (F1m = 69%) because of the smaller number of structural examples, due to the complexity of generating them. When dealing with a more generalistic dataset such as XNLI, our dataset fails to detect most of the contradictions properly, as the size of both datasets are very different and our dataset only covers four types of contradiction. However, using the classification of the contradictions leads us to conclude that there are highly complex contradictions that will need external knowledge in order to be properly detected and this will avoid the need for them to be previously exposed to the system. |
Patrocinador/es: | This research work has been partially funded by Generalitat Valenciana through project “SIIA: Tecnologias del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible” with grant reference PROMETEU/2018/089, by the Spanish Government through project RTI2018-094653-B-C22: “Modelang: Modeling the behavior of digital entities by Human Language Technologies”, as well as being partially supported by a grant from the Fondo Europeo de Desarrollo Regional (FEDER) and the LIVING-LANG project (RTI2018-094653-B-C21) from the Spanish Government. |
URI: | http://hdl.handle.net/10045/113915 |
ISSN: | 2076-3417 |
DOI: | 10.3390/app11073060 |
Idioma: | eng |
Tipo: | info:eu-repo/semantics/article |
Derechos: | © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
Revisión científica: | si |
Versión del editor: | https://doi.org/10.3390/app11073060 |
Aparece en las colecciones: | INV - GPLSI - Artículos de Revistas |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
Sepulveda-Torres_etal_2021_ApplSci.pdf | 275,15 kB | Adobe PDF | Abrir Vista previa | |
Este ítem está licenciado bajo Licencia Creative Commons