“Here Are the Rules: Ignore All Rules”: Automatic Contradiction Detection in Spanish
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10045/113915
Title: | “Here Are the Rules: Ignore All Rules”: Automatic Contradiction Detection in Spanish |
---|---|
Authors: | Sepúlveda-Torres, Robiert | Bonet-Jover, Alba | Saquete Boró, Estela |
Research Group/s: | Procesamiento del Lenguaje y Sistemas de Información (GPLSI) |
Center, Department or Service: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | Universidad de Alicante. Instituto Universitario de Investigación Informática |
Keywords: | Contradiction detection | Natural language processing | Deep learning | Human language technologies |
Knowledge Area: | Lenguajes y Sistemas Informáticos |
Issue Date: | 30-Mar-2021 |
Publisher: | MDPI |
Citation: | Sepúlveda-Torres R, Bonet-Jover A, Saquete E. “Here Are the Rules: Ignore All Rules”: Automatic Contradiction Detection in Spanish. Applied Sciences. 2021; 11(7):3060. https://doi.org/10.3390/app11073060 |
Abstract: | This paper tackles automatic detection of contradictions in Spanish within the news domain. Two pieces of information are classified as compatible, contradictory, or unrelated information. To deal with the task, the ES-Contradiction dataset was created. This dataset contains a balanced number of each of the three types of information. The novelty of the research is the fine-grained annotation of the different types of contradictions in the dataset. Presently, four different types of contradictions are covered in the contradiction examples: negation, antonyms, numerical, and structural. However, future work will extend the dataset with all possible types of contradictions. In order to validate the effectiveness of the dataset, a pretrained model is used (BETO), and after performing different experiments, the system is able to detect contradiction with a F1m of 92.47%. Regarding the type of contradictions, the best results are obtained with negation contradiction (F1m = 98%), whereas structural contradictions obtain the lowest results (F1m = 69%) because of the smaller number of structural examples, due to the complexity of generating them. When dealing with a more generalistic dataset such as XNLI, our dataset fails to detect most of the contradictions properly, as the size of both datasets are very different and our dataset only covers four types of contradiction. However, using the classification of the contradictions leads us to conclude that there are highly complex contradictions that will need external knowledge in order to be properly detected and this will avoid the need for them to be previously exposed to the system. |
Sponsor: | This research work has been partially funded by Generalitat Valenciana through project “SIIA: Tecnologias del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible” with grant reference PROMETEU/2018/089, by the Spanish Government through project RTI2018-094653-B-C22: “Modelang: Modeling the behavior of digital entities by Human Language Technologies”, as well as being partially supported by a grant from the Fondo Europeo de Desarrollo Regional (FEDER) and the LIVING-LANG project (RTI2018-094653-B-C21) from the Spanish Government. |
URI: | http://hdl.handle.net/10045/113915 |
ISSN: | 2076-3417 |
DOI: | 10.3390/app11073060 |
Language: | eng |
Type: | info:eu-repo/semantics/article |
Rights: | © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
Peer Review: | si |
Publisher version: | https://doi.org/10.3390/app11073060 |
Appears in Collections: | INV - GPLSI - Artículos de Revistas |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
![]() | 275,15 kB | Adobe PDF | Open Preview | |
This item is licensed under a Creative Commons License