On the Poor Robustness of Transformer Models in Cross-Language Humor Recognition

Labadie Tamayo, Roberto; Ortega, Reynier; Rosso, Paolo; Rodriguez Cisneros, Mariano

On the Poor Robustness of Transformer Models in Cross-Language Humor Recognition

Please use this identifier to cite or link to this item: http://hdl.handle.net/10045/133258

Información del item - Informació de l'item - Item information
Title:	On the Poor Robustness of Transformer Models in Cross-Language Humor Recognition
Other Titles:	Sobre la Poca Robustez de los Modelos Transformers en el Reconocimiento Translingüístico del Humor
Authors:	Labadie Tamayo, Roberto \| Ortega, Reynier \| Rosso, Paolo \| Rodriguez Cisneros, Mariano
Keywords:	Humor recognition \| Humor translation \| Cross-language humor \| Multilingual models \| Detección de humor \| Traducción del humor \| Humor translingüe \| Modelos multilingües
Issue Date:	Mar-2023
Publisher:	Sociedad Española para el Procesamiento del Lenguaje Natural
Citation:	Procesamiento del Lenguaje Natural. 2023, 70: 73-83. https://doi.org/10.26342/2023-70-6
Abstract:	Humor is a pervasive communicative device; nevertheless, its portability from one language to another remains challenging for computer machines and even humans. In this work, we investigate the problem of humor recognition from a cross-language and cross-domain perspective, focusing on English and Spanish languages. To this aim, we rely on two strategies: the first is based on multilingual transformer models for exploiting the cross-language knowledge distilled by them, and the second introduces machine translation to learn and make predictions in a single language. Experiments showed that models struggle in front of the humor complexity when it is translated, effectively tracking a degradation in the humor perception when messages flow from one language to another. However, when multilingual models face a cross-language scenario, exclusive between the fine-tuning and evaluation data languages, humor translation helps to align the knowledge learned in fine-tuning phase. According to this, a mean increase of 11% in F1 score was observed when classifying English-written texts with models fine-tuned with a Spanish dataset. These results are encouraging and constitute the first step towards a computationally cross-language analysis of humor. \| El humor es un recurso comunicativo muy extendido; sin embargo, su portabilidad de un idioma a otro sigue siendo un reto para las máquinas informáticas e incluso para los humanos. En este trabajo, investigamos el problema del reconocimiento del humor desde una perspectiva translingüística y transdominio. Para ello, recurrimos a dos estrategias: la primera se basa en modelos transformers multilingües para explotar el conocimiento translingüístico que son capaces de destilar, y la segunda introduce la traducción automática para aprender y hacer predicciones en un solo idioma. Los experimentos demostraron que los modelos tienen dificultades ante la complejidad del humor cuando se traduce, lo que supone una degradación de la percepción del humor cuando los mensajes pasan de un idioma a otro. Sin embargo, cuando los modelos multilingües se enfrentan a un escenario translingüístico, exclusivo entre los idiomas de los datos de refinado y de evaluación, la traducción del humor ayuda a alinear los conocimientos aprendidos en la fase de refinado. En consecuencia, se observó un aumento medio del 11% de la puntuación F1 al clasificar textos escritos en inglés con modelos refinados con un conjunto de datos en español. Estos resultados son alentadores y constituyen el primer paso hacia un análisis computacional multilingüe del humor.
Sponsor:	This work has been partially developed with the support of valgrAI - Valencian Graduate School and Research Network of Artificial Intelligence and the Generalitat Valenciana, and co-funded by the European Union. The work of Ortega Bueno and Rosso was in the framework of the FairTransNLP research project (PID2021-124361OB-C31) funded by MCIN/AEI/10.13039/501100011033 and by ERDF, EU A way of making Europe.
URI:	http://hdl.handle.net/10045/133258
ISSN:	1135-5948
DOI:	10.26342/2023-70-6
Language:	eng
Type:	info:eu-repo/semantics/article
Rights:	© Sociedad Española para el Procesamiento del Lenguaje Natural. Distribuido bajo Licencia Creative Commons Reconocimiento-NoComercial-SinObraDerivada 4.0
Peer Review:	si
Publisher version:	https://doi.org/10.26342/2023-70-6
Appears in Collections:	Procesamiento del Lenguaje Natural - Nº 70 (2023)

Files in This Item:

Files in This Item:
File	Description	Size	Format
PLN_70_06.pdf		1,01 MB	Adobe PDF	Open Preview Close preview

See citations in Google Scholar

Show full item record

This item is licensed under a Creative Commons License