A Deep Learning-Based Multimodal Architecture to predict Signs of Dementia

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/134891
Información del item - Informació de l'item - Item information
Título: A Deep Learning-Based Multimodal Architecture to predict Signs of Dementia
Autor/es: Ortiz Pérez, David | Ruiz Ponce, Pablo | Tomás, David | Garcia-Rodriguez, Jose | Vizcaya-Moreno, M. Flores | Leo, Marco
Grupo/s de investigación o GITE: Arquitecturas Inteligentes Aplicadas (AIA) | Procesamiento del Lenguaje y Sistemas de Información (GPLSI) | Enfermería Clínica (EC)
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Tecnología Informática y Computación | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | Universidad de Alicante. Departamento de Enfermería
Palabras clave: Multimodal | Deep Learning | Transformers | Dementia prediction
Fecha de publicación: 5-jun-2023
Editor: Elsevier
Cita bibliográfica: Neurocomputing. 2023, 548: 126413. https://doi.org/10.1016/j.neucom.2023.126413
Resumen: This paper proposes a multimodal deep learning architecture combining text and audio information to predict dementia, a disease which affects around 55 million people all over the world and makes them in some cases dependent people. The system was evaluated on the DementiaBank Pitt Corpus dataset, which includes audio recordings as well as their transcriptions for healthy people and people with dementia. Different models have been used and tested, including Convolutional Neural Networks (CNN) for audio classification, Transformers for text classification, and a combination of both in a multimodal ensemble. These models have been evaluated on a test set, obtaining the best results by using the text modality, achieving 90.36% accuracy on the task of detecting dementia. Additionally, an analysis of the corpus has been conducted for the sake of explainability, aiming to obtain more information about how the models generate their predictions and identify patterns in the data.
Patrocinador/es: We would like to thank “A way of making Europe” European Regional Development Fund (ERDF) and MCIN/AEI/10.13039/501100011033 for supporting this work under the MoDeaAS project (grant PID2019-104818RB-I00) and AICARE project (grant SPID202200X139779IV0). Furthermore, we would like to thank Nvidia for their generous hardware donation that made these experiments possible.
URI: http://hdl.handle.net/10045/134891
ISSN: 0925-2312 (Print) | 1872-8286 (Online)
DOI: 10.1016/j.neucom.2023.126413
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: © 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Revisión científica: si
Versión del editor: https://doi.org/10.1016/j.neucom.2023.126413
Aparece en las colecciones:INV - GPLSI - Artículos de Revistas
INV - Enfermería Clínica - Artículos de Revistas
INV - AIA - Artículos de Revistas

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
ThumbnailOrtiz-Perez_etal_2023_Neurocomputing.pdf1,56 MBAdobe PDFAbrir Vista previa


Este ítem está licenciado bajo Licencia Creative Commons Creative Commons