Integrating advanced vision-language models for context recognition in risks assessment

Please use this identifier to cite or link to this item: http://hdl.handle.net/10045/150006
Información del item - Informació de l'item - Item information
Title: Integrating advanced vision-language models for context recognition in risks assessment
Authors: Rodriguez-Juan, Javier | Ortiz Pérez, David | Garcia-Rodriguez, Jose | Tomás, David | Nalepa, Grzegorz J.
Research Group/s: Arquitecturas Inteligentes Aplicadas (AIA) | Procesamiento del Lenguaje y Sistemas de Información (GPLSI)
Center, Department or Service: Universidad de Alicante. Departamento de Tecnología Informática y Computación | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | Universidad de Alicante. Instituto Universitario de Investigación Informática
Keywords: Deep learning | Risks assessment | Object detection | Action recognition | Transformers
Issue Date: 7-Dec-2024
Publisher: Elsevier
Citation: Neurocomputing. 2025, 618: 129131. https://doi.org/10.1016/j.neucom.2024.129131
Abstract: This study proposes an open-environment, multi-label human risk classification framework, capable of identifying possible risks to which individuals appearing on input video data are exposed. The framework consists of an ensemble of models covering object detection, action recognition, context understanding and text classification tasks. Each model is evaluated separately in the context of home environments, with the overall framework performing well in each evaluation after fine tuning. The models were evaluated using a combination of several datasets, including Charades, ETRI-Activity3D, and custom video question answering and risk datasets. This study exploits the ability of large language models to interpret semantic visual features combined with textual input in order to understand the context in which the person is placed. The framework’s ability to output multiple risks and its cross-domain capabilities make it a powerful tool that can enhance current risk management systems in a variety of scenarios, such as homes, construction sites and industry.
Sponsor: We would like to thank CIAICO/2022/132 Consolidated group project “AI4Health” funded by the Valencian government and International Center for Aging Research ICAR funded project “IASISTEM” This work has also been supported by a Valencian government grant for PhD studies, CIACIF/2022/175 and a research initiation grant from the University of Alicante, AII23-12. Finally we would like to thanks the support of the University Institute for Computer Research at the University of Alicante.
URI: http://hdl.handle.net/10045/150006
ISSN: 0925-2312 (Print) | 1872-8286 (Online)
DOI: 10.1016/j.neucom.2024.129131
Language: eng
Type: info:eu-repo/semantics/article
Rights: © 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer Review: si
Publisher version: https://doi.org/10.1016/j.neucom.2024.129131
Appears in Collections:INV - AIA - Artículos de Revistas
INV - GPLSI - Artículos de Revistas

Files in This Item:
Files in This Item:
File Description SizeFormat 
ThumbnailRodriguez-Juan_etal_2025_Neurocomputing.pdf1,28 MBAdobe PDFOpen Preview


Items in RUA are protected by copyright, with all rights reserved, unless otherwise indicated.