Indoor Risks Assessment Using Video Captioning

Empreu sempre aquest identificador per citar o enllaçar aquest ítem http://hdl.handle.net/10045/135404
Información del item - Informació de l'item - Item information
Títol: Indoor Risks Assessment Using Video Captioning
Autors: Rodríguez Juan, Javier
Director de la investigació: Garcia-Rodriguez, Jose | Tomás, David
Centre, Departament o Servei: Universidad de Alicante. Departamento de Tecnología Informática y Computación
Paraules clau: Video Captioning | Deep Learning | Indoor Environment | Research Engineering
Data de publicació: 22-de juny-2023
Data de lectura: 14-de juny-2023
Resum: The progress of automatic scene analysis techniques for homes and the development of ambient assisted living systems is vital to help different kinds of people, such as the elderly or visually impaired individuals, who require special care in their daily lives. In this bachelor’s thesis we are going to develop a study of the most promising used techniques inside the Video Captioning and scene analysis scope and we will propose a Deep Learning pipeline aimed at performing Risks Assessment on input videos using the knowledge acquired during the study. This can be potentially applied to create systems aimed to help aforementioned people. Moreover, we will propose different evaluation architectures to test each of the stages involved in the Risks Assessment pipeline in order to observe its effectiveness and limitations. In this work we will introduce SwinBERT, a powerful and recent Video Captioning model, complemented with YOLOv7, a model aimed at the Object Recognition task, for the analysis of home scenes. Moreover, we will use various lexical transformations and linguistic models to maximize the semantic similarity of descriptions generated and objects detected, aligning them with the annotations provided by the datasets used. This approach will allow us to achieve more accurate matches from a human perspective. In the experiments we will outstand the usage of the large-scale dataset Charades, which was created with the goal of producing a vast dataset designed for the visual analysis, while preserving the naturalness and spontaneity of household and daily activities.
URI: http://hdl.handle.net/10045/135404
Idioma: eng
Tipus: info:eu-repo/semantics/bachelorThesis
Drets: Licencia Creative Commons Reconocimiento-NoComercial-SinObraDerivada 4.0
Apareix a la col·lecció: Grado en Ingeniería Informática - Trabajos Fin de Grado

Arxius per aquest ítem:


Tots els documents dipositats a RUA estan protegits per drets d'autors. Alguns drets reservats.