Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/48462
Información del item - Informació de l'item - Item information
Título: Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition
Autor/es: Gu, Feng | Flórez-Revuelta, Francisco | Monekosso, Dorothy | Remagnino, Paolo
Grupo/s de investigación o GITE: Informática Industrial y Redes de Computadores | Domótica y Ambientes Inteligentes
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Tecnología Informática y Computación
Palabras clave: Deep learning | Marginalised stacked denoising autoencoders | Bag of words | Multiple kernel learning | Multi-view action recognition
Área/s de conocimiento: Arquitectura y Tecnología de Computadores
Fecha de publicación: 16-jul-2015
Editor: MDPI
Cita bibliográfica: Gu F, Flórez-Revuelta F, Monekosso D, Remagnino P. Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition. Sensors. 2015; 15(7):17209-17231. doi:10.3390/s150717209
Resumen: Multi-view action recognition has gained a great interest in video surveillance, human computer interaction, and multimedia retrieval, where multiple cameras of different types are deployed to provide a complementary field of views. Fusion of multiple camera views evidently leads to more robust decisions on both tracking multiple targets and analysing complex human activities, especially where there are occlusions. In this paper, we incorporate the marginalised stacked denoising autoencoders (mSDA) algorithm to further improve the bag of words (BoWs) representation in terms of robustness and usefulness for multi-view action recognition. The resulting representations are fed into three simple fusion strategies as well as a multiple kernel learning algorithm at the classification stage. Based on the internal evaluation, the codebook size of BoWs and the number of layers of mSDA may not significantly affect recognition performance. According to results on three multi-view benchmark datasets, the proposed framework improves recognition performance across all three datasets and outputs record recognition performance, beating the state-of-art algorithms in the literature. It is also capable of performing real-time action recognition at a frame rate ranging from 33 to 45, which could be further improved by using more powerful machines in future applications.
Patrocinador/es: This work has been supported by the Ambient Assisted Living Joint Programme and Innovate UK under project “BREATHE—Platform for self-assessment and efficient management for informal caregivers” (AAL-JP-2012-5-045).
URI: http://hdl.handle.net/10045/48462
ISSN: 1424-8220
DOI: 10.3390/s150717209
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: © 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/)
Revisión científica: si
Versión del editor: http://dx.doi.org/10.3390/s150717209
Aparece en las colecciones:INV - DAI - Artículos de Revistas
INV - AmI4AHA - Artículos de Revistas

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
Thumbnail2015_Gu_etal_Sensors.pdf713,63 kBAdobe PDFAbrir Vista previa


Este ítem está licenciado bajo Licencia Creative Commons Creative Commons