An Approach to Automatically Detect and Visualize Bias in Data Analytics

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/104029
Información del item - Informació de l'item - Item information
Título: An Approach to Automatically Detect and Visualize Bias in Data Analytics
Autor/es: Lavalle, Ana | Maté, Alejandro | Trujillo, Juan
Grupo/s de investigación o GITE: Lucentia
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave: Data Analytics | Artificial Intelligence | Automatically Detect | Visualize Bias
Área/s de conocimiento: Lenguajes y Sistemas Informáticos
Fecha de publicación: 2020
Editor: CEUR
Cita bibliográfica: DOLAP 2020, Proceedings of the 22nd International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data co-located with EDBT/ICDT 2020 Joint Conference (EDBT/ICDT 2020), Copenhagen, Denmark, March 30, 2020. CEUR Workshop Proceedings, Vol-2572
Resumen: Data Analytics and Artificial Intelligence (AI) are increasingly driving key business decisions and business processes. Any flaws in the interpretation of analytic results or AI outputs can lead to significant economic loses and reputation damage. Among existing flaws, one of the most often overlooked is the use biased data and imbalanced datasets. When unadverted, data bias warps the meaning of data and has a devastating effect on AI results. Existing approaches deal with data bias by constraining the data model, altering its composition until the data is no longer biased. Unfortunately, studies have shown that crucial information about the nature of data may be lost during this process. Therefore, in this paper we propose an alternative process, one that detects data biases and presents biased data in a visual way so that the user can comprehend how data is structured and decide whether or not constraining approaches are applicable in his context. Our approach detects the existence of biases in datasets through our proposed algorithm and generates a series of visualizations in a way that is understandable for users, including non-expert ones. In this way, users become aware not only of the existence of biases in the data, but also how they may impact their analytics and AI algorithms, thus avoiding undesired results.
Patrocinador/es: This work has been co-funded by the ECLIPSE-UA (RTI2018-094283-B-C32) project funded by Spanish Ministry of Science, Innovation, and Universities. Ana Lavalle holds an Industrial PhD Grant (I-PI 03-18) co-funded by the University of Alicante and the Lucentia Lab Spin-off Company.
URI: http://hdl.handle.net/10045/104029
ISSN: 1613-0073
Idioma: eng
Tipo: info:eu-repo/semantics/conferenceObject
Derechos: © Copyright 2020 for this paper held by its author(s). Published in the proceedings of DOLAP 2020 (March 30, 2020, Copenhagen, Denmark, co-located with EDBT/ICDT 2020) on CEUR-WS.org. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Revisión científica: si
Versión del editor: http://ceur-ws.org/Vol-2572/short11.pdf
Aparece en las colecciones:INV - LUCENTIA - Comunicaciones a Congresos, Conferencias, etc.

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
Thumbnail2020_Lavalle_etal_DOLAP.pdf930,99 kBAdobe PDFAbrir Vista previa


Este ítem está licenciado bajo Licencia Creative Commons Creative Commons