EvoSplit: An Evolutionary Approach to Split a Multi-Label Data Set into Disjoint Subsets
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/113868
Título: | EvoSplit: An Evolutionary Approach to Split a Multi-Label Data Set into Disjoint Subsets |
---|---|
Autor/es: | Flórez-Revuelta, Francisco |
Grupo/s de investigación o GITE: | Informática Industrial y Redes de Computadores |
Centro, Departamento o Servicio: | Universidad de Alicante. Departamento de Tecnología Informática y Computación |
Palabras clave: | Multi-label data sets | Supervised learning | Machine learning | Evolutionary computation | Big data applications |
Área/s de conocimiento: | Arquitectura y Tecnología de Computadores |
Fecha de publicación: | 22-mar-2021 |
Editor: | MDPI |
Cita bibliográfica: | Florez-Revuelta F. EvoSplit: An Evolutionary Approach to Split a Multi-Label Data Set into Disjoint Subsets. Applied Sciences. 2021; 11(6):2823. https://doi.org/10.3390/app11062823 |
Resumen: | This paper presents a new evolutionary approach, EvoSplit, for the distribution of multi-label data sets into disjoint subsets for supervised machine learning. Currently, data set providers either divide a data set randomly or using iterative stratification, a method that aims to maintain the label (or label pair) distribution of the original data set into the different subsets. Following the same aim, this paper first introduces a single-objective evolutionary approach that tries to obtain a split that maximizes the similarity between those distributions independently. Second, a new multi-objective evolutionary algorithm is presented to maximize the similarity considering simultaneously both distributions (labels and label pairs). Both approaches are validated using well-known multi-label data sets as well as large image data sets currently used in computer vision and machine learning applications. EvoSplit improves the splitting of a data set in comparison to the iterative stratification following different measures: Label Distribution, Label Pair Distribution, Examples Distribution, folds and fold-label pairs with zero positive examples. |
URI: | http://hdl.handle.net/10045/113868 |
ISSN: | 2076-3417 |
DOI: | 10.3390/app11062823 |
Idioma: | eng |
Tipo: | info:eu-repo/semantics/article |
Derechos: | © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
Revisión científica: | si |
Versión del editor: | https://doi.org/10.3390/app11062823 |
Aparece en las colecciones: | INV - I2RC - Artículos de Revistas INV - AmI4AHA - Artículos de Revistas |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
Florez-Revuelta_2021_ApplSci.pdf | 959,84 kB | Adobe PDF | Abrir Vista previa | |
Este ítem está licenciado bajo Licencia Creative Commons