EvoSplit: An Evolutionary Approach to Split a Multi-Label Data Set into Disjoint Subsets
Empreu sempre aquest identificador per citar o enllaçar aquest ítem
http://hdl.handle.net/10045/113868
Títol: | EvoSplit: An Evolutionary Approach to Split a Multi-Label Data Set into Disjoint Subsets |
---|---|
Autors: | Flórez-Revuelta, Francisco |
Grups d'investigació o GITE: | Informática Industrial y Redes de Computadores |
Centre, Departament o Servei: | Universidad de Alicante. Departamento de Tecnología Informática y Computación |
Paraules clau: | Multi-label data sets | Supervised learning | Machine learning | Evolutionary computation | Big data applications |
Àrees de coneixement: | Arquitectura y Tecnología de Computadores |
Data de publicació: | 22-de març-2021 |
Editor: | MDPI |
Citació bibliogràfica: | Florez-Revuelta F. EvoSplit: An Evolutionary Approach to Split a Multi-Label Data Set into Disjoint Subsets. Applied Sciences. 2021; 11(6):2823. https://doi.org/10.3390/app11062823 |
Resum: | This paper presents a new evolutionary approach, EvoSplit, for the distribution of multi-label data sets into disjoint subsets for supervised machine learning. Currently, data set providers either divide a data set randomly or using iterative stratification, a method that aims to maintain the label (or label pair) distribution of the original data set into the different subsets. Following the same aim, this paper first introduces a single-objective evolutionary approach that tries to obtain a split that maximizes the similarity between those distributions independently. Second, a new multi-objective evolutionary algorithm is presented to maximize the similarity considering simultaneously both distributions (labels and label pairs). Both approaches are validated using well-known multi-label data sets as well as large image data sets currently used in computer vision and machine learning applications. EvoSplit improves the splitting of a data set in comparison to the iterative stratification following different measures: Label Distribution, Label Pair Distribution, Examples Distribution, folds and fold-label pairs with zero positive examples. |
URI: | http://hdl.handle.net/10045/113868 |
ISSN: | 2076-3417 |
DOI: | 10.3390/app11062823 |
Idioma: | eng |
Tipus: | info:eu-repo/semantics/article |
Drets: | © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
Revisió científica: | si |
Versió de l'editor: | https://doi.org/10.3390/app11062823 |
Apareix a la col·lecció: | INV - I2RC - Artículos de Revistas INV - AmI4AHA - Artículos de Revistas |
Arxius per aquest ítem:
Arxiu | Descripció | Tamany | Format | |
---|---|---|---|---|
Florez-Revuelta_2021_ApplSci.pdf | 959,84 kB | Adobe PDF | Obrir Vista prèvia | |
Aquest ítem està subjecte a una llicència de Creative Commons Llicència Creative Commons