Oversampling imbalanced data in the string space
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/72581
Registro completo de metadatos
Campo DC | Valor | Idioma |
---|---|---|
dc.contributor | Reconocimiento de Formas e Inteligencia Artificial | es_ES |
dc.contributor.author | Castellanos, Francisco J. | - |
dc.contributor.author | Valero-Mas, Jose J. | - |
dc.contributor.author | Calvo-Zaragoza, Jorge | - |
dc.contributor.author | Rico-Juan, Juan Ramón | - |
dc.contributor.other | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | es_ES |
dc.date.accessioned | 2018-01-17T11:38:03Z | - |
dc.date.available | 2018-01-17T11:38:03Z | - |
dc.date.issued | 2018-02-01 | - |
dc.identifier.citation | Pattern Recognition Letters. 2018, 103: 32-38. doi:10.1016/j.patrec.2018.01.003 | es_ES |
dc.identifier.issn | 0167-8655 (Print) | - |
dc.identifier.issn | 1872-7344 (Online) | - |
dc.identifier.uri | http://hdl.handle.net/10045/72581 | - |
dc.description.abstract | Imbalanced data is a typical problem in the supervised classification field, which occurs when the different classes are not equally represented. This fact typically results in the classifier biasing its performance towards the class representing the majority of the elements. Many methods have been proposed to alleviate this scenario, yet all of them assume that data is represented as feature vectors. In this paper we propose a strategy to balance a dataset whose samples are encoded as strings. Our approach is based on adapting the well-known Synthetic Minority Over-sampling Technique (SMOTE) algorithm to the string space. More precisely, data generation is achieved with an iterative approach to create artificial strings within the segment between two given samples of the training set. Results with several datasets and imbalance ratios show that the proposed strategy properly deals with the problem in all cases considered. | es_ES |
dc.description.sponsorship | This work was partially supported by the Spanish Ministerio de Economía y Competitividad through Project TIMuL (No. TIN2013- 48152-C2-1-R supported by EU FEDER funds), the Universidad de Alicante through the FPU program (UAFPU2014–5883) and grant GRE-16-04 . | es_ES |
dc.language | eng | es_ES |
dc.publisher | Elsevier | es_ES |
dc.rights | © 2018 Elsevier B.V. | es_ES |
dc.subject | Class imbalance problem | es_ES |
dc.subject | Oversampling | es_ES |
dc.subject | String space | es_ES |
dc.subject | SMOTE | es_ES |
dc.subject.other | Lenguajes y Sistemas Informáticos | es_ES |
dc.title | Oversampling imbalanced data in the string space | es_ES |
dc.type | info:eu-repo/semantics/article | es_ES |
dc.peerreviewed | si | es_ES |
dc.identifier.doi | 10.1016/j.patrec.2018.01.003 | - |
dc.relation.publisherversion | http://dx.doi.org/10.1016/j.patrec.2018.01.003 | es_ES |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TIN2013-48152-C2-1-R | - |
Aparece en las colecciones: | INV - GRFIA - Artículos de Revistas |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
2018_Castellanos_etal_PatternRecognLett_final.pdf | Versión final (acceso restringido) | 716,98 kB | Adobe PDF | Abrir Solicitar una copia |
2018_Castellanos_etal_PatternRecognLett_accepted.pdf | Accepted Manuscript (acceso abierto) | 313,42 kB | Adobe PDF | Abrir Vista previa |
Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.