Binary classifiers versus AdaBoost for labeling of digital documents

Montejo Ráez, Arturo; Ureña López, Luis Alfonso

Binary classifiers versus AdaBoost for labeling of digital documents

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/3341

Registro completo de metadatos

Registro completo de metadatos
Campo DC	Valor	Idioma
dc.contributor.author	Montejo Ráez, Arturo	-
dc.contributor.author	Ureña López, Luis Alfonso	-
dc.date.accessioned	2007-11-28T16:35:28Z	-
dc.date.available	2007-11-28T16:35:28Z	-
dc.date.issued	2006-09	-
dc.identifier.citation	MONTEJO RÁEZ, Arturo; UREÑA LÓPEZ, Luis Alfonso. "Binary classifiers versus AdaBoost for labeling of digital documents". Procesamiento del lenguaje natural. N. 37 (sept. 2006). ISSN 1135-5948, pp. 319-326	en
dc.identifier.issn	1135-5948	-
dc.identifier.uri	http://hdl.handle.net/10045/3341	-
dc.description.abstract	La asignación de términos de un vocabulario controlado (habitualmente un tesauro) a documentos en formato digital está abriendo la puerta a nuevas aplicaciones. En este artículo se comparan dos algoritmos avanzados para clasificación de documentos: la selección adaptativa de clasificadores base binarios y el algoritmo AdaBoost. Si bien ambos mostraron tiempos de respuesta similares, el primero proporcionó los mejores resultados sobre la partición hep-ex del corpus HEP, respaldando dicho método como una solución robusta al multi-etiquetado para grandes colecciones.	en
dc.description.abstract	Assignment of labels from a controlled set of terms (usually a thesaurus) to digital version of documents is opening a wide range of new applications, now becoming powerful tools for digital libraries. In this paper we compare two different and advanced approaches for multi-label text categorization: the adaptive selection of binary base classifiers and the AdaBoost algorithm. Though both of them showed similar response times on producing final labels, the use of adaptive selection of binary classifiers performed better than AdaBoost on the hep-ex partition of the HEP corpus, confirming this method as a robust solution for multi-label of large collections.	en
dc.description.sponsorship	This work has been partially supported by the Spanish Government under project R2D2-RIM (TIC2003-07158-C04-04).	en
dc.language	eng	en
dc.publisher	Sociedad Española para el Procesamiento del Lenguaje Natural	en
dc.relation.ispartof	Procesamiento del lenguaje natural, nº 37 (sept. 2006), pp. 319-326	en
dc.subject	Clasificación automática de documentos	en
dc.subject	Comparación de algoritmos	en
dc.subject	Clasificación binaria	en
dc.subject	Benchmark	en
dc.subject	Automatic text categorization	en
dc.subject	Algorithms comparison	en
dc.subject	Binary classification	en
dc.title	Binary classifiers versus AdaBoost for labeling of digital documents	en
dc.type	info:eu-repo/semantics/article	en
dc.rights.accessRights	info:eu-repo/semantics/openAccess	-
Aparece en las colecciones:	Procesamiento del Lenguaje Natural - Nº 37 (septiembre 2006)

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
PLN_37_39.pdf		159,01 kB	Adobe PDF	Abrir Vista previa Cerrar vista previa

Ver citas en Google Académico

Muestra el registro sencillo