Automatic selection of molecular descriptors using random forest: Application to drug discovery

Cano, Gaspar; Garcia-Rodriguez, Jose; Garcia-Garcia, Alberto; Pérez Sánchez, Horacio; Benediktsson, Jón Atli; Thapa, Anil; Barr, Alastair

Automatic selection of molecular descriptors using random forest: Application to drug discovery

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/61744

Información del item - Informació de l'item - Item information
Título:	Automatic selection of molecular descriptors using random forest: Application to drug discovery
Autor/es:	Cano, Gaspar \| Garcia-Rodriguez, Jose \| Garcia-Garcia, Alberto \| Pérez Sánchez, Horacio \| Benediktsson, Jón Atli \| Thapa, Anil \| Barr, Alastair
Grupo/s de investigación o GITE:	Informática Industrial y Redes de Computadores
Centro, Departamento o Servicio:	Universidad de Alicante. Departamento de Tecnología Informática y Computación
Palabras clave:	Random forest \| Drug discovery \| Molecular descriptors \| Computational chemistry
Área/s de conocimiento:	Arquitectura y Tecnología de Computadores
Fecha de publicación:	15-abr-2017
Editor:	Elsevier
Cita bibliográfica:	Expert Systems with Applications. 2017, 72: 151-159. doi:10.1016/j.eswa.2016.12.008
Resumen:	The optimal selection of chemical features (molecular descriptors) is an essential pre-processing step for the efficient application of computational intelligence techniques in virtual screening for identification of bioactive molecules in drug discovery. The selection of molecular descriptors has key influence in the accuracy of affinity prediction. In order to improve this prediction, we examined a Random Forest (RF)-based approach to automatically select molecular descriptors of training data for ligands of kinases, nuclear hormone receptors, and other enzymes. The reduction of features to use during prediction dramatically reduces the computing time over existing approaches and consequently permits the exploration of much larger sets of experimental data. To test the validity of the method, we compared the results of our approach with the ones obtained using manual feature selection in our previous study (Perez-Sanchez, Cano, and Garcia-Rodriguez, 2014).The main novelty of this work in the field of drug discovery is the use of RF in two different ways: feature ranking and dimensionality reduction, and classification using the automatically selected feature subset. Our RF-based method outperforms classification results provided by Support Vector Machine (SVM) and Neural Networks (NN) approaches.
Patrocinador/es:	This work was partially supported by the Fundación Séneca del Centro de Coordinación de la Investigación de la Región de Murcia under Project 18946/JLI/13. This work has been funded by the Nils Coordinated Mobility under grant 012-ABEL-CM-2014A, in part financed by the European Regional Development Fund (ERDF).
URI:	http://hdl.handle.net/10045/61744
ISSN:	0957-4174 (Print) \| 1873-6793 (Online)
DOI:	10.1016/j.eswa.2016.12.008
Idioma:	eng
Tipo:	info:eu-repo/semantics/article
Derechos:	© 2016 Elsevier Ltd.
Revisión científica:	si
Versión del editor:	http://dx.doi.org/10.1016/j.eswa.2016.12.008
Aparece en las colecciones:	INV - I2RC - Artículos de Revistas INV - AIA - Artículos de Revistas

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
2017_Cano_etal_ESWA_final.pdf	Versión final (acceso restringido)	2,6 MB	Adobe PDF	Abrir Solicitar una copia
2017_Cano_etal_ESWA_accepted.pdf	Accepted Manuscript (acceso abierto)	7,88 MB	Adobe PDF	Abrir Vista previa Cerrar vista previa

Ver citas en Google Académico

Muestra el registro completo