Protein-Protein Interactions: Gene Acronym Redundancies and Current Limitations Precluding Automated Data Integration

Please use this identifier to cite or link to this item: http://hdl.handle.net/10045/69174
Información del item - Informació de l'item - Item information
Title: Protein-Protein Interactions: Gene Acronym Redundancies and Current Limitations Precluding Automated Data Integration
Authors: Casado Vela, Juan | Matthiesen, Rune | Sellés-Marchart, Susana | Naranjo, José Ramón
Research Group/s: Proteómica y Genómica Funcional de Plantas
Center, Department or Service: Universidad de Alicante. Departamento de Agroquímica y Bioquímica
Keywords: Bioinformatics | Calsenilin | Choline kinase | Data integration | DREAM | Gene acronym | Gene redundancy | HGNC | HUGO | Human interactome | KChIP3 | Protein accession | Protein interactions | Protein-protein prediction | Uromodulin
Knowledge Area: Bioquímica y Biología Molecular
Issue Date: 31-May-2013
Publisher: MDPI
Citation: Casado-Vela J, Matthiesen R, Sellés S, Naranjo JR. Protein-Protein Interactions: Gene Acronym Redundancies and Current Limitations Precluding Automated Data Integration. Proteomes. 2013; 1(1):3-24. doi:10.3390/proteomes1010003
Abstract: Understanding protein interaction networks and their dynamic changes is a major challenge in modern biology. Currently, several experimental and in silico approaches allow the screening of protein interactors in a large-scale manner. Therefore, the bulk of information on protein interactions deposited in databases and peer-reviewed published literature is constantly growing. Multiple databases interfaced from user-friendly web tools recently emerged to facilitate the task of protein interaction data retrieval and data integration. Nevertheless, as we evidence in this report, despite the current efforts towards data integration, the quality of the information on protein interactions retrieved by in silico approaches is frequently incomplete and may even list false interactions. Here we point to some obstacles precluding confident data integration, with special emphasis on protein interactions, which include gene acronym redundancies and protein synonyms. Three human proteins (choline kinase, PPIase and uromodulin) and three different web-based data search engines focused on protein interaction data retrieval (PSICQUIC, DASMI and BIPS) were used to explain the potential occurrence of undesired errors that should be considered by researchers in the field. We demonstrate that, despite the recent initiatives towards data standardization, manual curation of protein interaction networks based on literature searches are still required to remove potential false positives. A three-step workflow consisting of: (i) data retrieval from multiple databases, (ii) peer-reviewed literature searches, and (iii) data curation and integration, is proposed as the best strategy to gather updated information on protein interactions. Finally, this strategy was applied to compile bona fide information on human DREAM protein interactome, which constitutes liable training datasets that can be used to improve computational predictions.
Sponsor: J. Casado-Vela is a JAE-DOC (CSIC) holder supported by Ministerio de Economía y Competitividad, Spain, co-funded by the European Social Fund. RM is supported by PTDC/EIA-EIA/099458/2008 Fundação para a Ciência e a Tecnologia (FCT), program CIENCIA 2007.
URI: http://hdl.handle.net/10045/69174
ISSN: 2227-7382
DOI: 10.3390/proteomes1010003
Language: eng
Type: info:eu-repo/semantics/article
Rights: © 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
Peer Review: si
Publisher version: http://dx.doi.org/10.3390/proteomes1010003
Appears in Collections:INV - Proteómica y Genómica Funcional de Plantas - Artículos de Revistas

Files in This Item:
Files in This Item:
File Description SizeFormat 
Thumbnail2013_Casado-Vela_etal_Proteomes.pdf732,93 kBAdobe PDFOpen Preview


This item is licensed under a Creative Commons License Creative Commons