Boosting Perturbation-Based Iterative Algorithms to Compute the Median String

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/141854
Información del item - Informació de l'item - Item information
Título: Boosting Perturbation-Based Iterative Algorithms to Compute the Median String
Autor/es: Mirabal, Pedro | Abreu Salas, José Ignacio | Seco, Diego | Pedreira, Óscar | Chávez, Edgar
Grupo/s de investigación o GITE: Procesamiento del Lenguaje y Sistemas de Información (GPLSI)
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | Universidad de Alicante. Instituto Universitario de Investigación Informática
Palabras clave: Approximate median string | Algorithm initialization | Half space proximal neighbors
Fecha de publicación: 23-dic-2021
Editor: IEEE
Cita bibliográfica: IEEE Access. 2021, 9: 169299-169308. https://doi.org/10.1109/ACCESS.2021.3137767
Resumen: The most competitive heuristics for calculating the median string are those that use perturbation-based iterative algorithms. Given the complexity of this problem, which under many formulations is NP-hard, the computational cost involved in the exact solution is not affordable. In this work, the heuristic algorithms that solve this problem are addressed, emphasizing its initialization and the policy to order possible editing operations. Both factors have a significant weight in the solution of this problem. Initial string selection influences the algorithm’s speed of convergence, as does the criterion chosen to select the modification to be made in each iteration of the algorithm. To obtain the initial string, we use the median of a subset of the original dataset; to obtain this subset, we employ the Half Space Proximal (HSP) test to the median of the dataset. This test provides sufficient diversity within the members of the subset while at the same time fulfilling the centrality criterion. Similarly, we provide an analysis of the stop condition of the algorithm, improving its performance without substantially damaging the quality of the solution. To analyze the results of our experiments, we computed the execution time of each proposed modification of the algorithms, the number of computed editing distances, and the quality of the solution obtained. With these experiments, we empirically validated our proposal.
Patrocinador/es: This work was supported in part by the Comisión Nacional de Investigación Científica y Tecnológica - Programa de Formación de Capital Humano Avanzado (CONICYT-PCHA)/Doctorado Nacional/2014-63140074 through the Ph.D. Scholarship, in part by the European Union's Horizon 2020 under the Marie Sklodowska-Curie under Grant 690941, in part by the Millennium Institute for Foundational Research on Data (IMFD), and in part by the FONDECYT-CONICYT under Grant 1170497. The work of ÓSCAR PEDREIRA was supported in part by the Xunta de Galicia/FEDER-UE refs under Grant CSI ED431G/01 and Grant GRC: ED431C 2017/58, in part by the Office of the Vice President for Research and Postgraduate Studies of the Universidad Católica de Temuco, VIPUCT Project 2020EM-PS-08, and in part by the FEQUIP 2019-INRN-03 of the Universidad Católica de Temuco.
URI: http://hdl.handle.net/10045/141854
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2021.3137767
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
Revisión científica: si
Versión del editor: https://doi.org/10.1109/ACCESS.2021.3137767
Aparece en las colecciones:INV - GPLSI - Artículos de Revistas
Investigaciones financiadas por la UE

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
ThumbnailMirabal_etal_2021_IEEEAccess.pdf1,02 MBAdobe PDFAbrir Vista previa


Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.