SpeCH: A scalable framework for data placement of data-intensive services in geo-distributed clouds

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/93243
Información del item - Informació de l'item - Item information
Título: SpeCH: A scalable framework for data placement of data-intensive services in geo-distributed clouds
Autor/es: Atrey, Ankita | Van Seghbroeck, Gregory | Mora, Higinio | De Turck, Filip | Volckaert, Bruno
Grupo/s de investigación o GITE: Informática Industrial y Redes de Computadores
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Tecnología Informática y Computación
Palabras clave: Data placement | Geo-distributed clouds | Location-based services | Online social networks | Scalability | Spectral clustering | Hypergraphs | Approximation | Distribution | Apache Spark
Área/s de conocimiento: Arquitectura y Tecnología de Computadores
Fecha de publicación: 15-sep-2019
Editor: Elsevier
Cita bibliográfica: Journal of Network and Computer Applications. 2019, 142: 1-14. doi:10.1016/j.jnca.2019.05.012
Resumen: The advent of big data analytics and cloud computing technologies has resulted in wide-spread research on the data placement problem. Since data-intensive services require access to multiple datasets within each transaction, traditional schemes of uniformly partitioning the data into distributed nodes, as employed by many popular data stores like HDFS or Cassandra, may cause network congestion thereby affecting system throughput. In this article, we propose a scalable and unified framework for data-intensive service data placement into geographically distributed clouds. The proposed framework introduces a new paradigm for partitioning a set of data-items into geo-distributed clouds using Spectral Clustering on Hypergraphs, and is therefore called SpeCH. Scaling spectral methods to large workloads is challenging, since computing the spectra of the hypergraph laplacian is a computationally intensive task. SpeCH provides two solutions to tackle this problem: (1) an algorithm, called SpectralApprox, that leverages randomized techniques for obtaining low-rank approximations of the hypergraph matrix with bounded guarantees, thereby significantly improving the efficiency of spectral clustering while also providing high quality solutions in practice; (2) an algorithm, called SpectralDist, that exploits the highly parallel nature of the spectral clustering algorithm and uses Apache Spark to speed-up the process while retaining the same quality guarantees as the exact algorithm. Additionally, being distributed in nature, SpectralDist enables SpeCH to perform data placement on workloads that require resources beyond the capacity of a single machine. Experiments on a real-world trace-based online social network dataset show that the SpeCH is effective, efficient, and scalable. Empirically, SpectralApprox is comparable in efficacy on the evaluated metrics, while being up to 10 times faster in execution time when compared to state-of-the-art techniques. On the other hand, though SpectralApprox is 7–8 times faster when compared to SpectralDist, in terms of efficacy on the evaluated metrics the latter is up to 50% better.
Patrocinador/es: This research is partly funded by VLAIO, under grant number 140055 (SBO Decomads).
URI: http://hdl.handle.net/10045/93243
ISSN: 1084-8045 (Print) | 1095-8592 (Online)
DOI: 10.1016/j.jnca.2019.05.012
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: © 2019 Elsevier Ltd.
Revisión científica: si
Versión del editor: https://doi.org/10.1016/j.jnca.2019.05.012
Aparece en las colecciones:INV - I2RC - Artículos de Revistas
INV - AIA - Artículos de Revistas

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
Thumbnail2019_Atrey_etal_JNetworkCompAppl_final.pdfVersión final (acceso restringido)1,91 MBAdobe PDFAbrir    Solicitar una copia
Thumbnail2019_Atrey_etal_JNetworkCompAppl_accepted.pdfAccepted Manuscript (acceso abierto)1,17 MBAdobe PDFAbrir Vista previa


Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.