SpeCH: A scalable framework for data placement of data-intensive services in geo-distributed clouds
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/93243
Título: | SpeCH: A scalable framework for data placement of data-intensive services in geo-distributed clouds |
---|---|
Autor/es: | Atrey, Ankita | Van Seghbroeck, Gregory | Mora, Higinio | De Turck, Filip | Volckaert, Bruno |
Grupo/s de investigación o GITE: | Informática Industrial y Redes de Computadores |
Centro, Departamento o Servicio: | Universidad de Alicante. Departamento de Tecnología Informática y Computación |
Palabras clave: | Data placement | Geo-distributed clouds | Location-based services | Online social networks | Scalability | Spectral clustering | Hypergraphs | Approximation | Distribution | Apache Spark |
Área/s de conocimiento: | Arquitectura y Tecnología de Computadores |
Fecha de publicación: | 15-sep-2019 |
Editor: | Elsevier |
Cita bibliográfica: | Journal of Network and Computer Applications. 2019, 142: 1-14. doi:10.1016/j.jnca.2019.05.012 |
Resumen: | The advent of big data analytics and cloud computing technologies has resulted in wide-spread research on the data placement problem. Since data-intensive services require access to multiple datasets within each transaction, traditional schemes of uniformly partitioning the data into distributed nodes, as employed by many popular data stores like HDFS or Cassandra, may cause network congestion thereby affecting system throughput. In this article, we propose a scalable and unified framework for data-intensive service data placement into geographically distributed clouds. The proposed framework introduces a new paradigm for partitioning a set of data-items into geo-distributed clouds using Spectral Clustering on Hypergraphs, and is therefore called SpeCH. Scaling spectral methods to large workloads is challenging, since computing the spectra of the hypergraph laplacian is a computationally intensive task. SpeCH provides two solutions to tackle this problem: (1) an algorithm, called SpectralApprox, that leverages randomized techniques for obtaining low-rank approximations of the hypergraph matrix with bounded guarantees, thereby significantly improving the efficiency of spectral clustering while also providing high quality solutions in practice; (2) an algorithm, called SpectralDist, that exploits the highly parallel nature of the spectral clustering algorithm and uses Apache Spark to speed-up the process while retaining the same quality guarantees as the exact algorithm. Additionally, being distributed in nature, SpectralDist enables SpeCH to perform data placement on workloads that require resources beyond the capacity of a single machine. Experiments on a real-world trace-based online social network dataset show that the SpeCH is effective, efficient, and scalable. Empirically, SpectralApprox is comparable in efficacy on the evaluated metrics, while being up to 10 times faster in execution time when compared to state-of-the-art techniques. On the other hand, though SpectralApprox is 7–8 times faster when compared to SpectralDist, in terms of efficacy on the evaluated metrics the latter is up to 50% better. |
Patrocinador/es: | This research is partly funded by VLAIO, under grant number 140055 (SBO Decomads). |
URI: | http://hdl.handle.net/10045/93243 |
ISSN: | 1084-8045 (Print) | 1095-8592 (Online) |
DOI: | 10.1016/j.jnca.2019.05.012 |
Idioma: | eng |
Tipo: | info:eu-repo/semantics/article |
Derechos: | © 2019 Elsevier Ltd. |
Revisión científica: | si |
Versión del editor: | https://doi.org/10.1016/j.jnca.2019.05.012 |
Aparece en las colecciones: | INV - I2RC - Artículos de Revistas INV - AIA - Artículos de Revistas |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
2019_Atrey_etal_JNetworkCompAppl_final.pdf | Versión final (acceso restringido) | 1,91 MB | Adobe PDF | Abrir Solicitar una copia |
2019_Atrey_etal_JNetworkCompAppl_accepted.pdf | Accepted Manuscript (acceso abierto) | 1,17 MB | Adobe PDF | Abrir Vista previa |
Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.