KD SENSO-MERGER: An architecture for semantic integration of heterogeneous data

Please use this identifier to cite or link to this item: http://hdl.handle.net/10045/140267
Información del item - Informació de l'item - Item information
Title: KD SENSO-MERGER: An architecture for semantic integration of heterogeneous data
Authors: Gutiérrez, Yoan | Abreu Salas, José Ignacio | Montoyo, Andres | Muñoz, Rafael | Estévez-Velarde, Suilan
Research Group/s: Procesamiento del Lenguaje y Sistemas de Información (GPLSI)
Center, Department or Service: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Keywords: Heterogeneous data | Knowledge discovery | NERC | Natural language processing | Ontology and knowledge representation | Semantic data integration
Issue Date: 19-Jan-2024
Publisher: Elsevier
Citation: Engineering Applications of Artificial Intelligence. 2024, 132: 107854. https://doi.org/10.1016/j.engappai.2024.107854
Abstract: This paper presents KD SENSO-MERGER, a novel Knowledge Discovery (KD) architecture that is capable of semantically integrating heterogeneous data from various sources of structured and unstructured data (i.e. geolocations, demographic, socio-economic, user reviews, and comments). This goal drives the main design approach of the architecture. It works by building internal representations that adapt and merge knowledge across multiple domains, ensuring that the knowledge base is continuously updated. To deal with the challenge of integrating heterogeneous data, this proposal puts forward the corresponding solutions: (i) knowledge extraction, addressed via a plugin-based architecture of knowledge sensors; (ii) data integrity, tackled by an architecture designed to deal with uncertain or noisy information; (iii) scalability, this is also supported by the plugin-based architecture as only relevant knowledge to the scenario is integrated by switching-off non-relevant sensors. Also, we minimize the expert knowledge required, which may pose a bottleneck when integrating a fast-paced stream of new sources. As proof of concept, we developed a case study that deploys the architecture to integrate population census and economic data, municipal cartography, and Google Reviews to analyze the socio-economic contexts of educational institutions. The knowledge discovered enables us to answer questions that are not possible through individual sources. Thus, companies or public entities can discover patterns of behavior or relationships that would otherwise not be visible and this would allow extracting valuable information for the decision-making process.
Sponsor: This research is supported by the University of Alicante, Spain, the Spanish Ministry of Science and Innovation, the Generalitat Valenciana, Spain, and the European Regional Development Fund (ERDF) through the following funding: At the national level, the following projects were granted: TRIVIAL (PID2021-122263OB-C22); and CORTEX (PID2021-123956OB-I00), funded by MCIN/AEI/10.13039/501100011033 and, as appropriate, by ‘‘ERDF A way of making Europe’’, by the ‘‘European Union’’ or by the ‘‘European Union NextGenerationEU/PRTR’’. At regional level, the Generalitat Valenciana (Conselleria d’Educacio, Investigacio, Cultura i Esport), Spain, granted funding for NL4DISMIS (CIPROM/2021/21).
URI: http://hdl.handle.net/10045/140267
ISSN: 0952-1976 (Print) | 1873-6769 (Online)
DOI: 10.1016/j.engappai.2024.107854
Language: eng
Type: info:eu-repo/semantics/article
Rights: © 2024 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer Review: si
Publisher version: https://doi.org/10.1016/j.engappai.2024.107854
Appears in Collections:INV - GPLSI - Artículos de Revistas

Files in This Item:
Files in This Item:
File Description SizeFormat 
ThumbnailGutierrez_etal_2024_EngApplArtificIntellig_final.pdf3,59 MBAdobe PDFOpen Preview


Items in RUA are protected by copyright, with all rights reserved, unless otherwise indicated.