KD SENSO-MERGER: An architecture for semantic integration of heterogeneous data
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10045/140267
Title: | KD SENSO-MERGER: An architecture for semantic integration of heterogeneous data |
---|---|
Authors: | Gutiérrez, Yoan | Abreu Salas, José Ignacio | Montoyo, Andres | Muñoz, Rafael | Estévez-Velarde, Suilan |
Research Group/s: | Procesamiento del Lenguaje y Sistemas de Información (GPLSI) |
Center, Department or Service: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos |
Keywords: | Heterogeneous data | Knowledge discovery | NERC | Natural language processing | Ontology and knowledge representation | Semantic data integration |
Issue Date: | 19-Jan-2024 |
Publisher: | Elsevier |
Citation: | Engineering Applications of Artificial Intelligence. 2024, 132: 107854. https://doi.org/10.1016/j.engappai.2024.107854 |
Abstract: | This paper presents KD SENSO-MERGER, a novel Knowledge Discovery (KD) architecture that is capable of semantically integrating heterogeneous data from various sources of structured and unstructured data (i.e. geolocations, demographic, socio-economic, user reviews, and comments). This goal drives the main design approach of the architecture. It works by building internal representations that adapt and merge knowledge across multiple domains, ensuring that the knowledge base is continuously updated. To deal with the challenge of integrating heterogeneous data, this proposal puts forward the corresponding solutions: (i) knowledge extraction, addressed via a plugin-based architecture of knowledge sensors; (ii) data integrity, tackled by an architecture designed to deal with uncertain or noisy information; (iii) scalability, this is also supported by the plugin-based architecture as only relevant knowledge to the scenario is integrated by switching-off non-relevant sensors. Also, we minimize the expert knowledge required, which may pose a bottleneck when integrating a fast-paced stream of new sources. As proof of concept, we developed a case study that deploys the architecture to integrate population census and economic data, municipal cartography, and Google Reviews to analyze the socio-economic contexts of educational institutions. The knowledge discovered enables us to answer questions that are not possible through individual sources. Thus, companies or public entities can discover patterns of behavior or relationships that would otherwise not be visible and this would allow extracting valuable information for the decision-making process. |
Sponsor: | This research is supported by the University of Alicante, Spain, the Spanish Ministry of Science and Innovation, the Generalitat Valenciana, Spain, and the European Regional Development Fund (ERDF) through the following funding: At the national level, the following projects were granted: TRIVIAL (PID2021-122263OB-C22); and CORTEX (PID2021-123956OB-I00), funded by MCIN/AEI/10.13039/501100011033 and, as appropriate, by ‘‘ERDF A way of making Europe’’, by the ‘‘European Union’’ or by the ‘‘European Union NextGenerationEU/PRTR’’. At regional level, the Generalitat Valenciana (Conselleria d’Educacio, Investigacio, Cultura i Esport), Spain, granted funding for NL4DISMIS (CIPROM/2021/21). |
URI: | http://hdl.handle.net/10045/140267 |
ISSN: | 0952-1976 (Print) | 1873-6769 (Online) |
DOI: | 10.1016/j.engappai.2024.107854 |
Language: | eng |
Type: | info:eu-repo/semantics/article |
Rights: | © 2024 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
Peer Review: | si |
Publisher version: | https://doi.org/10.1016/j.engappai.2024.107854 |
Appears in Collections: | INV - GPLSI - Artículos de Revistas |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
![]() | 3,59 MB | Adobe PDF | Open Preview | |
Items in RUA are protected by copyright, with all rights reserved, unless otherwise indicated.