Parsing Large XES Files for Discovering Process Models: A Big Data Problem

Please use this identifier to cite or link to this item: http://hdl.handle.net/10045/49370
Información del item - Informació de l'item - Item information
Title: Parsing Large XES Files for Discovering Process Models: A Big Data Problem
Authors: Aponte Báez, Yosvanys | Sánchez, Alexander | Marco Such, Manuel
Research Group/s: Transducens
Center, Department or Service: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Keywords: Indexing | Big data | XES | Hadoop | Map Reduce
Knowledge Area: Lenguajes y Sistemas Informáticos
Issue Date: Jul-2015
Publisher: IJARCSSE
Citation: International Journal of Advanced Research in Computer Science and Software Engineering. 2015, 5(7): 144-149
Abstract: Process mining is a group of techniques for retrieving de-facto models using system traces. Discovering algorithms can obtain mathematical models exploiting the information contained into list of events of activities. Completeness of the traces is relevant for the accuracy of the final results. Noiseless traces appear as an ideal scenario. The performance of the algorithms is significant reduce if the log files are not processed efficiently. XES is a logical model for process logs stored in data centric xml files. In real processes the sizes of the logs increase exponentially. Parsing XES files is presented as a big data problem in real scenarios with dense traces. Lazy parsers and DOM models are not enough appropriate in scenarios with large volumes of data. We discuss this problematic and how to use indexing techniques for retrieving useful information for process mining. An XES compression schema is also discussed for reducing the index construction time.
URI: http://hdl.handle.net/10045/49370
ISSN: 2277-6451 (Print) | 2277-128X (Online)
Language: eng
Type: info:eu-repo/semantics/article
Rights: CC Attribution-NonCommercial-NoDerivs 4.0
Peer Review: si
Publisher version: http://www.ijarcsse.com/index.php
Appears in Collections:INV - TRANSDUCENS - Artículos de Revistas

Files in This Item:
Files in This Item:
File Description SizeFormat 
ThumbnailV5I7-01777.pdf602,73 kBAdobe PDFOpen Preview


This item is licensed under a Creative Commons License Creative Commons