We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
A Study of Quality and Accuracy Trade-offs in Process Mining.
- Authors
Zan Huang; Kumar, Akhil
- Abstract
In recent years, many algorithms have been proposed to extract process models from process execution logs. The process models describe the ordering relationships between tasks in a process in terms of standard constructs like sequence, parallel, choice, and loop. Most algorithms assume that each trace in a log represents a correct execution sequence based on a model. In practice, logs are often noisy, and algorithms designed for correct logs are not able to handle noisy logs. In this paper we share our key insights from a study of noise in process logs both real and synthetic. We found that all process logs can be explained by a block-structured model with two special self-loop and optional structures, making it trivial to build a fully accurate process model for any given log, even one with inaccurate data or noise present in it. However, such a model suffers from low quality. By controlling the use of self-loop and optional structures of tasks and blocks of tasks, we can balance the quality and accuracy trade-off to derive high-quality process models that explain a given percentage of traces in the log. Finally, new quality metrics and a novel quality-based algorithm for model extraction from noisy logs are described. The results of the experiments with the algorithm on real and synthetic data are reported and analyzed at length.
- Subjects
DATA mining; CONJOINT analysis; ALGORITHMS; MATHEMATICAL models; DATA extraction; ACCURACY
- Publication
INFORMS Journal on Computing, 2012, Vol 24, Issue 2, p311
- ISSN
1091-9856
- Publication type
Article
- DOI
10.1287/ijoc.1100.0444