Abstract
During the last years a new generation of processaware information systems has emerged, which enables process model configurations at buildtime as well as process instance changes during runtime. Respective model adaptations result in large collections of process model variants that are derived from same process model, but slightly differ in structure. Generally, such model variants are expensive to maintain and configure. In this thesis, we present challenges, scenarios and algorithms for representing, comparing and mining such process model variants.
We first introduce the notion of process distance, which corresponds to the minimal number of highlevel change operations needed for transforming one process model into another. In general, we presume that the shorter the average distance between a reference process model and related process variants is, the less changes are required for adapting the variants and the less efforts are needed for (future) process configuration. In this context, we present a method based on boolean algebra to compute the distance between two process models.
Starting with a collection of related process model variants, the major goal of this thesis is to discover a reference process model out of which these variants can be easily configured; i.e., a reference process model with minimal average distance to the variants. To achieve this goal we present two advanced algorithms which have their pros \& cons, and that are applicable in different scenarios. Our clustering algorithm does not presume any knowledge about the original reference process model out of which the process model variants were configured. By only looking at the process model variants, this algorithm can quickly discover a reference process model in polynomial time, which allows us to scale up when solving realworld problems. The clustering algorithm further provides information on how well each part of the discovered reference model fits to the variants. Our heuristic algorithm, in turn, can take the original reference model into account as well. In particular, the user can control to what degree the discovered model differs from the original one. This way we can avoid spaghettilike process models and additionally control how many changes we want to perform on the original reference model.
We systematically evaluate and compare the two algorithms based on simulations that comprise more than 7000 process models. Simulation results indicate good performance and make the differences between the two algorithms explicit. For example, the simulation results indicate that our clustering algorithm runs significantly faster than our heuristic algorithm. However, our heuristic algorithm can identify important changes at the beginning of the search and can discover better results than the clustering algorithm.
We successfully applied the two algorithms to cases from the automotive and the healthcare domain. During these case studies, the practical relevance and benefit of our work has become evident once more.
Overall, this Ph.D thesis will contribute to more intelligent information systems by learning from past adaptations and to an improved management of the variants by continuously evolving related reference process model.
Original language  Undefined 

Awarding Institution 

Supervisors/Advisors 

Thesis sponsors  
Award date  11 Nov 2010 
Place of Publication  Enschede, the Netherlands 
Publisher  
Print ISBNs  9789036530842 
DOIs  
Publication status  Published  11 Nov 2010 
Keywords
 EWI19774
 IR74378
 METIS276772
Cite this
Li, C. (2010). Mining Process Model Variants: Challenges, Techniques, Examples. Enschede, the Netherlands: University of Twente. https://doi.org/10.3990/1.9789036530842