Scalable process discovery and conformance checking
 3.5k Downloads
 15 Citations
Abstract
Considerable amounts of data, including process events, are collected and stored by organisations nowadays. Discovering a process model from such event data and verification of the quality of discovered models are important steps in process mining. Many discovery techniques have been proposed, but none of them combines scalability with strong quality guarantees. We would like such techniques to handle billions of events or thousands of activities, to produce sound models (without deadlocks and other anomalies), and to guarantee that the underlying process can be rediscovered when sufficient information is available. In this paper, we introduce a framework for process discovery that ensures these properties while passing over the log only once and introduce three algorithms using the framework. To measure the quality of discovered models for such large logs, we introduce a model–model and model–log comparison framework that applies a divideandconquer strategy to measure recall, fitness, and precision. We experimentally show that these discovery and measuring techniques sacrifice little compared to other algorithms, while gaining the ability to cope with event logs of 100,000,000 traces and processes of 10,000 activities on a standard computer.
Keywords
Big data Scalable process mining Blockstructured process discovery Directlyfollows graphs Algorithm evaluation Rediscoverability Conformance checking1 Introduction
Considerable amounts of data are collected and stored by organisations nowadays. For instance, ERP systems log business transaction events, hightech systems such as Xray machines record software and hardware events, and web servers log page visits. Typically, each action of a user executed with the system, e.g. a customer filling in a form or a machine being switched on, can be recorded by the system as an event; all events related to the same process execution, e.g. a customer order or an Xray diagnosis, are grouped in a trace (ordered by their time); an event log contains all recorded traces of the system. Process mining aims to extract information from such event logs, for instance social networks, business process models, compliance to rules and regulations, and performance information (e.g. bottlenecks) [46].
In this paper, we focus on two process mining challenges: process discovery and conformance checking. Figure 1 shows the context of these two challenges: a reallife business process (a system) is running, and the executed process steps are recorded in an event log. In process discovery, one assumes that the inner workings of the system are unknown to the analyst and cannot be obtained otherwise. Therefore, process discovery aims to learn a process model from an event log, which describes the system as it actually happened (in contrast to what is assumed to have happened) [50]. Two main challenges exist in process discovery: first, one would like to learn an easytounderstand model that captures the actual behaviour. Second, the model should have a proper formal interpretation, i.e. have welldefined behavioural semantics and be free of deadlocks and other anomalies (be sound) [28]. In Sect. 2, we explore these challenges in more detail and explore how they are realised in existing algorithms and settings. Few existing algorithms solve both challenges together.
Event logs can be ‘big’ in two dimensions: many events and many activities (i.e. the different process steps). From our experiments (Sect. 6), we identified relevant gradations for these dimensions: for the number of activities we identified complex logs, i.e. containing hundreds of activities, and more complex logs, i.e. containing thousands of activities. For the number of events we identified medium logs, i.e. containing tens of thousands of events, large logs, i.e. containing millions of events, and larger logs, i.e. containing billions of events.
In our experiments we observed that existing process discovery algorithms with strong quality guarantees, e.g. soundness, can handle medium logs (see IM in Fig. 2; the algorithms will be introduced later). Algorithms not providing such guarantees (e.g. \(\alpha \), HM) can handle large logs, but fail on larger logs. In the dimension of the number of different activities, experiments showed that most algorithms could not handle complex processes. Current conformance checking techniques in our experiments and [38] seem to be unable to handle medium or complex event logs.
Such numbers of events and activities might seem large for a complainthandling process in an airline; however, processes of much larger complexity exist. For instance, even simple software tools contain hundreds or thousands of different methods. We obtained a large and complex log (SL) that will be used in the evaluation. To study or reverse engineer such software, studies [33] have recorded method calls in event logs (at various levels of granularity), and process mining and software mining techniques have been used on small examples to perform the analyses [33, 40]. complex logs can, for instance, be found in hospitals: the BPI Challenge log of 2011 (BPIC11) [56] was recorded in the emergency department of a Dutch hospital and contains over 600 activities [56]. Even though this log is just complex and medium, current discovery techniques have difficulties with this log (we will use it in the evaluation). Other areas in which more complex logs appear are clickstream data from websites, such as the website of a dotcom startup, which produced an event log containing 3300 activities (CL) [26]. Even more difficult logs could be extracted from large machines, such as the Large Hadron Collider, in which over 25,000 distributed communicating components form just a part of the control systems [25], resulting in complicated behaviour that could be analysed using scalable process mining techniques. In the future, we aim to extract such logs and apply our techniques to it, but currently, we would only discover a model but would not be able to process the discovered model further (no conformance checking and no visualisation on that scale). Nevertheless, we will show in our evaluation that our discovery techniques are able to handle such logs.
Problem definition and contribution In this paper, we address two problems: applying process discovery to larger and more complex logs, and conformance checking to medium and complex logs. We introduce two scalable frameworks: one for process discovery, the Inductive Miner—directlyfollows framework (IMd framework), and one for conformance checking: the Projected Conformance Checking framework (pcc framework). We instantiate these frameworks to obtain several algorithms, each with their specific purposes. For discovery, we show how to adapt an existing family of algorithms that offers several quality guarantees (the Inductive Miner framework (IM framework) [29]), such that is scales better and works on larger and more complex logs. We show that the recursion on event logs used by the IM framework can be replaced by recursion on an abstraction (i.e. the socalled directlyfollows graph [29]), which can be computed in a single pass over the event log. We show that this principle can also be applied to incomplete event logs (when a discovery technique has to infer missing information) and logs with infrequent behaviour or noise (when a discovery algorithm has to identify and filter the events that would degrade model quality); we present corresponding algorithms. Incompleteness and infrequency/noise pose opposing challenges to discovery algorithms, i.e. not enough and too much information. For these purposes, we introduce different algorithms. For conformance checking, we introduce the configurable divideandconquer pcc framework to compare logs to models and models to models. Instead of comparing the complete behaviour over all activities, we decompose the problem into comparing behaviour for subsets of activities. For each such subset, a recall, fitness, or precision measure is computed. The averages over these subsets provide the final measures, while the subsets with low values give information about the location in the model/log/systemmodel where deviations occur.
Results We conducted a series of experiments to test how well algorithms handle large logs and complex processes. We found that the IMd framework provides the scalability to handle all kinds of logs up to larger and more complex logs (see Fig. 2). In a second series of experiments we investigated the ability of several discovery algorithms to rediscover the original system model: we experimented to analyse the influence of log sizes, i.e. completeness of the logs, to analyse the influence of noise, i.e. randomly appearing or missing events in process executions, and to assess the influence of infrequent behaviour, i.e. structural deviations from the system model during execution. We found that the new discovery algorithms perform comparable to existing algorithms in terms of model quality, while providing much better scalability. In a third experiment, we explored how the new discovery algorithms handle large and complex reallife logs.
In all of these experiments, the new pcc framework was applied to assess the quality of the discovered models with respect to the log and where applicable the system, as existing conformance checking techniques could not handle medium or complex logs and systems. We compared the new conformance checking techniques to existing techniques, and the results suggest that the new techniques might be able to replace existing less scalable techniques. In particular, model quality with respect to a log can now be assessed in situations where existing techniques fail: up to large and complex logs.
Relation to earlier papers This paper extends the work presented in [32]. We present the new pcc framework, which is able to cope with medium and complex logs and models. This allows us to analyse and compare the quality of the IMd framework to other algorithms on such event logs in detail.
Outline First, process mining is discussed in more detail in Sect. 2. Second, process trees, directlyfollows graphs, and cuts are introduced in Sect. 3. In Sect. 4, the IMd framework and three algorithms using it are introduced. We introduce the pcc framework in Sect. 5. The algorithms are evaluated in Sect. 6 using this pcc framework. Section 7 concludes the paper.
2 Process mining
In this section, we discuss conformance checking and process discovery in more detail.
2.1 Conformance checking
The aim of conformance checking is to verify a process model against reality. As shown in Fig. 1, two types of conformance checking exist: log–model conformance checking and model–model conformance checking. In log–model conformance checking, reality is assumed to be represented by an event log, while in model–model conformance checking, a representation of the system is assumed to be present and to represent reality. Such a system is usually given as another process model, to which we refer to as system model.
2.1.1 Log–model conformance checking
To compare a process model to an event log, several quality measures have been proposed [2]. For instance, fitness expresses the part of the event log that is represented by the model, logprecision expresses the behaviour in the model that is present in the event log, generalisation expresses the likelihood that future behaviour will be representable by the model, and simplicity expresses the absence of complexity in a model [2] to represent its behaviour.
Several techniques and measures have been proposed to measure fitness and precision, such as tokenbased replay [43], alignments [1, 2], and many more: for an overview, see [60]. Some techniques that were proposed earlier, such as tokenbased replay [43], cannot handle nondeterminism well, i.e. silent activities (\(\tau \)) and duplicate activities. Later techniques, such as alignments [2], can handle nondeterminism by exhaustively searching for the model trace that has the least deviations from a given log trace (according to some cost function). However, even optimised implementations of these techniques cannot deal with medium or complex event logs and models [38]. To alleviate this problem, decomposition techniques have been proposed, using the general principles in [52], for instance using passages [47] or singleentry–singleexit decompositions [38]. The pcc framework uses the insights of [52] by checking conformance on small subsets of activities.
2.1.2 Model–model conformance checking
For a more elaborate overview of this field, we refer to [19] and [7].
Typically, the quality of a process discovery algorithm is measured using log conformance checking, i.e. a discovered model is compared to an event log. Alternatively, discovered model and system model could be compared directly. Ideally, both would be compared on branching bisimilarity or even stronger notions of equivalence [58], thereby taking the moments of choice into account. However, as an event log describes a language and does not contain information about choices, the discovered model will lack this information as well and we consider comparison based on languages (trace equivalence, a language is the set of traces of a log, or the set of traces that a model can produce).
One such technique is [5]. This approach translates the models into partially ordered runs annotated with exclusive relationships (event structures), which can be generated from process models as well [5]. However, event structures have difficulties supporting loops by their acyclic nature and constructing them requires a full statespace exploration.
As noted in [19], many model–model comparison techniques suffer from exponential complexity due to concurrency and loops in the models. To overcome this problem, several techniques apply an abstraction, for instance using causal footprints [50, 55], weak order relations [68], or behavioural profiles [27, 62]. Another technique to reduce the state space is decompose the model in pieces and perform the computations on these pieces individually [27]. Our approach (pcc framework, which applies to both log–model and model–model conformance checking) applies an abstraction using a different angle: we project on subsets of activities, thereby generalising over many of these abstractions, i.e. many relations between the projected activities are captured. Moreover, our approach handles any formalism of which the executable semantics can be described by deterministic finite automata (DFAs), which includes BPMN, UMLADs, labelled Petri nets, and allows for models with duplicate activities, silent steps, and anomalies such as potential deadlocks.
2.2 Process discovery
Process discovery aims at discovering a process model from an event log (see Fig. 1). We first sketch some challenges that discovery algorithms face, after which we discuss existing discovery approaches.
Challenges Several factors challenge process discovery algorithms. One such challenge is that the resulting process model should have welldefined behavioural semantics and be sound [50]. Even though an unsound process model or a model without a language, i.e. without a definition of traces the model expresses, might be useful for manual analysis, conformance checking and other automated techniques can obviously not provide accurate measures on such models [28, 59]. The IMd framework uses its representational bias to provide a language and to guarantee soundness: it discovers an abstract hierarchical view on workflow nets [50], process trees, that is guaranteed to be sound [11].
Another challenge of process discovery is that for many event logs the different measures, e.g. fitness, logprecision, generalisation and simplicity, are competing, i.e. there might not exist a model that scores well on all criteria [12]. Thus, discovery algorithms have to balance these measures, and this balance might depend on the use case at hand, e.g. auditing questions are best answered using a model with high fitness, optimisations are best performed on a model with high logprecision, implementations might require a model with high generalisation, and human interpretation is eased by a simple model [12].
A desirable property of discovery algorithms is having the ability to rediscover the language of the system (rediscoverability); we assume the system and the system model to have the same behaviour for rediscoverability. Rediscoverability is usually proven using assumptions on both system and event log: the system typically must be of a certain class, and the event log must contain enough correct information to describe the system well [29]. Therefore, three more challenges of process discovery algorithms are to handle (1) noise in the event log, i.e. random absence or presence of events [13], (2) infrequent behaviour, i.e. behaviour that occurs less frequent than ‘normal’ behaviour, i.e. the exceptional cases. For instance, most complaints sent to an airline are handled according to a model, but a few complaints are so complicated that they require ad hoc solutions. This behaviour could be of interest or not, which depends on the goal of the analysis [50]. (3) incompleteness, i.e. the event log does not contain ‘enough’ information. The notion of what ‘enough’ means depends on the discovery algorithm [6, 29]. Even though rediscoverability is desirable, it is a formal property, and it is not easy to compare algorithms using it. However, the pcc framework allows to perform experiments to quantify how rediscoverability is influenced by noise, infrequent behaviour, and incompleteness.
A last challenge arises from the main focus of this paper, i.e. highly scalable environments. Ideally, a discovery technique should linearly pass over the event log once, which removes the need to keep the event log in memory. In the remainder of this section, we discuss related process discovery techniques and their application in scalable environments.
Sound process discovery algorithms Process discovery techniques such as the Evolutionary Tree Miner (ETM) [11], the Constructs Competition Miner (CCM) [41], Maximal Pattern Mining (MPM) [34], and Inductive Miner (IM) [29] provide several quality guarantees, in particular soundness and some offer rediscoverability, but do not manage to discover a model in a single pass. ETM applies a genetic strategy, i.e. generates an initial population, and then applies random crossover steps, selects the ‘best’ individuals from the population and repeats. While ETM is very flexible towards the desired log measures to which respect the model should be ‘best’ and guarantees soundness, it requires multiple passes over the event log and does not provide rediscoverability.
CCM and IM use a divideandconquer strategy on event logs. In the Inductive Miner framework (IM framework), first an appropriate cut of the process activities is selected; second, that cut is used to split the event log into sublogs; third, these sublogs are recursed on, until a base case is encountered. If no appropriate cut can be found, a fallthrough (‘anything can happen’) is returned. CCM works similarly by having several process constructs compete with one another. While both CCM and the IM framework guarantee soundness and IM guarantees rediscoverability (for the class of models described in Appendix 1), both require multiple passes through the event log (the event log is being split and recursed on).
MPM first constructs a prefix tree of the event log. Second, it folds leaves to obtain a process model, thereby applying local generalisations to detect concurrency. The MPM technique guarantees soundness and fitness, allows for noise filtering and can reach high precision, but it does so at the cost of simplicity: typically, lots of activities are duplicated. Inherently, the MPM technique requires random access to the event log and a single pass does not suffice.
Some of these guarantee soundness, but do not support explicit concurrency (FD, PM1) [31]. The ILP miner guarantees fitness and can guarantee that the model is empty after completion, but only for the traces seen in the event log, i.e. the models produced by ILP are usually not sound. However, most of these algorithms (\(\alpha \), HM, ILP, PM2) neither guarantee soundness nor even provide a final marking, which makes it difficult to determine their language (see Appendix 5), and thus, their models are difficult to analyse automatically (though, such unsound models can still be useful for manual analysis).
Several techniques (e.g. \(\alpha \), HM) satisfy the singlepass requirement. These algorithms first obtain an abstraction from the log, which denotes what activities directly follow one another; in HM, this abstraction is filtered. Second, from this abstraction a process model is constructed. Both \(\alpha \) and HM have been demonstrated to be applicable in highly scalable environments: event logs of 5 million traces have been processed using mapreduce techniques [21]. Moreover, \(\alpha \) guarantees rediscoverability, but neither \(\alpha \) nor HM guarantees soundness. We show that our approach offers the same scalability as HM and \(\alpha \), but provides both soundness and rediscoverability.
Some commercial tools such as FD and PM1 offer high scalability, but do not support explicit concurrency [31]. Other discovery techniques such as the languagebased region miner [9, 10] or the statebased region miner [17] guarantee fitness but neither soundness nor rediscoverability nor work in single pass.
Software mining In the field of software mining, similar techniques have been used to discover formal specifications of software. For instance, in [40] and [4], execution sequences of software runs (i.e. traces) are recorded in an event log, from which techniques extract, e.g., valid execution sequences on the methods of an API. Such valid execution sequences can then be used to generate documentation. Process discovery differs from software mining in focus and challenges: process discovery aims to find process models with soundness and concurrency and is challenged, e.g., by deviations from the model (noise, infrequent behaviour) and readability requirements of the discovered models, while for software mining techniques, the system is fixed and challenges arise from, e.g., nesting levels [40], programmed exceptions [67], and collaborating components [22].
Streams Another set of approaches that aims to handle even bigger logs assumes that the event log is an unbounded stream of events. Some approaches such as [18, 24] work on clickstream data, i.e. the sequence of web pages users visit, to extract, for instance, clusters of similar users or web pages. However, we aim to extract endtoend process models, in particular containing parallelism. HM, \(\alpha \), and CCM have been shown to be applicable in streaming environments [14, 42], and any singlepass discovery algorithm (thus the IMd framework as well) can be converted into a streaming algorithm, but will have to deal with the same discovery challenges as described before.
3 Preliminaries
To overcome the limitations of process discovery on large event logs, we will combine the singlepass property of directlyfollows graphs with a divideandconquer strategy. This section recalls these existing concepts. The new algorithms are introduced in Sect. 4.
3.1 Basic notions
Event logs An event log is a multiset of traces that denote process executions. For instance, the event log \([\langle a, b, c \rangle , \langle b, d \rangle ^2]\) denotes the event log in which the trace consisting of the activity a followed by the activity b followed by the activity c was executed once, and the trace consisting of b followed by d was executed twice.
Directlyfollows graphs A directlyfollows graph can be derived from a log and describes what activities follow one another directly, and with which activities a trace starts or ends. In a directlyfollows graph, there is an edge from an activity a to an activity b if a is followed directly by b. The weight of an edge denotes how often that happened. For instance, the directlyfollows graph of our example log \([\langle a, b, c \rangle , \langle b, d \rangle ^2]\) is shown in Fig. 4. Note that the multiset of start activities is \([a, b^2]\) and the multiset of end activities is \([c, d^2]\). A directlyfollows graph can be obtained in a single pass over the event log with minimal memory requirements [21].
Cuts, characteristics, and the Inductive Miner framework. A partition is a nonoverlapping division of the activities of a directlyfollows graph. For instance, \((\{a, b\}, \{c, d\})\) is a binary partition of the directlyfollows graph in Fig. 4. A cut is a partition combined with a process tree operator, for instance \((\rightarrow , \{a, b\}, \{c, d\})\). In the IM framework, finding a cut is an essential step: its operator becomes the root of the process tree, and its partition determines how the log is split.
The IM framework [29] discovers the main cut and projects the given log onto the activity partition. In case of loops, each iteration becomes a new trace in the projected sublog. Subsequently, for each sublog its main cut is detected and recursion continues until reaching partitions with singleton elements; these become the leaves of the process tree. If no cut can be found, a generalising fallthrough is returned that allows for any behaviour (a ‘flower model’). By the use of process trees, the IM framework guarantees sound models and makes it easy to guarantee fitness. The IM framework is formalised in Appendix 1.
Suppose that the log is produced by a process which can be represented by a process tree T. Then, the root of T leaves certain characteristics in the log and in the directlyfollows graph. The most basic algorithm that uses the IM framework, i.e. IM [29], searches for a cut that matches these characteristics perfectly. Other algorithms using the IM framework are the infrequent behaviourfiltering Inductive Miner—infrequent (IMf) [28] and the incompletenesshandling Inductive Miner—incompleteness (IMc) [30].
3.2 Cut detection
Cut definitions are given Appendix 1. Here we describe how the cut detection works. Each of the four process tree operators \(\times \), \(\rightarrow \), \(\wedge \), and \(\circlearrowleft \) leaves a different characteristic footprint in the directlyfollows graph. Figure 5 visualises these characteristics: for exclusive choice, the activities of one subtree will never occur in the same trace as activities of another subtree. Hence, activities of the different subtrees form clusters that are not connected by edges in the directlyfollows graph. Thus, the \(\times \) cut is computed by taking the connected components of the directlyfollows graph.
If two subtrees are sequentially ordered, all activities of the first subtree strictly precede all activists of the second subtree; in the directlyfollows graph we expect to see a chain of clusters without edges going back. The procedure to discover a sequence cut is as follows: each activity starts as a singleton set. First, the strongly connected components of the directlyfollows graph are computed and merged. By definition, two activities are in a strongly connected component if they are pairwise reachable, and therefore they cannot sequential. Second, pairwise unreachable sets are merged, as if there is no way to reach two nodes in the same trace, they cannot be sequential. Finally, the remaining sets are sorted based on reachability.
The activities of two parallel subtrees can occur in any intertwined order; we expect all possible connections to be present between the child clusters in the directlyfollows graph. To detect parallelism, the graph is negated: the negated graph gets no edge between two activities if both directlyfollows edges between these activities are present. If either edge is missing, the negated graph will contain an edge between these two activities. In this negated graph, the partition of the parallel cut is the set of connected components.
4 Process discovery using a directlyfollows graph
Algorithms using the IM framework guarantee soundness, and some even rediscoverability, but do not satisfy the singlepass property, as the log is traversed and even copied during each recursive step. Therefore, we introduce an adapted framework: Inductive Miner—directlyfollows (IMd framework) that recurses on the directlyfollows graph instead of the event log. In this section, we first introduce the IMd framework and a basic algorithm using it. Second, we introduce two more algorithms: one to handle infrequent behaviour and another one that handles incompleteness.
4.1 Inductive Miner: directlyfollows
As a first algorithm that uses the framework, we introduce Inductive Miner—directlyfollows (IMd). We explain the stages of IMd in more detail by means of an example: Let L be \([\langle a, b, c, f, g, h, i \rangle \), \(\langle a, b, c, g, h, f, i \rangle \), \(\langle a, b, c, h, f, g, i \rangle \), \(\langle a, c, b, f, g, h, i \rangle \), \(\langle a, c, b, g, h, f, i \rangle \), \(\langle a, c, b, h, f, g, i \rangle \), \(\langle a, d, f, g, h, i \rangle \), \(\langle a, d, e, d, g, h, f, i \rangle \), \(\langle a, d, e, d, e, d, h, f, g, i \rangle ]\). The directlyfollows graph \(D_1\) of L is shown in Fig. 6.
Cut detection IMd searches for a cut that perfectly matches the characteristics mentioned in Sect. 3. As explained, cut detection has been implemented using standard graph algorithms (connected components, strongly connected components), which run in polynomial time, given the number of activities (O(n)) and directlyfollows edges (\(O(n^2)\)) in the graph.
In our example, the cut \((\rightarrow , \{a\}, \{b, c, d, e\}, \{f, g, h\}, \{i\})\) is selected: as shown in Fig. 5, every edge crosses the cut lines from left to right. Therefore, it perfectly matches the sequence cut characteristic. Using this cut, the sequence is recorded and the directlyfollows graph can be split.
Directlyfollows graph splittingGiven a cut, the IMd framework splits the directlyfollows graph in disjoint subgraphs. The idea is to keep the internal structure of each of the clusters of the cut by simply projecting a graph on the cluster. Figure 7 shows an example of how \(D_1\) (Fig. 6) is split using the sequence cut that was discovered in our example. If the operator of the cut is \(\rightarrow \) or \(\circlearrowleft \), the start and end activities of a child might be different from the start and end activities of its parent. Therefore, every edge that enters a cluster is counted as a start activity, and an edge leaving a cluster is counted as an end activity. In our example, the start activities of cluster \(\{f, g, h\}\) are those having an incoming edge not starting in \(\{f, g, h\}\), and correspondingly for end activities. The result is shown in Fig. 7a. In case of \(\times \), no edges leave any cluster and hence the start and end activities remain unchanged. In case of \(\wedge \), removed edges express the arbitrary interleaving of activities in parallel clusters; removing this interleaving information does not change with which activities a cluster may start or end; thus, start and end activities remain unchanged.
Recursion Next, IMd recurses on each of the new directlyfollows graphs (find cut, split, ...) until a base case (see below) is reached or no perfectly matching cut can be found. Each of these recursions returns a process tree, which in turn can be inserted as a child of an operator identified in an earlier recursion step.
Base case Directlyfollows graphs \(D_2\) (Fig. 7a) and \(D_5\) (Fig. 7d) contain base cases: in both graphs, only a single activity is left. The algorithm turns these into leaves of the process tree and inserts them at the respective spot of the parent operator. In our example, detecting the base cases of \(D_2\) and \(D_5\) yields the intermediate tree \(\rightarrow (a, (D_3), (D_4), i)\), in which \(D_3\) and \(D_4\) indicate directlyfollows graphs that are not base cases and will be recursed on later.
Fallthrough Consider \(D_4\) as shown in Fig. 7c. \(D_4\) does not contain unconnected parts, so does not contain an exclusive choice cut. There is no sequence cut possible, as f, g, and h form a strongly connected component. There is no parallel cut as there are no dually connected parts and no loop cut as all activities are start and end activities. Thus, IMd selects a fallthrough, being a process tree that allows for any behaviour consisting of f, g, and h (a flower model \(\circlearrowleft (\tau , f, g, h)\), having the language \((fgh)*\)). The intermediate tree of our example up till now becomes \(\rightarrow (a, (D_3), \circlearrowleft (\tau , f, g, h), i)\) (remember that \(\tau \) denotes the activity of which the execution is invisible).
Example continued In \(D_3\), as shown in Fig. 7b, a cut is present: \((\times , \{b, c\}, \{d, e\})\): no edge in \(D_3\) crosses this cut. The directlyfollows graphs \(D_6\) and \(D_7\), as shown in Fig. 8a, b, result after splitting \(D_3\). The tree of our example up till now becomes \(\rightarrow (a, \times ((D_6), (D_7)), \circlearrowleft (\tau , f, g, h), i)\).
To summarise: IMd selects a cut, splits the directlyfollows graph, and recurses until a base case is encountered or a fallthrough is necessary. As each recursion removes at least one activity from the graph and cut detection is \(O(n^2)\), IMd runs in \(O(n^3)\), in which n is the number of activities in the directlyfollows graph.
By the nature of process trees, the returned model is sound. By reasoning similar to IM [29], IMd guarantees rediscoverability on the same class of models (see Appendix 1), i.e. assuming that the model is representable by a process tree without using duplicate activities, and it is not possible to start loops with an activity they can also end with [29]. This makes IMd the first singlepass algorithm to offer these guarantees.
4.2 Handling infrequency and incompleteness
The basic algorithm IMd guarantees rediscoverability, but, as will be shown in this section, is sensitive to both infrequent and incomplete behaviour. To solve this, we introduce two more algorithms using the IMd framework.
Infrequent behaviour Infrequent behaviour in an event log is behaviour that occurs less frequent than ‘normal’ behaviour, i.e. the exceptional cases. For instance, most complaints sent to an airline are handled according to a model, but a few complaints are so complicated that they require ad hoc solutions. This behaviour could be of interest or not, which depends on the goal of the analysis.
Consider again directlyfollows graph \(D_3\), as shown in Fig. 7b, and suppose that there is a single directlyfollows edge added, from c to d. Then, \((\times , \{b, c\}, \{d, e\})\) is not a perfectly matching cut, as with the addition of this edge the two parts \(\{b, c\}\) and \(\{d, e\}\) became connected. Nevertheless, as 9 traces showed exclusive choice behaviour and only one did not, this single trace is probably an outlier and in most cases, a model ignoring this trace would be preferable.
Incompleteness A log in a ‘bigdata setting’ can be assumed to contain lots of behaviour. However, we only see example behaviour and we cannot assume to have seen all possible traces, even if we use the rather weak notion of directlyfollows completeness [30] as we do here. Moreover, sometimes smaller subsets of the log are considered, for instance when performing slicing and dicing in the context of process cubes [48]. For instance, an airline might be interested in comparing the complainthandling process for several groups of customers, to gain insight in how the process relates to age, city, and frequentflyer level of the customer. Then, there might be combinations of age, city, and frequentflyer level that rarely occur and the log for these customers might contain too little information.
If the log contains little information, edges might be missing from the directlyfollows graph and the underlying real process might not be rediscovered. Figure 9 shows an example: the cut \((\{a, b\}, \{c, d\})\) is not a parallel cut as the edge (c, b) is missing. As the event log only provides example behaviour, it could be that this edge is possible in the process, but has not been seen yet. Given this directlyfollows graph, IMd can only give up and return a fallthrough flower model, which yields a very imprecise model. However, choosing the parallel cut \((\{a, b\}, \{c, d\})\) would obviously be a better choice here, providing a better precision.
To handle incompleteness, we introduce Inductive Miner—incompleteness—directlyfollows (IMcD), which adopts ideas of IMc [30] into the IMd framework. IMcD first applies the cut detection of IMd and searches for a cut that perfectly matches a characteristic. If that fails, instead of a perfectly matching cut, IMcD searches for the most probable cut of the directlyfollows graph at hand.
IMcD does so by first estimating the most probable behavioural relation between any two activities in the directlyfollows graph. In Fig. 9, the activities a and b are most likely in a sequential relation as there is an edge from a to b. a and c are most likely in parallel as there are edges in both directions. Loops and choices have similar local characteristics. For each pair of activities x and y the probability \(P_r(x, y)\) that x and y are in relation R is determined. The best cut is then a partition into sets of activities X and Y such that the average probabilities that \(x \in X\) and \(y \in Y\) are in relation R is maximal. For a formal definition, please refer to [30].
In our example, the probability of cut \((\wedge , \{a, b\}, \{c, d\})\) is the average probability that (a, c), (a, d), (b, c) and (b, d) are parallel. IMcD chooses the cut with highest probability, using optimisation techniques. This approach gives IMcD a runtime exponential in the number of activities, but still requires a single pass over the event log.
4.3 Limitations
The IMd framework imposes some limitations on process discovery. We discuss limiting factors on the challenges identified in Sect. 2: rediscoverability, handling incompleteness, handling noise and handling infrequent behaviour, and balancing fitness, precision, generalisation and simplicity.
Limitations on rediscoverability of the IMd framework are similar to the IM framework: the system must be a process tree and adhere to some restrictions, and the log must be directlyfollows complete (as discussed before). If the system does not adhere to the restrictions, then IMd framework will not give up but rather try to discover as much process tree like behaviour as possible. For instance, if a part of the process is sequential, then IMd framework might be able to discover this, even though the other parts of the process are not block structured. Therefore, in practice, such models can be useful [8, 15]. To formally investigate what happens on nonblockstructured models would be an interesting subject of further study. For the remaining nonblock structured parts, the flexibility of IMd framework easily allows for future customisations, e.g. [36].
If the log contains noise and/or infrequent behaviour, then the IMd framework might choose a wrong cut at some point (as discussed in Sect. 4.2), possibly preventing rediscovery of the system. The noise handling abilities of logbased and directlyfollowsbased algorithms differ in detail; in both, noise and infrequent behaviour manifest as superfluous edges in a directlyfollows graph. On one hand, in IM, such wrong edges might pop up during recursion by reasoning similar to the incompleteness case (which could be harmful), while using the same reasoning, logbased algorithms might have more information available to filter such edges again (which could be beneficial). In the evaluation, we will investigate this difference further.
Given an event log, both types of algorithms have to balance fitness, precision, generalisation and simplicity. For directlyfollowsbased algorithms, this balance might be different.
For instance, a desirable property of discovery algorithms is the ability to preserve fitness, i.e. to discover a model that is guaranteed to include all behaviour seen in the event log. For directlyfollowsbased algorithms, this is challenging. For instance, Fig. 10c shows a complete directlyfollows graph of the process tree \(P_2 = \wedge (\rightarrow (a, b), c)\). However, it is also the directlyfollows graph of the event log \(L_2 = \{\langle a, c, b, c, a, b\rangle , \langle c \rangle \}\). Hence, if a fitnesspreserving directlyfollowsbased discovery algorithm would be applied to the directlyfollows graph in Fig. 10c, this algorithm could not return \(P_2\) and has to seriously underfit/generalise to preserve fitness since the behaviour of both needs to be included. Hence, \(P_2\) could never be returned. Therefore, we chose the IMd framework to not guarantee fitness, while the IM framework by its log splitting indirectly takes such concurrency dependencies into account. Please note that this holds for any pure directlyfollowsbased process discovery algorithm (see the limitations of the \(\alpha \)algorithm). Generalisation, i.e. the likelihood that future behaviour will be representable by the model, is similarly influenced.
Algorithms of the IM framework can achieve a high logprecision if it can avoid fallthroughs such as the flower model [28]. Thus, IM framework achieves the highest logprecision if it can find a cut. The same holds for IMd framework, and therefore we expect logprecision to largely depend on the cut selection. In the evaluation, we will investigate logprecision further.
The influence of directlyfollowsbased algorithms on simplicity highly depends on the chosen simplicity measure: both IM framework and IMd framework return models in which each activity appears once.
5 Comparing models to logs and models

First, we want to compare algorithms based on the size of event logs they can handle, as well as the quality of the produced models. In particular, both recall/fitness and precision (with respect to the given system model or log) need to be compared, as trivial models exist that achieve either perfect recall or perfect precision, but not both.

Second, we want to assess under which conditions the new algorithms achieve rediscoverability, i.e. under which conditions the partial or incorrect information in the event log allows to obtain a model that has exactly the same behaviour as the original system that produced the event log. More formally, under which conditions (and up to which sizes of systems) has the discovered model the same language as the original system.
We first introduce this technique for measuring recall and precision of two models—with the aim of analysing rediscoverability of process mining algorithms (Sect. 5.1). Second, we adopt this technique to also compare a discovered model to a (possibly very large) event log (Sect. 5.2). We use the techniques in our evaluation in Sect. 6.
5.1 Model–model comparison
The recall of a model S and a model M describes the part of the behaviour of S that is captured by M (compare to the conventional fitness notion in process mining), while precision captures the part of the behaviour of M that is also possible in S.
Framework Figure 11 shows an overview of the model–model evaluation framework; formally, the framework takes as input a model S, a model M, and an integer k. S and M must be process models, but can be represented using any formalism with executable semantics.
To measure recall and precision of S and M, we introduce a parameterised technique in which k defines the size of the subsets of activities for which recall and precision shall be computed. Take a subset of activities \(A = \{a_1\ldots a_k\}\), such that \(A \subseteq \varSigma (M) \cup \varSigma (S)\), and \(A = k\). Then, S and M are projected onto A, yielding \(S_A\) and \(M_A\) (we will show how the projection is performed below). From these projected \(S_A\) and \(M_A\), deterministic finite automata (DFAs) are generated, which are compared to quantify recall and precision. These steps are repeated for all such subsets A, and the average recall and precision over all subsets is reported.
As we aim to apply this method to test rediscoverability, a desirable property is that precision and recall should be 1 if and only if \(\mathcal {L}(S) = \mathcal {L}(M)\). Theorem 1, given later, states that this is the case for the class of process trees used in this paper.
In the remainder of this section, we describe the steps of the framework in more detail after which we give an example and prove Theorem 1.
Projection Many process formalisms allow for projection on subsets of activities; we give a definition for process trees here and sketch projection for Petri nets in Appendix 5.
In principle, any further languagepreserving statespace reduction rules can be applied; we will not explore further options in this paper.
Process model to deterministic finite automaton An automaton describes a language based on an alphabet \(\varSigma \). The automaton starts in its initial state; from each state, transitions labelled with activities from \(\varSigma \) denote the possible steps that can be taken from that state. A state can be an accepting state, which denotes that a trace which execution ends in that state is accepted by the automaton. An automaton with a finite set of states is a nondeterministic finite automaton (NFA). In case that the automaton does not contain a state from which two transitions with the same activity leave, the automaton is a deterministic finite automaton (DFA). Each NFA can be translated into a DFA and a language for which a DFA exist is a regular language; for each DFA, there exists a reduced unique minimal version [35].
Process tree is defined using regular expressions in Appendix 1, which can be transformed straightforwardly into an NFA (we used the implementation [37], which provides a shuffle operator). Second, a simple procedure transforms the NFA into a DFA [35].
The translation of our example \(S_{\{a, b\}}\) and \(M_{\{a, b\}}\) to DFAs results in the automata shown in Fig. 14a, b.
Outgoing edge counting of our running example
Recall for activity subset \(\{a, b\}\)  

State in \(\text {DFA}(S_{\{a, b\}})\)  Outgoing edges  State in \(\text {DFAc}(S, M, \{a, b\})\)  Outgoing edges 
\(s_1\)  3  \(s_1m_1\)  1 
\(s_2\)  1  \(s_2m_2\)  1 
\(s_3\)  1  \(s_3m_3\)  1 
\(s_4\)  1  –  0 
Precision for activity subset \(\{a, b\}\)  

State in \(\text {DFA}(M_{\{a, b\}})\)  Outgoing edges  State in \(\text {DFAc}(S, M, \{a, b\})\)  Outgoing edges 
\(m_1\)  2  \(s_1m_1\)  1 
\(m_2\)  1  \(s_2m_2\)  1 
\(m_3\)  1  \(s_3m_3\)  1 
5.1.1 Over all activities
Framework guarantees Using these definitions, we prove that the framework is able to detect language equivalence between process trees of the class that can be rediscovered by IM and IMd. This theorem will be useful in later evaluations, where from recall and precision being 1, we can conclude that the system was rediscovered.
Theorem 1
Let S and M be process trees without duplicate activities and without \(\tau \)s. Then, \({\textit{recall}}(S, M, 2) = 1 \wedge {\textit{precision}}(S, M, 2) = 1 \Leftrightarrow \mathcal {L}(S) = \mathcal {L}(M)\).
The proof strategy is to prove the two directions of the implication separately, using that for such trees, there exists a languageunique normal form [29, Corollary 15]. For a detailed proof, see Appendix 3. As for sound freechoice unlabelled workflow nets without short loops the directlyfollows graph defines a unique language [61], Theorem 1 applies to these nets as well.
Corollary 2
Let S and M be sound freechoice unlabelled workflow nets without short loops. Then, \({\textit{recall}}(S, M, 2) = 1 \wedge {\textit{precision}}(S, M, 2) = 1 \Leftrightarrow \mathcal {L}(S) = \mathcal {L}(M)\).
Unfortunately, this theorem does not hold for general process trees. For instance, take \(S = \times (a, b, c, \tau )\) and \(M = \times (a, b, c)\). For \(k = 2\), the framework will consider the subtrees \(\times (a, b, \tau )\), \(\times (a, c, \tau )\) and \(\times (b, c, \tau )\) for both S and M, thus will not spot any difference: \({\textit{recall}} = 1\) and \({\textit{precision}} = 1\), even though the languages of S and M are clearly different. Only for \(k = 3\), the framework will detect the difference.
In Sect. 6, we use the algorithm framework to test incompleteness, noise, and infrequent behaviour on large models. Before that, we first show that the ideas of the framework can also be used to compare models to event logs.
5.2 Log–model comparison
Note that L is a multiset: if a trace appears multiple times in L, it contributes multiple times as well. This is repeated for all subsets of activities A of a certain length k, similarly to the model–model comparison. Note that besides a fitness/precision number, the subsets A also provide clues where deviations in the log and model occur.
6 Evaluation
 RQ1
What is the largest event log (number of events/traces or number of activities) that process discovery algorithms can handle?
 RQ2
Are singlepass algorithms such as the algorithms of the IMd framework able to rediscover the system? How large do event logs have to be in order to enable this rediscovery? How do these algorithms compare to classical algorithms?
 RQ3
Can the system also be rediscovered if an event log contains unstructured noise or structured infrequent behaviour? How does model quality of the newly introduced algorithms suffer compared to other algorithms?
 RQ4
Can pcc framework handle logs that existing measures cannot handle? How do both sets of measures compare on smaller logs?
To answer RQ4, we conducted another experiment: we use reallife logs, apply discovery algorithms, and measure fitness and logprecision, using both the pcc framework and existing measures (Sect. 6.5). All algorithms of the IMd framework and pcc framework are implemented as plugins of the ProM framework,^{1} taking as input a directlyfollows graph. Directlyfollows graphs were generated using an external Python script. For more details on the setup, please refer to Appendix 4.
6.1 Scalability of IMd versus other discovery algorithms
First, we compare the IMd algorithms with several other discovery algorithms in their ability to handle big event logs and complex systems using limited main memory.
Setup All algorithms were tested on the same set of XES event logs, which have been created randomly from three process trees, of (A) 40 activities, (B) 1000 activities, and (C) 10,000 activities. The three trees have been generated randomly.
For each tree, we first generate a random log of t traces, starting t at 1. Second, we test whether an algorithm returns a model for that log when allocated 2GB of main memory, i.e. the algorithm terminates with a result and does not crash. If successful, we multiply t by 10 and repeat the procedure. The maximum t is recorded for each algorithm and process tree A, B, and C.
\(\alpha \)Algorithm  (\(\alpha \))  [49]  ProM 6.5.1a 
Heuristics Miner  (HM)  [63]  ProM 6.5.1a 
Integer Linear Programming  (ILP)  [54]  ProM 6.5.1a 
immediately_follows_cnet_from_log  (PIF)  [16]  PMLAB 
pn_from_ts  (PPT)  [16]  PMLAB 
Inductive Miner  (IM)  [29]  ProM 6.5.1a 
Inductive Miner—infrequent  (IMf)  [28]  ProM 6.5.1a 
Inductive Miner—incompleteness  (IMc)  [30]  ProM 6.5.1a 
IM—directlyfollows  (IMd)  This paper  ProM 6.5.1a 
IM—infrequent—directlyfollows  (IMfD)  This paper  ProM 6.5.1a 
IM—incompleteness—directlyfollows  (IMcD)  This paper  ProM 6.5.1a 
The soundnessguaranteeing algorithms ETM, CCM, and MPM were not included, as ETM is a nondeterministic algorithm and requires long run times to discover reasonable models, and as for CCM and MPM, there is no implementation publicly available. It would be interesting to test these as well.
Event logs The complexities of the event logs are shown in Table 2; they were generated randomly from trees A, B, or C. From this table, we can deduce that the average trace length in (A) is 37 events, in (B) 109, and in (C) 764; Appendix 6 shows additional statistics. Thus, the average trace length increases with the number of activities.
The largest log we could generate for A was 217 GB (\(10^8\) traces), limited by disk space. For the trees B and C, the largest logs we could generate were \(10^6\) and \(10^5\) traces, but now limited by RAM. For the bigger logs, the traces were directly transformed into a directlyfollows graph and the log itself was not stored. In Table 2, these logs are marked with *.
Log complexity (\(^{*}\) denotes that a directlyfollows graph was generated)
Traces  A: 40 activities  B: 1000 activities complex  C: 10,000 activities more complex  

Events  Activities  Events  Activities  Events  Activities  
1  21  21  190  52  81  66 
10  309  40  922  359  8796  1932 
\(10^2\)  3567  40  9577  802  77,664  7195 
\(10^3\)  37,415  40  112,821  973  780,535  9589 
\(10^4\)  370,687  40  1,106,495  999  7,641,398  9991 
\(10^5\)  3,697,424  40  10,908,461  1000  76,663,981  10,000 
\(10^6\)  36,970,718  40  109,147,057  1000  764,585,193  10,000* 
\(10^7\)  369,999,523  40  1,090,802,965  1000*  7,644,466,866  10,000* 
\(10^8\)  3,700,046,394  40  10,908,051,834  1000*  76,477,175,661  10,000* 
Scalability: maximum number of traces an algorithm could handle
A: 40 activities  B: 1000 activities  C: 10,000 activities  

Traces  Traces  Traces  
\(\alpha \)  10,000  100  1 
HM  1,000,000  1,000,000  1 
ILP  1000  100  1 
PIF  10,000  0*  1* 
PPT  10,000  0*  1* 
IM  100,000  100,000  1000 
IMd  100,000,000  100,000,000  100,000,000 
IMf  100,000  100,000  1000 
IMfD  100,000,000  100,000,000  100,000,000 
IMc  100,000\(\dagger \)  1  10* 
IMcD  100,000,000\(\dagger \)  1  10* 
This experiment clearly shows the scalability of the IMd framework, which handles larger and more complex logs easily (IMd and IMfD handle \(10^8\) traces, \(7\times 10^{10}\) events, and \(10^4\) activities). Moreover, it shows the inability of existing approaches to handle larger and complex logs: the most scalable other algorithms were IM and IMf, that both handled only 1000 traces. Furthermore, it shows the limited use sampling would have on such logs (logs manageable for other algorithms, i.e. 1000 traces for tree C do not contain all activities yet). We discuss the results in detail in Appendix 7.
TimeTimewise, it took a day to obtain a directlyfollows graph from the log of \(10^8\) traces of tree A, (using the preprocessing Python script) after that discovering a process model was a matter of seconds for IMd and IMfD. For the largest logs they could handle, PIF, \(\alpha \), HM, IM, IMf, and IMc took a few minutes; PPT took days, ILP a few hours. In comparison, on the logs that ILP could handle, creating a directlyfollows graph took a few seconds, just as applying IMd.
6.2 The influence of incompleteness on rediscoverability
Setup For each model generated in the scalability experiment, we measure recall and precision with respect to tree A, B, or C using the pcc framework. Given the results of the scalability experiment, we include the algorithms IM, IMf, IMc, HM, IMd, IMfD, IMcD, and a baseline model allowing for any behaviour (a flower model).
As HM does not guarantee to return a sound model, nor provides a final marking, we obtain a final marking using the method described in Appendix 5. However, even with this method we were unable to determine the languages of the models returned by HM, and thus, these were excluded.
Results Figure 17 shows that for model A (40 activities) both IM and IMd rediscover the language of A (a model that has 1.0 modelprecision and recall with respect to A) on a log of \(10^4\) traces. As can be seen in Fig. 18, IMd could rediscover the language of B at \(10^8\) traces, IM did not succeed as the largest log it could handle (\(10^5\) traces) did not contain enough information to rediscover the language of B. The largest log we generated for tree C, i.e. containing \(10^8\) traces, did not contain enough information to rediscover the language of C: IMd discovered a model with a recall of 1.0 and a modelprecision of 0.97. Corresponding results have been obtained for IMf/IMfD and IMc/IMcD; see Appendix 8 for all details. The flower model provided the baseline for precision: it achieved recall 1.0 at \(10^1\) (A) and \(10^2\) (B) traces and achieves a modelprecision of 0.8.
6.3 The influence of noise on rediscoverability
To answer RQ3, we tested how noise in the event log influences rediscovery. We took the event log of \(10^4\) traces of tree B, as that log was well handled by several algorithms but did not reach perfect recall and precision in the incompleteness experiment, indicating that discovery is possible but challenging. To this \(10^4\) traces, we add n noisy traces with some noise, for tenfold increasing n from 1 to \(10^5\), i.e. the logs have 10,001–110,000 traces. A noisy trace is obtained from a normal trace by adding or removing a random event (both with 0.5 probability). By the representational bias of the process trees used in the generation, such a trace is guaranteed to not fit the original model. To each of these logs, we applied IM, IMf, IMd, and IMfD and measured recall and precision with respect to the unchanged system B. No artificial RAM limit was enforced.
Notice that when \(10^5\) noisy traces are added, only 9 % of the traces remains noisefree. The directlyfollows graph of this noisy log contains 118,262 directlyfollows edges, while the graph of the model would just have 32,012. Moreover, almost all activities are observed as start activities (946) and end activities (929) in this log (vs 244/231 in the model). It is clear that without serious noise filtering, no algorithm could make any sense of this log.
Results Figure 19 shows the comparison of the 2 noise filtering algorithms IMf and IMfD on the logs of B with various noise levels. Surprisingly, IMfD performs better than IMf: IMfD achieves consistently higher precision at only slight drop in recall compared to IMf whose precision drops to 0.8, which is close to the flower model (i.e. no actual restriction of behaviour). The perfect recall obtained by IM on large lots can be explained by the fallthroughs of IMd and IM: if no cut can be found, a flower model is selected. For IM and IMd, we consistently observed lower precision scores for all models compared to both IMf and IMfD but a consistent fitness of 1.0 (which is easily explained by their lack of noise handling capabilities); exact numbers and more details are available in Appendix 8.
A manual inspection of the models returned shows that all models still give information on the overall structure of the system, while for larger parts of the model no structure could be discovered and a flower submodel was discovered. In this limited experiment, IMfD is the clear winner: it keeps precision highest in return for a little drop in recall. We suspect that this is due to IMfD using less information than IMf and therefore the introduced noise has a larger impact (see Sect. 4.3). More experiments need to evaluate this hypothesis.
6.4 The influence of infrequent behaviour on rediscoverability
Similar to the noise experiment, the log with \(10^5\) added infrequent traces has a lot of wrong behaviour: without infrequent behaviour, its directlyfollows graph would contain 32,012 edges, 244 start activities, and 231 end activities, but the deviating log contained 118,262 edges, 946 start activities, and 929 end activities. That means that if one would randomly pick an edge from the log, there would be only 27 % chance that the chosen edge would be according to the model. Exact numbers and more details are available in Appendix 8.
Results Figure 20 shows the results of IMf and IMfD on logs of process tree B with various levels of added infrequent behaviour. Similar to the noise experiments, IMfD compared to IMf trades recall (0.95 vs 0.99) for logprecision (0.95 vs 0.90). Of the nonfiltering versions IM and IMd, both got a recall of 0.99, IM a modelprecision of around 0.90, and IMd 0.92, and thus, IMd performs a bit better in this experiment.
We suspect that two of the inserted types of infrequent behaviour positively influenced the results: skipping a child of a \(\rightarrow \) or \(\wedge \) has no troublesome impact on the directlyfollows graph for the IMd framework, but log splitting will introduce a (false) empty trace for the IM framework. The IM framework algorithms must decide to ignore this empty trace in later recursions, while IMd framework algorithms simply don’t see it. Altogether, IMfD performs remarkably well given event logs containing structured deviations from the system.
6.5 Reallife model–log evaluation
To test reallife performance of the new algorithms and to answer RQ4, i.e. whether the newly introduced fitness and precision measures can handle larger logs and how they compare to existing measures, we performed a fourth experiment.
Experimental setup In this experiment, we take four reallife event logs. To these event logs, we apply the algorithms \(\alpha \), HM, IM, IMf, IMd, and IMfD and analyse the resulting models manually. The algorithms CCM and MPM are not publicly available and were excluded.
Second, in order to evaluate the pcc framework, we apply the pcc framework and existing fitness [51] and logprecision [1] measures to the discovered models. The models by HM and \(\alpha \) were unsound and had to be excluded (we introduced some heuristics for unsound models, but they did not help in this case. For more details, see Appendix 5). Furthermore, IM and IMd do not apply noise filtering and therefore their models are often flower models, so these were excluded as well.
Results for process discovery The first step in this experiment was to apply several process discovery algorithms.
Log measures compared on reallife logs
Existing techniques  This paper (pcc framework)  

Fitness [51]  Logprecision [1]  Time  Fitness  Logprecision  Time  
Measured  Scaled  Measured  Scaled  
BPIC11  IMf  Out of memory  0.627  0.764  0.472  25 s  
IMfD  Out of memory  0.997  0.766  0.477  1 m  
Flower  1.000  0.002  0.000  5 h  1.000  0.553  0.000  25 s  
BPIC12\(_A\)  IMf  0.995  0.606  0.940  \(\le \)1 s  0.999  0.967  0.931  \(\le \)1 s 
IMfD  0.816  1.000  1.000  \(\le \)1 s  0.700  1.000  1.000  \(\le \)1s  
Flower  1.000  0.227  0.000  \(\le \)1 s  1.000  0.520  0.000  \(\le \)1 s  
BPIC12\(_O\)  IMf  0.991  0.508  0.351  \(\le \)1 s  0.981  0.809  0.407  \(\le \)1 s 
IMfD  0.861  0.384  0.187  \(\le \)1 s  0.862  0.794  0.360  \(\le \)1 s  
Flower  1.000  0.242  0.000  \(\le \)1 s  1.000  0.678  0.000  \(\le \)1 s  
BPIC12\(_W\)  IMf  0.876  0.690  0.553  \(\le \)1 s  0.875  0.836  0.611  \(\le \)1 s 
IMfD  0.914  0.300  \(\)0.010  \(\le \)1 s  0.923  0.823  0.581  \(\le \)1 s  
Flower  1.000  0.307  0.000  \(\le \)1 s  1.000  0.578  0.000  \(\le \)1 s  
BPIC12  IMf  0.967  0.364  0.290  20 m  0.978  0.668  0.092  \(\le \)1 s 
IMfD  1.000  0.189  0.095  25 m  1.000  0.693  0.161  \(\le \)1 s  
Flower  1.000  0.104  0.000  30 m  1.000  0.634  0.000  \(\le \)1 s  
SL  IMf  Out of memory  0.584  0.246  \(\)0.158  30 m  
IMfD  Out of memory  0.924  0.385  0.055  30 m  
Flower  Out of memory  1.000  0.349  0.000  35 m  
CS  IMf  Out of memory  0.999  0.580  0.023  1h  
IMfD  Out of memory  0.999  0.585  0.036  6.5 h  
Flower  Out of memory  1.000  0.570  0.000  55 m 
On the SL log, HM, IM, IMf, IMd, and IMfD produced a model, and \(\alpha \) could not proceed beyond passing over the event log. The models discovered by HM, IMf, and IMfD are shown in Fig. 23 (we uploaded these models to http://www.processmining.org/blogs/pub2015/scalable_process_discovery_and_evaluation). The model discovered by HM has 2 unconnected parts, of which one part cannot be executed. Hence, it is not a workflow model, thus not sound and, as discussed before, difficult to be analysed automatically. In the models discovered IMf and IMfD, the five RapidProM operators are easily recognisable. However, the models are too complex to be analysed in detail by hand.^{2} Therefore, in further analysis steps, problematic parts of the models by IMf and IMfD could be identified, the log filtered for them, and the analysis repeated.
On the CS log, IM, IMf, IMd, and IMfD produced a model. IMd and IMfD returned a model in less than 30 s using less than 1 GB of RAM, while IM and IMf took more than an hour and used 30 GB of RAM. As CS has five times more activities than SL, we could not visualise it. This illustrates that scalable process discovery is a first step in scalable process mining: the models we obtained are suitable for automatic processing, but human analysis without further visualisation techniques is very challenging.
Results for log conformance checking Table 4 shows the results, extended with the approximate running time of the techniques.
Fitness scores according to the pcc framework differ from the fitness scores by van der Aalst [51] by at most 0.05 (except for BPIC12\(_A\) IMfD). Thus, this experiment suggests that the new fitness measurement could replace the alignmentbased fitness [51] metric, while being generally faster on both smaller and larger logs, though additional experiments may be required to verify this hypothesis. More importantly, the pcc framework could handle logs (BPIC11, SL, CS) that the existing measure could not handle.
Comparing the scaled precision measures, the pcc framework and the existing approach agree on the relative order of IMf and IMfD for BPIC12\(_A\) and BPIC12\(_O\), disagree on BPIC12, and are incomparable on BPIC11, SL, and CS due to failure of the existing measure. For BPIC12\(_W\), IMfD performed worse than the flower model according to Adriansyah et al. [1] but better according to our measure. This model, as shown in Fig. 24, is certainly more restrictive than a flower model, which is correctly reflected by our new precision measure. Therefore, likely the approach of Adriansyah et al. [1] encounters an inaccuracy when computing the precision score. For BPIC12, precision [1] ranks IMf higher than IMfD, whereas our precision ranks IMfD higher than IMf. Inspecting the models, we found that IMf misses one activity from the log while IMfD has all activities. Apparently, our new measure penalises more for a missing activity, while the alignmentbased existing measure penalises more for a missing structure.
A similar effect is visible for SL: IMf achieves a lower precision than the flower model. Further analysis revealed that several activities were missing from the model by IMf. The following example illustrates the effect: let \(L = \{\langle a, b \rangle \}\) be a projected log and \(M = a\) a projected model. Then, technically, their conjunction is empty and hence both precision and recall are 0. This matches intuition, as they have no trace in common. This sensitivity to missing activities is inherent to languagebased measuring techniques. From the model discovered by IMf, 45 activities are missing, which means that of the 36,585 pairs of activities that are considered for precision and recall, in 11,160 pairs a missing activity is involved.
This experiment does not suggest that our new measure can directly replace the existing measures, but precision seems to be able to provide a categorisation, such as good/mediocre/bad precision, compared to the flower model.
Altogether, we showed that our new fitness and precision metrics are useful to quickly assess the quality of a discovered model and decide whether to continue analyses with it or not, in particular on event logs that are too large for current techniques. In addition to simply providing an aggregated fitness and precision value, both existing and our new technique allow for more finegrained diagnostics of where in the model and event log fitness and precision are lost. For instance, by looking at the subsets \(a_1 \ldots a_k\) of activities with a low fitness or precision score, one can identify the activities that are not accurately represented by the model and then refine the analysis of the event log accordingly.
For most event logs, IMfD seems to perform comparably to IMf. However, please notice that by the nature of fitness and logprecision, for each event log there exists a trivial model that scores perfectly on both, i.e. the model consisting of a choice between all traces. As such a model provides neither any new information nor insight, generalisation and simplicity have to be taken into account as well. As future work, we would like to adapt generalisation metrics to be applicable to large event logs and complex processes as well.
7 Conclusion
Process discovery aims to obtain process models from event logs, while conformance checking aims to obtain information from the differences between a model and either an event log or a system model. Currently, there is no process discovery technique that works on larger and more complex logs, i.e. containing billions of events or thousands of activities, and that guarantees both soundness and rediscoverability. Moreover, current log conformance checking techniques cannot handle medium and complex logs, and as current process discovery evaluation techniques are based on log conformance checking, algorithms cannot be evaluated for medium and complex logs. In this paper, we pushed the boundary on what can be done with larger and more complex logs.
For process discovery, we introduced the Inductive Miner—directlyfollows (IMd) framework and three algorithms using it. The input of the framework is a directlyfollows graph, which can be obtained from any event log in linear time, for instance using highly scalable techniques such as mapreduce. The IM framework uses a divideandconquer strategy that recursively builds a process model by splitting the directlyfollows graph and recursing on the subgraphs until it encounters a base case.
We showed that the memory usage of algorithms of the IMd framework is independent of the number of traces in the event log considered. In our experiments, the scalability was only limited by the logs we could generate. The IMd framework managed to handle over 70 billion events, while using only 2 GB of RAM; some other techniques required the event log to be in main memory and therefore could handle at most 1–10 million events. Besides scalability, we also investigated how the new algorithms compare qualitatively to existing techniques that use more knowledge, but also have higher memory requirements. The new algorithms handled systems of 10,000 activities in polynomial time and were robust to incompleteness, noise, and infrequent behaviour. Moreover, they always return sound models and suffered little loss in quality compared to multipass algorithms; in some cases we even observed quality improvements.
For conformance checking, we introduced the projected conformance checking framework (pcc framework), that is applicable to both log–model and model–model conformance checking. The pcc framework measures recall/fitness and precision, by projecting both system and system model/log onto subsets of activities to determine their recall/fitness and precision. Using this framework, one can measure recall/fitness and precision of arbitrary models with a bounded state space of (almost) arbitrary size.
The pcc framework’s model–model capabilities enable a novel way to evaluate discovery techniques that scales well and provides new insights. We applied this to test robustness of various algorithms to incompleteness, noise, and infrequent behaviour. Moreover, we showed that the log–model version of the pcc framework allows to measure fitness and precision of a model with respect to an event log, even in cases where classical techniques fail, and can give detailed insights into the location of deviations in both log and model.
Altogether, we have presented the first steps of process mining workflows on very large data sets: discovering a model and assessing its quality. However, as we encountered in our evaluation, we envision further steps in the processing and visualisation of large models, such as using natural languagebased techniques [5]. To ease the analyses in contexts of big data, our algorithm evaluation framework could be combined with the approach in [45], by having our framework detecting the problematic sets of activities, and the approach in [45] focusing on these submodels. For instance, it would be interesting to approximate performance measures on the model without computing alignments.
Furthermore, it would be interesting to study the influence of k on the pcc framework, both practically and theoretically. As shown in Sect. 5.1, there exist cases for which the language equivalence can only be guaranteed if k is at least the number of nodes minus one. However, besides the classes for which Theorem 1 or Corollary 2 holds, there might be other classes of models for which a smaller k suffices.
Footnotes
 1.
Available for download at http://promtools.org.
 2.
Anecdotically: the vector images of these models were too large to be displayed by Adobe Illustrator or Adobe Acrobat.
References
 1.Adriansyah, A., MunozGama, J., Carmona, J., van Dongen, B.F., van der Aalst, W.M.P.: Alignment based precision checking. In: Business Process Management Workshops 2012, pp. 137–149 (2012). doi: 10.1007/9783642362859_15
 2.Adriansyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Conformance checking using costbased fitness analysis. In: IEEE EDOC 2011, pp. 55–64 (2011). doi: 10.1109/EDOC.2011.12
 3.Adriansyah, A.: Aligning Observed and Modeled Behavior. Ph.D. thesis, Eindhoven University of Technology, Eindhoven (2014)Google Scholar
 4.Ammons, G., Bodík, R., Larus, J.R.: Mining specifications. In: POPL SIGPLANSIGACT 2002, pp. 4–16 (2002). doi: 10.1145/503272.503275. http://dblp.unitrier.de/rec/bibtex/conf/popl/AmmonsBL02
 5.ArmasCervantes, A., Baldan, P., Dumas, M., GarcíaBa nuelos, L.: Behavioral comparison of process models based on canonically reduced event structures. In: BPM 2014, pp. 267–282 (2014). doi: 10.1007/9783319101729_17
 6.Badouel, E.: On the \(\alpha \)reconstructibility of workflow nets. In: Proceedings on 33rd International Conference, of Application and Theory of Petri Nets, Hamburg, Germany, June 25–29, vol. 7347, pp. 128–147. Springer, Berlin (2012). doi: 10.1007/9783642311314_8. http://dblp.unitrier.de/rec/bibtex/conf/apn/Badouell2
 7.Becker, M., Laue, R.: A comparative survey of business process similarity measures. Comput. Ind. 63(2), 148–167 (2012). doi: 10.1016/j.compind.2011.11.003 CrossRefGoogle Scholar
 8.BennerWickner, M., Brückmann, T., Gruhn, V., Book, M.: Process mining for knowledgeintensive business processes. In: IKNOW 2015, pp. 4:1–4:8 (2015). doi: 10.1145/2809563.2809580
 9.Bergenthum, R., Desel, J., Lorenz, R., Mauser, S.: Process Mining Based on Regions of Languages. Business Process Management, Hoboken (2007)CrossRefzbMATHGoogle Scholar
 10.Bergenthum, R., Desel, J., Mauser, S., Lorenz, R.: Synthesis of Petri nets from term based representations of infinite partial languages. Fundam. Inform. 95(1), 187–217 (2009)MathSciNetzbMATHGoogle Scholar
 11.Buijs, J., van Dongen, B., van der Aalst, W.: A genetic algorithm for discovering process trees. In: IEEE Congress on Evolutionary Computation, pp. 1–8 (2012)Google Scholar
 12.Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: OTM. LNCS, vol. 7565, pp. 305–322 (2012). doi: 10.1007/9783642336065_19
 13.Buijs, J.C.A.M.: Flexible Evolutionary Algorithms for Mining Structured Process Models. Ph.D. thesis, Eindhoven University of Technology, Eindhoven (2014)Google Scholar
 14.Burattin, A., Sperduti, A., van der Aalst, W.M.P.: Controlflow discovery from event streams. In: IEEE Congress on Evolutionary Computation, pp. 2420–2427 (2014). doi: 10.1109/CEC.2014.6900341
 15.Burattin, A.: PLG2: multiperspective processes randomization and simulation for online and offline settings. CoRR (2015). arXiv:1506.08415
 16.Carmona, J., Solé, M.: PMLAB: an scripting environment for process mining. In: BPM Demos. CEURWP, vol. 1295 (2014)Google Scholar
 17.Cortadella, J., Kishinevsky, M., Lavagno, L., Yakovlev, A.: Deriving Petri nets from finite transition systems. IEEE Trans. Comput. 47(8), 859–882 (1998)MathSciNetCrossRefGoogle Scholar
 18.Datta, S., Bhaduri, K., Giannella, C., Wolff, R., Kargupta, H.: Distributed data mining in peertopeer networks. IEEE Internet Comput. 10(4), 18–26 (2006). doi: 10.1109/MIC.2006.74 CrossRefGoogle Scholar
 19.Dijkman, R.M., van Dongen, B.F., Dumas, M., GarcíaBa nuelos, L., Kunze, M., Leopold, H., Mendling, J., Uba, R., Weidlich, M., Weske, M., Yan, Z.: A short survey on process model similarity. In: Seminal Contributions to Information Systems Engineering, 25 Years of CAiSE, pp. 421–427 (2013). doi: 10.1007/9783642369261_34
 20.Esparza, J., Nielsen, M.: Decidability issues for Petri nets—a survey. Bull. EATCS 52, 244–262 (1994). http://dblp.unitrier.de/rec/bibtex/journals/eatcs/EsparzaN94
 21.Evermann, J.: Scalable process discovery using mapreduce. In: IEEE Transactions on Services Computing (2014, to appear)Google Scholar
 22.Gabel, M., Su, Z.: Javert: fully automatic mining of general temporal properties from dynamic traces. In: ACM SIGSOFT 2008, pp. 339–349 (2008). doi: 10.1145/1453101.1453150
 23.Günther, C., Rozinat, A.: Disco: discover your processes. In: BPM (Demos), pp. 40–44 (2012)Google Scholar
 24.Hay, B., Wets, G., Vanhoof, K.: Mining navigation patterns using a sequence alignment method. Knowl. Inf. Syst. 6(2), 150–163 (2004)CrossRefGoogle Scholar
 25.Hwong, Y., Keiren, J.J.A., Kusters, V.J.J., Leemans, S.J.J., Willemse, T.A.C.: Formalising and analysing the control software of the compact muon solenoid experiment at the large hadron collider. Sci. Comput. Program. 78(12), 2435–2452 (2013). doi: 10.1016/j.scico.2012.11.009 CrossRefGoogle Scholar
 26.Kohavi, R., Brodley, C.E., Frasca, B., Mason, L., Zheng, Z.: Kddcup 2000 organizers’ report: peeling the onion. SIGKDD Explor. 2(2), 86–98 (2000). doi: 10.1145/380995.381033 CrossRefGoogle Scholar
 27.Kunze, M., Weidlich, M., Weske, M.: Querying process models by behavior inclusion. Softw. Syst. Model. 14(3), 1105–1125 (2015). doi: 10.1007/s1027001303896 CrossRefGoogle Scholar
 28.Leemans, S., Fahland, D., van der Aalst, W.: Discovering blockstructured process models from event logs containing infrequent behaviour. In: Business Process Management Workshops, pp. 66–78 (2013)Google Scholar
 29.Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering blockstructured process models from event logs—A constructive approach. In: Petri Nets 2013, pp. 311–329 (2013). doi: 10.1007/9783642386978_17
 30.Leemans, S., Fahland, D., van der Aalst, W.: Discovering blockstructured process models from incomplete event logs. In: Petri nets 2014, vol. 8489, pp. 91–110 (2014). doi: 10.1007/9783319077345_6
 31.Leemans, S., Fahland, D., van der Aalst, W.: Exploring processes and deviations. In: Business Process Management Workshops (2014, to appear)Google Scholar
 32.Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Scalable process discovery with guarantees. In: BPMDS 2015, pp. 85–101 (2015). doi: 10.1007/9783319192376_6
 33.Leemans, M., van der Aalst, W.: Process mining in software systems: discovering reallife business transactions and process models from distributed systems. In: Lethbridge T, Cabot J, Egyed A (eds.) ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, pp. 44–53 (2015). http://dblp.unitrier.de/rec/bibtex/conf/models/LeemansA15
 34.Liesaputra, V., Yongchareon, S., Chaisiri, S.: Efficient process model discovery using maximal pattern mining. In: BPM 2015, pp. 441–456 (2015). doi: 10.1007/9783319230634_29
 35.Linz, P.: An introduction to formal languages and automata. Jones & Bartlett Learning, Burlington (2011)zbMATHGoogle Scholar
 36.Lu, X., Fahland, D., van den Biggelaar, F.J., van der Aalst, W.M.: Label refinement for handling duplicated tasks in process discovery. In: BPM (2016, submitted)Google Scholar
 37.Møller, A.: dk.brics.automaton—finitestate automata and regular expressions for Java (2010). http://www.brics.dk/automaton/
 38.MunozGama, J., Carmona, J., van der Aalst, W.M.P.: Singleentry singleexit decomposed conformance checking. Inf. Syst. 46, 102–122 (2014). doi: 10.1016/j.is.2014.04.003 CrossRefGoogle Scholar
 39.Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 541–580 (1989)CrossRefGoogle Scholar
 40.Pradel, M., Gross, T.R.: Automatic generation of object usage specifications from large method traces. In: ASE 2009, pp. 371–382. IEEE Computer Society (2009). doi: 10.1109/ASE.2009.60
 41.Redlich, D., Molka, T., Gilani, W., Blair, G.S., Rashid, A.: Constructs competition miner: process controlflow discovery of bpdomain constructs. In: Proceedings on 12th International Conference, Business Process Management (BPM), Haifa, Israel, September 7–11, 2014, vol. 8659, pp. 134–150 (2014). doi: 10.1007/9783319101729_9
 42.Redlich, D., Molka, T., Gilani, W., Blair, G.S., Rashid, A.: Scalable dynamic business process discovery with the constructs competition miner. In: SIMPDA 2014. CEURWP, vol. 1293, pp. 91–107 (2014)Google Scholar
 43.Rozinat, A., van der Aalst, W.M.P.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33(1), 64–95 (2008). doi: 10.1016/j.is.2007.07.001 CrossRefGoogle Scholar
 44.TapiaFlores, T., LópezMellado, E., EstradaVargas, A.P., Lesage, J.: Petri net discovery of discrete event processes by computing tinvariants. In: Proceedings of the 2014 IEEE Emerging Technology and Factory Automation, ETFA 2014, Barcelona, Spain, Sept. 16–19, pp. 1–8 (2014). doi: 10.1109/ETFA.2014.7005080
 45.van Beest, N.R.T.P., Dumas, M., GarcíaBa nuelos, L., Rosa, M.L.: Log delta analysis: Interpretable differencing of business process event logs. In: BPM 2015, pp. 386–405 (2015). doi: 10.1007/9783319230634_26
 46.van der Aalst, W.M.P., et al.: Process mining manifesto. In: Business Process Management Workshops(BPM) 2011 International Workshops, ClermontFerrand, France, August 29, 2011, Revised Selected Papers, Part I, 2011, pp. 169–194 (2011). doi: 10.1007/9783642281082_19
 47.van der Aalst, W.M.P.: Decomposing process mining problems using passages. In: Petri Nets 2012, pp. 72–91 (2012). doi: 10.1007/9783642311314_5
 48.van der Aalst, W.M.P.: Process cubes: slicing, dicing, rolling up and drilling down event data for process mining. APBPM 2013, 1–22 (2013). doi: 10.1007/9783319029221_1 Google Scholar
 49.van der Aalst, W., Weijters, A., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004). doi: 10.1109/TKDE.2004.47 CrossRefGoogle Scholar
 50.van der Aalst, W.M.P.: Process Mining—Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011). doi: 10.1007/9783642193453 zbMATHGoogle Scholar
 51.van der Aalst, W., Adriansyah, A., van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip. Rev Data Min. Knowl. Discov. 2(2), 182–192 (2012)CrossRefGoogle Scholar
 52.van der Aalst, W.M.P.: Decomposing Petri nets for process mining: a generic approach. Distrib. Parallel Databases 31(4), 471–507 (2013). doi: 10.1007/s1061901371275 CrossRefGoogle Scholar
 53.van der Aalst, W., Stahl, C.: Modeling Business Processes: A Petri NetOriented Approach. MIT Press, Cambridge (2011)zbMATHGoogle Scholar
 54.van der Werf, J., van Dongen, B., Hurkens, C., Serebrenik, A.: Process discovery using integer linear programming. Fundam. Inform. 94(3–4), 387–412 (2009)MathSciNetzbMATHGoogle Scholar
 55.van Dongen, B.F., Dijkman, R.M., Mendling, J.: Measuring similarity between business process models. In: Seminal Contributions to Information Systems Engineering, 25 Years of CAiSE, pp. 405–419 (2013). doi: 10.1007/9783642369261_33
 56.van Dongen, B.: BPI Challenge 2011 Dataset (2011). doi: 10.4121/uuid:d9769f3d0ab04fb8803b0d1120ffcf54
 57.van Dongen, B.: BPI Challenge 2012 Dataset (2012). doi: 10.4121/uuid:3926db30f7124394aebc75976070e91f
 58.van Glabbeek, R.J., Weijland, W.P.: Branching time and abstraction in bisimulation semantics. J. ACM 43(3), 555–600 (1996). doi: 10.1145/233551.233556 MathSciNetCrossRefzbMATHGoogle Scholar
 59.Vanhatalo, J., Völzer, H., Leymann, F.: Faster and more focused controlflow analysis for business process models through SESE decomposition. In: ICSOC 2007, pp. 43–55 (2007). doi: 10.1007/9783540749745_4
 60.Weerdt, J.D., Backer, M.D., Vanthienen, J., Baesens, B.: A multidimensional quality assessment of stateoftheart process discovery algorithms using reallife event logs. Inf. Syst. 37(7), 654–676 (2012)CrossRefGoogle Scholar
 61.Weidlich, M., van der Werf, J.: On profiles and footprints—relational semantics for Petri nets. In: Petri Nets, pp. 148–167 (2012)Google Scholar
 62.Weidlich, M., Polyvyanyy, A., Mendling, J., Weske, M.: Causal behavioural profiles—efficient computation, applications, and evaluation. Fundam. Inform. 113(3–4), 399–435 (2011)MathSciNetzbMATHGoogle Scholar
 63.Weijters, A., Ribeiro, J.: Flexible heuristics miner. In: CIDM, pp. 310–317 (2011)Google Scholar
 64.Weijters, A., van der Aalst, W., de Medeiros, A.: Process Mining with the Heuristics MinerAlgorithm. BETA working paper series 166, Eindhoven University of Technology, Eindhoven (2006)Google Scholar
 65.Wen, L., Wang, J., Sun, J.: Mining invisible tasks from event logs. In: Advances in Data and Web Management, pp. 358–365 (2007)Google Scholar
 66.Wen, L., van der Aalst, W., Wang, J., Sun, J.: Mining process models with nonfreechoice constructs. Data Min. Knowl. Discov. 15(2), 145–180 (2007)MathSciNetCrossRefGoogle Scholar
 67.Yang, J., Evans, D., Bhardwaj, D., Bhat, T., Das, M.: Perracotta: mining temporal API rules from imperfect traces. In: ICSE 2006, pp. 282–291 (2006). doi: 10.1145/1134325
 68.Zha, H., Wang, J., Wen, L., Wang, C., Sun, J.: A workflow net similarity measure based on transition adjacency relations. Comput. Ind. 61(5), 463–471 (2010). doi: 10.1016/j.compind.2010.01.001 CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.