1 Introduction

Finding a suitable balance between flexibility and control is a long-standing problem in the management of work processes [83]. Among the different approaches striving to achieve this balance, flexibility by design suggests to infuse flexibility in the process modeling language at hand. Declarative process modeling languages take this to the extreme: they support the specification of what are the relevant constraints on the temporal evolution of the process, without explicitly indicating how process instances should be routed to satisfy such constraints. In comparison with imperative approaches that produce “closed” representations (i.e., only those process executions explicitly foreseen in the model are allowed), declarative approaches yield “open” representations (i.e., every process execution is implicitly allowed, as long as it does not incur in the violation of some constraint).

Figure 1 depicts an intuitive representation of the difference between classical imperative process models and declarative process specifications, considering execution traces that are forbidden by the real process, allowed by the real process, and captured by the designed process specification. Imperative models (such as those based on Petri nets and related formalisms) are suited to explicitly capture control-flow patterns like sequences, choices, concurrent sections, and loops. Those patterns, in turn, lend themselves to characterize a subset of the allowed traces, but struggle in covering the whole space of execution paths in the case of loosely structured, flexible processes. In other words, they favor control over flexibility. Contrariwise, declarative specifications strive to balance flexibility and control by attempting to characterize constraints that well-separate the allowed behaviors from the forbidden ones. In other words, declarative process specifications allow us to capture not only what is expected to occur, but also what should not happen. This helps in better approximating the boundaries of the real process, containing (and extending) those captured via imperative process models.

Fig. 1.
figure 1

Intuitive representation of the difference between imperative process models and declarative process specifications in the space of all execution traces. Diagram (a) represents a real process, which isolates the allowed (green, solid fill) behaviors from the forbidden (red, dotted fill) ones. Diagram (b) shows an imperative process model that stays within the boundaries of the process, but misses many allowed behaviors. Diagram (c) shows a declarative process specification that well approximates the boundaries of the process: it accepts only traces that are allowed by the process, and includes all the traces accepted by the imperative model in (b). (Color figure online)

The idea of adopting a constraint-based, declarative approach to regulate dynamic systems has been originally brought forward in different communities: in data management, to express cascaded transactional updates [26]; in multiagent systems, to regulate agent interaction protocols [88]; and in business process management, to capture subprocesses that foresee loosely-coupled control-flow conditions on their activities [85]. This idea was further developed within BPM in consequent years, leading to a series of declarative, constraint-based process modeling languages, with two prominent exponents: Declare [76] and Dynamic Condition-Response Graphs [49]. Common to all such approaches is the usage of linear temporal/dynamic logics (i.e., temporal/dynamic logics for sequences of events) to formally describe specifications, and the exploitation of corresponding reasoning mechanisms to tackle a variety of concrete tasks along the entire process lifecycle, from design and model analysis to runtime execution and data analysis.

In this chapter, we focus on declarative process mining, that is, process mining where the input or output models are specified using declarative, constraint-based languages. Concretely, we employ the Declare language, but all the presented ideas seamlessly apply any language that can be formalized using logics over finite traces [30], which are indeed at the core of Declare. Focusing on finite traces reflects the intuition that every process instance is expected to complete in a finite number of steps. This aspect has a significant impact on the corresponding operational techniques, as these logics admit an automata-theoretic characterization that is based on standard finite-state automata [27, 30], instead of automata on infinite structures, which are needed when such logics are interpreted over infinite traces.

Leveraging automata-based techniques paired with suitable measures relating traces, events and constraints, we review three interconnected fundamental declarative process mining tasks:

  • Reasoning – to uncover relationships among different constraints, and check key properties of Declare specifications;

  • Discovery – to extract a Declare specification that suitably characterizes the traces contain in an event log;

  • Monitoring – to provide operational decision support  [63] by checking at runtime whether a running process execution satisfies a Declare specification, promptly detecting and reporting violations.

All the presented techniques are integrated in the MINERful process discovery techniqueFootnote 1 [40] and the RuM toolkitFootnote 2 [4].

The chapter is organized as follows. Section 2 introduces the declarative process specification language Declare alongside a running example to which we will refer throughout the remainder of the chapter. Section 3 provides the fundamental notions upon which the core techniques for reasoning, discovery and monitoring on declarative specifications are based. We define the formal semantics of Declare and discuss the core reasoning tasks for declarative specifications in Sect. 4. Section 5 explains the core notions of declarative process discovery and monitoring. Section 6 discusses the latest advances in the field of declarative process specification mining. Finally, Sect. 7 concludes this chapter with final remarks and a summary of the core concepts illustrated herein.

Table 1. A set of Declare constraints among those that are typically used for process mining, with their textual description, graphical notation, and examples fulfilling or violating them.

2 Declare: A Gentle Introduction

Declare is a language and graphical notation providing an extendible repertoire of templates to formulate constraints. The origin of the approach traces back to the PhD work by Pesic [75], and the parallel and consequent study in the PhD work by Montali [67]. Notably, Declare actually stems from three initial lines of research, respectively focused on the declarative specification of business processes (cf. the ConDec language [78]), service choreographies (cf. the DecSerFlow language [70, 94]), and clinical guidelines (cf. the CigDec language [72]). These lines were then unified into a single research thread. The term Declare was used for the first time in [76].

Table 1 shows a set of Declare constraints we use throughout this chapter. The whole, core set of Declare templates has been inspired by a catalogue of temporal logic patterns used in model checking for a variety of dynamic systems from different application domains [41].

Formally, we define a declarative process specification as follows.

Definition 1

(Declarative process specification). A declarative process specification is a tuple \(\textsc {DS}=(\textsc {Rep},\mathrm {Act},K)\) where

  • \(\textsc {Rep}\) is a finite non-empty set of templates, where each template is a predicate \(\textsc {k}(x_1, \ldots , x_m) \in \textsc {Rep}\) on variables \(x_1, \ldots , x_m\) (with \(m \in \mathbb {N}\) the arity of \(\textsc {k}\)),

  • \(\mathrm {Act}\) is a finite non-empty set of activities,

  • \(K\) is a finite set of constraints, namely pairs \((\textsc {k}(x_1, \ldots , x_m),\kappa )\) where \(\textsc {k}(x_1, \ldots , x_m)\) is a template from \(\textsc {Rep}\), and \(\kappa \) is a mapping that, for every \(i \in \{1,\ldots ,m\}\) assigns variable \(x_i\) with an activity \(\kappa (x_i) = a_i \in \mathrm {Act}\); we compactly denote such a constraint with \(\textsc {k}(a_1, \ldots , a_m)\).    \(\triangleleft \)

Example 1

(A Declare process specification). Figure 2 portrays an example of declarative specification for the admission process of an international Bachelor’s program. This example considers the Declare repertoire of templates. The process begins with the creation of an account in the university portal (henceforth, \( \textsf {c}\)). To specify that \( \textsf {c}\) is the initial task, we write \(\textsc {Init}( \textsf {c})\), graphically depicted with the \(\textsc {Init}\) label in the tag on top of the activity box. \(\textsc {Init}\) is a unary template and \(\textsc {Init}( \textsf {c})\) assigns its variable with activity \( \textsf {c}\). Unary templates in Declare are also known as existence templates. We indicate that not more than one account can be created per process run with \(\textsc {AtMostOne}( \textsf {c})\). In the diagram, it is indicated with the 0..1 label in the tag.

To register for a selection round (\( \textsf {r}\)), an account must have been created before (\({\textsc {Precedence}}( \textsf {c}, \textsf {r})\)). \(\textsc {Precedence}\) is a binary template and \({\textsc {Precedence}}( \textsf {c}, \textsf {r})\), graphically depicted as , assigns \( \textsf {c}\) and \( \textsf {r}\) to its first and second variable, respectively. Binary templates in Declare are commonly named as relation templates.

Every registration to a selection round (\( \textsf {r}\)) gives access to a uniquely corresponding evaluation phase (\( \textsf {v}\)). After \( \textsf {r}\), \( \textsf {v}\) eventually follows and no other registrations are allowed until \( \textsf {v}\) completes. We write \(\textsc {AlternateResponse}( \textsf {r}, \textsf {v})\), graphically depicted as . The evaluation requires \( \textsf {r}\) to be completed before and \( \textsf {v}\) will not recur unless a new registration is issued: \(\textsc {AlternatePrecedence}( \textsf {r}, \textsf {v})\), . Typically, if both \(\textsc {AlternateResponse}( \textsf {r}, \textsf {v})\) and \(\textsc {AlternatePrecedence}( \textsf {r}, \textsf {v})\) hold true, we compactly represent them jointly with the mutual relation constraint \(\textsc {AlternateSuccession}( \textsf {r}, \textsf {v})\). An admission test score has to be uploaded in the platform to access the evaluation phase: \({\textsc {Precedence}}( \textsf {t}, \textsf {v})\). Evaluation phases are necessary for the committee to return rejections (\( \textsf {n}\)) and notifications of admission (\( \textsf {y}\)), thus \(\textsc {AlternatePrecedence}( \textsf {v}, \textsf {y})\) and \(\textsc {AlternatePrecedence}( \textsf {v}, \textsf {n})\) hold.

After the admission has been notified, the candidate will not receive a rejection any longer – \(\textsc {NotResponse}( \textsf {y}, \textsf {n})\), drawn in Fig. 2 as . \(\textsc {NotResponse}( \textsf {y}, \textsf {n})\) falls under the category of the negative relation constraints, as the occurrence of \( \textsf {y}\) disables \( \textsf {n}\) in the remainder of the process execution.

Only if candidates receive a notification of admission, they will be entitled to pre-enrol in the program (\({\textsc {Precedence}}( \textsf {y}, \textsf {p})\)). The candidates are considered as pre-enrolled immediately after they pay the subscription fee (\(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\), shown as follows in the diagram: ). Also, candidates cannot be considered as pre-enrolled if they have not paid the subscription fee: \({\textsc {Precedence}}( \textsf {\$}, \textsf {p})\). Not more than one pre-enrolment is allowed per candidate: \(\textsc {AtMostOne}( \textsf {p})\). To enrol in the program (\( \textsf {e}\)), the candidate must have pre-enrolled – \({\textsc {Precedence}}( \textsf {p}, \textsf {e})\) – and uploaded the necessary school and language certificates – \({\textsc {Precedence}}( \textsf {u}, \textsf {e})\).

Fig. 2.
figure 2

The Declare map of the admission process at a university.

So far, we have been attaching an informal semantics to Declare and its templates. In the next section, we provide a more systematic and formal characterization.

3 Formal Background

Considering that Declare templates have been originally defined starting from a catalogue of Linear Temporal Logic (LTL) patterns [41], it is not surprising that temporal logics have been used to characterize the semantics of Declare since the very beginning. However, the fact that Declare specifications are interpreted over finite-length executions calls for the use of Linear Temporal Logic on Finite Traces (\(\textsc {LTL}_f\)) [30]. This indeed leads to a setting that is radically different, both semantically and algorithmically, from the traditional one where formulae are interpreted using \(\textsc {LTL}\) over infinite, recurring behaviors [29].

A complete formalization of Declare templates, also including an alternative formalization using a logic programming-based approach, can be found in [68]. It was later refined in [29]. In his PhD thesis, Di Ciccio was the first to provide a semantics based on regular expressions [36]. These two themes were later unified in [28], leading to a richer framework that is able to declaratively capture constraints and metaconstraints, that is, constraints predicating over the possible/certain satisfaction and violation of other constraints.

In this section, we provide some necessary background on \(\textsc {LTL}_f\) and its extension with past-tense temporal operators, as well as on the automata-theoretic characterization for this logic. We then use this framework to formalize Declare and reason automatically on Declare specifications. Thereupon, we reflect upon the most recent advances of research in attempting at capturing not only the formal semantics of constraints, but also how they pragmatically interact with relevant events.

3.1 Linear Temporal Logic on Finite Traces

\(\textsc {LTL}_f\) has the same syntax of \(\textsc {LTL}\) [80], but is interpreted on finite traces. In this chapter, in particular, we consider the \(\textsc {LTL}\) dialect including past modalities [56] for declarative process specifications as in [18].

From now on, we fix a finite set \(\varSigma \) representing an alphabet of propositional symbols describing (names of) activities available in the domain under study. A (finite) trace \(t= \langle a_1,\ldots ,a_n \rangle \in \varSigma \) of length \(|t|=n\) is a finite sequence of activities, where the presence of activity \(a_i\) at instant i of the trace represents an event that witnesses the occurrence of \(a_i\) at instant i – which we also write \(t(i) = a_i\). Notice that at each instant we assume that one and only one activity occurs. Using standard notation from regular expressions, the set \(\varSigma ^*\) denotes the overall set of traces whose constitutive events refer to activities in \(\varSigma \).

Definition 2

(Syntax of \(\mathbf {LTL}_{\boldsymbol{f}}\)). Well-formed formulae are built from \(\varSigma \), the unary temporal operators \(\mathop \bigcirc \) (next) and \(\mathop \ominus \) (yesterday), and the binary temporal operators \(\;\mathop {\mathrm {\mathbf {U}}}\;\) (until) and \(\;\mathop {\mathrm {\mathbf {S}}}\;\) (since) as follows:

$$ \varphi :\,\!:= \textsf {a}\mid (\lnot \varphi ) \mid (\varphi _1 \wedge \varphi _2) \mid ( \mathop \bigcirc \varphi ) \mid (\varphi _1 \;\mathop {\mathrm {\mathbf {U}}}\;\varphi _2) \mid (\mathop \ominus \varphi ) \mid (\varphi _1 \;\mathop {\mathrm {\mathbf {S}}}\;\varphi _2) $$

where \( \textsf {a}\in \varSigma \).    \(\triangleleft \)

Definition 3

(Semantics of \(\mathbf {LTL}_{\boldsymbol{f}}\), satisfaction, validity, entailment). An \(\textsc {LTL}_f\) formula \(\varphi \) is inductively satisfied in some instant \(i\) (\( 1 \le i\le n\)) of a trace \(t\) of length \(n\in \mathbb {N}\), written \(t, i\vDash \varphi \), if the following holds:

  • \( t, i\vDash \textsf {a}\) iff \( t(i) \) is assigned with \( \textsf {a}\);

  • \( t, i\vDash \lnot \varphi \) iff \( t, i\nvDash \varphi \);

  • \( t, i\vDash \varphi _1\wedge \varphi _2 \) iff \( t, i\vDash \varphi _1 \) and \( t, i\vDash \varphi _2 \);

  • \( t, i\vDash \mathop \bigcirc \varphi \) iff \( i < n\) and \( t, i+1 \vDash \varphi \);

  • \( t, i\vDash \mathop \ominus \varphi \) iff \( i>1 \) and \( t, i-1 \vDash \varphi \);

  • \( t, i\vDash \varphi _1\;\mathop {\mathrm {\mathbf {U}}}\;\varphi _2 \) iff \( t,j \vDash \varphi _2 \) with \( i\le j\le n\), and \( t, k \vDash \varphi _1 \) for all k s.t. \( {i\le k<j} \);

  • \( t, i\vDash \varphi _1\;\mathop {\mathrm {\mathbf {S}}}\;\varphi _2 \) iff \( t, j \vDash \varphi _2 \) with \( 1 \le j \le i\), and \( t, k \vDash \varphi _1 \) for all k s.t. \( {j < k \le i} \).

A formula \(\varphi \) is satisfied by a trace \(t\) (equivalently, \(t\) satisfies \(\varphi \)), written \(t\vDash {\varphi }\), iff \(t, 1 \vDash {\varphi }\). A formula \(\varphi \) is: (i) satisfiable if it has a satisfying trace from \(\varSigma ^*\); (ii) valid if every trace in \(\varSigma ^*\) satisfies it. A formula \(\varphi _1\) entails formula \(\varphi _2\), written \(\varphi _1 \models \varphi _2\), if, for every trace \( t \) of length \(n \in \mathbb {N}\) and every i s.t. \(1 \le i \le n\), if \(t,i \models \varphi \) then \(t,i \models \psi \).    \(\triangleleft \)

Since \(\textsc {LTL}_f\) is closed under negation, it is easy to see that a formula \(\varphi \) is valid if and only if \(\lnot \varphi \) is unsatisfiable.

It is worth noting that, in \(\textsc {LTL}_f\), the next operator is interpreted as the so-called strong next: \(\mathop \bigcirc \varphi \) requires that the next instant exists within the trace, and that at such next instant \(\varphi \) holds. This has an important consequence: differently from \(\textsc {LTL}\), in \(\textsc {LTL}_f\) formula \(\lnot \mathop \bigcirc \varphi \) is not equivalent to \( \mathop \bigcirc \lnot \varphi \). This is because \(\lnot \mathop \bigcirc \varphi \) is true in an instant of a finite trace either when that instant has no successor, or the next instant exists and in such a next instant \(\varphi \) does not hold. More on this can be found in [29].

From the basic operators above, the following can be derived:

  • Classical boolean abbreviations \( \mathbf {true}, \mathbf {false}, \vee , \rightarrow \);

  • Constant \(\mathbf {end}\equiv \lnot \mathop \bigcirc \mathbf {true}\), denoting the last instant of a trace;

  • Constant \(\mathbf {start}\equiv \lnot \mathop \ominus \mathbf {true}\), denoting the first instant of a trace;

  • \( \mathop \Diamond \varphi \equiv \mathbf {true}\;\mathop {\mathrm {\mathbf {U}}}\;\varphi \) indicating that \( \varphi \) eventually holds true in the trace (hence, before or at \(\mathbf {end}\));

  • \( \varphi _1 \;\mathop {\mathrm {\mathbf {W}}}\;\varphi _2 \equiv (\varphi _1 \;\mathop {\mathrm {\mathbf {U}}}\;\varphi _2) \vee \mathop \Box \varphi _1\), which relaxes \(\;\mathop {\mathrm {\mathbf {U}}}\;\) as \(\varphi _2\) may never hold true;

  • indicating that \( \varphi \) holds true at some instant before the current one (i.e., after \(\mathbf {start}\) in the trace);

  • \( \mathop \Box \varphi \equiv \lnot \mathop \Diamond \lnot \varphi \) indicating that \( \varphi \) holds true from the current instant till \(\mathbf {end}\);

  • indicating that \( \varphi \) holds true from \(\mathbf {start}\) to the current instant.

Example 2

Let \(t= \langle a, b, b, c, d, e \rangle \) be a trace and \(\varphi _1\), \(\varphi _2\) and \(\varphi _3\) three \(\textsc {LTL}_f\) formulae defined as follows: \(\varphi _1 \doteq d\); \(\varphi _2 \doteq \mathop \Diamond b\); \(\varphi _3 \doteq \mathop \Box ( b \rightarrow \mathop \Diamond d ) \). We have that \(t, 1 \nvDash \varphi _1\) whereas \(t, 5 \vDash \varphi _1\); \(t, 1 \vDash \varphi _2\) whereas \(t, 5 \nvDash \varphi _2\); \(t, 1 \vDash \varphi _3\) and \(t, 5 \vDash \varphi _3\) (in fact, \(t, i\vDash \varphi _3\) for any instant \(1 \le i\le n\)).    \(\triangleleft \)

3.2 Finite-State Automata

One of the central features of \(\textsc {LTL}_f\) is that a finite state automaton (FSA) [22] \(\mathscr {A}\!\left( \varphi \right) \) can be computed such that for every trace \(t\) we have that \(t\vDash \varphi \) iff \(t\) is in the language recognized by \(\mathscr {A}\!\left( \varphi \right) \), as illustrated in [18, 28, 30, 38]. We include the main notions next, recalling that focusing on deterministic FSAs is without loss of generality, as over finite traces every non-deterministic FSAs can be determinized [50].

Fig. 3.
figure 3

Examples of constraint FSAs.

Definition 4

(Finite state automaton (FSA)). A (deterministic) finite state automaton (FSA) is a tuple \(A= {(\varSigma ,S,\delta ,s_0,S_\text {F})}\), where:

  • \(\varSigma \) is a finite set of symbols;

  • \(S\) is a finite non-empty set of states;

  • \(\delta : S\times \varSigma \rightarrow S\) is the transition function, i.e., a partial function that, given a starting state and a (labeled) transition, returns the target state;

  • \(s_0\) is the initial state;

  • \(S_\text {F}\subseteq S\) is the set of final (accepting) states \(s_\text {F}\in S_\text {F}\)

   \(\triangleleft \)

In the remainder of the chapter, we assume that \(\delta \) is left-total and surjective on \(S\setminus \{s_0\}\), that is, the transition function is defined for every state and symbol, and every state is on a path from the initial one – with the possible exception of the initial state itself. An FSAs that is left-total is called untrimmed. Notice that these two requirements are without loss of generality: every FSA can be converted into an equivalent FSA that is left-total and surjective. In particular, to make an FSAs untrimmed, it is sufficient to: (i) introduce a non-final trap state \(s_\bot \); (ii) for every state s and symbol \(a'\) such that \(\delta (s,a')\) is not defined, enforce \(\delta (s,a') = s_\bot \); (iii) connect \(s_\bot \) to itself for every symbol, setting \(\delta (s_\bot ,a) = s_\bot \) for every \(a \in \varSigma \).

Example 3

Figure 3 depicts four FSAs. States are represented as circles and transitions as arrows. Accepting states are decorated with a double line. The initial state is indicated with a single, unlabeled incoming arc. For instance, Fig. 3(a) is such that \(\varSigma \supseteq \{ \sigma _1, \sigma _2 \}\), \(S = \{ s_0, s_1, s_2 \}\), \(S_\text {F} = \{ s_0\}\), \(\delta ( s_0, \sigma _1 ) = s_1\) and \(\delta ( s_1, \sigma _1 ) = s_2\).    \(\triangleleft \)

Definition 5

(Runs and traces of an FSA). Let \(A= {(\varSigma ,S,\delta ,s_0,S_\text {F})}\) be an FSA as per Definition 4. A computation \(\pi \) of \(A\) is a finite sequence alternating states and activities \(s_0\xrightarrow {\sigma _0} \ldots \xrightarrow {\sigma _{n-1}} s_n\) that starts from the initial state \(s_0\) is such that for every \(0 \le i < n\), we have \(\delta (s_i,\sigma _i) = s_{i+1}\). If \(\pi \) terminates in a final state, that is, \(s_n \in S_\text {F}\), then it is a run, and induces a corresponding trace \(\sigma _0,\ldots ,\sigma _{n-1}\) over \(\varSigma ^*\) obtained from \(\pi \) by only keeping the symbols that label the transitions.    \(\triangleleft \)

Example 4

In Fig. 3(a), \( \pi _1 = s_0\xrightarrow {\sigma _1} s_1 \), \( \pi _2 = s_0\xrightarrow {\sigma _2} s_0\xrightarrow {\sigma _1} s_1\xrightarrow {\sigma _1} s_2 \), and \( \pi _3 = s_0\xrightarrow {\sigma _1} s_1\xrightarrow {\sigma _2} s_2\xrightarrow {\sigma _1} s_0 \) are three examples of computations. However, only \(\pi _3\) is a run because \(s_0\in S_\text {F}\) whereas \(s_1, s_2 \notin S_\text {F}\). Notice that, in Fig. 3, we additionally highlight with a grey background colour those states that cannot be in a step of a run – that is, from which accepting states cannot be reached (e.g., \(s_2\) in Fig. 3(a)).    \(\triangleleft \)

Definition 6

(Accepted trace, language of an FSA). A trace \(t\in \varSigma ^*\) is accepted by FSA \(A= {(\varSigma ,S,\delta ,s_0,s_\text {F})}\) if there is a run of \(A\) inducing \(t\). The language \(\mathscr {L}\!\left( A\right) \) of \(A\) is the set of traces accepted by \(A\).    \(\triangleleft \)

Example 5

For the FSA in Fig. 3(a), the language contains the trace \(t_1 = \langle \sigma _1,\sigma _2,\sigma _1 \rangle \), since a run exists over this sequence of labels (i.e., \(\pi _3\) above), whereas \(t_2= \langle \sigma _2,\sigma _1 \rangle \) is not part of the language.    \(\triangleleft \)

Automata Product. FSAs are closed under the (synchronous) product operation \(\times \) [81]. The (cross-)product \(A\times A'\) of two FSAs \(A\) and \(A'\) is an FSA that accepts the intersection of languages (sets of accepted traces) of each operand: \(\mathscr {L}\!\left( A\times A'\right) = \mathscr {L}\!\left( A\right) \bigcap \mathscr {L}\!\left( A'\right) \). It is defined as follows.

Definition 7

(Automata product). The product FSA of two FSAs \(A= (\varSigma ,S,\delta ,s_0,S_\text {F})\) and \(A' = (\varSigma ,S',\delta ',s_0',S_\text {F}')\) over the same alphabet \(\varSigma \) is the FSA \(A\times A' = (\varSigma ,S^\times ,\delta ^\times ,s_0^\times ,S_\text {F}^\times )\), where the set \(S^\times \subseteq S \times S'\) of states (obtained from the cartesian product of the states in \(A\) and \(A'\)), its initial state \(s_0^\times \), its final states \(S_\text {F}^\times \), and the transition function \(\delta ^\times \), are defined by simultaneous induction as follows:

  • \(s_0^\times = \langle s_0,s_0'\rangle \in S^\times \);

  • For every state \(\langle s_1,s_1'\rangle \in S^\times \), state \(s_2 \in S\), state \(s_2' \in S'\), and label \(\ell \in \varSigma \), if \(\delta (s_1,\ell ) = s_2\) and \(\delta '(s_1',\ell ) = s_2'\) then: (i) \(\langle s_2,s_2'\rangle \in S^\times \), (ii) \(\delta ^\times (\langle s_1,s_1'\rangle ,\ell ) = \langle s_2,s_2'\rangle \), (iii) if \(s_2 \in S_\text {F}\) and \(s_2' \in S_\text {F}'\), then \(\langle s_2,s_2'\rangle \in S_\text {F}^\times \).

  • Nothing else is in \(S_\text {F}^\times \), \(S^\times \), and \(\delta ^\times \).

   \(\triangleleft \)

Notice that the FSA constructed with Definition 7 can be manipulated using language-preserving automata operations, such as in particular minimization [50].

The product operation \(\times \) is commutative and associative. The identity element for \(\times \) over alphabet \(\varSigma \) is \({A^\mathrm {I} = \left( \varSigma , \{s_0\}, s_0, \{s_0\} \times \varSigma \times \{s_0\}, \{s_0\} \right) }\) – depicted in Fig. 4(a). It accepts all traces over \(\varSigma \): \(\mathscr {L}\!\left( A^\mathrm {I}\right) = \mathbb {P}\left( \varSigma ^{*}\right) \) as any sequence of transitions labeled by symbols in \(\varSigma \) corresponds to a run for \(A^\mathrm {I}\). The absorbing element is \({A^{\emptyset } = \left( \varSigma , \{s_0\}, s_0, \{s_0\} \times \varSigma \times \{s_0\}, \emptyset \right) }\) and is illustrated in Fig. 4(b). It does not accept any trace at all: \(\mathscr {L}\!\left( A^{\emptyset }\right) = \emptyset \) as any sequence of transitions labeled by symbols in \(\varSigma \) corresponds to a computation ending in a non-accepting state.

Fig. 4.
figure 4

Finite state automata acting as identity element and absorbing element for the automata cross-product operation.

4 Reasoning

Equipped with the notions acquired thus far, we can now discuss the core reasoning tasks that are associated to declarative process specifications. To this end, we begin this section by describing the semantics of Declare in detail.

4.1 Semantics of Declare

The semantics of a Declare template \(\textsc {k}(x_1, \ldots , x_m)\) is given as an \(\textsc {LTL}_f\) formula \(\varphi _{\textsc {k}(x_1, \ldots , x_m)}\) defined over variables \(x_1, \ldots , x_m\) instead of activities. Given the free variables \(x\) and \(y\), e.g., \(\textsc {Response}(x,y)\) corresponds to \(\mathop \Box ( x\rightarrow \mathop \Diamond y)\), witnessing that whenever \(x\) occurs, then \(y\) is expected to occur at some later instant. Table 2 shows the \(\textsc {LTL}_f\) formulae of some templates of the Declare repertoire. The formalization of a constraint is then obtained by grounding the \(\textsc {LTL}_f\) formula of its template.

Definition 8

(Constraint formula, satisfying trace). The formula of constraint \(\textsc {k}(a_1, \ldots , a_m)\), written \(\varphi _{\textsc {k}(a_1, \ldots , a_m)}\), is the \(\textsc {LTL}_f\) formula obtained from \(\varphi _{\textsc {k}(x_1, \ldots , x_m)}\) by replacing \(x_i\) with \(a_i\) for each \(1 \le i \le m\). A trace \(t\) satisfies \(\textsc {k}(a_1, \ldots , a_m)\) if \(t\models \varphi _{\textsc {k}(a_1, \ldots , a_m)}\); otherwise, we say that \(t\) violates \(\textsc {k}(a_1, \ldots , a_m)\).    \(\triangleleft \)

Table 2. Semantics of some Declare constraints.

Example 6

Considering Table 2, we have \(\varphi _{\textsc {Response}( \textsf {a}, \textsf {b})} = \mathop \Box ( \textsf {a}\rightarrow \mathop \Diamond \textsf {b})\), and \(\varphi _{\textsc {Response}( \textsf {b}, \textsf {c})} = \mathop \Box ( \textsf {b}\rightarrow \mathop \Diamond \textsf {c})\). Traces \(\langle \textsf {b}\rangle \) and \(\langle \textsf {a}, \textsf {b}, \textsf {a}, \textsf {a}, \textsf {c}, \textsf {b}\rangle \) satisfy \(\textsc {Response}( \textsf {a}, \textsf {b})\), while \(\langle \textsf {a}\rangle \) and \(\langle \textsf {a}, \textsf {b}, \textsf {a}, \textsf {a}, \textsf {c}\rangle \) do not.    \(\triangleleft \)

A Declare specification is then formalized by conjoining all its constraint formulae, thus obtaining a direct, declarative notion of model trace, that is, a trace that is accepted by the specification.

Definition 9

(Specification formula, model trace). The formula of Declare specification \(\textsc {DS}=(\textsc {Rep},\mathrm {Act},K)\), written \(\varphi _{\textsc {DS}}\), is the \(\textsc {LTL}_f\) formula \(\bigwedge _{\textsc {k}\in K} \varphi _{\textsc {k}}\). A trace \(t\in \mathrm {Act}^*\) is a model trace of \(\textsc {DS}\) if \(t\models \varphi _{\textsc {DS}}\); in this case, we say that \(t\) is accepted by \(\textsc {DS}\), otherwise that \(t\) is rejected by \(\textsc {DS}\).    \(\triangleleft \)

Constructing constraint and specification formulae is, however, not enough. When one reads \(\mathop \Box ( \textsf {a}\rightarrow \mathop \Diamond \textsf {b})\) following the textual description given above, the formula gets intepreted as “whenever \( \textsf {a}\) occurs, then \( \textsf {b}\) is expected to occur at some later instant”. This formulation intuitively hints at the fact that the occurrence of \( \textsf {a}\) activates the \(\textsc {Response}( \textsf {a}, \textsf {b})\) constraint, requiring the target \( \textsf {b}\) to occur. In turn, we get that a trace not containing any occurrence of \( \textsf {a}\) is less interesting than a trace containing occurrences of \( \textsf {a}\), each followed by one or more occurrences of \( \textsf {b}\): even though both traces satisfy \(\textsc {Response}( \textsf {a}, \textsf {b})\), the first trace never “interacts” with \(\textsc {Response}( \textsf {a}, \textsf {b})\), while the second does. This relates to the notion of vacuous satisfaction in \(\textsc {LTL}\) [51] and that of interestingness of satisfaction in \(\textsc {LTL}_f\) [39].

The point is, all such considerations are not captured by the formula \(\mathop \Box ( \textsf {a}\rightarrow \mathop \Diamond \textsf {b})\), but are related to pragmatic interpretation of how it relates to traces. To see this aspect, let us consider that we can equivalently express the formula above as \(\mathop \Box \lnot \textsf {a}\vee \mathop \Diamond ( \textsf {b}\wedge \mathop \Box \lnot \textsf {a})\), which now reads as follows: “Either \( \textsf {a}\) never happens at all, or there is some occurrence of \( \textsf {b}\) after which \( \textsf {a}\) never happens”. This equivalent reformulation does not put into evidence the activation or the target.

This problem can be tackled in two possible ways. One option is to attempt at an automated approach where activation, target, and interesting satisfaction are semantically, implicitly characterized once and for all at the logical level; this is the route followed in [39]. The main drawback of this approach is that the user cannot intervene at all in deciding how to fine-tune the activation and target conditions. An alternative possibility is instead to ask the user to explicitly indicate, together with the \(\textsc {LTL}_f\) formula \(\varphi \) of the template, also two related \(\textsc {LTL}_f\) formulae expressing activation and target conditions for \(\varphi \). This latter approach, implicitly adopted in [69] and then explicitly formalized in [18], gives more control to the user on how to pragmatically interpret constraints. We follow this latter approach.

Intuitively, the activation of a constraint is a triggering condition that, once made true, expects that the target condition is satisfied by the process execution. Contrariwise, if the constraint is not activated, the satisfaction of the target is not enforced. All in all, to properly constitute an activation-target pair for an \(\textsc {LTL}_f\) formula \(\varphi \), we need them to satisfy the condition that whenever the current instant is such that the activation is satisfied, \(\varphi \) must behave equivalently to the target (thus requiring its satisfaction). This is formally captured as follows.

Definition 10

(Activation and target of a constraint). The activation and target of a constraint \(\textsc {k}\) over activities \(\mathrm {Act}\) are two \(\textsc {LTL}_f\) formulae and such that for every trace \(t\in \mathrm {Act}^*\) we have that:

Table 2 shows activations and targets for each constraint, inspired by the work of Cecconi et al. [18]. In the next example, we explain the rationale behind some of the constraint formulations in the table.

Example 7

Consider \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\), dictating that whenever \( \textsf {\$}\) occurs, then \( \textsf {p}\) is the activity occurring next. We have \(\varphi _{\textsc {ChainResponse}( \textsf {\$}, \textsf {p})} = \mathop \Box ( \textsf {\$} \rightarrow \mathop \bigcirc \textsf {p})\). Then, by Definition 10, we can directly fix , and , respectively witnessing that every occurrence of \( \textsf {\$}\) triggers the constraint, with a target requiring the consequent execution of \( \textsf {p}\) in the next instant. Similarly, for \({\textsc {Precedence}}( \textsf {\$}, \textsf {p})\) we have , and in turn, by Definition 10, and . The case of \(\textsc {AtMostOne}( \textsf {p})\) is also similar. In this case, \(\varphi _{\textsc {AtMostOne}( \textsf {p})}\) formalizes that \( \textsf {p}\) cannot occur twice, which in \(\textsc {LTL}_f\) can be directly captured by \(\lnot \mathop \Diamond ( \textsf {p}\wedge \mathop \bigcirc \mathop \Diamond \textsf {p})\). This is logically equivalent to \(\mathop \Box ( \textsf {p}\rightarrow \lnot \mathop \bigcirc \mathop \Diamond \textsf {p})\), which directly yields and .

A quite different situation holds instead for the other existence constraints. Take, for example, \(\textsc {AtLeastOne}( \textsf {a})\), requiring that \( \textsf {a}\) occurs at least once in the execution. This can be directly encoded in \(\textsc {LTL}_f\) as \(\mathop \Diamond \textsf {a}\). This formulation, however, does not help to individuate the activation and target of the constraint. Intuitively, we may disambiguate this by capturing that since the constraint requires the presence of \( \textsf {a}\) from the very beginning of the execution, the constraint is indeed activated at the beginning, i.e., when \(\mathbf {start}\) holds, imposing the satisfaction of the target \(\mathop \Diamond \textsf {a}\). This intuition is backed up by Definition 10, using the semantics of \(\mathbf {start}\) and noticing the following logical equivalences:

$$ \mathop \Diamond \textsf {a}= \mathbf {start}\rightarrow \mathop \Diamond \textsf {a}= \mathop \Box \left( \mathbf {start}\rightarrow \mathop \Diamond \textsf {a}\right) $$

This explains why the latter formulation is employed in Table 2.    \(\triangleleft \)

Fig. 5.
figure 5

Example FSAs of Declare constraints.

Declarative Constraints as FSAs. Crucial for our techniques is that every \(\textsc {LTL}_f\) formula \(\varphi \) can be encoded into a corresponding FSA (in the sense of Definition 4) \(A_\varphi \) that recognizes all and only those traces that satisfy the formula. This can be done through different algorithmic techniques. A direct approach that transforms an input formula into a non-deterministic FSAs is presented in [28, 29]; notice that the so-obtained FSAs can then be determinized and minimized using standard techniques [50, 99]. A fortiori, given a Declare specification \(\textsc {DS}=(\textsc {Rep},\mathrm {Act},K)\), we proceed as follows:

  • We pair each constraint \(\textsc {k}\in K\) to a corresponding, so-called local automaton \(A_\textsc {k}\). This automaton is the FSA \(A_{\varphi _{\textsc {k}}}\) of the constraint formula \(\varphi _{\textsc {k}}\), and is used to characterize all and only those traces that satisfy \(\textsc {k}\);

  • We pair the whole specification to a so-called global automaton \(A_\textsc {DS}\), that is, the FSA \(A_{\varphi _{\textsc {DS}}}\) of the constraint formula \(\varphi _{\textsc {DS}}\). It thus recognizes all and only the model traces of \(\textsc {DS}\). Recall that, as introduced in Definition 9, \(\varphi _{\textsc {DS}}\) is the conjunction of the formulae of the constraints in \(K\), and thus the language \(\mathscr {L}\!\left( A_\textsc {DS}\right) \) corresponds to \(\bigcap _{\textsc {k}\in K} \mathscr {L}\!\left( A_\textsc {k}\right) \). By definition of automata product, this means that \(\mathscr {L}\!\left( A_\textsc {DS}\right) \) can be obtained by computing the product of the local automata of the constraints in \(K\).

Figure 5 shows four local automata for constraints taken from our running example: \(\textsc {AlternateResponse}( \textsf {r}, \textsf {v})\), \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\), \({\textsc {Precedence}}( \textsf {u}, \textsf {e})\) and \(\textsc {AtMostOne}( \textsf {p})\). Examples of global automata are instead given in Fig. 6.

In the remainder of this chapter, we will extensively use local and global automata for reasoning, discovery, and monitoring. Though out of scope for this chapter, it is also worth mentioning that the automata-based approach has also been used for simulation of Declare models and thereby the production of event logs from declarative specifications [37], and also to define enactment engines for Declare specifications [76, 97].

Fig. 6.
figure 6

Global automata for the interplay of Declare constraints.

4.2 Reasoning on Declare Specifications

Reasoning on a Declare specification is necessary to understand which model traces are supported and, in turn, to ascertain its correctness. Reasoning is also key to unveil how constraints interact with each other, and check whether activations and targets are properly defined. As we will see, this is instrumental not only to analyze specifications, but it is also an integral part of declarative process mining.

In general, reasoning on declarative specifications is of particular importance: while they enjoy flexibility, they typically do not explicitly indicate how execution has to be controlled. We have seen how this phenomenon concretely manifests itself in the context of Declare: traces conforming to the specification (that is, model traces) are only implicitly described as those that satisfy all the given constraints. Constraints, in turn, may be quite diverse from each other (e.g., indicating what is expected to occur, but also what should not happen) and, even more importantly, may affect each other in subtle, difficult to detect ways. This phenomenon is known, in the literature that studies the cognitive impact of languages and notations, under the name of hidden dependencies [47]. Hidden dependencies in Declare have been studied in [32, 70], and their impact on understandability and interpretability of declarative process models has spawned a dedicated line of research, started in [48].

We detail next key reasoning tasks in the context of Declare, substantiating how hidden dependencies enter into the picture. We show that all such reasoning tasks can be homogeneously tackled by a single check on the global automaton of the specification under study.

Specification Consistency. This is the most fundamental task, defined as follows.

Definition 11

(Consistent specification). A Declare specification \(\textsc {DS}\) is consistent if there exists at least one model trace for \(\textsc {DS}\).    \(\triangleleft \)

Fig. 7.
figure 7

Examples of incorrect Declare specifications.

Example 8

Consider the Declare specification in Fig. 7(a). The specification is inconsistent. This is not due to conflicting constraints insisting on the same activity, but due to hidden dependencies arising from the interplay of multiple constraints. To see why the specification is inconsistent, we can try to construct a trace that satisfies some of the constraints in the model, until we reach a contradiction (i.e., the “trace pattern” constructed so far violates a constraint of the specification). This is graphically shown next:

figure g

The picture clearly depicts that \(\textsc {AtLeastOne}( \textsf {a})\) triggers:

  • on the one hand \({\textsc {Precedence}}( \textsf {d}, \textsf {a})\), calling for a preceding occurrence of \( \textsf {d}\);

  • on the other hand, in cascade, \(\textsc {Response}( \textsf {a}, \textsf {b})\), \(\textsc {Response}( \textsf {b}, \textsf {c})\), and \(\textsc {Response}( \textsf {c}, \textsf {d})\), calling for a later occurrence of \( \textsf {d}\).

Considering the interplay of the involved constraints, \( \textsf {d}\) is required to occur in different instants, hence twice, in turn violating \(\textsc {AtMostOne}( \textsf {d})\).    \(\triangleleft \)

By definition of model trace, it is immediate to see that \(\textsc {DS}\) is consistent if and only if the \(\textsc {LTL}_f\) specification formula \(\varphi _{\textsc {DS}}\) is satisfiable. This, in turn, can be algorithmically verified by first constructing the global automaton \(A_\textsc {DS}\), and then checking whether such an automaton is empty (i.e., it does not recognize any trace). Specifically, \(\varphi _{\textsc {DS}}\) is satisfiable if and only if \(A_\textsc {DS}\) is non-empty.

Detection of Dead Activities. This task amounts to check whether a Declare specification is over-constrained, in the sense that it contains an activity that can never be executed (in that case, such an activity is called dead).

Definition 12

(Dead activity). Let \(\textsc {DS}=(\textsc {Rep},\mathrm {Act},K)\) be a Declare specification. An activity \( \textsf {a}\in \mathrm {Act}\) is dead in \(\textsc {DS}\) if there is no model trace of \(\textsc {DS}\) where \( \textsf {a}\) occurs.    \(\triangleleft \)

Example 9

Consider the Declare specification in Fig. 7(b). The specification is consistent; as an example, trace \(\langle \textsf {c}, \textsf {d}\rangle \) is a model trace. However, none of its model traces can foresee the execution of \( \textsf {b}\). This can be seen if one tries to construct a trace containing an occurrence of \( \textsf {b}\). The result is the following:

figure h

It is apparent that the presence of \( \textsf {b}\) requires a previous occurrence of \( \textsf {a}\) and, indirectly, a future occurrence of \( \textsf {d}\), violating \(\textsc {NotResponse}( \textsf {a}, \textsf {d})\). This shows that \( \textsf {b}\) is a dead activity.

Consider now the specification in Fig. 7(c). The situation here is trickier. The specification is consistent, as it accepts the empty trace (where no activity is executed, and hence none of the two response constraints present in the specification gets activated). However, none of the two activities \( \textsf {a}\) and \( \textsf {b}\) present therein can occur. As soon as this happens, the combination of the two response constraints cannot be finitely satisfied. In fact, an occurrence of \( \textsf {a}\) requires a later occurrence of \( \textsf {b}\), which in turn requires a later occurrence of \( \textsf {a}\), and so on and so forth, indefinitely. In other words, in every instant, one between \(\textsc {Response}( \textsf {a}, \textsf {b})\) and \(\textsc {Response}( \textsf {b}, \textsf {a})\) must be active and waiting for a later occurrence of its target, in a future instant. Since every instant must have a next instant, it is not possible to construct a satisfying (finite) trace.    \(\triangleleft \)

Dead activity detection can be directly reduced to (in)consistency of a specification. Specifically, activity \( \textsf {a}\) is dead in a Declare specification \(\textsc {DS}=(\textsc {Rep},\mathrm {Act},K)\) if and only if the specification \((\textsc {Rep},\mathrm {Act},K\cup \{\textsc {AtLeastOne}( \textsf {a})\})\), obtained from \(\textsc {DS}\) by forcing the existence of \( \textsf {a}\) is inconsistent (i.e., its specification formula is not satisfiable).

Valid Activation and Target. To ensure that a Declare constraint \(\textsc {k}\) comes with a valid activation and target for its formula \(\varphi _{\textsc {k}}\), we can directly apply Definition 10 and check whether the \(\textsc {LTL}_f\) formula is valid, that is, whether its negation is not satisfiable.

Checking Relations Between Constraints/Specifications. We establish two key relations between constraints/specifications. The first is that of subsumption between templates, leveraging the entailment relation between \(\textsc {LTL}_f\) formulae to constraints. We formally define it as follows.

Definition 13

(Subsumption). Let \(\textsc {k}(x_1, \ldots , x_m), \textsc {k}'(x_1, \ldots , x_m) \in \textsc {Rep}\) two templates. \(\textsc {k}(x_1, \ldots , x_m)\) subsumes \(\textsc {k}'(x_1, \ldots , x_m)\) (in symbols, \(\textsc {k}(x_1, \ldots , x_m) \sqsubseteq \textsc {k}'(x_1, \ldots , x_m)\)) if, given any mapping \(\kappa \) assigning \(x_1, \ldots , x_m\) with activities \(a_1, \ldots , a_m \in \mathrm {Act}\), \( \varphi _{\textsc {k}(a_1, \ldots , a_m)} \models \varphi _{\textsc {k}'(a_1, \ldots , a_m)}\).    \(\triangleleft \)

This relation can be checked by verifying that \(\varphi _{\textsc {k}(a_1, \ldots , a_m)} \rightarrow \varphi _{\textsc {k}'(a_1, \ldots , a_m)}\) is valid, that is, the negated formula \(\varphi _{\textsc {k}(a_1, \ldots , a_m)} \wedge \lnot \varphi _{\textsc {k}'(a_1, \ldots , a_m)}\) is not satisfiable for any \(a_1, \ldots , a_m \in \mathrm {Act}\). For example, \(\textsc {Alt.Prec.}(x,y) \sqsubseteq {\textsc {Precedence}}(x,y)\) as the former requires that \(y\) can occur only if preceded by \(x\) (just as the latter) and \(y\) does not recur in between. Therefore, every event that satisfies the former must satisfy the latter too. In the following, we shall lift this notion to constraints too (e.g., we say that \(\textsc {AlternatePrecedence}( \textsf {y}, \textsf {p})\) subsumes \({\textsc {Precedence}}( \textsf {y}, \textsf {p})\)).

By Definition 8 and Definition 9, since both Declare constraints and specifications correspond to \(\textsc {LTL}_f\) formulae, we can use subsumption for a twofold purpose:

  • Consider two candidate constraints \(\textsc {k}_1\) and \(\textsc {k}_2\). If \(\textsc {k}_1 \sqsubseteq \textsc {k}_2\), then we know that adding \(\textsc {k}_1\) to a Declare specification will make the addition of \(\textsc {k}_2\) irrelevant, and that adding \(\textsc {k}_1\) or \(\textsc {k}_2\) will determine whether the specification is more or less constraining.

  • Consider a candidate constraint \(\textsc {k}\) and a target specification \(\textsc {DS}\). If the former logically entails the latter, \(\varphi _{\textsc {DS}} \models \varphi _{\textsc {k}}\), then \(\textsc {k}\) is redundant in \(\textsc {DS}\), and it makes no sense to include it in \(\textsc {DS}\).

The second relation characterizes constraints that are the negated version of each other. Let \(\textsc {k}_1\) and \(\textsc {k}_2\) be two Declare constraints, coming with activation formulae and and target formulae and , respectively. We say that \(\textsc {k}_1\) and \(\textsc {k}_2\) are the negated versions of one another if their activations are logically equivalent, that is , and their targets are incompatible, that is, is false. An example is that of \(\textsc {Response}\) vs \(\textsc {NotResponse}\).

Consider now the situation where a decision must be taken concerning which of two candidate constraints \(\textsc {k}_1\) and \(\textsc {k}_2\) can be added to a Declare specification. Knowing that \(\textsc {k}_1\) and \(\textsc {k}_2\) are the negated versions of one another indicates that they should not both be added to the specification, as including them both would make the specification inconsistent as soon as the two constraints are activated.

As we will see in the next section, these notions become key when dealing with declarative process mining, and in particular the discovery of Declare specifications from event logs. Figure 8 graphically depicts how the main Declare constraint templates relate to each other in terms of subsumption and negated versions.

Fig. 8.
figure 8

The subsumption map of Declare templates. Templates are indicated with solid boxes. The subsumption relation is depicted as a line starting from the subsumed template and ending in the subsuming one, with an empty triangular arrow recalling the UML IS-A graphical notation. The negative templates are graphically linked to the corresponding relation templates by means of wavy grey arcs.

5 Declarative Process Mining

Declarative process constraints depict the interplay of every activity in the process with the rest of the activities. As a consequence, the behavioural relationships that hold among activities can be analysed with a local focus on each one [9], as a projection of the whole process behaviour on a single element thereof. The constraints pertaining to a single activity thus be seen as its footprint in the global behaviour of the process. We shall interchangeably interpret Declare constraints as (i) behavioural relations between activities in a process specification or (ii) rules exerted on the occurrence of events in traces. Notice that the latter is a different approach than the former, typically used for process modelling as originally conceived by the seminal work of Pesic et al. [77]. The former is instead the basis for declarative process mining. In the following, we describe how process specifications can be discovered and monitored.

figure i

5.1 Declarative Process Discovery

Declarative process discovery refers to the inference of those constraints that significantly rule the behaviour of a process, based upon an input event log. The problem can be framed in two distinct ways:

  • A discriminative discovery problem, reminiscent of a classification task. This requires to split the input event log in two partitions, one containing “positive” examples and the second containing “negative” examples. Discovery amounts to find a suitable Declare specification that correctly reconstructs the classification, that is, accepts all positive examples and reject all negative ones.

  • A standard discovery problem – also known as specification mining in the software engineering literature [53]. This calls for the individuation of which Declare constraints best describe the traces in the log, considering all of them as “positive” examples.

The first discovery algorithm for Declare treated discovery as a discriminative problem, exploiting inductive logic programming to tackle it [20, 52]. In parallel, Goedertier et al. [46] brought forward techniques to generate negative examples from positive ones. Interestingly, this line of investigation recently received again the attention of the community [19, 89].

Declarative process discovery framed as a standard discovery problem finds its two main exponents in Declare Miner [58] and MINERful [40], which have been then extended with an arsenal of techniques to improve the quality and correctness of the discovered specifications. We follow the second thread, summarizing the main ideas exploited therein, though reshaping the core concepts in an attempt to embrace the wider plethora of declarative process discovery techniques and the advancements they brought [8, 18, 59].

Process discovery in a declarative setting typically consists of the following phases:

  1. 1)

    The initial setup, i.e., the selection of (i) the templates to be sought for, (ii) the activities to be considered for the candidate constraints instantiating those templates, and (iii) the minimum thresholds for constraint interestingness measures to retain a candidate constraint;

  2. 2)

    The computation of interestingness measures for all the constraints that instantiate the given templates;

  3. 3)

    The simplification of the returned specification, through (i) the removal of constraints whose measures do not reach the user-specified thresholds, (ii) the pruning of the redundant constraints from the set, and (iii) the removal of one constraint for every pair of constraints that are the negated version of one another.

Algorithm 1 gives a bird-eye view of the approach in pseudocode. As we can observe, interestingness measures are crucial to determine the degree to which constraints are satisfied in the log. They have been introduced to indicate the level of reliability and relevance of constraints discovered from event logs, originally devised in the field of association rule mining [3] and adapted to the declarative process discovery context [17, 65]. Among them, we recall support and confidence. Intuitively, support is a normalized measure quantifying how often the constraint is satisfied in the event log. Confidence considers the number of satisfactions with respect to the occurrences of the activations. We define them formally as follows.

Definition 14

(Trace-based measures). Let \(L\) be a non-empty simplified event log with at least a non-empty trace, and \(\textsc {k}\) a declarative constraint as per Definition 1. We define the trace-based support \(\mathrm {supp}_\mathrm {t}\) and the trace-based confidence \(\mathrm {conf}_\mathrm {t}\) as follows:

figure j

   \(\triangleleft \)

We remark that the condition at the numerator that the trace has to satisfy not only the constraint \(\textsc {k}\) but also eventually its activation, i.e., , serves the purpose of avoiding to count “vacuous satisfactions” discussed in Sect. 4.1. For example, while trace \(\langle \textsf {b}, \textsf {c}\rangle \) satisfies \(\textsc {ChainResponse}( \textsf {a}, \textsf {b})\), it does so vacuously, in the sense that it never activates the constraint. This intuitively means that \(\textsc {ChainResponse}( \textsf {a}, \textsf {b})\), albeit satisfied, it cannot be interestingly used to describe the behaviour encoded in the trace. We recall that with \(L( t )\) denotes the multiplicity of occurrences of \( t \) in the log \(L\) (see [1], Sect 3.1). The \(\max {}\) term at the denominator of the formulation of confidence serves the purpose of avoiding a division by zero in case no trace satisfies .

Declare Miner first introduced the trace-based measures to discover specifications from logs, counting traces that (non-vacuously) satisfy constraints as a whole. MINERful, instead, advocated also the adoption of measures that lie at the level of granularity of events. The similarities and differences between the two measuring schemes and the role of explicit activations and targets to tackle vacuity has been later systematized in [18]. The motivation behind the use of event-based measures is the ability to give a differently weight to traces violating the constraints in more than one instant: with trace-based measures, e.g., both traces \(\langle \textsf {a}, \textsf {b}, \textsf {c}, \textsf {a}, \textsf {b}, \textsf {c}, \textsf {c}, \textsf {a}, \textsf {b}, \textsf {a}, \textsf {b}, \textsf {a}, \textsf {b}, \textsf {a}, \textsf {b}, \textsf {c}, \textsf {a}, \textsf {b}, \textsf {c}, \textsf {a}, \textsf {b}, \textsf {a}, \textsf {b}, \textsf {a}, \textsf {c}\rangle \) and \(\langle \textsf {b}, \textsf {a}, \textsf {c}, \textsf {a}, \textsf {c}, \textsf {a}, \textsf {a}, \textsf {a}, \textsf {a}, \textsf {a}, \textsf {a}, \textsf {c}\rangle \) would count as single violations for \(\textsc {ChainResponse}( \textsf {a}, \textsf {b})\). However, only the last occurrence of \( \textsf {a}\) out of ten leads to violation in the first trace, whereas all eight occurrences of \( \textsf {a}\) lead to violation in the second trace. Next, we formally capture the notion of event-based measures.

Definition 15

(Event-based measures). Let \(L\) be a non-empty simplified event log with at least a non-empty trace, and \(\textsc {k}\) a declarative constraint as per Definition 1. We define the event-based support \(\mathrm {supp}_\mathrm {e}\) and the event-based confidence \(\mathrm {conf}_\mathrm {e}\) as follows:

figure k

   \(\triangleleft \)

Again, the condition at the numerator that events satisfy both activation and target of the constraint is intended to avoid including vacuous satisfactions in the sum. The \(\max {}\) term at the denominator of confidence is intended to avoid a division by zero in case no event satisfies .

Table 3. Measures computed for the relation constraints of Example 1 from the event log of Example 10.

For the sake of readability, we shall denote with \(\mathrm {allm}\!\left( \textsc {k},L\right) \) the tuple containing all computed measures for a constraint \(\textsc {k}\) on the event log \(L\): \(\mathrm {allm}\!\left( \textsc {k},L\right) = \left( \mathrm {supp}_\mathrm {t}\!\left( \textsc {k},L\right) , \mathrm {conf}_\mathrm {t}\!\left( \textsc {k},L\right) , \mathrm {supp}_\mathrm {e}\!\left( \textsc {k},L\right) , \mathrm {conf}_\mathrm {e}\!\left( \textsc {k},L\right) \right) \). Given two constraints \(\textsc {k}_1\) and \(\textsc {k}_2\), we write \( \mathrm {allm}\!\left( \textsc {k}_1,L\right) \le \mathrm {allm}\!\left( \textsc {k}_2,L\right) \) if \( \mathrm {supp}_\mathrm {t}\!\left( \textsc {k}_1,L\right) \le \mathrm {supp}_\mathrm {t}\!\left( \textsc {k}_2,L\right) \), \( \mathrm {conf}_\mathrm {t}\!\left( \textsc {k}_1,L\right) \le \mathrm {conf}_\mathrm {t}\!\left( \textsc {k}_2,L\right) \), \( \mathrm {supp}_\mathrm {e}\!\left( \textsc {k}_1,L\right) \le \mathrm {conf}_\mathrm {t}\!\left( \textsc {k}_2,L\right) \), and \( \mathrm {conf}_\mathrm {e}\!\left( \textsc {k}_1,L\right) \le \mathrm {conf}_\mathrm {t}\!\left( \textsc {k}_2,L\right) \). We write \( \mathrm {allm}\!\left( \textsc {k}_1,L\right) \le \mathrm {allm}\!\left( \textsc {k}_2,L\right) \) if \( \mathrm {allm}\!\left( \textsc {k}_1,L\right) \le \mathrm {allm}\!\left( \textsc {k}_2,L\right) \) and \( \mathrm {allm}\!\left( \textsc {k}_2,L\right) \le \mathrm {allm}\!\left( \textsc {k}_1,L\right) \).

Example 10

(An event log for the specification in Example 1). Let \(\mathcal{U}_{ act }\doteq \{ \textsf {c}, \textsf {r}, \textsf {v}, \textsf {t}, \textsf {n}, \textsf {y}, \textsf {\$}, \textsf {p}, \textsf {e}, \textsf {u}\} \cup \{ \textsf {@}\}\) be an alphabet of activities. We interpret \( \textsf {@}\) as an email exchange, which can occur at any stage during the process. The other activities in \(\mathcal{U}_{ act }\) are those that were considered in the process specification in Example 1. Let the following event log be built on \(\mathcal{U}_{ act }\): \(L= [ t_1^{200}, t_2^{100}, t_3^{100}, t_4^{80}, t_5^{80}, t_6^{4}, t_7^{2}, t_8^{2} ]\) where

$$\begin{aligned} t_{1}&= \langle \textsf {c}, \textsf {t}, \textsf {r}, \textsf {v}, \textsf {y}, \textsf {\$}, \textsf {p}, \textsf {u}, \textsf {e}\rangle&t_{2}&= \langle \textsf {c}, \textsf {t}, \textsf {t}, \textsf {r}, \textsf {v}, \textsf {n}, \textsf {t}, \textsf {r}, \textsf {v}, \textsf {y}, \textsf {\$}, \textsf {p}, \textsf {u}, \textsf {e}\rangle \\ t_{3}&= \langle \textsf {c}, \textsf {t}, \textsf {r}, \textsf {t}, \textsf {v}, \textsf {y}, \textsf {u}, \textsf {\$}, \textsf {p}, \textsf {e}\rangle&t_{4}&= \langle \textsf {c}, \textsf {t}, \textsf {@}, \textsf {t}, \textsf {r}, \textsf {v}, \textsf {n}, \textsf {@}, \textsf {r}, \textsf {v}, \textsf {n}\rangle \\ t_{5}&= \langle \textsf {c}, \textsf {r}, \textsf {t}, \textsf {t}, \textsf {v}, \textsf {n}, \textsf {y}, \textsf {@}\rangle&t_{6}&= \langle \textsf {c}, \textsf {t}, \textsf {r}, \textsf {t}, \textsf {v}, \textsf {@}, \textsf {@}, \textsf {y}, \textsf {\$}, \textsf {p}, \textsf {@}, \textsf {e}\rangle \\ t_{7}&= \langle \textsf {c}, \textsf {@}, \textsf {r}, \textsf {v}, \textsf {y}, \textsf {\$}, \textsf {p}, \textsf {@}, \textsf {e}\rangle&t_{8}&= \langle \textsf {c}, \textsf {t}, \textsf {r}, \textsf {r}, \textsf {v}, \textsf {@}, \textsf {n}\rangle \\ \end{aligned}$$

We observe that the log above does not fully comply with the specification. Indeed, (i) trace \(t_{8}\) violates \(\textsc {AlternateResponse}( \textsf {r}, \textsf {v})\), as the candidate managed to register twice before evaluation (notice the occurrence of two consecutive \( \textsf {r}\)’s before \( \textsf {v}\)); (ii) \(t_{7}\) violates \({\textsc {Precedence}}( \textsf {t}, \textsf {v})\) and \({\textsc {Precedence}}( \textsf {u}, \textsf {e})\), as the candidate must have sent the admission test score and the necessary enrolment documents via email rather than via the system (see the occurrence of \( \textsf {@}\) in place of \( \textsf {t}\) in the second instant and in place of \( \textsf {u}\) later in the trace); finally, (iii) trace \(t_{6}\) violates \({\textsc {Precedence}}( \textsf {u}, \textsf {e})\), as the candidate must have submitted the enrolment documents via email in that case too (notice the absence of task \( \textsf {u}\) and the presence of \( \textsf {@}\) in its stance).    \(\triangleleft \)

Example 11

With the example above, we have that both the trace-based support and trace-based confidence of \(\textsc {Alt.Prec.}( \textsf {r}, \textsf {v})\), e.g., equate to 1.0: \( \mathrm {supp}_\mathrm {t}\!\left( {\textsc {Precedence}}( \textsf {c}, \textsf {r}),L\right) = \mathrm {conf}_\mathrm {t}\!\left( {\textsc {Precedence}}( \textsf {c}, \textsf {r}),L\right) = 1.0 \). This is because in all traces the activator (i.e., \( \textsf {r}\)) occurs and the constraint is not violated in any trace. Instead, \( \mathrm {supp}_\mathrm {t}\!\left( \textsc {Alt.Prec.}( \textsf {v}, \textsf {n}),L\right) = \frac{100+80+80+2}{568} \approxeq 0.461 \) and \( \mathrm {conf}_\mathrm {t}\!\left( \textsc {Alt.Prec.}( \textsf {v}, \textsf {n}),L\right) = 1.0 \). The trace-based support is lower than the trace-based confidence because the activator (\( \textsf {n}\)) occurs in 262 traces out of 568 (i.e., in the 100 instances of \( t _2\), the 80 instances of \( t _4\), the 80 instances of \( t _5\), and the 2 instances of \( t _8\)). Similarly, \( \mathrm {conf}_\mathrm {e}\!\left( {\textsc {Precedence}}( \textsf {c}, \textsf {r}),L\right) = 1.0 \) and \( \mathrm {conf}_\mathrm {e}\!\left( \textsc {Alt.Prec.}( \textsf {v}, \textsf {n}),L\right) = 1.0 \). The measures do not change for event-based and trace-based confidence because every activation of the two constraints above leads to a satisfaction. In contrast, \( \mathrm {supp}_\mathrm {e}\!\left( {\textsc {Precedence}}( \textsf {c}, \textsf {r}),L\right) = \frac{ 1 \times 200 + 2 \times 100 + 1 \times 100 + 2 \times 80 + 1 \times 80 + 1 \times 4 + 1 \times 2 + 2 \times 2}{ 9 \times 200 + 14 \times 100 + 10 \times 100 + 11 \times 80 + 8 \times 80 + 12 \times 4 + 9 \times 2 + 7 \times 2} = \frac{750}{5800} \approxeq 0.129 \).    \(\triangleleft \)

It is worth noting that discovery approaches such as Declare Miner [58] and Janus [18] adopt (variations of) local constraint automata to count the satisfactions of constraints. MINERful [40] and DisCoveR [8] resort to occurrence statistics of activities gathered from the event log, more closely to the procedural discovery algorithms discussed in [2].

By definition of confidence and support (trace- or event-based), and as exemplified above, we observe that trace-based confidence is an upper bound for trace-based support and event-based confidence is an upper bound for event-based support. Next, we illustrate how the discovery algorithm operates with our running example.

Example 12

Table 3 shows the event-based and trace-based measures computed on the basis of our running example for every constraint in the original specification – phase (2) of the discovery procedure described above. They belong to the output of the discovery algorithm running on the event log of Example 10 set at phase (1) to seek for (i) all templates from the Declare repertoire in Table 2 (ii) over activities \(\{ \textsf {c}, \textsf {r}, \textsf {v}, \textsf {t}, \textsf {n}, \textsf {y}, \textsf {\$}, \textsf {p}, \textsf {e}, \textsf {u}\}\), with (iii) minimum event-based confidence of 0.95. We remark that also \(\textsc {AlternatePrecedence}( \textsf {y}, \textsf {p})\), \(\textsc {ChainPrecedence}( \textsf {\$}, \textsf {p})\), \(\textsc {AlternatePrecedence}( \textsf {p}, \textsf {e})\) and \(\textsc {AlternatePrecedence}( \textsf {c}, \textsf {p})\), \(\textsc {NotChainPrecedence}( \textsf {y}, \textsf {p})\) and \(\textsc {NotChainResponse}( \textsf {y}, \textsf {p})\), among others, fulfil those criteria and thus are part of the returned set.    \(\triangleleft \)

Fig. 9.
figure 9

The subsumption map of relation Declare constraints in a discovery context. The graphical notation follows Fig. 8. Gray boxes denote constraints that have measures below the minimum thresholds. Light-gray boxes indicate constraints that are subsumed by others with equivalent measures.

To increase the information brought by a discovered model, not only we prune the constraints whose measures lie below the given threshold values. Also, we take into account the subsumption hierarchy illustrated in Fig. 8. In addition, we retain in the constraint set only one among pairs that are a negated version of one another. If we kept both, the model would turn the activation in common into a dead activity (see Sect. 4.2).

Example 13

Figure 9 illustrates the result of the pruning phase (3) based on subsumption and choice of constraints that are the negated version of one another, based on the event log of Example 10. We observe that \(\textsc {AlternatePrecedence}( \textsf {y}, \textsf {p})\) has the same measures as \({\textsc {Precedence}}( \textsf {y}, \textsf {p})\), and we know that \({\textsc {Precedence}}( \textsf {y}, \textsf {p})\) is subsumed by \(\textsc {AlternatePrecedence}( \textsf {y}, \textsf {p})\) (see Sect. 4.2); as we are interested in more restrictive constraints that reduce the space of possible process runs to more closely define its behaviour, we retain the former and discard the latter. Keeping both would introduce a redundancy, and retaining only the latter would omit detailed information as not only \( \textsf {p}\) must be preceded by \( \textsf {y}\), but also \( \textsf {p}\) cannot recur unless \( \textsf {y}\) occurs again. By the same line of reasoning, we prefer retaining \(\textsc {Init}( \textsf {c})\) to \(\textsc {AtMostOne}( \textsf {c})\) in the result specification. The same concepts apply with \(\textsc {ChainPrecedence}( \textsf {\$}, \textsf {p})\), to be preferred over \({\textsc {Precedence}}( \textsf {\$}, \textsf {p})\) and \(\textsc {AlternatePrecedence}( \textsf {p}, \textsf {e})\) in place of \({\textsc {Precedence}}( \textsf {p}, \textsf {e})\), among others. Notice that \({\textsc {Precedence}}( \textsf {y}, \textsf {p})\), \({\textsc {Precedence}}( \textsf {\$}, \textsf {p})\) and \({\textsc {Precedence}}( \textsf {p}, \textsf {e})\) were in the given specification of our running example but, we conclude, are not the most restrictive constraints that could be used in the specification, as the discovery algorithm evidences.    \(\triangleleft \)

To conclude, we remark that not all redundancies can be found with the sole subsumption-hierarchy based pruning. The subsumption hierarchy, indeed, checks constraints that are exerted on the same activities – e.g., \(\textsc {AlternatePrecedence}( \textsf {y}, \textsf {p})\) and \({\textsc {Precedence}}( \textsf {y}, \textsf {p})\). Therefore, we need a more powerful redundancy checking mechanism, seeking for constraints that are entailed by the remainder of the specification’s constraint set (see Sect. 4.2).

Example 14

The confidence of \(\textsc {AlternatePrecedence}( \textsf {v}, \textsf {p})\) is 1.0 in the event log of our running example. Yet, it does not add information to the discovered specification as it is redundant, logically entailed by the other constraints – in particular, \(\textsc {AlternatePrecedence}( \textsf {r}, \textsf {v})\), \(\textsc {AlternatePrecedence}( \textsf {v}, \textsf {y})\), \({\textsc {Precedence}}( \textsf {y}, \textsf {p})\) and \(\textsc {AtMostOne}( \textsf {p})\).    \(\triangleleft \)

To verify this, we can resort to language inclusion via automata product as in [38]: the language of the product of the four constraint automata is not smaller than the language accepted by the intersection of the second, third and fourth constraint automata. Here, we do not enter the details of the algorithms that detect redundancies at such a deeper level but provide an example of its rationale. The interested reader can find further details in [24, 38].

Fig. 10.
figure 10

Example FSAs adapted for the monitoring of constraints. Non-final states indicating current violation (\({\textsc {c}}\!\bot \)) are dashed and filled in orange; non-final states indicating permanent violation (\({\textsc {p}}\!\bot \)) are dotted and filled in red; final states indicating current satisfaction (\({\textsc {c}}\!\top \)) are thin-solid and filled in blue; final states indicating permanent satisfaction (\({\textsc {p}}\!\top \)) are thick-solid and filled in green. (Color figure online)

5.2 Declarative Process Monitoring

(Compliance) process monitoring aims at tracking running process executions to check their conformance to a reference process model, with the purpose of detecting and reporting deviations as soon as possible [57]. It constitutes one of the main tasks of operational decision support [92, Ch. 10], which characterizes process mining applied at runtime to running process executions.

Declarative process monitoring employs a declarative specification (in our case, described using Declare) as reference model for monitoring. The central fact in monitoring that process instances are running, that is, their generated traces evolve over time, calls for a finer-grained understanding of the state of constraints and of the whole specification. We illustrate this intuitively in the next example.

Example 15

Consider the excerpt in Fig. 11 of our admission process running example, and an evolving trace that, once completed, corresponds to the following sequence: \(\langle \textsf {\$}, \textsf {p}, \textsf {u}, \textsf {\$}, \textsf {p}\rangle \). Let us replay the trace from the beginning.

  1. 1.

    At the beginning, all constraints are satisfied, but they are so for sure only currently, as events may occur making them violated. For example, a registration without a consequent evaluation would lead to violating \(\textsc {AlternateResponse}( \textsf {r}, \textsf {v})\), whereas an enrolment without a prior upload of certificates would lead to a violation of \({\textsc {Precedence}}( \textsf {u}, \textsf {e})\).

  2. 2.

    Upon the occurrence of \( \textsf {\$}\), constraint \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\) becomes pending or, to be more precise, currently violated, as paying demands a pre-enrolment occurring immediately after.

  3. 3.

    The execution of \( \textsf {p}\) brings \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\) back to currently satisfied, as it does not require the occurrence of further events, but may do so in the future in case of another payment.

  4. 4.

    Upon the occurrence of \( \textsf {u}\), constraint \({\textsc {Precedence}}( \textsf {u}, \textsf {e})\) becomes permanently satisfied, as enrolment is now enabled, and there is no way to continue the execution leading to a violation of the constraint.

  5. 5.

    This is indeed what happens with the next occurrence of \( \textsf {\$}\), which makes \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\) currently violated.

  6. 6.

    The second pre-enrolment has the effect of bringing \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\) once again back to currently satisfied. However, it has also the effect of permanently violating \(\textsc {AtMostOne}( \textsf {p})\), as the number of occurrences of \( \textsf {p}\) has exceeded the upper bound allowed by \(\textsc {AtMostOne}( \textsf {p})\), and there is no way of fixing the violation.

   \(\triangleleft \)

Fig. 11.
figure 11

Excerpt of the Declare specification in Fig. 2.

As witnessed by the example, the state of each constraint can be described in a fine-grained way by considering on the one hand the trace accumulated so far (i.e., the prefix of the whole, still unknown, execution), and by pondering on the other hand about the possible, future continuations. To do so in a formal way, we appeal to the literature on runtime-verification for linear temporal logics, and in particular to the \(\textsc {RV}\hbox {-}\textsc {LTL}\) semantics, originally introduced in [11] over infinite traces. This semantics was adopted for the first time in the context of \(\textsc {LTL}_f\) over finite traces in [64, 66], in order to define an operational technique for Declare monitoring. This led to deeper investigations on the usage of \(\textsc {RV}\hbox {-}\textsc {LTL}\)to characterize the relevance of a trace to a declarative specification [39], and to finally obtain a formally grounded, comprehensive framework for monitoring [27, 28].

We now define the \(\textsc {RV}\hbox {-}\textsc {LTL}\) semantics for \(\textsc {LTL}_f\). In the definition, we denote the concatenation of trace \( t _1\) with \( t _2\) as \( t _1 \cdot t _2\).

Definition 16

(RV-LTL states). Consider an \(\textsc {LTL}_f\) formula \(\varphi \) over \(\varSigma \), and a trace \( t \) over \(\varSigma ^*\). We say that \(\varphi \) is in (RV-LTL) state s after \( t \), written \([ t \models \varphi ]_{\textsc {RV}} = v\), if:

  • (Permanent satisfaction) (i) \(v = {\textsc {p}}\!\top \), (ii) the current trace satisfies \(\varphi \) (\( t \models \varphi \)), and (iii) every possible suffix keeps \(\varphi \) satisfied (for every trace \( t ' \in \varSigma ^*\), we have \( t \cdot t ' \models \varphi \)).

  • (Permanent violation) (i) \(v = {\textsc {p}}\!\bot \), (ii) the current trace violates \(\varphi \) (\( t \not \models \varphi \)), and (iii) every possible suffix keeps \(\varphi \) violated (for every trace \( t ' \in \varSigma ^*\), we have \( t \cdot t ' \not \models \varphi \)).

  • (Current satisfaction) (i) \(v = {\textsc {c}}\!\top \), (ii) the current trace satisfies \(\varphi \) (\( t \models \varphi \)), and (iii) there exists a suffix that leads to violate \(\varphi \) (for some trace \( t ' \in \varSigma ^*\), we have \( t \cdot t ' \not \models \varphi \)).

  • (Current violation) (i) \(v = {\textsc {c}}\!\bot \), (ii) the current trace violates \(\varphi \) (\( t \not \models \varphi \)), and (iii) there exists a suffix that leads to satisfy \(\varphi \) (for some trace \( t ' \in \varSigma ^*\), we have \( t \cdot t ' \models \varphi \)).

We also say that \( t \) conforms to \(\varphi \) if \([ t \models \varphi ]_{\textsc {RV}} = {{\textsc {p}}\!\top }\) or \([ t \models \varphi ]_{\textsc {RV}} = {{\textsc {c}}\!\top }\) (i.e., stopping the execution in \( t \) satisfies the formula).    \(\triangleleft \)

By inspecting the definition, we can directly see that monitoring is at least as hard as \(\textsc {LTL}_f\) satisfiability/validity checking. To see this, consider what happens at the beginning of an execution, where the current trace is empty. By applying Definition 16 to this special case, and by recalling the notion of satisfiability/validity of an \(\textsc {LTL}_f\) formula, we in fact get that an \(\textsc {LTL}_f\) formula \(\varphi \) is:

  • permanently satisfied if \(\varphi \) is valid;

  • permanently violated if \(\varphi \) is unsatisfiable;

  • currently satisfied if the two formulae \(\varphi \wedge \mathbf {end}\) and \(\lnot \varphi \) are both satisfiable;

  • currently violated if the two formulae \(\lnot \varphi \wedge \mathbf {end}\) and \(\varphi \) are both satisfiable.

To perform monitoring according to the RV-LTL states from Definition 16, we can once again exploit the automata-theoretic characterization of \(\textsc {LTL}_f\). In particular, given an \(\textsc {LTL}_f\) formula \(\varphi \), we construct its FSA \(A_\varphi \), and color the automaton states according to the \(\textsc {RV}\hbox {-}\textsc {LTL}\) semantics. As introduced in [64] and then formally verified in [28], this can be simply done as follows. Consider a state s in of \(A_\varphi \). We label it by:

  • \({\textsc {p}}\!\top \), if s is final and all the states reachable from s in \(A_\varphi \) are final as well; if \(A_\varphi \) is minimized, this means that s only reaches itself.

  • \({\textsc {p}}\!\bot \), if s is non-final and all the states reachable from s in \(A_\varphi \) are non-final as well; if \(A_\varphi \) is minimized, this means that s only reaches itself.

  • \({\textsc {c}}\!\top \), if s is final and can reach a non-final state in \(A_\varphi \).

  • \({\textsc {c}}\!\bot \), if s is non-final and can reach a final state in \(A_\varphi \).

Figure 10 shows some examples of colored constraint automata, obtained by considering the constraint formulae of some Declare constraints from our running example. To monitor the state evolution of a constraint, one has simply to dynamically play the evolving trace on its colored local automaton, returning the updated \(\textsc {RV}\hbox {-}\textsc {LTL}\) label as soon as a new event is processed. Doing so on the local automata in Fig. 10 for trace \(\langle \textsf {\$}, \textsf {p}, \textsf {u}, \textsf {\$}, \textsf {p}\rangle \) formally reconstructs what discussed in Example 15.

However, this is not enough to promptly detect violations as soon as they manifest in the traces. This has been extensively discussed in [28, 66], and is at the very core of the power of temporal logic-based techniques for monitoring. We use again Example 15 to illustrate the problem.

Example 16

Consider Example 15 and the following question: is step 6 the earliest at which a violation can be detected? Clearly, if we focus on each constraint in isolation, the answer is affirmative. To see this formally, we play trace \(\langle \textsf {\$}, \textsf {p}, \textsf {u}, \textsf {\$}, \textsf {p}\rangle \) on the four colored local automata of Fig. 10, obtaining the following runs:

  • For \(\textsc {AlternateResponse}( \textsf {r}, \textsf {v})\), we have \(s_0\xrightarrow { \textsf {\$}} s_0\xrightarrow { \textsf {p}} s_0\xrightarrow { \textsf {u}} s_0\xrightarrow { \textsf {\$}} s_0\xrightarrow { \textsf {p}} s_0 \); no violation is encountered.

  • For \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\), we have \(s_0\xrightarrow { \textsf {\$}} s_1\xrightarrow { \textsf {p}} s_0\xrightarrow { \textsf {u}} s_0\xrightarrow { \textsf {\$}} s_1\xrightarrow { \textsf {p}} s_0 \); no violation is encountered.

  • For \({\textsc {Precedence}}( \textsf {u}, \textsf {e})\), we have \(s_0\xrightarrow { \textsf {\$}} s_0\xrightarrow { \textsf {p}} s_0\xrightarrow { \textsf {u}} s_1\xrightarrow { \textsf {\$}} s_1\xrightarrow { \textsf {p}} s_1 \); no violation is encountered.

  • For \(\textsc {AtMostOne}( \textsf {p})\), we have \(s_0\xrightarrow { \textsf {\$}} s_0\xrightarrow { \textsf {p}} s_1\xrightarrow { \textsf {u}} s_1\xrightarrow { \textsf {\$}} s_1\xrightarrow { \textsf {p}} s_2 \); a violation is encountered in the last reached state.

The answer changes if we consider the whole Declare specification that contains all such constraints at once. In fact, by taking into account the interplay of constraints, we can detect a violation already at step 5, i.e., after the second occurrence of payment. This is because, after that step, the two constraints \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\) and \(\textsc {AtMostOne}( \textsf {p})\) enter into a conflict, that is, no continuation of the current trace can lead to satisfy them both. In fact, after trace \(\langle \textsf {\$}, \textsf {p}, \textsf {u}, \textsf {\$}\rangle \), constraint \(\textsc {ChainResponse}( \textsf {\$}, \textsf {p})\) is currently violated, waiting for a consequent occurrence of \( \textsf {p}\); however, constraint \(\textsc {AtMostOne}( \textsf {p})\), which is currently satisfied, becomes permanently violated upon a further occurrence of \( \textsf {p}\).    \(\triangleleft \)

As we have seen, the early detection of violations cannot always be caught by considering the colored local automata of constraints in isolation. However, it can be systematically detected by taking into account the colored global automaton of the whole specification.

Fig. 12.
figure 12

The colored global automaton automaton obtained as the (colored) cross-product of constraints in Fig. 10 as shown in Fig. 6(c), the states of which are decorated with the four \(\textsc {RV}\hbox {-}\textsc {LTL}\) truth values.

Example 17

Figure 12 shows the colored global automaton of the Declare specification in Fig. 11. By playing the trace \(\langle \textsf {\$}, \textsf {p}, \textsf {u}, \textsf {\$}, \textsf {p}\rangle \) therein, we obtain the following run: \(s_0\xrightarrow { \textsf {\$}} s_1\xrightarrow { \textsf {p}} s_4\xrightarrow { \textsf {u}} s_8\xrightarrow { \textsf {\$}} s_{12}\xrightarrow { \textsf {p}} s_{12} \). Clearly, the violation state \(s_{12}\) is already reached in step 5, i.e., just after the second payment.    \(\triangleleft \)

All in all, we can then monitor an evolving trace against a Declare specification as follows:

  • Each constraint is encoded into the corresponding colored local automaton, used to track the state evolution of the constraint itself.

  • The whole specification is encoded into the corresponding colored global automaon, used to track the evolution of the whole specification, and in particular to early-detect violations.

  • At runtime, every new event occurrence is delivered in parallel to all the automata, updating each of them by executing the corresponding transition and entering into the next state, at the same time returning the associated \(\textsc {RV}\hbox {-}\textsc {LTL}\) label.

Figure 13 shows the result of applying this technique to our running example.

An alternative approach, which is exploited in [64], is to compute, as done before, the global automaton as the cross-product of local automata, remembering, in each global state, the RV-LTL labels of all local states from which such a global state has been produced. In addition, no minimization step is applied on the resulting automaton. Once colored, this non-minimized, global colored automaton combines in a single device the contribution of all local monitors and that of the global monitor.

5.3 A Note on Conformance Checking

In this section, we have focused on monitoring evolving traces against Declare specifications. This can be seen as a form of online conformance checking, aiming at detecting deviations at execution time. This technique can be seamlessly lifted to handle the standard conformance checking task, where conformance is evaluated on an event log containing full traces of already completed process executions (cf. [16]). In this setting, the global automaton is not needed anymore, as a-posteriori it is not relevant to compute the earliest moment of a violation, but only to properly detect it at the trace level. The usage of local automata, one per constraint, is enough, and also has the advantage of producing an informative feedback that indicates, trace by trace, how many (and which) constraints are satisfied or violated. Finer-grained feedbacks like those based on the computation of trace alignments have been extensively applied for procedural models (cf. [16]), and can be also recasted in the declarative setting, aligning the log traces with the (closest) model traces accepted by the global automaton of the Declare specification of interest. This is an active line of research, which started from the seminal approach in [31].

Fig. 13.
figure 13

Monitoring with local and global colored automata, showing a case where the global automaton detects a violation before it actually manifests on a single constraint.

6 Recent Advances and Outlook

We close this chapter by reporting about the most recent advances in the field of declarative process mining revolving around Declare, describing the current frontier of research, and highlighting open challenges.

6.1 Beyond Declare Patterns

As we have seen in Sect. 3, a Declare specification consists of a repertoire of constraint templates grounded on specific activities. At the same time, such templates come with a logic-based semantics given in terms of \(\textsc {LTL}_f\). A natural question is then: can the techniques described in this chapter be used for the entire \(\textsc {LTL}_f\) logic? This means, more precisely, considering the situation where each constraint corresponds to an arbitrary \(\textsc {LTL}_f\) formula while, as usual, the specification formula is constructed by putting in conjunction the \(\textsc {LTL}_f\) formulae of all its constituting constraints.

To answer this question, one has to separate the logical and pragmatic aspects involved in the different tasks we have been introducing. We do so focusing on reasoning, discovery, and monitoring.

Reasoning. As discussed in Sect. 4.2, all the reasoning tasks we have considered in this chapter can be lifted to the whole \(\textsc {LTL}_f\) logic. Indeed, they are reduced to \(\textsc {LTL}_f\) satisfiability/validity checking, which in turn can be tackled by checking (non-)emptiness of FSAs. The situation may change if one wants to provide more advanced debugging or diagnosis functionalities – for example, to return the most relevant conflicting set(s) of constraints that are causing inconsistencies or dead activities. While these types of problem can also be attacked at the level of the entire logic [25, 79], focusing only on pre-defined patterns becomes necessary if one wants to involve humans in the loop or define preferences over constraints in the case where multiple explanations exist [25]. Considering specific patterns is also relevant when studying the computational complexity of reasoning on pattern combinations [44, 45, 91], or the scalability and effectiveness of reasoning tools [44, 45, 71, 97].

Discovery. As pointed out in Sect. 5.1, two distinct process discovery problems are typically tackled in a declarative setting: discriminative discovery and specification mining.

The case of discriminative discovery is tightly related to classification and machine learning, allowing one to rely on general learning algorithms for declarative process mining. Such algorithms tackle general logical frameworks, such as Horn clauses in inductive logic programming or full temporal logics in model learning, and can thus go far beyond a pre-defined set of templates, either targeting full \(\textsc {LTL}_f\) [15, 82] or enriching the discoverable Declare templates with further key dimensions, such as metric temporal constraints, event attributes, and data conditions [21, 23].

As shown in Sect. 5.1, standard discovery stands as a radically different problem, since the input event log provides a uniform set of (positive) examples, while no negative example is given. This calls for suitable metrics to measure how well a set of constraints characterizes the behaviour contained in the log. In the approach described in this chapter, such metrics are defined starting from the notions of constraint activation and target, which are template-specific. Attempts have been conducted to lift some of these notions (in particular that of activation and “relevant” satisfaction [39]) to full \(\textsc {LTL}_f\), but further research is needed to target the discovery of arbitrary \(\textsc {LTL}_f\) formulae from event logs. Notice that while full \(\textsc {LTL}_f\) discovery would enrich the expressiveness of the discovered specifications, it would on the other hand pose the issue of understandability: end users may struggle when confronted with arbitrary temporal formulae, while they are facilitated when pre-defined templates are used.

Monitoring. As we have discussed in Sect. 5.2, Declare monitoring is tackled using automata, and consequently seamlessly work for arbitrary \(\textsc {LTL}_f\) formulae. As for advanced debugging techniques, the same considerations done for reasoning also hold for monitoring. For example, the detection of minimal conflicting sets of constraints in the case of early detection of violations caused by the interplay of multiple constraints can be tamed at the level of the full logic [66], but would require to focus on patterns if one wants to formulate preferences or incorporate human feedback [25].

Remarkably, working with FSAs allows us to define monitors for temporal formulae that go even beyond \(\textsc {LTL}_f\). In fact, \(\textsc {LTL}_f\) is as expressive as star-free regular expressions, while automata are able to capture full regular expressions and, in turn, finite-trace temporal logics incorporating in a single formalism \(\textsc {LTL}_f\) and regular expressions, such as Linear Dynamic Logic over finite traces (\(\textsc {LDL}_f\)) [30]. Working with \(\textsc {LDL}_f\) in our setting has the specific advantage that we can express and monitor metaconstraints, that is, constraints that predicate on the \(\textsc {RV}\hbox {-}\textsc {LTL}\) truth values of other constraints [27, 28].

6.2 Dealing with Uncertainty

In the conventional definition of a Declare specification, constraints are interpreted as being certain: every model trace is expected to satisfy all constraints contained in the specification. Such an interpretation is too restrictive in scenarios where the specification should accommodate:

  • constraints describing common behaviours, expected to hold in the majority, but not all cases;

  • constraints describing exceptional, outlier behaviours that rarely occurs but should be not judged as violating the specification.

To deal with this form of uncertainty, Declare has been recently extended with probabilistic constraints [62]. In this framework, every probabilistic constraint comes with:

  • a constraint formula \(\varphi \) (specified, as in the standard case, using \(\textsc {LTL}_f\));

  • a comparison operator \(\odot \in \{=,\ne ,<,\le ,>,\ge \}\);

  • a number \(p \in [0,1]\).

The interpretation of this constraint is that \(\varphi \) holds in a random trace generated by the process with a probability that is \(\odot p\). In frequentist terms, this can be in turn interpreted as follows: given a log of the process, the ratio of traces satisfying \(\varphi \) must be \(\odot p\).

Since a Declare specification contains multiple constraints, one has to consider how different probabilistic constraints interact with each other. In particular, n probabilistic constraints yield up to \(2^n\) possible so-called scenarios, each highlighting which probabilistic constraints hold and which are violated. Reasoning over such scenarios has to be conducted by suitably mixing their temporal and probabilistic dimensions. The former handles which combinations of constraints and their violations (i.e., which scenarios) are consistent, while the latter lifts the probability conditions attached of single constraints to discrete probability distributions over the possible scenarios.

To carry out this form of combined reasoning, probabilistic constraints are formalized in a well-behaved fragment of the logic introduced in [61]. As it turns out, logical and probabilistic reasoning are loosely coupled in this fragment, and can be carried out resorting to standard finite-state automata and systems of linear inequalities. This approach has been used as the basis for defining a new family of probabilistic declarative process mining techniques [6].

6.3 Mixed-Paradigm Models

In Fig. 1, we have intuitively contrasted declarative specifications and imperative models. The distinction of these two approaches is in reality not so crisp. In fact, a single process may contain parts that are more suitably captured using imperative languages, and parts that can be better described as declarative specifications. Take, for instance, a clinical guideline mixing administrative and therapeutic subprocesses [73].

To capture such hybrid processes, one needs a multi-paradigm approach that can combine imperative and declarative constructs in a single process model. One of the first proposals doing so is [85], where an imperative process can contain activities that are internally structured using so-called pockets of flexibility specified using declarative temporal constraints over a given set of tasks.

This layered approach has been further developed in [90], which brings forward a hierarchical model where each sub-process can be specified either as an imperative or declarative component. Discovery of hierarchical hybrid process models has been subsequently tackled in [87].

Multi-paradigm approaches providing a tighter integration between imperative and declarative components have also been studied. In [33], process models combining Petri nets and Declare constraints at the same modelling level are introduced and studied, singling out methodologies and techniques to handle the intertwined state space emerging from their interaction. Conformance checking for these mixed-paradigm models is extensively assessed in [95]. A different approach is brought forward in [5], where a Declare specification is used to express global constraints that “glue together” multiple imperative processes concurrently executed over the same instances. Automata-based techniques extending those illustrated in Sect. 5.2 are introduced to provide integrated monitoring functionalities dealing at once with the local processes and the global constraints.

At the current stage, further research is needed along the illustrated lines towards a solid theory and corresponding algorithmic techniques for hybrid, mixed-paradigm process mining.

6.4 Multi-perspective Declare Specifications

Throughout the chapter, we have considered pure control-flow specifications, where a process is captured solely in terms of its constitutive activities and of behavioural constraints separating legal from undesired executions. While the control-flow provides the main process backbone, other equally important perspectives should also be taken into account as suggested already in [1]:

  • The resource perspective deals with the actors that are responsible for executing tasks within the process.

  • The time perspective focusses on quantitative temporal conditions on when tasks can/must be scheduled and executed, and on their expected durations.

  • The data perspective captures how data objects and their attributes influence and are manipulated during the process execution.

Several works have investigated the extension of Declare with additional perspectives. From the formal point of view, this requires to extend the logic-based formalization of Declare with features that can capture resources, metric time, data, and conditions thereof, in turn resorting to variants of metric and/or first-order formalisms over finite traces [10, 14, 69, 74]. It is important to stress that such features may be blurred, considering that data support (if equipped with suitable datatypes and conditions) may be used to predicate over resources and time as well.

Such multi-perspective features have been extensively embedded into Declare or related approaches (see, for example, [13, 69, 98] for constraints with metric time and [42] for constraints with metric time and resources). Next, we focus in more detail on the data dimension.

When it comes to data, two main lines of research can be identified. The first one deals with standard “case-centric” processes extended with event and case data. The second one focuses instead on “multi-case” processes, wherein constraints are expressed over multiple objects and their mutual relations. We briefly discuss each line separately.

Declarative Process Specifications with Event/Case Data. Within a process, activities may be equipped with data attributes that, at execution time, are grounded to actual data values by the involved resources. This means that events witnessing the occurrence of task instances come with a data payload. In addition, each process instance may evolve its own case data in response to the execution of activities.Footnote 3 Such case data may be stored in different ways, e.g., as key-value pairs or a full-fledged relational database. In this setting, it becomes crucial to extend Declare with so-called data-aware constraints, that is, constraints enriched with data-aware conditions over activities. The simple but illustrative example described next motivates why this is needed.

Example 18

We focus on a process where payments are issued by customers through a \( \textsf {pay}\) activity, which comes with an attribute indicating the paid amount, in Euros. Two consequent activities \( \textsf {check}\) and \( \textsf {emit}\) are executed to respectively inspect a payment and emit a receipt.

Let a log for this process contain multiple repetitions of the following traces:

$$\begin{aligned} t_{1}&= \langle \textsf {pay(amount=50)}, \textsf {emit} \rangle&t_{2}&= \langle \textsf {pay(amount=300)}, \textsf {check}, \textsf {emit} \rangle \\ t_{3}&= \langle \textsf {pay(amount=20)} \rangle&t_{4}&= \langle \textsf {pay(amount=100)}, \textsf {emit}, \textsf {check} \rangle \\ t_{5}&= \langle \textsf {pay(amount=90)}, \textsf {emit} \rangle&t_{6}&= \langle \textsf {pay(amount=800)}, \textsf {check}\rangle \\ \end{aligned}$$

One may wonder whether \(\textsc {Response}( \textsf {pay}, \textsf {check})\) is a suitable constraint to explain (part of) the behaviour contained in the log. If considered unrestrictedly, this is not the case, as there are many traces where payment is not followed by any inspection. The situation changes completely if one restricts the scope of the constraint activation only to those payments that involve an amount of 100 or more.    \(\triangleleft \)

A number of works has brought forward combined techniques to discover Declare constraints equipped with various forms of data conditions [54, 60, 86], to check conformance for data-aware constraints [12, 13], and to handle their monitoring [5, 69]. This passage has to be carried out with extreme care, as combining event data and time quickly leads to undecidability of reasoning [14, 34, 35]. Therefore, such techniques have to operate in a limited fashion or suitably controlling the expressiveness of data conditions and the way they interact with time.

Object-Centric Declarative Process Specifications. So far, we have discussed the extension of Declare with event or case data. In a more general setting, data may refer to more complex networks of objects and their mutual relations, simultaneously co-evolved by one or multiple processes. In this type of processes, known under the umbrella term of object-centric processes, there is no single, pre-defined notion of case, and process executions cannot consequently be represented as flat traces, but call for richer representations (cf. [43]). The following example illustrates why Declare, in its conventional version, cannot be used to capture object-centric processes.

Fig. 14.
figure 14

Comparison of conventional vs object-centric Declare.

Example 19

Consider the fragment of an order-to-cash process, containing three activities: \( \textsf {sign}\) (indicating the signature of a GDPR form by the customer), \( \textsf {open}\) (the opening of an order), and \( \textsf {close}\) (the closing of an order). Two constraints apply to \( \textsf {close}\), defining under which conditions it becomes executable:

  • An order can be closed only if that order has been opened before.

  • An order can be closed only if its owner has signed the consent before.

Figure 14(a) shows how these two constraints can be captured in conventional Declare. This specification is satisfactory only in the case where each trace refers to a single customer and a single order by that customer. For example, consider the following two traces, respectively referring to an order \(o_1\) by Anne, and an order \(o_2\) by Bob:

$$\begin{aligned} t_{1}&= \langle \textsf {sign}, \textsf {open}, \textsf {close} \rangle&t_{2}&= \langle \textsf {open}, \textsf {close}, \textsf {sign} \rangle \\ \end{aligned}$$

Clearly, \(t_{1}\) is a model trace, while \(t_{2}\) is not, as the latter violates \({\textsc {Precedence}}( \textsf {sign}, \textsf {close})\).

However, one may need to consider multiple orders owned by the same or distinct customers, in the common situation where distinct orders may be later bundled together to handle their shipment. In our example, assuming that \(o_1\) and \(o_2\) are later bundled together in a shipment, this would require to combine \(t_{1}\) and \(t_{2}\) in a single object-centric trace, suitably extending each event with a reference to the object(s) it operates on. Suppose this would result into:

$$ t= \left\langle \begin{array}{l} \textsf {sign(customer=Anne)}, \textsf {open(order=o2)}, \textsf {open(order=o1)}, \\ \textsf {close(order=o1)}, \textsf {close(order=o2)}, \textsf {sign(customer=Bob)} \end{array} \right\rangle $$

The Declare specification of Fig. 14(a) becomes now inadequate. In fact, it cannot distinguish which events actually co-refer to one another and which do not, so it cannot identify that the first signature by Anne refers to the first occurrence of \( \textsf {close}\), but not to the second one. Hence, it wrongly uses the first occurrence of \( \textsf {sign}\) to satisfy \({\textsc {Precedence}}( \textsf {sign}, \textsf {close})\) for both orders.    \(\triangleleft \)

Fixing the issue described in Example 19 requires the explicitly extension of Declare with the ability of expressing how events relate to objects, how objects relate to each other, and in turn to scope the application of constraints, expressing that they must be enforced over events that suitably co-refer to each other – either because they operate on the same object, or because they operate on related objects. In our running example, this would call for the following actions:

  • introduce the classes of Order and Customer;

  • capture that there is a many-to-one owned by association linking orders to customers;

  • indicate that \( \textsf {sign}\) refers to a customer, and that \( \textsf {open}\) and \( \textsf {close}\) refer to an order;

  • scope \({\textsc {Precedence}}( \textsf {open}, \textsf {close})\) by enforcing that the two involved activities must co-refer to the same order (i.e., that an event of activity \( \textsf {close}\) for order o can only occur if an event of activity \( \textsf {open}\) has previously occurred for the same order);

  • scope \({\textsc {Precedence}}( \textsf {sign}, \textsf {close})\) by enforcing that the two involved activities must respectively operate with a customer and an order that co-refer through the owned by association (i.e., that an event of activity \( \textsf {close}\) for order o can only occur if an event of activity \( \textsf {sign}\) has previously occurred for the customer who owns o).

Object-centric behavioral constraints (OCBC) [93] have been brought forward to handle this type of scoping through the integration of Declare specifications and UML class diagrams. Figure 14(b) shows the OCBC specification correctly capturing the constraints of Example 19. The approach is still at its infancy: some first seminal works have been conducted to handle discovery of OCBC specifications from object-centric event logs recording full database transactions [55], and to formalize and reason upon OCBC specifications through temporal description logics [7]. Further research is being carried out to improve the performance of discovery and frame it in the context of object-centric event logs of the form of [1], and to tackle conformance checking and monitoring. This is particularly challenging, as integrating temporal constraints with data models quickly leads to undecidability [7].

7 Conclusion

Throughout this chapter, we have thoroughly reviewed the declarative approach to process specification and mining. The declarative approach aims at limiting the process behavior by defining the boundaries within which its executions can unfold, yet leaving process executors free to explore at runtime which specific executions are generated. This is in contrast with the imperative approach, where process models compactly depict all and only those traces that are admissible. In fact, notice that different (imperative) process models can comply with the same declarative specification, just like different dynamic systems can model (\(\models \)) a set of temporal rules. In the chapter, we have grounded our discussion on the Declare language, but the introduced concepts are broad enough to be seamlessly applicable to other related approaches.

Specifically, we have first discussed how declarative process specifications can be formalized using Linear Temporal Logic on Finite Traces (\(\textsc {LTL}_f\)), and in turn operationally characterized in terms finite state automata (FSAs) for their execution semantics. On this solid formal ground, we have examined the core reasoning tasks that relate to declarative specifications and then delved deeper into the discovery and monitoring of processes according to the declarative paradigm. Interestingly, we have observed that the reasoning tasks are pervasive in all stages of declarative process mining, such as within discovery to avoid producing redundant or inconsistent outputs, and within monitoring to speculatively consider the possible future continuations of the monitored execution. In the last part of the chapter, we have provided a summary of the most recent advances in declarative process mining, focusing in particular on: (i) the applicability of declarative process mining techniques and concepts to full temporal logics, going beyond pre-defined patterns; (ii) the incorporation of uncertainty within constraints; (iii) the analysis of hybrid models integrating imperative and declarative fragments; (iv) multi-perspective constraints incorporating additional dimensions beyond the control-flow, and supporting the declarative specification of object-centric (multi-case) processes. This bird-eye view provides a fair account of the open research challenges in declarative process mining.