1 Introduction

Knowledge workers are responsible for decision making and analyzing information in the Knowledge-intensive Processes they perform [7, 16]. Decisions for cases (process instances) are made in the context of business entities such as loan requests, price quotes or maintenance orders. Progress in such decision-intensive processes relies on available information. Knowledge workers prepare decisions by performing tasks in which information is gathered. The abundance of data in combination with the rise of data analytics techniques has increased the amount of information that is potentially available.

A knowledge worker typically has the discretionary power to perform or skip an information-gathering task, which requires a decision from his side. While performing an information-gathering task typically improves the quality of the final decision, it also increases the throughput time of the process and the costs made to reach the decision. For instance, a mortgage expert can gain more in-depth information about the client by assessing her/his risk level, but doing so may take a day and involve costs for getting a report from an outside organization. Therefore, knowledge workers need to continually make trade offs whether they need more information or whether they have sufficient information to make a reliable decision (cf. Fig. 1). Advanced support for such decision-intensive processes is currently lacking in scientific literature [12].

Fig. 1.
figure 1

Decision-intensive processes

We introduce decision support for knowledge workers based on Artifact-centric process models, a technique that combines data and process aspects in a holistic way in order to model decision-intensive processes [17, 24]. In particular, we focus on declarative business artifact-centric process models, which offer additional flexibility in performing processes by specifying constraints as rules rather than sequence relations and by allowing external events to influence the process. We model the artifact-centric process using the Guard-Stage-Milestone (GSM) formalism [5, 13]. GSM is well-defined and is one of the base models that have been used to introduce the OMG Case Management Model and Notation (CMMN) [3]. We slightly extend GSM by introducing discretionary decision events to support the approach.

In GSM schemas for decision-intensive processes [24], information is gathered and computed, i.e., data attributes are written. Initially, only a few attributes are known, such as the name of the client; other attributes such as salary and age are initially unknown, but estimates can be derived from previous mortgage requests. The main issue for knowledge workers is to decide whether they have collected sufficient information to make the final decision, or whether they need to retrieve more information, i.e., replace initial estimates with real values paying a certain cost, to improve the quality of the final decision. Additionally, if more information is needed, then the knowledge worker needs to decide which data attribute(s) to retrieve next.

To guide the knowledge worker in this decision, we introduce the novel concept of an information structure for decision-intensive processes. An information structure estimates the retrieved level of information, denoted as the quantity of interest, based on retrieved sources of information and estimates for non-retrieved information. For instance, an information structure for mortgages could estimate the creditworthiness of the clients. This estimate is updated as more information is retrieved about the client. The information structure is used to specify the recommended outcome of the final decision, but the actual outcome is decided by the knowledge worker.

To support decision making in declarative artifact-centric processes, we use the probabilistic optimization model: Markov Decision Process (MDP) [20]. This model recommends the best decision based on the information that is currently available in an environment where everything else is uncertain and only probabilities are known. It is uncertain what the real outcome of a decision is, but we can do predictions based on the probability distributions of the possible future decisions. We contribute by defining an approach for mapping a GSM schema to an MDP. In the MDP, the structure of the artifact process is incorporated. The information structure is used to define conditions on when the MDP should be terminated. By modelling the uncertainty explicitly with probabilities, we improve the reliability of the decision support and offer user guidance compared to other techniques such as simulations [21] and fuzzy modeling [8], as we can derive expected cost based on these probabilities.

The remainder of this paper is structured as follows. Section 2 gives preliminaries on GSM schemas and MDPs. Section 3 introduces the notion of the information structure. Section 4 discusses the mapping from GSM schemas to MDPs. Section 5 discusses related work. Section 6 discusses the approach, concludes the paper and gives future research topics. Due to space limitations, the formal translation from GSM schemas to MDPs is not provided here, but in an online appendix [26].

2 Preliminaries

2.1 Guard-Stage-Milestone Schemas

In this section, we formally define the GSM schemas used in this paper. We consider a lightweight variant compared to the classical GSM schemas [5, 13]. In particular, the GSM schemas in this paper are without hierarchy and have monotonic executions [9]. Using such a lightweight GSM schema variant allows us to better highlight the key aspects of the translation. In future work, we can relax these assumptions as, for example, adding hierarchy is orthogonal to the developed decision support.

We informally introduce GSM schemas by means of an example in Fig. 2. Rounded rectangles represent stages, in which work is performed. Open circles denote milestones, which are business objectives typically achieved by performing work in stages. Diamonds indicate sentries, which are rules that specify when a stage is opened. Milestones also have sentries that specify when milestones are achieved, but these are not visualized. Table 1 contains the sentries (guards) for the stages and the sentries of the milestones of \(\varGamma \).

Departing from original GSM notation [13], we explicitly visualize external events using the bull’s eye symbol. Named external events (prefix E:) are generated by the environment and not under the control of the knowledge worker performing this process, while decision events (prefix D:) are controlled and generated by the knowledge worker. Rectangles denote data attributes, written in the stages to which they are connected.

The GSM schema models a mortgage process. Input is a request for the issuing of a mortgage with a certain value to a customer. A mortgage expert decides whether or not a mortgage of a certain amount is issued. To prepare a decision, the expert can gather information by retrieving a set of criteria that predict the customer’s creditworthiness. In this case, we consider a set of four criteria: salary, outstanding debts, employment contract and age. The mortgage expert decides which criteria to check in which order and also decides on the final outcome of the process. Once the mortgage request arrives, the salary is immediately checked based on a process constraint. After this process step has been completed, the other three stages can be opened upon discretion of the mortgage expert. The ‘Make Decision’ stage can be opened at any time, once the mortgage expert has collected sufficient information and can make a decision to accept or reject this mortgage request.

Fig. 2.
figure 2

Mortgage decision process

Table 1. Sentries for the Mortgage decision process

To keep track of all business-relevant information about an artifact instance as it moves through its lifecycle, GSM schemas use data attributes and status attributes. Data attributes store information on an artifact instance and can be of any data type, for instance, integer or string. In our decision making process we assume that a task in an atomic stage yields new information that is stored in one or more data attribute(s). The status attributes are boolean attributes that keep track of the GSM lifecycle of the artifact instance; a stage (milestone) is true when it is open (has been achieved).

Each sentry is triggered by an event and has a condition that references attributes of the GSM schema. We assume a condition language C that includes predicates over scalars and boolean connectives between all attributes in the model. In the event part, we distinguish between external events (prefix E:), decision events (prefix D:) and completion events (prefix C:). External events come from the environment of the process, typically a customer, but not the knowledge worker. Decision events are generated by knowledge workers to start a discretionary activity. Knowledge workers have the freedom and authority to perform or skip such an activity. Decision events are not used in classical GSM schemas [13], but are introduced here to model discretionary activities. Finally, completion events signal completion of atomic stages.

Definition 1 (GSM schema)

A GSM schema is a tuple \(\varGamma = (\mathcal{A}= \mathcal{D}\cup \mathcal{S}\cup \mathcal{M}, wt, time, \mathcal{E}=\mathcal{E}_{cmp}\cup \mathcal{E}_{ext}\cup \mathcal{E}_{dec}, \mathcal{R})\), where

  • \(\mathcal{A}\) is the set of attributes containing the data attributes \(\mathcal{D}\), the stage attributes \(\mathcal{S}\), and milestone attributes \(\mathcal{M}\);

  • \(wt: \mathcal{S}\rightarrow \mathcal {P}(\mathcal{D})\) is a function that specifies for each stage the set of data attributes written in that stage. We require that distinct stages write distinct variables, i.e., for \(s, s'\in \mathcal{S}\), if \(s\ne s'\) then \( wt(s) \cap wt(s')=\emptyset \).

  • \(time: \mathcal{S}\rightarrow \mathbb {N}\) is a function that assigns to each stage the time needed to complete the stage;

  • \(\mathcal{E}\) is a finite set of external events, consisting of named external events \(\mathcal{E}_{ext}\) (prefixed E:), completion events \(\mathcal{E}_{cmp}=\{\text{ C: }s~|~s \in \mathcal{S}\}\), and decision or discretionary events \(\mathcal{E}_{dec}{}\) (prefixed D:);

  • \(\mathcal{R}\) is a function from \(\mathcal{S}\cup \mathcal{M}\) to a set of sentries (see Definition 2) ranging over all attributes \(\mathcal{A}\) defined on the condition language C.

Definition 2 (Sentry)

A sentry has the form \(\tau \wedge \gamma \), where \(\tau \) is the event-part and \(\gamma \) the condition-part. The event-part \(\tau \) is either empty (trivially true), an external event \(e \in \mathcal{E}\) or an internal event \(+d\) (d becomes true) or \(-d\) (d becomes false), where \(d \in \mathcal{S}\cup \mathcal{M}\) is a stage or milestone attribute. The condition \(\gamma \) is a Boolean formula in the condition language \(\mathcal{C}\) that refers to \(\mathcal{A}\), so data attributes in \(\mathcal{D}\) and status attributes in \(\mathcal{S}\cup \mathcal{M}\). The condition-part can be omitted if it is equivalent to true.

We also introduce the auxiliary function

$$ tr: \mathcal{E}_{dec}\rightarrow \mathcal {P}({\mathcal{S}}) $$

where \(tr(\text{ D: }n)=\{~s \in \mathcal{S}~|~ \text{ D: }n \text { is trigger event of a sentry of } s ~\}\).

At any given point of time, the whole GSM schema is in a specific state, called a snapshot, that is defined by the values of the status attributes and data attributes.

Definition 3 (Snapshot)

For a GSM schema \(\varGamma = (\mathcal{A}, wt, time, \mathcal{E}, \mathcal{R})\) a snapshot is a mapping \(\sigma \) from the attributes in \(\mathcal{A}\) into appropriate attribute values. Initially, all data attributes have value \(\perp \) (unknown) and all stage and milestone attributes have value False.

In response to the occurrence of an event in \(\mathcal{E}\), a snapshot changes into another snapshot by performing a Business step [5, 9]. The event can result in sentries that evaluate to true, which in turn may lead to stages being opened or closed and milestones being achieved. In particular, a stage completion event signals that a task has been completed; the payload of the event carries new values for the data attributes written in the stage. These values are incorporated into the new snapshot.

An important assumption in this variant of GSM schemas is that completion events happen after a known time period and are not used to open new stages, i.e, they are not used in guards. This means that we know beforehand how long a certain action will take, so the time of completion can not affect the decisions. This assumption is made to keep the later introduced MDP translation understandable, but will be relaxed in future work.

2.2 Markov Decision Process (MDP)

In this section we introduce the semantics that are used to define an MDP model and give some intuition on how to solve these models. MDPs are used to model dynamic processes in which repeated decisions (also: actions) are taken facing uncertainty. Decisions are made in each decision epoch \(t\in \mathbf N \) based on the state st(t) of the system. The state contains the information available in the epoch; S denotes the set of all possible states. The goal is to decide which action to take from the set of allowed actions A(st) in state st, in order to minimize a given cost function. The action determines the direct costs incurred, and influences the next state but does not completely specify it:

Definition 4 (Cost function)

\(C_{a}(st)\) is the cost when starting from state st, and doing action a. In our model this cost does not depend on the next state.

Definition 5 (Transition Probability)

\(P_{a}(st,st')\) is the probability that, after taking action a in state st, we end up in state \(st'\).

The complete action space is denoted by \(\mathbb {A}=\cup _{st\in S} A(st)\). An MDP is then a 4-tuple:

Definition 6 (Markov Decision Process)

A Markov Decision Process is a tuple \(Z = (S, \{A(st)\}_{st \in S}, \{P_a(st,st')\}_{st,st' \in S, a\in A(st)}, \{C_a(st)\}_{st \in S, a\in A(st)})\).

For the MDPs considered in this paper, states that correspond to situations where the final decision was made are so-called terminating states. After reaching such states, costs are no longer incurred. So effectively, the goal is to minimize the total costs incurred until a terminating state can be reached: Our MDP has an infinite number of decision epochs only for mathematical convenience.

To minimize total costs, we aim at reaching one of the terminating states as fast as possible: It may be possible to terminate while only a subset of all possible actions were performed, and this depends crucially on the outcomes of these actions. Our approach estimates the action that yields the cheapest route to a terminating state. In particular, starting in epoch T, the approach chooses a(t), the action at time epoch t, for all \(t \ge T\), to minimize \(V = \mathbb {E}[ \sum _{t=T}^{\infty } C_{a(t)}(st(t))]\). This can be achieved by solving the following set of recursive equations, for all states \(st\in S\), starting from the terminating states [20]:

$$ V^t(st) = \min _{a \in A(st)} C_{a}(st) + \sum _{st' \in S} P_{a}(st,st')(V^{t+1}(st'))$$

A range of algorithms exist to solve the above problem and give the statistically most interesting action to do, based on the current state [20]. These algorithms should suffice for modestly sized schemas. This paper focuses on translating GSM models into MDPs and leaves a detailed investigation into solving these MDPs (i.e. selecting optimal actions) for future research.

3 Information Structure

The need for an information structure mainly arises because the GSM schema has no single variable that can summarize the current snapshot.

To arrive at such a variable, we note that attributes retrieved in knowledge-intensive processes by definition help to obtain information about a specific property associated with each case. We will refer to this property as the quantity of interest. For example, in a mortgage request the quantity of interest may be the creditworthiness of the client.

3.1 Goal

We need a mathematical model to be able to create this variable and define the GSM snapshots where enough information is known to make a reliable final decision.

Definition 7 (Information structure)

Given a set of data attributes \(\mathcal{D}{} = \{X_1,X_2,...X_k\}\), a subset of \(\mathcal{D}{}\) with the retrieved values at snapshot \(\sigma \): \(\sigma (\mathcal{D}{}) = \{X_1 = x_1, X_4 = x_4, ...\}\) and probability distributions \(F_{X_j|X_j \in \mathcal{D}{} \setminus \sigma (\mathcal{D}{})}\) of the non-retrieved values, the information structure is a function \(\hat{y} = f(\sigma (\mathcal{D}{}), F_{X_j})\), which summarizes the known information about the quantity of interest y.

We emphasize that the outcome of the information structure is always an estimate of the real value of y. Therefore, the estimated y is denoted by \(\hat{y}\).

A wide range of methods exist that can yield information structures from past cases, e.g. (non)linear regressions, neural networks, etc. [11, 19].

3.2 Linear Regression

To estimate the correct weighing of different variables and create an information structure, linear regression has shown to be an effective method and is therefore used as an example method for the information structure. It uses previous cases to estimate the weights of each individual variable by minimizing the sum of squared errors. This preliminary set consists of two parts. First, the \(y_i\) variables are the dependent variables for historical case i, in our example this is the eventual creditworthiness. Second, the \(x_{i,j}\) variable is the \(j^{th}\) independent variable in case i. Formally, linear regression is then defined as:

Definition 8 (Linear regression)

Given a set of data variables \(\{y_i,x_{i,1}, x_{i,2},...,x_{i,k}\}_{i=1}^n\), we define the linear regression as \(y_i = \beta _0 + \beta _{1}x_{i,1} + \beta _{2}x_{i,2} + ... + \epsilon _i\), where \(\epsilon _i\) denotes the regression error. By minimizing the sum of the squared errors over all n estimates of y, we obtain the best \(\beta \) values: .

We remark that this method has a set of assumptions over the set of data variables that need to be checked before the estimates are reliable [23]. An example is the need for no or little multicollinearity in the data. The regression error \(\epsilon _i\) incorporates certain factors that might not have been recognized as relevant sources of information. Based on the \(\beta \) values or weights that are derived in the linear regression, we are now able to build a function that can be used as information structure. Given a set of variables or attributes that were retrieved, \(\{x_{1},x_{2},...,x_{k}\}\), we can define:

$$\hat{y} = \beta _0 + \beta _{1}x_{1} + \beta _{2}x_{2} + ... + \beta _{k}x_{k}$$

Returning to our example, suppose we have a set of attribute values for a customer:

$$\sigma (\mathcal{D}{}) = \sigma (\textsf {salary},\textsf {debt},\textsf {contract},\textsf {age},\textsf {decision}) = \{20000, 0, 1, 26, \perp \}$$

where the last attribute is used to store the final decision and thus depends on the creditworthiness y. Also, \(\textsf {contract} = {\left\{ \begin{array}{ll} 1 &{} \textsf {if}\, \textsf {fixed}\\ -1 &{} \textsf {if}\, \textsf {temporary} \end{array}\right. }\).

Using all previous cases of mortgage issuing to estimate the betas, we get the information structure: \(\hat{y} = \beta _0 + \beta _{salary}\textsf {salary} + \beta _{debt}\textsf {debt} +... \).

Table 2. Beta values

Suppose Table 2 gives the \(\beta \) values that were determined using linear regression. Based on this, the creditworthiness of this customer is estimated at a level of

$$ \hat{y} = \beta _0 + \beta _{salary}\textsf {salary} + \beta _{debt}\textsf {debt} + \beta _{contract}\textsf {contract} + \beta _{age}\textsf {age}$$
$$ \hat{y} = 2 + 0.5 + 0 + 1 + 0.325 = 3.825$$

.

3.3 Estimating y with Uncertain Variables

In the case of a GSM schema where decisions need to be made during the process, the problem is that not all attribute values are known yet. One needs to work with estimates based on the probability distribution of the attributes. This probability distribution can for example be obtained from previous cases. We want to estimate \(\hat{y}\) by using the probability distributions of the other variables. The decision rule could, for instance, be defined as: accept if \(P(\hat{y} \ge \theta ) \ge \phi \). Although the value of some attributes might be unknown, the regression estimate of the betas is still valid. Therefore, using the linear function of the value of interest, we can still estimate the outcome.

Returning to our example, suppose that the salary, debts and age are already retrieved, but the type of employment contract is unknown. Furthermore, historically 60% of employees has a fixed contract, the threshold for the mortgage request is set at \(\theta = \textsf {mortgage}\,\textsf {amount}/50000\), and \(\phi = 0.5\). Using the \(\beta \) estimates from Table 2, we can now estimate the real value of the customer with a 150000 mortgage request (\(\theta =3\)).

$$\begin{aligned}&P(\hat{y} \ge \theta | \textsf {salary} = 20000, \textsf {debt} = 0, \textsf {age} = 26) \\&= P(\beta _0+ \beta _{salary}20000+ \beta _{debt}0- \beta _{age}26 + \textsf {contract} \ge 3)\\&= P(\textsf {contract} \ge 0.175) \end{aligned}$$

Because contract can only take values 1 and \(-1\) and the probability of a fixed contract was said to be 0.6, this results in \(P(\hat{y} \ge 3|st) = 0.6\). Therefore, since the rule is adopted that \(P(\hat{y} \ge \theta )>0.5\) suffices to accept the request, the knowledge worker could now be advised to abort the retrieval of contract information and to accept the request. (Never mind that banks are in reality typically not so forthcoming, and would likely use higher values of \(\phi \).)

4 Decision Support for GSM Schemas with Uncertainty

In this section we discuss the implementation of MDP support in a GSM environment. First, we introduce a framework that links the two different concepts and gives an idea of how the new process would include decision support. Second, we introduce a more formal explanation of translating the GSM notation to the MDP model. The translation is formally introduced in an online appendix [26]. Finally, based on the introduced problem of deciding on a mortgage request, we give an example of how the process could run.

4.1 Framework

Figure 3 gives a framework of our solution approach. This figure only shows the GSM components that have an effect on the MDP model and are used when creating a decision event. The GSM part of this figure works according to the earlier explained rules, where an instance of the schema is in a current snapshot. By performing business steps we open and close stages, so we end up in new snapshots. Suppose we are in a snapshot where one of the possible actions is a decision event. This means that knowledge workers now have to decide if they should initiate this event. However, the knowledge worker asks for support. Therefore, using the information structure, the MDP model tries to estimate the effects of the different options that are available. The information structure helps with translating the snapshot of a GSM schema into a variable that indicates the progress of the total process. Based on this and the MDP state, the set of possible actions is determined. By using an MDP optimization algorithm, we can now come up with an advice on what to do next. Subsequently, the knowledge worker has the freedom to decide using the support. Based on the decision a new event will happen, causing a new Business step to happen. As a consequence, we receive a new snapshot and the process repeats.

Fig. 3.
figure 3

GSM to MDP mapping

4.2 Translation GSM to MDP

To translate the GSM lifecycle into an MDP state set, we recall the data snapshot \(\sigma (\mathcal{D}{})\), which consists only of the data attributes. A state in the MDP can be compared to a data snapshot in the GSM schema containing all retrieved information. A state change in the MDP model implies the revealing of new data attributes through a business step in the GSM schema in which the stage completes that writes the data attribute. The new value of this attribute changes \(\sigma (\mathcal{D}{})\) and thus also the state of the MDP. We assume that the current status of stages and milestones does not influence the decision process of a knowledge worker; we plan to relax this assumption in future work.

As the basic goal of an MDP is to decide what action to perform based on the minimization of costs, we need to recognize possible actions in the GSM schema. To jump from one snapshot to another in GSM, we take a business step that is deployed by an arriving event. More specific, the set of decision events, \(\mathcal{E}_{dec}{}\), defines the possible actions to consider as the knowledge worker is only allowed to create these events: \(a \in \mathcal{E}_{dec}{}\). If we look at the GSM notation, \(tr(a) \subset \mathcal{S}{}\) is the set of opened stages for action a and \(wt(tr(a)) \subset \mathcal{D}{}\) is the set of retrieved attributes for action a.

The result of an action is unknown in the MDP model. Comparably, in the GSM schema, the decision to collect information on a customer yields an unknown answer. In the MDP schema, we model this uncertainty using the transition probability to a specific state. Therefore, for each unknown attribute that can be determined in the GSM schema (except the final decision attribute), we need to obtain a probability distribution \(F_{X_j}\). If we recall the set of historical information, \(\{y_i,x_{i,1}, x_{i,2},...,x_{i,k}\}_{i=1}^n\), which was used in Sect. 3, we can use this information to define the empirical probabilities of ending up in a certain state.

$$P(\sigma (X_j) = x_j) = \frac{\sum _{i=1}^n I(x_{i,j}=x_j)}{n}, \text { where } I(x_{i,j}=x_j) = {\left\{ \begin{array}{ll} 1 &{} \textsf {if}\, x_{i,j}=x_j\\ 0 &{} \textsf {otherwise} \end{array}\right. }$$

By assuming independence between attributes, we can take the product of individual attribute probabilities to find the transition probabilities for the retrieval of multiple attributes in one action. Future research could relax this assumption.

Considering a decision-making process where decisions delay processes, we assign the necessary time to determine the value of an attribute as cost. Suppose that the time necessary to perform action a is \(C_{a}\). This time can be explained in two different ways. First, we try to minimize the total amount of time spent by workers on the complete process, T. Second, we try to minimize the time until a decision is made to make the customer happy, \(\tau \). Therefore, we define the cost function as a combination of these two goals. The cost for a is \(C_{a} = \delta T(a) + (1-\delta ) \tau (a)\), where \(\delta \) is the parameter indicating the relative importance between the total time and the customer waiting time. The cost for parallel stages in one action can be defined as: \(T(a) = \sum _{s \in tr(a)} T_{s}\). The waiting time of this action is equal to: \(\tau (a) = \max _{st \in tr(a)}T_{s}\). \(T_{s}\) is the time needed for stage \(s \in \mathcal{S}{}\): time(s). By opening stages in parallel, the retrieval is less time-consuming and thus can be an advantage. The exception for this given cost function are the cost to make a final decision.

Using the resulting variable \(\hat{y}\) from the information structure, we can define a stopping rule for the MDP. In the case of our example this could be: accept if \(P(\hat{y} \ge \theta |\sigma (\mathcal{D}{})) \ge \phi \). Here, we test if the variable \(\hat{y}\) is greater than \(\theta \) with a probability of at least \(\phi \), given that we are in state \(\sigma (\mathcal{D}{})\). If, based on the information of the current state \(\sigma \), this rule will hold, a knowledge worker could make the decision to abort any further information retrieval because the \(C_{D:Accept}\) to terminate are 0. Using the information structure, the stopping rule and the probability distributions, it is possible to calculate for each stage in the GSM model if a stopping rule has been achieved.

4.3 Example

We return to the example that was introduced in Sect. 2.1, and illustrate the mapping by naming a few examples. Suppose at some point only the attribute salary is retrieved, and that its value is 20000. Then the state can be represented as:

$$\sigma (\mathcal{D}{}) = \sigma (\textsf {salary, debt, contract, age, decision})= \{20000,\perp ,\perp ,\perp ,\perp \}$$

The action space for state \(\sigma (\mathcal{D}{}) = \{20000,\perp ,\perp ,\perp ,\perp \}\) is:

$$A_{\varGamma }(\sigma (\mathcal{D}{}))= \{\{\text {D:Age}\},\{\text {D:Debt,Contract}\},\{\text {D:Debt}\}, \{\text {D:Contract}\},$$
$$\{\text {D:Accept}\}, \{\text {D:Decline}\}\}$$

Now suppose we choose to retrieve the debt attribute, and in \(40\%\) of previous cases the debt amount was found to be 0. Then we can find the following transition probability:

$$P_{D:\text {Debt}}(\{20000, \perp , \perp , \perp , \perp \},\{20000, 0, \perp , \perp ,\perp \}) = 0.4$$

Furthermore, if \(60\%\) of previous cases has a fixed contract, then

$$P_{D:\text {Contract}}(\{20000, \perp , \perp , \perp , \perp \},\{20000, \perp , 1, \perp , \perp \}) = 0.6$$
$$P_{D:\text {Debt,Contract}}(\{20000, \perp , \perp , \perp ,\perp \},\{20000, 0, 1, \perp ,\perp \}) = P_{D:\text {Debt}}\cdot P_{D:\text {Contract}} = 0.24$$

As for the cost function, suppose \(\delta = 0.5\), \(time(\textsf {Debt})=200\), \(time(\textsf {Contract})=150\). Then:

$$C_{D:\text {Debt,Contract}}(\{20000, \perp , \perp , \perp , \perp \}) = 0.5T(a) + 0.5\tau (a) = $$
$$0.5(time(\textsf {Debt})+time(\textsf {Contract})) +0.5\max (time(\textsf {Debt}), time(\textsf {Contract}))$$
$$= 0.5(200+150)+0.5\cdot 200 = 275$$

Also, \(C_{D:\text {Debt}}(\{20000, \perp , \perp , \perp , \perp \}) = 200\) and \(C_{D:\text {Contract}}(\{20000, \perp , \perp , \perp , \perp \}) = 150\). The MDP weighs the relative merits of the various possible actions. E.g. retrieving only debt may yield enough information to decline the request, but if the value of debt retrieved does not lead to the ability to make a final decision, then contract is to be retrieved after all. If the latter scenario is likely, then retrieving both simultaneously is advisable, since this is less costly (faster) then one-by-one retrieval. Conceptually, the MDP approach weighs the eventual expected outcome of all alternatives actions, and recommends the action that has the most favorable expected outcome. Based on this, suppose we take the action \(D:\text {Debt,Contract}\), then we can possibly end up in state: \(\{20000, 0, 1, \perp ,\perp \}\).

Now, in general, the decline action moves us directly to the state where the final decision is decline:

$$P_{D:\text {Decline}}(\{20000, 0, 1, \perp , \perp \},\{20000, 0, 1, \perp , \text {decline}\}) = 1$$

To make the final decision of declining the request, cost are:

$$C_{D:\text {Decline}}(\{20000, 0, 1, \perp , \perp \})= {\left\{ \begin{array}{ll} \infty &{} \textsf {if }\, P(\hat{y} \le \theta _{dec}|\textsf {salary, debt, contract}) \le \phi _{dec} \\ 0 &{} \textsf {if }\, P(\hat{y} \le \theta _{dec}|\textsf {salary, debt, contract}) \ge \phi _{dec} \end{array}\right. }$$

Suppose that \(\hat{y}\) is the linear regression according to the example in Sect. 3, and that \(\theta _{dec} = 4\) and \(\phi _{dec} = 0.6\). Then

$$P(\hat{y} \le 4|\textsf {salary} = 20000, \textsf {debt}=0, \textsf {contract} = 1) = P(\beta _{age}\textsf {age} \le 0.5) = P(\textsf {age} \le 40)$$

Assuming that \(65\%\) of all requesters are younger than 40, \(C_{D:\text {Decline}}(\{20000, 0, 1, \perp , \perp \})= 0\). The result is that the cheapest option of actions is to decline and thus the recommendation will be to decline. In all cases where the stopping rule is not fulfilled, \(C_{D:\text {Decline}} = \infty \), causing actions where more information is retrieved to always be cheaper. E.g., suppose that instead of the above assumption only \(20\%\) of requesters are younger than 40. Then rejecting the case without retrieving the age would be too risky. Instead, the advice would be to retrieve the age of the requester, which would allow the final decision to be made.

5 Related Work

In the area of business process modeling there has been some related work on using Markov Decision Processes to give decision support. Vanderfeesten et al. [25] introduced Markov Decision Processes for optimizing the execution of Product Data Models, which specify data elements and operations that transform data elements into other data elements. Uncertainty in their approach is about the success or failure of operations, whereas in our approach uncertainty is about the quantity of interest that needs to be decided upon.

The paper of Petrusel [18] extends the paper of Vanderfeesten et al. [25]. Using a more expanded version of the Product Data Model, the Decision Data Model (DDM), they use MDP models to make optimal decisions in this DDM. The contribution we make compared to these papers [18, 25] is taking into account the actual data values that were retrieved during the process, whereas the previous papers only considered data elements without values. MDPs with structural similarities to the MDPs that we retrieve from GSM have been studied before by Lim, Bearden and Smith [15], but not in the context of business processes. They focus on the search of attributes to discover the value of an option, without any process restrictions.

Schonenberg et al. [22] focus on giving recommendations based on the comparison with similar traces in historical cases of the process. We use the historical cases to estimate the probability distributions. Lakshmanan et al. [14] use an instance-specific Probabilistic Process Model (PPM), to define the transition probabilities of a Markov Chain. However, they only give likelihood estimations of the future, but we provide a recommendation based on minimizing future cost.

Ghattas et al. [10] not only focuses on control flow decisions, but also decisions embedded in an action. They use a learning algorithm comparing historical cases to the current case. Our model is more detailed as it introduces an information structure that helps in deciding when to end the entire process and also gives a recommendation on what final decision should be made.

Mertens et al. [16] introduce a new declarative process language: DeciClare, which is an alternative language for modeling decision-intensive processes. The data perspective is based on the Decision Model and Notation [6], which is an industry-standard for modeling the requirements and logic of business decisions. However, DMN and therefore DeciClare do not consider uncertainty regarding the quality of interest to be decided upon, nor do they provide any recommendation support to guide knowledge workers.

Eshuis and Firat [8] use fuzzy modeling to express uncertainty in a more qualitative way. But especially in processes with a high repetitiveness, it is helpful to use the available information of historical cases. Therefore, we make use of the quantitative approach with probabilities. A different approach to model future uncertainty is by doing simulation [21]. However, the flexibility to have unknown decisions and to allow the decision maker to make counter-intuitive decisions is hard to model by simulation as this requires human interaction. Barba et al. [1] use a constrained - process approach, where mainly control flow and resources are considered and decisions are made to optimally divide work over the resources. All these approaches do not make use of quantitative uncertainty by means of probabilities, such as we introduced in this paper.

Conforti et al. [4] use a method to predict the risk of taking certain decisions. Based on this risk, the knowledge worker is recommended the next task. [14] provides likelihood estimates of the future states of the process, based on the Markov process that is estimated. Furthermore, [22] do not give direct recommendations on what action to perform. They only give a do or don’t advice per option based on process mining logs. Batoulis et al. [2] use a Bayesian Network to define the dependencies and define an influence diagram. Based on the influence diagram a decision model is defined using DMN. In our model we give more specific recommendations to increase the relevance of the recommendation to the knowledge worker. Moreover, we allow the knowledge worker to make stubborn decisions and include the result of this decision to recalibrate our recommendation.

6 Discussion and Conclusion

In this paper we have introduced a new approach of giving decision support for declarative artifact-centric processes. We do this by translating GSM schemas into a Markov Decision Process, using the novel notion of an information structure that estimates the quantity of interest. The introduced solution can be used as support for many sorts of decision-intensive processes, where the knowledge worker can decide on retrieving different sources of information to gain more knowledge about a case and finally makes a decision using this information.

In order to define a simple translation, we considered GSM schemas without hierarchy. The inclusion of hierarchy complicates our translation as the decision to open certain non-atomic stages would only have indirect consequences for the retrieval of data attributes. This would lead to dependencies between different stages, where all stages are now assumed to be independent of each other. A similar reasoning prevents us now from using non-monotonic executions. Once we have introduced attribute dependencies in the information structure and MDP, both these assumptions can be relaxed. A second assumption in this paper is that we assume that all stages can be decided to be opened. External events or completion events do not trigger the opening of new stages. This allows us to estimate the time needed to perform tasks as we assumed in Sect. 4. The authors will allow for other events than decisions in a follow-up paper.

Also, for the MDP that appears from these translations, there are currently some drawbacks. Firstly, by including all possible values of all variables in the state space, this state space explodes already for very small models. An exploding state space makes it hard or sometimes impossible to find globally optimal solutions. As mentioned in Sect. 2.2, this problem is left for future research. Secondly, the transition probabilities are now modelled as the product of independent attribute probabilities. But many variables in these process are often correlated and cannot be assumed to be independent. In future research we plan to resolve this issue. Finally, the information structure can be modelled using much more complicated models, as was mentioned in Sect. 3, that were not discussed in this paper.

This paper introduces a new approach to give decision support to knowledge workers performing decision-intensive approaches. There are several directions for future research to further validate, refine and improve the approach. One direction is to consider more general GSM schemas and to consider techniques to solve the MDP models in an efficient manner. Another more practical direction is to implement the translation in a tool and apply it in several real-world case studies. In developing such a tool, we will also plan to explore different ways to deal with the state space explosion problem.