Analyzing Inter-Connected Processes: Using Object-Centric Process Mining to Analyze Procurement Processes

.


Introduction
Process mining is a branch of data science aiming to exploit the event data recorded during the execution of an organizational process to get insights into the process. In particular, process discovery (the automatic discovery of a process model from the event data), conformance checking (the comparison between the behavior recorded in the event log and the process model), model enhancement (enriching the process model with frequency/performance information) and predictive analytics (predicting the remaining time or the next activity of an incomplete case) techniques have been proposed. Organizations successfully applying process mining (e.g., Siemens, BMW, Lufthansa, Uber, and Zalando) have reached a scale allowing them to save millions of Euros [1].
To apply process mining successfully, the process mining discipline provides several project methodologies aiming at supporting the application of process mining in organizational contexts. For instance, the L * life-cycle model [2] and the Process Mining Project Methodology (PM 2 ) [3] provide clear guidance to practitioners on how they implement process mining projects which aim to improve process performance and compliance to rules and regulations.
A potentially deleterious assumption of process mining is the association between events and cases/process instances. In particular, it is assumed that an event is associated with a single case. For example, the event of resolution of a ticket is associated with the ticket case in a ticketing management system. However, this is unrealistic in many real-life scenarios. Considering a Purchaseto-Pay process (P2P), a purchase order can be associated with several invoices and each one of which can be associated with different payments. Vice-versa, an invoice can be associated with different purchase orders. Requiring events to be associated with a single case may lead to deficiency (events not associated with any case), convergence (an event needs to be replicated for different cases), and divergence (several instances of the same activity are contained in the same case) issues explained in [4].
Object-centric event logs relax the assumption that an event is related to a single case. Instead, an event can be associated with different objects of different object types (e.g., an order and two invoices). This helps to resolve the deficiency/convergence/divergence issues since different objects can be related to an event, and we do not need to "coerce" events inside a case notion (for example, an event of invoice creation does not need to be repeated for all the purchase orders). The discipline of object-centric process mining, i.e., exploiting the information contained in object-centric event logs to obtain useful insights, is in active development.
In this paper, we want to propose a case study of object-centric process mining on top of a Purchase-to-Pay (P2P) process ( Fig. 2 contains a summary of the main stages of the P2P process), along with the description of novel ad-hoc techniques that are needed for the analysis. This is motivated by the difficulties in getting reliable insights out of traditional P2P event logs due to convergence/divergence issues. To apply object-centric process mining in an organizational setting, we extend P M 2 to guide the organization that seeks to apply object-centric process mining based on ERP systems data. The renewed methodology consists of six stages and is summarized in Fig. 1.
1. Planning: this stage is to set up the project and determine the research questions that need to be answered at the end of the project in a way that improves the process performance. 2. Extraction: this stage is to extract the event data from the information system and obtain the object-centric event log. 3. Data Preprocessing: this stage is to prepare the event data so that the following mining and analysis techniques can produce optimal results. 4. Mining and Analysis: this stage is to apply object-centric process mining techniques to the preprocessed event data and get insights into the processes which answer the research questions. 5. Evaluation: in this stage, the previous stage's findings are validated by the domain experts and interpreted to identify possible action points. 6. Improvement: this stage transforms actionable insights into actual management actions that support the process to improve performance and compliance. In particular, the extraction, preprocessing, and analysis steps required adaptation, since an object-centric event log is extracted (different design choices are possible on the correlation between events and objects of the log) and traditional preprocessing/analysis techniques cannot be applied to an object-centric event log.
The rest of the paper is organized as follows. In Section 2, we present the context in which the analysis was performed. In Section 3, we plan the case study by describing the process and providing the research questions. In Section 4, the extraction of the object-centric event log for the given process is described. In Section 5, the preprocessing strategies are presented. In Section 6, graph-based and statistical techniques are presented to respond to the initial questions. Section 7 presents the OCPM tool used throughout the case study. Main stages of the Purchase-to-Pay (P2P) process. The purchasing part includes the management of the purchase requisition, the purchase order, and the receipt of the goods. The payment part includes the verification of the invoice and the subsequent payment. In Section 8, the results of the analysis are validated and evaluated. In Section 9, some adopted and planned improvement strategies for the organizational P2P process are discussed. Section 10 presents the related scientific work. Finally, Section 11 concludes the paper.

Context
In this section, the context of the company behind the proposed case study is described, along with the initial results obtained from the application of process mining techniques/tools. Moreover, the description of the process is included.

Company
The ECE Group was founded in 1965 and has since grown to become a leading player in the European shopping center industry. ECE's current construction and planning activities amount to €3.2 billion and it manages a portfolio of 200 shopping centers. With assets under management of €31 billion and a workforce of 3,300 employees, the company has a strong presence in 13 countries.
In early 2020, ECE Group Services started a process mining initiative aimed at improving its complex and data-intensive processes. The pilot project showed the viability of using process mining tools and methods for process enhancement but also uncovered the need for a structured governance framework and standard operating procedures to conduct effective process analysis. Based on the preliminary results, a dedicated team -process insights -within ECE began developing new projects to support various business users in optimizing their processes.
In the early stages of the process insights project, the team utilized a method based on the standard BPM framework, which included process mining tools and techniques, with the aim to enhance business processes. The BPM lifecycle model, widely recognized as the standard, encompasses six essential steps: process identification, discovery, analysis, redesign, implementation, and monitoring and control, all aimed at optimizing and streamlining processes [5]. The team's approach in the ECE Context involved defining clear goals and objectives, gathering and preparing data, analyzing processes, identifying improvement opportunities, implementing changes, monitoring progress, and continuously re-evaluating processes to ensure desired results are achieved and new optimization prospects are explored.
However, process analysis projects faced several challenges, including a large number of process variants and standards for different countries and stakeholders. With different requirements, the number of KPIs and analyses has been increasing, which made the close monitoring of processes more difficult and time-consuming. Given these difficulties, it is important to obtain reliable and explainable results from the process mining analyses and to identify areas with significant improvement potential, in order to justify the improvement initiatives.

Initial Results and Tools
Given the complexity of the aforementioned issues, we initially chose to concentrate on a single process and its interactions with other processes. We began with the Accounts Payable process, and, after consultations with top management, the research objective was established as reducing the number of overdue payments. This focus allowed us to effectively identify the root causes of the problem and implement targeted solutions while also laying the foundation for future process improvement initiatives. By focusing on a specific process, we aimed to demonstrate the practical application of process mining techniques and the benefits they can bring to the organization. The objective of reducing late payments was selected as it had a direct impact on the financial performance of the organization and could be easily measured. By successfully addressing this issue, we hoped to demonstrate the value of process mining and encourage wider adoption within the organization. In the process improvement initiative at ECE, Celonis has been used as the primary tool for analyzing and visualizing the behavior of various processes. The Celonis Action Engine, a web-based application, was employed to turn process analysis insights into actionable recommendations and to provide operational support during the process execution [6]. Also, process mining dashboards were adopted. Fig.3 shows a simplified and anonymized representation of a typical process mining dashboard for the Accounts Payable process. Similar dashboards were used by both process participants and management to optimize the process. It highlights how addressing the different needs and requirements of different stakeholders can result in insights that may be difficult to uncover or put into action.
However, the effectiveness of the tool was highly dependent on the process analyst and how they designed the signals extracted from the data model. It failed to address several challenges faced by the ECE team, such as identifying complex patterns in interconnected processes and providing root cause insights based on feature importance calculations. To overcome these limitations, we decided to form an informal network of experts from various fields and work together to find the best solutions for specific challenges.
Currently, the focus is extended to the Purchase-to-Pay (P2P) process, which involves both the procurement/purchasing and a payment part (see Fig.  2), and is therefore interconnected with the Accounts Payable (AP) process. The considered P2P process involves the following steps: 1. Purchase Requisition: A purchase requisition is created by a user in the SAP system (with the execution of the transaction ME51N in SAP) to request the purchase of goods or services. This requisition is then reviewed and approved by a designated approver. 2. Purchase Order : Once a vendor has been selected, a purchase order (PO) is created in the SAP system to place the order with the vendor (with the execution of the transaction ME51N in SAP). The PO includes the details of the goods or services being purchased, the price, and the delivery terms. 3. Goods Receipt: Once the goods or services have been delivered, a goods receipt is recorded in the SAP system to acknowledge receipt of the items (with the execution of the transaction ME21N in SAP). 4. Invoice Verification: The vendor sends an invoice to the purchaser for the goods or services provided. The invoice is then reviewed and verified in the SAP system to ensure that the goods or services have been received and that the charges are correct (with the execution of the transaction MIRO in SAP). 5. Payment: Once the invoice has been verified, the payment is processed in the SAP system (with the execution of the transaction F-53 in SAP). This may involve creating a payment request, issuing a check, or making an electronic payment to the vendor. The execution of the process is summarized in Fig. 4. The process has been partly automated using the Invoice xSuite solution https://www.xsuite.com/ software/invoice/ for the digital acquisition of the documents (purchase requisitions, purchase orders, goods receipts, invoices). The meta-data that is acquired from the software needs to be checked by an operator.

Process Description
In our research, we analyzed issues pertaining not only to the Purchaseto-Pay cycle but also to interconnected processes, most notably, the Accounts Payable (AP) process. The AP process is responsible for managing the payments to vendors and suppliers for goods and services that were procured through the P2P process. The steps in the P2P process that are connected to the AP process include: • Purchase Order : This step in the P2P process is connected to the AP process as it involves creating the purchase order and including the details of the goods or services being purchased, the price, and the delivery terms. This information will be used in the invoice verification process and also in the payment process, to ensure that the right vendor is paid the right amount for the right goods and services. • Invoice Verification: This step in the P2P process is closely connected to the AP process as it involves reviewing and verifying the vendor invoices to ensure that the charges are correct and that the goods or services have been received. • Payment: This step in the P2P process is directly connected to the AP process as it involves processing the payments to vendors and suppliers through the SAP system. The AP process would be responsible for creating the payment request, issuing the check, or making an electronic payment. Overall, the P2P process in SAP helps to ensure that the right goods and services are procured from the right vendors at the right price, while the AP process in SAP helps to ensure that the vendors are paid correctly and on time for the goods and services that were procured through the P2P process. In light of the aforementioned points, an optimal implementation of the P2P process requires not only a good execution of the procurement and accounts payable steps but also an efficient collaboration between the two stages.

Planning
In this section, we plan the case study. In Subsection 3.1, we propose objectcentric event logs as the target data format. In Subsection 3.2, the research questions are proposed.

Target Data Format
Object-centric process mining techniques require object-centric event logs (OCELs, [7]) 1 . An example of OCEL in tabular form is contained in Table  1. For example, the first row contains the event with identifier e1, activity Create Purchase Requisition, and timestamp 2021-03-20 10:30. This event is related to a single object PR1 of type Purch.Req. We adopted the JSON-OCEL implementation, which is based on the JSON format.
Generating an object-centric event log from a database requires: • The extraction of the set business objects, along with their attributes.
• The extraction of the set of events.
• The correlation between events and objects. In particular, the last step requires a significant amount of design choices. For example, in a P2P process, an order can be related to different invoices. The correlation between orders and invoices can be described in the event with the activity "Create Purchase Order" (which will report the order and all the related invoices) or the event with the activity "Create Invoice" (reporting the order and the invoice related to the event).

Research Questions
We categorize different questions to identify the problems that can be addressed by object-centric process mining. Note that answering these questions requires dealing with interdependencies between different processes, where conventional process mining approaches do not provide convincing answers. We categorize the identified questions into three main categories: Process Performance (PP), Process Compliance (PC), and Process Quality (PQ). The questions related to Process Performance (PP) are as follows: PP1 What is the current processing capacity of the accounts payable department? PP2 How much time is needed from the accounts payable department to process/verify a single invoice document? PP3 How many procurement orders were inserted correctly during the creation so that no change is needed? (no-touch orders) PP4 Considering only the completed orders, what is the average throughput time from the placement of the purchase order to a given stage of the process (either the receipt of the goods or the payment)? PP5 What is the end-to-end performance of the P2P process (considering the entire chain of documents related to the order, which may or may not include purchase requisitions, invoices, and payments)? PP6 Which are the activities that are correlated with a high processing time? PP7 Does a high workload (in terms of open documents) in the purchasing or accounts payable departments lead to higher processing time? The following questions aim to analyze Process Compliance (PC), where the goal is to make sure that actual executions are following the designed process model: PC1 Does maverick buying (the order is placed without proper approval, and created in the system only after the invoice is received) happen in the process? PC2 How many purchase requisitions were changed after the placement of the corresponding purchase order, just to match the information? The last category of questions is related to Process Quality (PQ), where the goal is to evaluate the quality of individual tasks in a process, plus identifying unintended behavior in the control flow: PQ1 How many orders contain more than an invoice, leading to additional work in the accounts payable department? PQ2 What are the unexpected patterns in the execution of the targeted P2P process? We propose in Section 6 some analyses on the object-centric event log that allow answering the aforementioned questions. The results are presented in Section 8.

Extraction
In this section, we explain how to extract object-centric event logs of P2P processes from SAP ERP systems. For the sake of simplicity, we limit the scope of our analysis using the following criteria: • A specific organization/company code.
• The control-flow perspective (activity-timestamp of the events) without any data attribute. • The entire procurement stage of the process, and the invoicing-payment part of the accounting. • A time interval. We report the following system-specific challenges when extracting event data from SAP ERP: • The relational schema of SAP ERP counts hundreds of thousands of tables. Even data related to popular processes (O2C, P2P) is scattered along many different tables. • The relational schema of SAP ERP is weak, and foreign keys are usually implicitly defined and maintained at the application level. • SAP ERP is a multi-tenant system and therefore events of different clients on the application side can be contained in the same tables. They can be distinguished by the client field (MANDT). • The identifier of some documents is not unique but is unique in a given fiscal year. That poses the challenge that only the concatenation between the identifier and the fiscal year provides a unique reference to the document. • The granularity of the timestamp: while for some information the timestamp is recorded with the second's granularity, for some other information the timestamp has a day's granularity. To extract an object-centric event log for the considered P2P process, the following steps have been conducted: 1. Identification of the different documents/stages of the process: we identified the procurement stage (including the management of purchase requisitions and purchase orders documents, and the receipt of the goods) and the accounts payable stage (receipt and management of invoices and payments). The cardinalities of the relationships between these documents are summarized as follows: • Many-to-many relationships between purchase requisitions and purchase orders. • Many-to-many relationships between purchase orders and goods receipts. • Many-to-many relationships between purchase orders and invoices.
• Many-to-many relationships between invoices and payments. The xSuite workflow system is responsible for the digital acquisition of documents related to purchase orders, goods receipts, and invoices. Since the correctness of the data inserted for the procurement stage is essential for the correct execution of the accounts payable process, the workflow system plays a big role in ensuring that data is inserted correctly in the system. 2. Identification of the objects related to the given documents: we consider the following correspondence: • A purchase requisition document is associated with a "purchase requisition" object and some "purchase requisition item" objects. • A purchase order document is associated with a "purchase order" object and some "purchase order item" objects. Moreover, if purchase requisitions are associated with the order, their "purchase requisition" and "purchase requisition item" objects are related to the order.
• A goods receipt document is associated with the corresponding "purchase order" and "purchase order item" objects (of all the received items). • An invoice document is associated with a "invoice" object and some "invoice item" objects. Given the purchase orders associated with the invoice, their "purchase order" and "purchase order item" objects are associated with the invoice. • A payment document is associated with a "payment" object and some "payment item" objects. Given the invoices associated with the payment, their "invoice" and "invoice item" objects are associated with the payment. 3. Extraction of the objects: after identifying the different object types, we identified the tables inside SAP ERP that can be used to extract the objects. The correspondence between the object types defined in the log and tables/objects are described in Table 2. 4. Extraction of the relationships between the objects: for this step, some tables of SAP ERP are used to link the different objects (see Table 3), as visualized in Fig. 5. All links are trivial except the invoice-payment connection 2 . 5. Extraction of the basic events: for this step, we extract some basic events directly from the entries of the considered tables as described in Table  4. We note that from the same table, events with different activities and timestamps can be extracted. For the analysis, we do not look for additional attributes aside from the activity and the timestamp. Since the required information is stored with the day's granularity (e.g. purchase requisition, purchase order, and invoice could happen on the same day), we add a time delta to the dates to ensure the correct logical order between the events. 6. Extraction of the change events: the rows of the table CDPOS become change events of our object-centric event log, reporting as related object identifiers the one cited in the table. The activity associated with these events depends on the fields that are created/updated/removed by the change.

Data Preprocessing
We obtained an object-centric event log for the P2P process as described in Section 4. However, this object-centric event log might contain incomplete executions of the process (open orders still being processed in the system, or orders starting outside the considered time interval). Therefore, in order to get reliable insights for the analyses proposed in Section 6, we need to filter out incomplete executions of the process from the object-centric event log. Fig. 6: Graph representing the interactions between the objects, starting from the object-centric event log contained in Table 1 To do so, we adopt a simple data preprocessing strategy, which allows us to sample and filter the object-centric event log.
The strategy consists of building a graph (called the object interaction graph) in which the objects of the object-centric event log are the nodes, and the interactions between the objects (two objects belonging to the set of related objects of any event in the log) are the edges of this graph.
Starting from the object-centric event log contained in Table 1, we build the graph represented in Fig. 6. In this case, the different colors at the node level highlight different connected components of the given graph. The object-centric event log can be sampled keeping only the events related to the objects of a given connected component, or some properties of the connected components can be considered for filtering.
This allows us to easily filter out incomplete behavior (given that we applied a filter at the timestamp level). In our case, we would like to analyze the endto-end process going from the purchase requisition to the payment. Since not all the orders are inserted with a purchase requisition, we want to consider all the connected components that contain at least a purchase order object and a payment object. In the example object-centric event log proposed in Table 1, we highlight the components that satisfy such criteria:

Mining and Analysis
In this section, we describe the analyses which were adopted in the recent past to analyze the P2P process in the company (Subsection 6.1) and propose new object-centric analyses to address the research questions (Subsection 6.2).

Previously Adopted Process Discovery Techniques
Existing techniques have been applied to different event log extractions, and we divide them between traditional/case-centric techniques (considering traditional event logs and the interaction between them) and object-centric process mining techniques (which require object-centric event logs as input).
Traditional/Case-Centric Process Mining: When working with traditional process mining techniques, a case notion needs to be chosen. The order or the invoice are popularly chosen as case notions for the P2P process. This results in event logs containing, for each case, all the events related to the processing of the order/invoice documents, plus all the events related to the processing of interconnected documents. For example, choosing order as a case notion would lead to including in the same case the events related to the purchase requisitions, goods receipts, invoices, and payments related to the order, in addition to the processing of the given order. This leads to "spaghetti" process models with unreliable annotations due to the convergence and divergence issues [4]. For example, Fig. 7 contains a "spaghetti" directly-follows graph Fig. 7: "Spaghetti" directly-follows graph of the P2P process obtained using traditional process mining techniques. obtained from our P2P process. Simplification techniques can be applied to obtain simpler graphs, however, annotations are even more unreliable [8]. Commercial vendors are aware of the problems of traditional process mining techniques and tried to propose solutions. For example, Celonis supports the usage of "Multi-Event Logs" (MEL) in which interactions are shown between different process maps computed on different event logs of interconnected processes (the interacting cases are assumed to be known). These interactions are computed using temporal (interleaved & non-interleaved miner ) and attributebased criteria (match miner ). Figure 8 contains an example multi-event log, in which the interleaved miner has been used to mine temporal interactions between the two process maps (the right one related to the "procurement" log, the left one related to the "accounts payable" log). We could see different interactions, including: • An interaction between the Create Purchase Requisition, the Create Purchase Order, and the Vendor Creates Invoice activities, considering the order in which they are created. If goods are ordered prior to issuing of Purchase Requisition and Purchase Order, then maverick buying 3 happens. Problems connected to maverick buying can lead to a number of issues, such as overspending or lack of proper approvals. Tracking the temporal interactions between different process maps using the interleaved miner can further improve visibility and understanding of the overall purchase process.

• An examination of the interactions between the Enter Goods Receipt and
Receive Invoice activities is crucial to understand how the business users are processing goods receipts in practice. In theory, the goods receipt processing transaction (MIGO, in SAP) should be used to confirm the quantity and quality of received materials. However, if it is observed that there is only a minimal time gap between the receipt of an invoice and entering the goods receipt activity, the users might be entering both invoice information and goods receipt information simultaneously. This interaction can be used to identify categories of goods for which the goods receipt activity is unnecessary and only pro forma. This can help organizations to optimize their processes, eliminate unnecessary steps and reduce costs. Additionally, it can also help to identify potential errors or discrepancies in the invoicing and goods receipt process and improve the accuracy of financial and inventory records. However, while the meaning of some interactions discovered by the method was clear, some others are quite difficult to interpret.
Object-Centric Process Mining: we considered mainly open-source prototypal software because commercial vendors (Celonis, MPM, IBM) are starting to offer support for object-centric process mining. For example, the Celonis vendor recently introduced the Process Sphere feature, which allows "to analyze and visualize the complex relationships between events and objects across interconnected processes" 4 and can ingest object-centric event logs in the OCEL standard. However, the features were still not available in our Celonis instance at the moment in which the analysis was performed. Prototypal tools supporting the application of object-centric process mining techniques on top of OCELs are OCPM [9] and OCπ [10]. In Fig. 9, an object-centric directly-follows graph computed (using the OCPM tool) on an extract of the object-centric event log of the P2P process is shown. This contains different object types and activities (detailed in Tab. 2 and Tab. 4). According to the provided abstraction, Create Purchase Order is preceded by Create Purchase Requisition and followed by Invoice Created by Supplier. Then the goods and invoices are received by the company and recorded in the system (Goods Receipt and Invoice Receipt). The invoice is eventually posted for verification (Invoice Posted ) and then the payment occurs clearing the different items of the invoices (Clearance Posted (Payment)).
An object-centric directly-follows graph (such as the one in Fig. 9) helps to identify the paths between the activities in a process. These paths can be annotated with frequency (as in the figure) or performance metrics. Therefore, we can get reliable statistics about the number of occurrences of an activity, or the number of objects related to events of the given activity. We could also identify the bottlenecks in the process by considering the average times between two directly following activities.
Limitations of Existing Object-Centric Techniques: in the context of a P2P process, the object-centric directly-follows graph allows us to visualize the performance and the compliance of the process: • Maverick buying can be quantified by looking at the sum of the frequency of the input arcs of type Ord.Doc. in the Create Purchase Order activity. • Post-mortem changes to purchase requisitions can be identified by looking at the output arcs of type Req.Doc. in the Create Purchase Order activity. However, some analyses cannot be answered by considering object-centric directly-follows graphs. Regarding process performance, we are interested in the throughput time of a given step of the process (for example, the invoice processing time, which can be computed by the average difference between the completion and starting time for the objects of type invoice), or the end-to-end performance (from purchase requisitions to the completion of the payment) rather than the times of an individual path. Also, the correlations between the execution of an activity and the total throughput time, or the workload metrics (as the number of concurrently worked items) cannot be visualized on object-centric directly-follows graphs.

Novel Analysis Techniques
Mainstream object-centric process discovery techniques are not able to satisfy some of our goals, therefore we focus on two novel analyses which allow us to retrieve the answers: • Graph-based analyses: we used the implementation provided in the tool OCPM (https://www.ocpm.info/) . Alternative implementations have been evaluated using PM4Py (https://pm4py.fit.fraunhofer.de/) and the Neo4J graph database [11].
• Statistical analyses: in this case, we used the tool OCPM to build a feature table on the object-centric event log and apply the available techniques of the tool to get some insights. Fig. 10: Example object creation graph computed starting from the objectcentric event log contained in Table 1 To answer some of the research questions, we can build an alternative directed graph (in comparison to the one introduced in Section 5) showing the short-and long-term dependencies between the objects in the object-centric event log. This is done by considering as nodes the objects of the log, and using the following criteria to connect objects with a directed arc:

Graph-Based Analyses
• The objects should be both related to a given event.
• The source object's lifecycle does not start with the given event.
• The target object's lifecycle starts with the given event. We call this object creation graph. An example of an object creation graph computed starting from the object-centric event log contained in Table 1 is contained in Fig. 10.
The object creation graph is different from the object interaction graph defined in Section 5, which was useful for discovering the sets of connected objects and allowing for advanced filtering. Instead, the object creation graph helps to identify the logical flow of the objects in the object-centric event log and in identifying long-term dependencies.
Starting from the object creation graph, we can respond to some of the research questions: • For PP4 and PP5, we can consider the long-term dependencies between the different objects represented in the object creation graph. • For PC1, we consider the invoices that continue into an order as the ones for which maverick buying happens (in the normal situation, an order is continued into an invoice). • For PC2, we consider the couples of purchase requisitions and orders directly connected in the object creation graph and consider the timestamps of the first/last event of the lifecycle of the given purchase requisitions and orders.
• For PQ1, we consider orders which continue in more than an invoice.

Statistical Analyses
In this section, we want to perform specific analyses (which could not be performed by interpreting the object-centric directly-follows graph) using statistical analyses: • Which are the activities that are correlated with a high processing time? (PP6) • Does a high workload (in terms of open documents) lead to higher processing time? (PP7) • Which are the "outlier" patterns in the execution of the business process?
(PQ2) The answer to these points can be obtained using respectively the Correlation Statistics (first two analyses) and Conformance pages of the machine learning component of OCPM.

The OCPM Tool
In this section, we present the main features of the OCPM tool that was used to perform the analysis. The Javascript-based tool is publicly available at https://www.ocpm.info/ and a demo on a simulated log is available at https://www.ocpm.info/ocel demo.html. The OCPM tool provides a rich set of object-centric process mining features: • Ingestion/exporting of object-centric event logs in the OCEL standard format (JSON-OCEL and XML-OCEL). • Flattening the object-centric event logs into traditional event logs with the choice of a case notion. • Advanced preprocessing features (filtering, sampling).
• Discovering object-centric process models: object-centric directly-follows graphs [9], and object-centric Petri nets [12]. • Conformance checking on object-centric event logs based on declarative and temporal constraints (log skeleton, temporal profile). • Exploration of the events/objects of the object-centric event log.
• Machine learning (anomaly detection, correlation analytics, advanced conformance checking). Fig. 11 shows the process model component of the proposed tool on top of the "demo" dataset. The position of different logical components is highlighted. In particular, 1 allows for the selection of the object-centric process model to discover, 2 allows to see and remove the filters applied on the event log, 3 allows to apply a filter on an activity of the model, 4 allows to apply a filter on an edge of the model. Fig. 12 shows the machine learning component of the proposed tool on top of the "demo" dataset. The landing page computes the features of the objectcentric event logs and proposes an SQL explorer to navigate/query the values for the different objects. In i, an anomaly detection algorithm is applied to the features in order to identify the anomalous objects (click an object allows to explore its lifecycle). In ii, correlation statistics are computed between a feature and the other features of the object-centric event log. In iii, a dimensionality reduction technique (FASTMAP) is used to show groups of objects with similar features. In iv, the correlation between a feature and the other features is explored by means of decision trees.

Evaluation
In this section, we evaluate the correctness and the quality of the insights obtained by our analysis. These are the steps of our evaluation: • Since we developed a custom extractor of object-centric event logs, we checked if the information extracted from SAP (number of objects, timestamps, the flow of documents) matches the result of some reporting transactions in SAP (in particular, ME53N (Display Purchase Requisition), ME23N (Display Purchase Order), and FB03 (Display Finance Document)). • We compared the frequency/performance annotations with the ones obtained by applying traditional techniques offered by the Celonis tool. To make the comparisons feasible, we focused on smaller samples of the original object-centric event logs (obtained by applying the technique described in Section 5 ).
The analytical comparison shows some advantages of object-centric process mining over traditional process mining techniques:  • We could obtain the correct number of documents for every stage of the process (PP1). This applies in particular to the number of distinct purchase requisitions. Some purchase requisitions involved items that were then procured by different purchase orders. Therefore, the number of purchase requisitions obtained in the Celonis software was significantly bigger than the true number of purchase requisitions (due to the convergence problem). • We found that the end-to-end performance (PP5) of the Purchase-to-Pay process was skewed by the reversal of some payments, which was not supported by the traditional extractor (because it would require other computationally expensive table joins), but that we could support in the object-centric setting.
Moreover, we found some interesting compliance/quality patterns in the considered timespan: • Maverick Buying (PC1) is a deleterious behavior where the traditional approval steps of a purchase order are skipped and the order is placed directly with the supplier. An invoice is then received by the supplier, which leads to the creation of the purchase order. We could detect a non-negligible amount of maverick buying in our process. • Post-Mortem Changes to Purchase Requisitions (PC2): another deleterious behavior observed in the process is that purchase requisitions are changed after their approval to match the amounts/quantities of the purchase order. • Orders with Duplicated Invoices (PQ1): we could detect some orders with duplicated invoices in our system. Also, some orders with dozens of different invoices exist. After investigation, we detected that they are maintenance contracts. An example of a maintenance contract is a cleaning contract, repeated weekly at the same conditions and invoiced monthly. In addition, also the statistical analyses provided us with interesting insights: • We could detect a high correlation (PP6) between the presence of change activities (for both orders and invoices) with the processing time. In particular, changes in the amount (to pay) for the invoices, or the amounts expected for the orders, are particularly deleterious on the processing time (see Tab. 5). • In our process and in the considered timespan, the workload is highly correlated (correlation coefficient 0.89) with the processing time (PP7). Therefore, an high workload in the process (number of open documents) leads to a higher processing time (see Fig. 13). • Considering PQ2, e could identify some purchase orders with a significant number of change activities (>= 150) executed on the order. Moreover, we can identify invoices related to more than 20 orders.

Improvements
The results of the analysis are currently being used to improve the execution of the business process. In particular, the following improvements have been already adopted in the company: • Our study delved into the issues related to maverick buying (PC1) and post-mortem changes in purchase requisitions (PC2), and through this investigation, we were able to identify steps to address these issues. These included updating internal documents with additional instructions and incorporating adjustments to internal training programs to improve the purchasing process. • By identifying activities that were correlated with prolonged processing times (PP6), we were able to conduct a series of workshops with business users and apply the Pareto principle to identify the main underlying causes. This allowed us to target the most significant contributors to the processing time, and work towards implementing effective solutions. • Our analysis identified instances of duplicated invoices (PQ1), which provided insight into issues with master data management within the company. Further investigation revealed that these duplicates were often the result of workarounds implemented by individual users to speed up the invoice entry process. By addressing these root causes, the company was able to improve its master data management and reduce the occurrence of duplicated invoices. • By examining the correlation between high workload and prolonged processing times (PP7), we were able to identify "peak times" when the workload was highest. With this information, we were able to find temporary resource solutions to stabilize the workload and ensure a more efficient process. Other considerations are currently evaluated but still not enacted: • By analyzing the results of PP1 and PP2, we were able to discern distinct groups of invoices that exhibited particularly long or short lifecycle times. While some of these cases may be justifiable due to factors such as long-term contracts or emergency maintenance orders, we also identified outliers that cannot be explained by these criteria. From the perspective of business users, these outliers may be considered excessive. To address this issue, potential interventions may include updating relevant master data and conducting interdepartmental workshops to address underlying issues. • By considering the results of PP4 and PP5, we were able to determine whether invoices with prolonged lifecycle times were affected by issues related to purchase orders, such as long processing times or bottlenecks. This information was used to inform the design of improved communication strategies between the departments responsible for invoice payments and ordering goods, to address these identified issues and anticipate them in the future. • The identification and examination of outliers in interconnected processes (PQ2) helped to initiate a discussion about the desired behavior in these processes, and what constitutes non-compliant behavior. This ultimately led to a deeper understanding of the processes and how to improve them.

Related Work
The paper described process mining analyses on top of a Purchase-to-Pay process supported by SAP ERP. Therefore, we consider scientific results related to the extraction of traditional/object-centric event data and the results of previous case studies. Moreover, since the interconnected nature of objects in SAP, we consider graph-based techniques.
Extraction of Traditional Event Logs from SAP ERP: SAP is an interesting system for process mining since its widespread usage by companies and the unstructuredness of the supported processes. Hence, several process mining publications targeted the extraction of data from SAP ERP. In [13], a method is proposed for the extraction and transformation of event logs from SAP ERP, which involves the manual specification of a meta-model defining how events, resources, and their relationships are stored. The method has been applied to SAP systems provided by Norwegian Agricultural and Marketing Cooperative and Nidar. Some limitations exist: although all the information (transactional, master, and ontological data) needed to extract meaningful process models is available, the transactions are not mapped directly to the tasks. Moreover, it was not possible to map the extracted transaction flow to the processes in the SAP reference model. In [14], a meta-model that can ingest the contents of a relational database and provide the possibility to easily specify queries to produce an event log is described. The meta-model can be used on a database supporting SAP ERP. However, this leads to the generation of traditional event logs having convergence/divergence issues [4].
Extraction of Object-Centric Event Data from SAP ERP:Some approaches have been proposed to avoid the drawbacks of using traditional event logs. In [15], the construction of artifact-centric models on top of SAP ERP is proposed, along with an implementation in the popular ProM 6.x framework. An artifact-centric model considers both the lifecycle of an artifact (purchase order document, invoice document, payment document) and the interaction between different artifacts. Some limitations exist: the approach requires some non-trivial manual steps, and the discovery phase is limited to two artifacts. In [16], a method to extract object-centric event logs starting from SAP ERP is proposed, which is the foundation of the current paper. The proposed prototypal software is limited by an in-memory approach and customization options. Moreover, some fundamental details (the construction of the GoR and the extraction of the relationships between the table entries) have been omitted for space reasons.
Process Mining Case Studies on top of SAP ERP: SAP ERP stores interesting but company-critical data. Therefore, few case studies in applying process mining on top of SAP ERP data have been proposed. In [17], an application of process mining to the Order-to-Cash and Procure-to-Pay processes of a manufacturing company is proposed. An application of process mining to the Procure-to-Pay and Accounts Payable processes is proposed in [18,19]. The warehouse management process is considered in [20]. Also, [21] discusses the implementation of a decision support system, supported by process mining, for the standardization of ERP systems.
Graph-Based Analyses of SAP ERP: The graph-based nature of event data is exploited in [22]. Traditional (and object-centric) event logs can be encoded in a graph database. This allows for queries that are unfeasible on top of relational databases since edges are first-class entities in graph databases. An application to ERP systems (BPI Challenge 2019 log) is proposed. The contribution in [23] further exploits the graph-and object-based nature of event data to build event knowledge graphs. This data structure allows us to naturally model behavior over multiple entities as a network of events.

Conclusion
In this paper, we presented a case study concerning the application of objectcentric process mining techniques to a real-life P2P process, along with novel ad-hoc techniques that were needed to process the event data. We adapted the P M 2 process mining project methodology to the object-centric setting, in particular in the extraction (retrieval of the events, objects, and event-to-object relationships), preprocessing (proposing a simple filtering/sampling technique in the object-centric setting) and analysis (with the usage of object-process mining techniques). During the analysis, we considered both traditional and novel techniques. In particular, we proposed process-tailored graph-and statistical-based paradigms. The results that we obtained were interesting with regard to the performance and compliance of the process (Section 8). Currently, we have enacted some of the insights, and some other optimizations are planned (Section 9). Overall, the usage of object-centric process mining allows, in contrast with traditional process mining, the retrieval of the correct number of documents for every stage of the process, and allows for more reliable end-to-end performance measurement.
We found that the discipline of object-centric process mining is still in an early stage concerning project support. In particular, we needed to develop extraction/preprocessing techniques. Therefore, further work is needed to increase the adoption of object-centric process mining in real-life contexts.

Declarations
Ethical Approval: Not Applicable.