Skip to main content

A graph-theoretic method for the inductive development of reference process models


Business process management is one of the most widely discussed topics in information systems research. As process models advance in both complexity and maturity, reference models, serving as reusable blueprints for the development of individual models, gain more and more importance. Only a few business domains have access to commonly accepted reference models, so there is a widespread need for the development of new ones. This article describes a new inductive approach for the development of reference models, based on existing individual models from the respective domain. It employs a graph-based paradigm, exploiting the underlying graph structures of process models by identifying frequent common subgraphs of the individual models, analyzing their order relations, and merging them into a new model. This newly developed approach is outlined and evaluated in this contribution. It is applied in three different case studies and compared to other approaches to the inductive development of reference models in order to highlight its characteristics as well as assets and drawbacks.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.

    Cf. attribute “construction method” in the reference model catalog at http://rmk.iwi.uni-sb.

  2. 2.

    Technically, the number of input models is not restricted. Factually, a very large number of input models will not result in a meaningful reference model. This is a general limitation of inductive approaches, which is also addressed in Sect. 9.

  3. 3.

    This label-based mapping is a loophole, such that the algorithm may be used even without a mapping. However, such a label-based mapping may significantly reduce the quality of the resulting reference model, as not all node analogies may be captured.

  4. 4.

  5. 5.

    Technically, a singleton subgraph is simply a node. However, we deliberately use the term “singleton subgraph” to stress the fact that it is contained in several models simultaneously. Node are associated with one specific model only, whereas singleton subgraphs are related to a set of models.

  6. 6.

    Cf. our project Web site (partially in German).

  7. 7.

    The acronym MOR was chosen by Vogelaar et al. [48] in their respective study. Although it is not clear what MOR stands for, we use the name for comparability and simplicity.

  8. 8.

    We acknowledge that runtimes can only realistically be compared when the programs are executed on comparable IT infrastructures. However, none of the other publications specify the infrastructure used for their experiments. As we used a fairly standard PC for our computations (as specified in Sect. 7), we still assume the runtimes to be at least roughly comparable.

  9. 9.

    Technically, there are four models for this domain. Since two of them are identical, only one is included in our computation.


  1. 1.

    Ahlemann, F., Gastl, H.: Process model for an empirically grounded reference model construction. In: Fettke, P., Loos, P. (eds.) Reference Modeling for Business Systems Analysis. IGI Global, Hershey (2007)

    Google Scholar 

  2. 2.

    Aier, S., Fichter, M., Fischer, C.: Referenzprozesse empirisch bestimmen Von Vielfalt zu Standards. Wirtsch. Manag. 3(3), 14–22 (2011). (in German)

    Article  Google Scholar 

  3. 3.

    Ardalani, P., Houy, C., Fettke, P., Loos, P.: Towards a minimal cost of change approach for inductive reference model development. In: Proceedings of the 21st European Conference on Information Systems. European Conference on Information Systems, AIS (2013)

  4. 4.

    Becker, J., Delfmann, P., Knackstedt, R., Kuropka, D.: Konfigurative Referenzmodellierung. In: Becker, J., Knackstedt, R. (eds.) Wissensmanagement mit Referenzmodellen: Konzepte für die Anwendungssystem- und Organisationsgestaltung, pp. 25–144. Springer, Berlin (2002)

    Chapter  Google Scholar 

  5. 5.

    Becker, J., Meise, V.: Strategy and organizational frame. In: Becker, J., Kugeler, M., Rosemann, M. (eds.) Process Management. A Guide for the Design of Business Processes, pp. 91–132. Springer, Berlin (2011)

    Google Scholar 

  6. 6.

    Becker, J., Schütte, R.: Reference information systems for retail: definition, use and recommendations for design and company-specific adaption of reference models. In: Wirtschaftsinformatik, pp. 427–448. Springer (1997) (in German)

  7. 7.

    Becker, J., Schütte, R.: A reference model for retail enterprises. In: Fettke, P., Loos, P. (eds.) Reference Modeling for Business Systems Analysis. IGI Global, Hershey (2007)

    Google Scholar 

  8. 8.

    Becker, M., Laue, R.: A comparative survey of business process similarity measures. Comput. Ind. 63(2), 148–167 (2012)

    Article  Google Scholar 

  9. 9.

    Castano, S., De Antonellis, V., Fugini, M.G., Pernici, B.: Conceptual schema analysis: techniques and applications. ACM Trans. Database Syst. 23(3), 286–333 (1998)

    Article  Google Scholar 

  10. 10.

    Cayoglu, U., Dijkman, R., Dumas, M., Fettke, P., Garcia-Banuelos, L., Hake, P., Klinkmller, C., Leopold, H., Ludwig, A., Loos, P., Mendling, J., Oberweis, A., Schoknecht, A., Sheetrit, E., Thaler, T., Ullrich, M., Weber, I., Weidlich, M.: The process model matching contest 2013. In: 4th International Workshop on Process Model Collections: Management and Reuse. Springer (2013)

  11. 11.

    Cordella, L., Foggia, P., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004)

    Article  Google Scholar 

  12. 12.

    Dijkman, R., Dumas, M., Garcia-Banuelos, L.: Graph matching algorithms for process model similarity search. In: Business Process Management, pp. 48–63. Springer (2009)

  13. 13.

    Dijkman, R., La Rosa, M., Reijers, H.: Managing large collections of business process models-current techniques and challenges. Comput. Ind. 63(2), 91–97 (2012)

    Article  Google Scholar 

  14. 14.

    Dorn, F.: Planar subgraph isomorphism revisited. In: 27th International Symposium on Theoretical Aspects of Computer Science-STACS 2010, pp. 263–274 (2010)

  15. 15.

    Eppstein, D.: Subgraph isomorphism in planar graphs and related problems. In: Proceedings of the sixth annual ACM–SIAM symposium on Discrete algorithms, pp. 632–640. Society for Industrial and Applied Mathematics (1995)

  16. 16.

    Fettke, P., Loos, P.: Perspectives on reference modeling. In: Fettke, P., Loos, P. (eds.) Reference Modeling for Business Systems Analysis, pp. 1–20. Idea Group Publishing, Hershey (2007)

    Chapter  Google Scholar 

  17. 17.

    Fill, H.G.: On the conceptualization of a modeling language for semantic model annotations. In: Salinesi, C., Pastor, O. (eds.) Advanced Information Systems Engineering Workshops, vol. 83, pp. 134–148. Springer, Berlin (2011)

    Chapter  Google Scholar 

  18. 18.

    Foggia, P., Sansone, C., Vento, M.: A performance comparison of five algorithms for graph isomorphism. In: Proceedings of the 3rd IAPR TC-15 Workshop on Graph-based Representations in Pattern Recognition, pp. 188–199 (2001)

  19. 19.

    Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., San Francisco (1990)

    MATH  Google Scholar 

  20. 20.

    Gottschalk, F., Van Der Aalst, W., Jansen-Vullers, M.: Mining reference process models and their configurations. In: Meersman, R., Tari, Z., Herrero, P. (eds.) On the Move to Meaningful Internet Systems: OTM 2008 Workshops. Lecture Notes in Computer Science, vol. 5333, pp. 263–272. Springer, Berlin (2008)

  21. 21.

    Gregor, S., Hevner, A.R.: Positioning and presenting design science research for maximum impact. MIS Q. 37(2), 337–356 (2013)

    Google Scholar 

  22. 22.

    Gröger, S., Decker, J., Schumann, M.: Do universities get the hang of working efficiently?—A survey of the influencing factors on the adoption of electronic document and workflow management in german-speaking countries. In: Proceedings of the Twentieth Americas Conference on Information Systems, Savannah (2014)

  23. 23.

    Gröger, S., Schumann, M.: It-unterstützung zur verbesserung der dritt-und sondermittelbewirtschaftung an hochschulen-state of the art. Tech. rep., University of Göttingen (2013)

  24. 24.

    Hevner, A., March, S., Park, J., Ram, S.: Design science in information systems research. MIS Q. 28(1), 75–105 (2004)

    Google Scholar 

  25. 25.

    Houy, C., Fettke, P., Loos, P.: On the theoretical foundations of research into the understandability of business process models. In: Proceedings of the 22nd European Conference on Information Systems (ECIS-14), Tel Aviv, Israel, AIS, 9–11 June (2014)

  26. 26.

    Karow, M., Pfeiffer, D., Räckers, M.: Empirical-based construction of reference models in public administrations. In: Multikonferenz Wirtschaftsinformatik, pp. 1613–1624 (2008)

  27. 27.

    Klinkmüller, C., Weber, I., Mendling, J., Leopold, H., Ludwig, A.: Increasing recall of process model matching by improved activity label matching. In: Business Process Management, pp. 211–218. Springer (2013)

  28. 28.

    Kuramochi, M., Karypis, G.: Finding frequent patterns in a large sparse graph*. Data Min. Knowl. Discov. 11(3), 243–271 (2005)

    MathSciNet  Article  Google Scholar 

  29. 29.

    La Rosa, M., Dumas, M., Uba, R., Dijkman, R.: Business process model merging: an approach to business process consolidation. ACM TOSEM 22(2), 11 (2013)

    Google Scholar 

  30. 30.

    Levenshtein, V.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 10, 707–710 (1966)

    MathSciNet  MATH  Google Scholar 

  31. 31.

    Li, C., Reichert, M., Wombacher, A.: On measuring process model similarity based on high-level change operations. In: Proceedings of the 27th International Conference on Conceptual Modeling, pp. 248–264. Springer (2008)

  32. 32.

    Li, C., Reichert, M., Wombacher, A.: Discovering reference models by mining process variants using a heuristic approach. In: BPM’09, no. 5701 in LNCS, pp. 344–362. Springer (2009)

  33. 33.

    Li, C., Reichert, M., Wombacher, A.: The minadept clustering approach for discovering reference process models out of process variants. Int. J. Coop. Inf. Syst. 19(3–4), 159–203 (2010)

    Article  Google Scholar 

  34. 34.

    Lu, R., Sadiq, S.: Managing process variants as an information resource. In: Business Process Management, pp. 426–431 (2006)

  35. 35.

    Martens, A., Fettke, P., Loos, P.: A genetic algorithm for the inductive derivation of reference models using minimal graph-edit distance applied to real-world business process data. In: Kundisch, D., Suhl, L., Beckmann, L. (eds.) Tagungsband Multikonferenz Wirtschaftsinformatik. Universität Paderborn, Paderborn (2014)

  36. 36.

    Melcher, J.: Process Measurement in Business Process Management: Theoretical Framework and Analysis of Several Aspects. KIT Scientific Publishing, Karlsruhe (2014)

    Google Scholar 

  37. 37.

    Peffers, K., Tuunanen, T., Rothenberger, M., Chatterjee, S.: A design science research methodology for information systems research. J. Manag. Inf. Syst. 24(3), 45–77 (2007)

    Article  Google Scholar 

  38. 38.

    Polyvyanyy, A., Smirnov, S., Weske, M.: Business process model abstraction. Handb. Bus. Process Manag. 1, 147–165 (2015)

    Google Scholar 

  39. 39.

    Rehse, J.R., Fettke, P., Loos, P.: Eine Untersuchung der Potentiale automatisierter Abstraktionsansätze für Geschäftsprozessmodelle im Hinblick auf die induktive Entwicklung von Referenzprozessmodellen. In: Alt, R., Franczyk, B. (eds.) Proceedings of the 11th International Conference on Wirtschaftsinformatik, Leipzig, Germany (2013) (in German)

  40. 40.

    Skiena, S.: Graph problems: hard problems. In: The Algorithm Design Manual, pp. 523–561. Springer, London (2008)

  41. 41.

    Song, W., Liu, S., Liu, Q.: Business process mining based on simulated annealing. In: The 9th International Conference for Young Computer Scientists, IEEE, pp. 725–730 (2008)

  42. 42.

    Thaler, T., Hake, P., Fettke, P., Loos, P.: Evaluating the evaluation of process matching techniques. In: Suhl, L., Kundisch, D. (eds.) Tagungsband der Multikonferenz Wirtschaftsinformatik. Universität Paderborn, Paderborn (2014)

    Google Scholar 

  43. 43.

    Thaler, T., Walter, J., Ardalani, P., Fettke, P., Loos, P.: Development and usage of a process model corpus. In: Proceeding of the 24th International Conference on Information Modelling and Knowledge Bases EJC (2014)

  44. 44.

    Uba, R., Dumas, M., Garca-Bauelos, L., La Rosa, M.: Clone detection in repositories of business process models. In: Business Process Management, pp. 248–264. Springer (2011)

  45. 45.

    Van Der Aalst, W.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011)

    Book  MATH  Google Scholar 

  46. 46.

    van der Aalst, W.: Business process management: a comprehensive survey. ISRN Software Engineering (2013)

  47. 47.

    Van der Aalst, W., van Dongen, B., Herbst, J., Maruster, L., Schimm, G., Weijters, A.: Workflow mining: a survey of issues and approaches. Data Knowl. Eng. 47(2), 237–267 (2003)

    Article  Google Scholar 

  48. 48.

    Vogelaar, J., Verbeek, H., Luka, B., van der Aalst, W.M.: Comparing business processes to determine the feasibility of configurable models: a case study. In: Business Process Management Workshops, pp. 50–61. Springer (2012)

  49. 49.

    Walter, J., Fettke, P., Loos, P.: How to identify and design successful business process models: an inductive method. In: Becker, J., Matzner, M. (eds.) Promoting Business Process Management Excellence in Russia. Proceedings and Report of the PropelleR 2012 Workshop. Innovation Forum PropelleR, vol. 15, pp. 89–96. European Research Center for Information Systems, Münster (2013)

  50. 50.

    Weidlich, M., Dijkman, R., Mendling, J.: The icop framework: identification of correspondences between process models. In: Advanced Information Systems Engineering, pp. 483–498. Springer (2010)

  51. 51.

    Weidlich, M., Sheetrit, E., Branco, M.C., Gal, A.: Matching business process models using positional passage-based language models. In: Conceptual Modeling, pp. 130–137. Springer (2013)

  52. 52.

    Weske, M.: Business Process Management: Concepts, Languages, Architectures. Springer, Berlin (2007)

    Google Scholar 

  53. 53.

    Yahya, B.N., Bae, H., Bae, J., Kim, D.: Generating valid reference business process model using genetic algorithm. Int. J. Innov. Comput. Inf. Control 8(2), 1463–1477 (2012)

    Google Scholar 

  54. 54.

    Yan, X., Han, J.: Closegraph: mining closed frequent graph patterns. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 286–295. ACM (2003)

Download references


The research described in this paper was partly supported by a grant from the German Research Foundation (DFG), project name: Konzeptionelle, methodische und technische Grundlagen zur induktiven Erstellung von Referenzmodellen (Reference Model Mining), support code GZ LO 752/5-1. The authors would like to thank the three anonymous reviewers and the editor-in-chief for their valuable comments and insights, which helped to improve this contribution.

Author information



Corresponding author

Correspondence to Jana-Rebecca Rehse.

Additional information

Communicated by Prof. Ulrich Frank.

Example computation of RMM-1

Example computation of RMM-1


The main objective of the running example in Sect. 5 is to explain the basic functionality of the RMM-1 approach. The models are small and rather synthetic in nature. They are chosen such that no difficulties arise in computing the subgraphs and aggregating the relations. This is necessary to illustrate the formally defined algorithm behind the RMM-1 approach. However, this running example does not provide a real-world application scenario of our approach. As the increased applicability and flexibility of RMM-1 are stressed in this contribution, we intend to demonstrate them by means of a complete and realistic running example. Hence, we use this chapter for a thorough application of the RMM-1 to a set of real-world models. This way, we are able to illustrate the difficulties in inductive reference model computation and also the capabilities of our approach. For an appropriate set of input data, we need models from a real-world context that are big enough to be representative, but small enough to be fully depicted here. The models should not fulfill all the restrictions imposed by RMM-1, as this would not allow us to discuss its practical applicability. Dealing with properties of realistic models is one of the challenges that our approach has to conquer.

A contribution by Groeger et al. provided us with a convenient set of data [22]. In order to analyze the processes of public institutions, the authors conducted an extensive study to document the processes concerning third-party funding across several German universities. The identified processes are published in an according technical report [23]. For our specific application, these models have several advantages. First of all, these data represent a realistic use case of process model collection and reference model development. The depicted models illustrate processes that are regularly executed in university administrations. Hence, these process models provide us with a real-world application scenario that is very well suited to discuss the applicability of RMM-1. Second, the data were collected in four different German universities, including both individual features and commonalities. In addition, a reference model is provided for each of the examined domains. As these reference models were generated manually based on the assessment of a modeling expert, they provide an excellent indicator for the quality of our own reference model. Finally, all the models are publicly available, so readers have the chance to convince themselves of both the quality of the input models and the validity of our results.

For our running example, we choose the domain of applying for third-party funding. The chosen set of models contains three process models for three different universities.Footnote 9


Since the models are originally depicted in BPMN Notation, we converted them into EPCs by applying the following rules.

  • A BPMN event is converted into an EPC event.

  • A BPMN activity is converted into an EPC function.

  • BPMN lanes are converted into organizational units. All BPMN activities in a lane are assigned the according organizational unit when converted into a function.

  • Data objects are not depicted in the EPC.

  • An exclusive gateway is converted into an XOR-connector. The label of such a gateway is depicted as a function preceding the connector. The labels of the outgoing edges are converted into events that succeed the connector.

  • An inclusive gateway is converted into an OR-connector.

  • In BPMN models, an event may follow after a yes/no-decision or another event. In this case, an additional function is inserted into the EPC to represent the action leading to this event (e.g., “Stop project attempt”).

  • If several control flows point toward a BPMN activity, an XOR-connector is inserted preceding the according EPC function.

  • A BPMN subprocess is converted into an EPC function.

  • Additional text annotations are not represented in the EPC.

  • BPMN time events are converted into a function, if they represent an action that requires a certain time period (e.g., “wait until”). If they represent a specific point in time, they are converted into an event.

  • If an activity activates several control flows, an additional AND-connector is inserted into the EPC

The models already contained harmonized labels, which were manually translated from the original German into English. In addition, we applied the following measures to account for the restrictions of the RMM-1 approach.

  • In order to eliminate duplicate labels (Stop project attempt, Project attempt is stopped), the duplicate nodes are removed.

  • Also, to eliminate duplicate labels, events titled “yes” and “no” are renamed to be more precise.

  • As RMM-1 is not capable of handling loops, they are eliminated from the models.

  • Organizational units are removed from the models, since they are not considered by the algorithm.

These measures, especially deleting the nodes and loops, undoubtedly changes the character of the input models, but is necessary to apply the RMM-1 approach. This stresses the relevance of the (semi-)manual post-processing step. Altering the computed reference model according to the user’s requirements is necessary or process modelers to adapt the model to its intended purpose. Loops and duplicate nodes can be re-inserted, if they are necessary or useful in the reference model.

The final models we use for our computation are shown in Fig. 6. Readers should note that the resulting models are technically not syntactically correct EPCs. One event (“Notification is received”) is followed by an XOR-connector, which splits into three events. These events represent the possible outcomes of the preceding function (“Wait on notification by sponsor”). If we insisted on syntactically correct EPCs, this event could be eliminated without substantially changing the process. However, this provides us with a good chance to illustrate the capabilities of the RMM-1 approach.

Fig. 6

Set of university processes used as input data

Because the models are harmonized in terms of labels, so we do not require an external mapping. Instead, an implicit mapping is defined by the equality of node labels.

Identification of frequent subgraphs

The RMM-1 approach is applied to those models as it is specified in Sect. 5. After the pre-processing is completed in the first stage, the frequent common subgraphs are identified next. With the given model set, we can play around with the parameter configurations. As our input models are fairly similar to each other, we are able to compute a integrated reference model, i.e., a model that contains all options represented in the input models, if the abstraction parameter is lower than \(66\,\%\). A higher abstraction parameter will yield a model that contains only those building blocks that are contained in every one of the input models. Such a reference model can be considered as a common practice reference model, since it contains the most frequent activities. However, since the integrated model contains all possible (reference) model variance, we begin by computing it, including all the intermediate results. Hence, the abstraction parameter \(\alpha \) is set to \(66\,\%\), such that the reference models contain all nodes that are present in at least two input models. This applies to all the nodes, as illustrated in Fig. 6. As we will later see, it is possible to use exact aggregation in this use case, so we do not have to define a confidence level. The XOR-replacement parameter is set to true.

Assuming an abstraction parameter of \(\alpha = 66\,\%\), we can identify 22 different subgraphs with the following labels:

  1. 1.

    Proposal is to be submitted,

  2. 2.

    Application for initial funding (proposal),

  3. 3.

    Preparation of proposal,

  4. 4.

    Submission to sponsor (proposal),

  5. 5.

    Proposal submitted?,

  6. 6.

    Proposal is submitted,

  7. 7.

    Proposal is not submitted,

  8. 8.

    Preparation of funding notification,

  9. 9.

    Wait on notification by sponsor,

  10. 10.

    Notification is received,

  11. 11.

    Positive assessment,

  12. 12.

    Negative assessment,

  13. 13.

    Revision is required,

  14. 14.

    Reject project,

  15. 15.

    Project is rejected,

  16. 16.

    Grant project,

  17. 17.

    Project is granted,

  18. 18.

    Perform revision?,

  19. 19.

    Revision is performed,

  20. 20.

    Revision is not performed,

  21. 21.

    Stop project attempt,

  22. 22.

    Project attempt is stopped.

Computation and analysis of order matrices

In the next stage, we compute the order matrices for each of the three input models, with regard to the 22 subgraphs listed above. To avoid duplicate information, we only display the aggregated order matrix, which combines the three singular order matrices. The relation entries in the arrays are ordered according to Fig. 6; i.e., the first relation entry represents the model on the far left. For the sake of readability, the nodes are referred to with the numbers assigned above. It is apparent that subgraphs 2, 5, 6, and 7 are not present in the first model and subgraph 8 is not present in the second model. The third model contains all of the computed subgraphs, so it may be considered as a supermodel for the chosen domain.

To provide the intermediate results in a comprehensible way, we only show the aggregated order matrix that is computed for the 22 subgraphs. As it contains the same information as the singular order matrices, no information is omitted. The aggregated matrix is shown in Table 11. Although the input models are relatively small, the intermediate results for the identified 22 subgraphs have almost reached the size where they can hardly be displayed for a human to understand. If the models get any larger, users should focus on analyzing the final model instead of the intermediate results. The model contains the same information as the order matrix, but in a much denser and more comprehensible way.

Table 11 Aggregated order matrix

When analyzing the order matrix, several things can be observed. As the models only contain XOR-connector, the \(\times \)-relation along with the sequential relations is the main component of the matrix. While the first part of the models is rather sequential, the second consists of several nested XOR-blocks. Due to the fairly high degree of similarity between the models, the relations are fairly similar as well, so we can rely on exact aggregation instead of having to use the more flexible heuristic aggregation. Obviously, the latter would yield the same results. Integrating the relation arrays from Table 11 is fairly straightforward, so we do not include the integrated order separately. Instead, we focus on merging the identified subgraphs into a new reference model in the next section.

Integration of model parts

For integrating the identified building blocks into a new model, RMM-1 uses an iterative approach based on row and column equality in the integrated order matrix. In the first step, all sequential subgraphs are aggregated. If there are no more connectable sequences, all operator blocks are formed by identifying the respective branches and including opening and closing operators. This process is repeated until the model is either connected or no more building blocks can be merged.

For our running example, the following integration steps are executed:

  1. 1.

    Sequential Integration

    • Subgraphs 1–5 are merged into a sequence.

    • Subgraphs 6 and 8–10 are merged into a sequence.

    • Subgraphs 11, 16, and 17 are merged into a sequence.

    • Subgraphs 12, 14, and 15 are merged into a sequence.

    • Subgraphs 13 and 18 are merged into a sequence.

    • Subgraphs 20–22 are merged into a sequence.

  2. 2.

    Operator Block Integration

    • Subgraphs 19 and 20–22 are merged into an operator block connected by an XOR.

    • Subgraphs 11/16/17 and 12/14/15 are merged into an operator block connected by an XOR.

  3. 3.

    Sequential Integration: Subgraphs 13/18 and 19–22 are merged into a sequence.

  4. 4.

    Operator Block Integration: Subgraphs 11–12/14–17 and 13/18–22 are merged into an operator block connected by an XOR.

  5. 5.

    Sequential Integration: Subgraphs 6/8–10 and 11–22 are merged into a sequence.

  6. 6.

    Operator Block Integration: Subgraphs 6/8–22 and 7 are merged into an operator block connected by an XOR.

  7. 7.

    Sequential Integration: Subgraphs 1–5 and 6–22 are merged into a sequence.

After seven integration steps, the model is fully connected. However, it contains a few redundant operators that can be removed automatically. In addition, it can be post-processed manually to better represent the original models.


The main purpose of the post-processing step is to account for drawbacks and restrictions of the automated computation of the RMM-1 approach that took place in the preceding stages. As we point out above, there are a number of algorithmic restrictions that have to be considered when applying RMM-1. However, post-processing the preliminary reference model, which is computed in a fully automated way, allows us to transform it into a meaningful reference model candidate. Post-processing can be performed both automatically and manually. RMM-1 itself offers several measures to correct syntactic mistakes or irregularities of the preliminary model. Also, the RefMod-Miner as a surrounding implementation offers a variety of analysis tools, which may highlight additional improvement potentials. Finally, a user is enabled to further process the model according to the application-specific requirements. Manual post-processing steps may include re-naming, moving, or deleting activities, inserting loops or duplicate nodes that were removed in the pre-processing steps, or inserting additional deductively developed model parts.

Figure 7 shows three reference models for the domain our application scenario (applying for third-party funding of research activities at universities). The model on the very left is the preliminary reference model that is returned by the last automated stage of the RMM-1 approach. Readers may recreate this model by following the integration steps listed in the previous section. It is evident that this model should not directly be used as a reference model, as it offers improvement potential on both the syntactical and the semantic level. In the center of Fig. 7, we see the preliminary reference model after the automatic post-processing steps. It should be noted that while this model differs significantly from the first model in terms of syntax, their semantics are identical in a sense that both models allow for executing the same set of process traces. However, the second model is both smaller and easier to understand due to the reduction in edges and connectors. It hence fulfills the main purpose of the automatic post-processing, which is to improve the appearance and comprehensibility of the preliminary reference model.

Fig. 7

Preliminary, automatically and manually post-processed reference models

The results of a manual post-processing step is shown on the very right of Fig. 7. The displayed model contains a loop for the revision of the proposal as well as an additional connector which allows a proper closing of the project attempt in case no revision is performed. Both of these measures account for changes that has to be made during pre-processing in order to fulfill the algorithmic requirements of RMM-1. As manual post-processing depends on the choice of the respective user and the intended use of the reference model, this can only serve as a suggestion or an example of possible manual post-processing measures. In this use case, we intend to show that the restrictions of RMM-1 do not rule out its application in a real-world scenario.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rehse, JR., Fettke, P. & Loos, P. A graph-theoretic method for the inductive development of reference process models. Softw Syst Model 16, 833–873 (2017).

Download citation


  • Reference modeling
  • Frequent subgraphs
  • Order matrices
  • Inductive development of reference models