A Cloud-Based Prediction Framework for Analyzing Business Process Performances

  • Eugenio Cesario
  • Francesco FolinoEmail author
  • Massimo Guarascio
  • Luigi Pontieri
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9817)


This paper presents a framework for analyzing and predicting the performances of a business process, based on historical data gathered during its past enactments. The framework hinges on an inductive-learning technique for discovering a special kind of predictive process models, which can support the run-time prediction of some performance measure (e.g., the remaining processing time or a risk indicator) for an ongoing process instance, based on a modular representation of the process, where major performance-relevant variants of it are equipped with different regression models, and discriminated through context variables. The technique is an original combination of different data mining methods (namely, non-parametric regression methods and a probabilistic trace clustering scheme) and ad hoc data transformation mechanisms, meant to bring the log traces to suitable level of abstraction. In order to overcome the severe scalability limitations of current solutions in the literature, and make our approach really suitable for large logs, both the computation of the trace clusters and of the clusters’ predictors are implemented in a parallel and distributed manner, on top of a cloud-based service-oriented infrastructure. Tests on a real-life log confirmed the validity of the proposed approach, in terms of both effectiveness and scalability.


Data mining Prediction BPM Cloud/grid computing 


  1. 1.
    van der Aalst, W.M.P., van Dongen, B.F., Herbst, J., Maruster, L., Schimm, G., Weijters, A.J.M.M.: Workflow mining: a survey of issues and approaches. Data Knowl. Eng. 47(2), 237–267 (2003)CrossRefGoogle Scholar
  2. 2.
    van der Aalst, W.M.P., Schonenberg, M.H., Song, M.: Time prediction based on process mining. Inf. Syst. 36(2), 450–475 (2011)CrossRefGoogle Scholar
  3. 3.
    Blockeel, H., Raedt, L.D.: Top-down induction of first-order logical decision trees. Artif. Intell. 101(1–2), 285–297 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Cesario, E., Talia, D.: Distributed data mining patterns and services: an architecture and experiments. Concurr. Comput. Pract. Exp. 24(15), 1751–1774 (2012)CrossRefGoogle Scholar
  5. 5.
    Czajkowski, K., et al.: From Open Grid Services Infrastructure To Ws-resource Framework: Refactoring & Evolution (2004)Google Scholar
  6. 6.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. 39(1), 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Folino, F., Guarascio, M., Pontieri, L.: Discovering context-aware models for predicting business process performances. In: Proceedings of the 20th International Conference on Cooperative Information Systems (CoopIS 2012), pp. 287–304 (2012)Google Scholar
  8. 8.
    Folino, F., Guarascio, M., Pontieri, L.: A data-adaptive trace abstraction approach to the prediction of business process performances. In: Proceedings of the 15th International Conference on Enterprise Information Systems (ICEIS 2013), pp. 56–65 (2013)Google Scholar
  9. 9.
    Foster, I.: Globus toolkit version 4: software for service-oriented systems. In: Jin, H., Reed, D., Jiang, W. (eds.) NPC 2005. LNCS, vol. 3779, pp. 2–13. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  10. 10.
    Moltó, G., Hernández, V.: On demand replication of wsrf-based grid services via cloud computing. In: Proceedings of the 9th International Meeting on High Performance Computing for Computational Science (VecPar 2010) (2010)Google Scholar
  11. 11.
    Schonenberg, H., Weber, B., van Dongen, B.F., van der Aalst, W.M.P.: Supporting flexible processes through recommendations based on history. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 51–66. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Sempolinski, P., Thain, D.: A comparison and critique of eucalyptus, opennebula and nimbus. In: Proceedings of the 2nd IEEE International Conference on Cloud Computing Technology and Science (CLOUDCOM 2010), pp. 417–426 (2010)Google Scholar
  13. 13.
    Sotomayor, B., Childers, L.: Globus Toolkit 4: Programming Java Services. Morgan Kaufmann, San Francisco (2006)CrossRefGoogle Scholar
  14. 14.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Publishers Inc., San Francisco (2005)zbMATHGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2016

Authors and Affiliations

  • Eugenio Cesario
    • 1
  • Francesco Folino
    • 1
    Email author
  • Massimo Guarascio
    • 1
  • Luigi Pontieri
    • 1
  1. 1.ICAR-CNR, National Research Council of ItalyRendeItaly

Personalised recommendations