Skip to main content
Log in

Event correlation for process discovery from web service interaction logs

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Understanding, analyzing, and ultimately improving business processes is a goal of enterprises today. These tasks are challenging as business processes in modern enterprises are implemented over several applications and Web services, and the information about process execution is scattered across several data sources. Understanding modern business processes entails identifying the correlation between events in data sources in the context of business processes (event correlation is the process of finding relationships between events that belong to the same process execution instance). In this paper, we investigate the problem of event correlation for business processes that are realized through the interactions of a set of Web services. We identify various ways in which process-related events could be correlated as well as investigate the problem of discovering event correlation (semi-) automatically from service interaction logs. We introduce the concept of process view to represent the process resulting from a certain way of event correlation and that of process space referring to the set of possible process views over process events. Event correlation is a challenging problem as there are various ways in which process events could be correlated, and in many cases, it is subjective. Exploring all the possibilities of correlations is computationally expensive, and only some of the correlated event sets result in process views that are interesting. We propose efficient algorithms and heuristics to identify correlated event sets that lead potentially to interesting process views. To account for its subjectivity, we have designed the event correlation discovery process to be interactive and enable users to guide it toward process views of their interest and organize the discovered process views into a process map that allows users to effectively navigate through the process space and identify the ones of interest. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Agrawal, R., Ghosh, S.P., Imielinski, T., Iyer, B.R., Swami, A.N.: An interval classifier for database mining applications. In: Proceedings of VLDB’92, pp. 560–573 (1992)

  2. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of SIGMOD’93, pp. 207–216 (1993)

  3. Alon, N., Gibbons, P.B., Matias, Y., Szegedy, M.: Tracking join and self-join sizes in limited storage. In: PODS, pp. 10–20 (1999)

  4. Alonso G., Casati F., Kuno H.A., Machiraju V.: Web services–concepts, architectures and applications. Data-centric systems and applications. Springer, Berlin (2004)

    Google Scholar 

  5. Barros, A.P., Decker, G., Dumas, M., Weber, F.: Correlation patterns in service-oriented architectures. In: Proceedings of 10th International Conference on Fundamental Approaches to Software Engineering (FASE) vol. 4422 of LNCS, pp. 245–259 (2007)

  6. Beeri, C., Eyal, A., Milo, T., Pilberg, A.: Query-based monitoring of bpel business processes. In: Proceedings of SIGMOD’07, pp. 1122–1124 (2007)

  7. Benatallah B., Casati F., Toumani F.: Representing, analysing and managing web service protocols. Data and Knowledge Engineering 58(3), 327–357 (2006)

    Article  Google Scholar 

  8. Benatallah, B., Motahari, H., Saint-Paul, R., Casati, F.: Rotocol discovery for web services. In: 13th HP OVUA (2006)

  9. Bobrik, R., Reichert, M., Bauer, T.: View-based process visualization. In: Proceedings of International Conference on Business Process Management (BPM), pp. 88–95 (2007)

  10. Casati, F., Castellanos, M., Dayal, U., Salazar, N.: A generic solution for warehousing business process data. In: Proceedings of VLDB’07, pp. 1128–1137 (2007)

  11. Cormen T.H., Leiserson C.E., Rivest R.L., Stein C.: Introduction to Algorithms, 2nd edn. The MIT Press/McGraw-Hill Book Company, Cambridge/New York (2001)

    MATH  Google Scholar 

  12. Dasu, T., Johnson, T., Muthukrishnan, S., Shkapenyuk, V.: Mining database structure; or, how to build a data quality browser. In: SIGMOD, pp. 240–251 (2002)

  13. Demers, A.J., Gehrke, J., Panda, B., Riedewald, M., Sharma, V., White, W.M.: Cayuga: a general purpose event monitoring system. In: Proceedings of CIDR’07, pp. 412–422 (2007)

  14. Dustdar S., Gombotz R.: Discovering web service workflows using web services interaction mining. Int. J. Bus. Process Integ. Manag. (IJBPIM) 1(4), 256–266 (2006)

    Article  Google Scholar 

  15. Elmagarmid A.K., Ipeirotis P.G., Verykios V.S.: Duplicate record detection: a survey. IEEE TKDE 19(1), 1–16 (2007)

    Google Scholar 

  16. Georgakopoulos D., Hornick M.F., Sheth A.P.: An overview of workflow management: from process modeling to workflow automation infrastructure. Distribut. Parallel Databases 3(2), 119–153 (1995)

    Article  Google Scholar 

  17. Grigori D., Casati F., Castellanos M., Dayal U., Sayal M., Shan M.-C.: Business process intelligence. Comput. Ind. 53(3), 321–343 (2004)

    Article  Google Scholar 

  18. Han J., Kamber M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., Massachusetts (2005)

    Google Scholar 

  19. Hara, C.S., Davidson, S.B.: Reasoning about nested functional dependencies. In: Proceedings of 18th ACM SIGMOD-SIGACT-SIGART Symp. Principles of database systems (PODS’99), pp. 91–100. ACM Press, New York (1999)

  20. Hipp J., Guntzer U., Nakhaeizadeh G.: Algorithms for association rule mining—a general survey and comparison. SIGKDD Explor. 2(1), 58–64 (2000)

    Article  Google Scholar 

  21. HP. HP OpenView Solutions. http://www.managementsoftware.hp.com (2007)

  22. IBM. FileNet Enterprise Content Management Solutions. http://www.filenet.com (2007)

  23. IBM. WebSphere Business Process Management software. http://www.ibm.com/software/integration (2007)

  24. Jain A.K., Dubes R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  25. Kivinen J., Mannila H.: Approximate inference of functional dependencies from relations. Theor. Comput. Sci. 149(1), 129–149 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  26. Liu D.-R., Shen M.: Workflow modeling for virtual processes: an order-preserving process-view approach. Inf. Syst. 28(6), 505–532 (2003)

    Article  MATH  Google Scholar 

  27. Mannila, H., Rusakov, D.: Decomposition of event sequences into independent components. In: Proceedings of 1st SIAM International Conference on Data Mining (January 2001)

  28. Mannila H., Toivonen H.: Levelwise search and borders of theories in knowledge discovery. Data Mining Knowl. Discovery 1(3), 241–258 (1997)

    Article  Google Scholar 

  29. McGarry K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20(1), 39–61 (2005)

    Article  Google Scholar 

  30. Motahari-Nezhad H.R., Benatallah B., Saint-Paul R., Casati F., Andritsos P.: Process spaceship: discovering and exploring process views from event logs in data spaces. Proc. VLDB Endow. 1(2), 1412–1415 (2008)

    Google Scholar 

  31. Motahari-Nezhad H.R., Saint-Paul R., Benatallah B., Casati F.: Deriving protocol models from imperfect service conversation logs. IEEE TKDE 20(12), 1683–1698 (2008)

    Google Scholar 

  32. Motahari-Nezhad, H.R., Saint-paul, R., Benatallah, B., Casati, F., Andritsos, P.: Message correlation for conversation reconstruction in service interaction logs. Technical Report UNSW-CSE-TR-0709, The University of New South Wales, Australia (March 2007)

  33. Motahari-Nezhad, H.R., Saint-Paul, R., Benatallah, B., Casati, F., Ponge, J., Toumani, F.: Servicemosaic: interactive analysis and manipulation of service conversations. In: Proceedings of ICDE’07, pp. 1497–1498 (2007)

  34. Oracle. Business Activity Monitoring. http://www.oracle.com/technology/products/integration/bam/pdf/oracle-bam-datasheet.pdf (2006)

  35. Pauw, W.D., Hoch, R., Huang Y.: Discovering conversations in web services using semantic correlation analysis. In: Proceedings of International Conference on Web Services (ICWS’07), pp. 639–646 (2007)

  36. Pauw W.D., Lei M., Pring E., Villard L., Arnold M., Morar J.F.: Web services navigator: visualizing the execution of web services. IBM Syst. J. 44(4), 821–846 (2005)

    Article  Google Scholar 

  37. Petit, J.-M., Toumani, F., Boulicaut, J.-F., Kouloumdjian, J.: Towards the reverse engineering of denormalized relational databases. In: Proceedings of 12th International Conference on Data Engineering (ICDE’96), pp. 218–227 (1996)

  38. Rahm E., Do H.H.: Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 23(4), 3–13 (2000)

    Google Scholar 

  39. Sahar, S.: Interestingness via what is not interesting. In: Proceedings of (KDD 1999), pp. 332–336 (1999)

  40. Sismanis, Y., Brown, P., Haas, P.J., Reinwald, B.: Gordian: Efficient and scalable discovery of composite keys. In: Proceedings of VLDB’06, pp. 691–702 (2006)

  41. Spiliopoulou M., Mobasher B., Berendt B., Nakagawa B.: A framework for the evaluation of session reconstruction heuristics in web-usage analysis. INFORMS J. Comput. 15(2), 171–190 (2003)

    Article  Google Scholar 

  42. Steinle, M., Aberer, K., Girdzijauskas, S., Lovis, C.: Mapping moving landscapes by mining mountains of logs: novel techniques for dependency model generation. In: Proceedings of VLDB’06, pp. 1093–1102 (2006)

  43. Tran, H., Zdun, U., Dustdar, S.: View-based reverse engineering approach for enhancing model interoperability and reusability in process-driven soas. In: 10th International Conference on Software Reuse, pp. 233–244 (2008)

  44. van der Aalst, W., ter Hofstede, A.H.M., Weske, M.: Business process management: a survey. In: Proceedings of International Conference on Business Process Management (BPM), pp. 1–12 (2003)

  45. van der Aalst W., van Dongen B.F., Herbst J., Maruster L., Schimm G., Weijters A.J.M.M.: Workflow mining: a survey of issues and approaches. Data Knowl. Eng. 47(2), 237–267 (2003)

    Article  Google Scholar 

  46. van der Aalst W., van Hee K.: Workflow management: models, methods, and systems. MIT Press, Cambridge (2002)

    Google Scholar 

  47. Weidlich, M., Barros, A., Mendling, J., Weske, M.: Vertical alignment of process models—how can we get there? In: CAiSE 2009 Workshop Proceedings, 10th Workshop on Business Process Modeling, Development, and Support (BPMDS’09), pp. 71–84 (2009)

  48. Zhao, X., Liu, C., Sadiq, W., Kowalkiewicz, M.: Process view derivation and composition in a dynamic collaboration environment. In: OTM Conferences (1), pp. 82–99 (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamid Reza Motahari-Nezhad.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Motahari-Nezhad, H.R., Saint-Paul, R., Casati, F. et al. Event correlation for process discovery from web service interaction logs. The VLDB Journal 20, 417–444 (2011). https://doi.org/10.1007/s00778-010-0203-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-010-0203-9

Keywords

Navigation