The VLDB Journal

, Volume 20, Issue 3, pp 417–444

Event correlation for process discovery from web service interaction logs

  • Hamid Reza Motahari-Nezhad
  • Regis Saint-Paul
  • Fabio Casati
  • Boualem Benatallah
Regular Paper

Abstract

Understanding, analyzing, and ultimately improving business processes is a goal of enterprises today. These tasks are challenging as business processes in modern enterprises are implemented over several applications and Web services, and the information about process execution is scattered across several data sources. Understanding modern business processes entails identifying the correlation between events in data sources in the context of business processes (event correlation is the process of finding relationships between events that belong to the same process execution instance). In this paper, we investigate the problem of event correlation for business processes that are realized through the interactions of a set of Web services. We identify various ways in which process-related events could be correlated as well as investigate the problem of discovering event correlation (semi-) automatically from service interaction logs. We introduce the concept of process view to represent the process resulting from a certain way of event correlation and that of process space referring to the set of possible process views over process events. Event correlation is a challenging problem as there are various ways in which process events could be correlated, and in many cases, it is subjective. Exploring all the possibilities of correlations is computationally expensive, and only some of the correlated event sets result in process views that are interesting. We propose efficient algorithms and heuristics to identify correlated event sets that lead potentially to interesting process views. To account for its subjectivity, we have designed the event correlation discovery process to be interactive and enable users to guide it toward process views of their interest and organize the discovered process views into a process map that allows users to effectively navigate through the process space and identify the ones of interest. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.

Keywords

Business processes Event correlation Process views Process spaces 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Ghosh, S.P., Imielinski, T., Iyer, B.R., Swami, A.N.: An interval classifier for database mining applications. In: Proceedings of VLDB’92, pp. 560–573 (1992)Google Scholar
  2. 2.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of SIGMOD’93, pp. 207–216 (1993)Google Scholar
  3. 3.
    Alon, N., Gibbons, P.B., Matias, Y., Szegedy, M.: Tracking join and self-join sizes in limited storage. In: PODS, pp. 10–20 (1999)Google Scholar
  4. 4.
    Alonso G., Casati F., Kuno H.A., Machiraju V.: Web services–concepts, architectures and applications. Data-centric systems and applications. Springer, Berlin (2004)Google Scholar
  5. 5.
    Barros, A.P., Decker, G., Dumas, M., Weber, F.: Correlation patterns in service-oriented architectures. In: Proceedings of 10th International Conference on Fundamental Approaches to Software Engineering (FASE) vol. 4422 of LNCS, pp. 245–259 (2007)Google Scholar
  6. 6.
    Beeri, C., Eyal, A., Milo, T., Pilberg, A.: Query-based monitoring of bpel business processes. In: Proceedings of SIGMOD’07, pp. 1122–1124 (2007)Google Scholar
  7. 7.
    Benatallah B., Casati F., Toumani F.: Representing, analysing and managing web service protocols. Data and Knowledge Engineering 58(3), 327–357 (2006)CrossRefGoogle Scholar
  8. 8.
    Benatallah, B., Motahari, H., Saint-Paul, R., Casati, F.: Rotocol discovery for web services. In: 13th HP OVUA (2006)Google Scholar
  9. 9.
    Bobrik, R., Reichert, M., Bauer, T.: View-based process visualization. In: Proceedings of International Conference on Business Process Management (BPM), pp. 88–95 (2007)Google Scholar
  10. 10.
    Casati, F., Castellanos, M., Dayal, U., Salazar, N.: A generic solution for warehousing business process data. In: Proceedings of VLDB’07, pp. 1128–1137 (2007)Google Scholar
  11. 11.
    Cormen T.H., Leiserson C.E., Rivest R.L., Stein C.: Introduction to Algorithms, 2nd edn. The MIT Press/McGraw-Hill Book Company, Cambridge/New York (2001)MATHGoogle Scholar
  12. 12.
    Dasu, T., Johnson, T., Muthukrishnan, S., Shkapenyuk, V.: Mining database structure; or, how to build a data quality browser. In: SIGMOD, pp. 240–251 (2002)Google Scholar
  13. 13.
    Demers, A.J., Gehrke, J., Panda, B., Riedewald, M., Sharma, V., White, W.M.: Cayuga: a general purpose event monitoring system. In: Proceedings of CIDR’07, pp. 412–422 (2007)Google Scholar
  14. 14.
    Dustdar S., Gombotz R.: Discovering web service workflows using web services interaction mining. Int. J. Bus. Process Integ. Manag. (IJBPIM) 1(4), 256–266 (2006)CrossRefGoogle Scholar
  15. 15.
    Elmagarmid A.K., Ipeirotis P.G., Verykios V.S.: Duplicate record detection: a survey. IEEE TKDE 19(1), 1–16 (2007)Google Scholar
  16. 16.
    Georgakopoulos D., Hornick M.F., Sheth A.P.: An overview of workflow management: from process modeling to workflow automation infrastructure. Distribut. Parallel Databases 3(2), 119–153 (1995)CrossRefGoogle Scholar
  17. 17.
    Grigori D., Casati F., Castellanos M., Dayal U., Sayal M., Shan M.-C.: Business process intelligence. Comput. Ind. 53(3), 321–343 (2004)CrossRefGoogle Scholar
  18. 18.
    Han J., Kamber M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., Massachusetts (2005)Google Scholar
  19. 19.
    Hara, C.S., Davidson, S.B.: Reasoning about nested functional dependencies. In: Proceedings of 18th ACM SIGMOD-SIGACT-SIGART Symp. Principles of database systems (PODS’99), pp. 91–100. ACM Press, New York (1999)Google Scholar
  20. 20.
    Hipp J., Guntzer U., Nakhaeizadeh G.: Algorithms for association rule mining—a general survey and comparison. SIGKDD Explor. 2(1), 58–64 (2000)CrossRefGoogle Scholar
  21. 21.
    HP. HP OpenView Solutions. http://www.managementsoftware.hp.com (2007)
  22. 22.
    IBM. FileNet Enterprise Content Management Solutions. http://www.filenet.com (2007)
  23. 23.
    IBM. WebSphere Business Process Management software. http://www.ibm.com/software/integration (2007)
  24. 24.
    Jain A.K., Dubes R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)MATHGoogle Scholar
  25. 25.
    Kivinen J., Mannila H.: Approximate inference of functional dependencies from relations. Theor. Comput. Sci. 149(1), 129–149 (1995)MathSciNetMATHCrossRefGoogle Scholar
  26. 26.
    Liu D.-R., Shen M.: Workflow modeling for virtual processes: an order-preserving process-view approach. Inf. Syst. 28(6), 505–532 (2003)MATHCrossRefGoogle Scholar
  27. 27.
    Mannila, H., Rusakov, D.: Decomposition of event sequences into independent components. In: Proceedings of 1st SIAM International Conference on Data Mining (January 2001)Google Scholar
  28. 28.
    Mannila H., Toivonen H.: Levelwise search and borders of theories in knowledge discovery. Data Mining Knowl. Discovery 1(3), 241–258 (1997)CrossRefGoogle Scholar
  29. 29.
    McGarry K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20(1), 39–61 (2005)CrossRefGoogle Scholar
  30. 30.
    Motahari-Nezhad H.R., Benatallah B., Saint-Paul R., Casati F., Andritsos P.: Process spaceship: discovering and exploring process views from event logs in data spaces. Proc. VLDB Endow. 1(2), 1412–1415 (2008)Google Scholar
  31. 31.
    Motahari-Nezhad H.R., Saint-Paul R., Benatallah B., Casati F.: Deriving protocol models from imperfect service conversation logs. IEEE TKDE 20(12), 1683–1698 (2008)Google Scholar
  32. 32.
    Motahari-Nezhad, H.R., Saint-paul, R., Benatallah, B., Casati, F., Andritsos, P.: Message correlation for conversation reconstruction in service interaction logs. Technical Report UNSW-CSE-TR-0709, The University of New South Wales, Australia (March 2007)Google Scholar
  33. 33.
    Motahari-Nezhad, H.R., Saint-Paul, R., Benatallah, B., Casati, F., Ponge, J., Toumani, F.: Servicemosaic: interactive analysis and manipulation of service conversations. In: Proceedings of ICDE’07, pp. 1497–1498 (2007)Google Scholar
  34. 34.
  35. 35.
    Pauw, W.D., Hoch, R., Huang Y.: Discovering conversations in web services using semantic correlation analysis. In: Proceedings of International Conference on Web Services (ICWS’07), pp. 639–646 (2007)Google Scholar
  36. 36.
    Pauw W.D., Lei M., Pring E., Villard L., Arnold M., Morar J.F.: Web services navigator: visualizing the execution of web services. IBM Syst. J. 44(4), 821–846 (2005)CrossRefGoogle Scholar
  37. 37.
    Petit, J.-M., Toumani, F., Boulicaut, J.-F., Kouloumdjian, J.: Towards the reverse engineering of denormalized relational databases. In: Proceedings of 12th International Conference on Data Engineering (ICDE’96), pp. 218–227 (1996)Google Scholar
  38. 38.
    Rahm E., Do H.H.: Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 23(4), 3–13 (2000)Google Scholar
  39. 39.
    Sahar, S.: Interestingness via what is not interesting. In: Proceedings of (KDD 1999), pp. 332–336 (1999)Google Scholar
  40. 40.
    Sismanis, Y., Brown, P., Haas, P.J., Reinwald, B.: Gordian: Efficient and scalable discovery of composite keys. In: Proceedings of VLDB’06, pp. 691–702 (2006)Google Scholar
  41. 41.
    Spiliopoulou M., Mobasher B., Berendt B., Nakagawa B.: A framework for the evaluation of session reconstruction heuristics in web-usage analysis. INFORMS J. Comput. 15(2), 171–190 (2003)CrossRefGoogle Scholar
  42. 42.
    Steinle, M., Aberer, K., Girdzijauskas, S., Lovis, C.: Mapping moving landscapes by mining mountains of logs: novel techniques for dependency model generation. In: Proceedings of VLDB’06, pp. 1093–1102 (2006)Google Scholar
  43. 43.
    Tran, H., Zdun, U., Dustdar, S.: View-based reverse engineering approach for enhancing model interoperability and reusability in process-driven soas. In: 10th International Conference on Software Reuse, pp. 233–244 (2008)Google Scholar
  44. 44.
    van der Aalst, W., ter Hofstede, A.H.M., Weske, M.: Business process management: a survey. In: Proceedings of International Conference on Business Process Management (BPM), pp. 1–12 (2003)Google Scholar
  45. 45.
    van der Aalst W., van Dongen B.F., Herbst J., Maruster L., Schimm G., Weijters A.J.M.M.: Workflow mining: a survey of issues and approaches. Data Knowl. Eng. 47(2), 237–267 (2003)CrossRefGoogle Scholar
  46. 46.
    van der Aalst W., van Hee K.: Workflow management: models, methods, and systems. MIT Press, Cambridge (2002)Google Scholar
  47. 47.
    Weidlich, M., Barros, A., Mendling, J., Weske, M.: Vertical alignment of process models—how can we get there? In: CAiSE 2009 Workshop Proceedings, 10th Workshop on Business Process Modeling, Development, and Support (BPMDS’09), pp. 71–84 (2009)Google Scholar
  48. 48.
    Zhao, X., Liu, C., Sadiq, W., Kowalkiewicz, M.: Process view derivation and composition in a dynamic collaboration environment. In: OTM Conferences (1), pp. 82–99 (2008)Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • Hamid Reza Motahari-Nezhad
    • 1
    • 2
  • Regis Saint-Paul
    • 3
  • Fabio Casati
    • 4
  • Boualem Benatallah
    • 2
  1. 1.HP LabsPalo AltoUSA
  2. 2.University of New South WalesSydneyAustralia
  3. 3.CREATE-NETTrentoItaly
  4. 4.University of TrentoTrentoItaly

Personalised recommendations