Skip to main content
Log in

Discovery of Frequent Episodes in Event Sequences

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Sequences of events describing the behavior and actions of users or systems can be collected in several domains. An episode is a collection of events that occur relatively close to each other in a given partial order. We consider the problem of discovering frequently occurring episodes in a sequence. Once such episodes are known, one can produce rules for describing or predicting the behavior of the sequence. We give efficient algorithms for the discovery of all frequent episodes from a given class of episodes, and present detailed experimental results. The methods are in use in telecommunication alarm management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining association rules between sets of items in large databases. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD '93). Washington, D.C., pp. 207–216.

  • Agrawal, R. and Srikant, R. 1995. Mining sequential patterns. In Proceedings of the Eleventh International Conference on Data Engineering (ICDE '95). Taipei, Taiwan, pp. 3–14.

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A.I. 1996. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.), Menlo Park, CA: AAAI Press, pp. 307–328.

    Google Scholar 

  • Bairoch, A., Bucher, P., and Hofmann, K. 1995. The PROSITE database, its status in 1995. Nucleic Acids Research, 24:189–196.

    Google Scholar 

  • Bettini, C., Wang, X.S., and Jajodia, S. 1996. Testing complex temporal relationships involving multiple granularities and its application to data mining. In Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS '96). Montréal, Canada, pp. 68–78.

  • Das, G., Fleischer, R., Gasieniec, L., Gunopulos, D., and Kärkkäinen, J. 1997. Episode matching. In Proceedings of the 8th Symposium on Combinatorial Pattern Matching (CPM '97). Aarhus, Denmark, pp. 12–27.

  • Denning, P.J. 1968. The working set model of program behavior. Communications of the ACM, 11(5):323–333.

    Google Scholar 

  • Dousson, C., Gaborit, P., and Ghallab, M. 1993. Situation recognition: Representation and algorithms. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-93). Chambery, France, pp. 166–172.

  • ExPASy Molecular Biology Server. Geneva University Hospital and University of Geneva, Switzerland. http://expasy.hcuge.ch/.

  • Forgy, C.L. 1982. Rete: A fast algorithm for the many pattern/many object pattern match problem. Artificial Intelligence, 19:17–37.

    Google Scholar 

  • Gehani, N., Jagadish, H., and Shmueli, O. 1992. Composite event specification in active databases. In Proceedings of the Eighteenth International Conference on Very Large Data Bases (VLDB '92). San Diego, CA, pp. 327–338.

  • Goodman, R.M., Ambrose, B.E., Latin, H.W., and Ulmer, C.T. 1995. Noaa-An expert system managing the telephone network. In Integrated Network Management IV, A.S. Sethi, Y. Raynaud, and F. Faure-Vincent (Eds.), London, UK: Chapman and Hall, pp. 316–327.

    Google Scholar 

  • Grossi, R. and Luccio, F. 1989. Simple and efficient string matching with k mismatches. Information Processing Letters, 33:113–120.

    Google Scholar 

  • Han, J. and Fu, Y. 1995. Discovery of multiple-level association rules from large databases. In Proceedings of the 21st International Conference on Very Large Data Bases (VLDB '95). Zurich, Swizerland, pp. 420–431.

  • Hätönen, K., Klemettinen, M., Mannila, H., Ronkainen, P., and Toivonen, H. 1996a. Knowledge discovery from telecommunication network alarm databases. In 12th International Conference on Data Engineering (ICDE '96). New Orleans, LA, pp. 115–122.

  • Hätönen, K., Klemettinen, M., Mannila, H., Ronkainen, P., and Toivonen, H. 1996b. TASA: Telecommunication alarm sequence analyzer, or how to enjoy faults in your network. In 1996 IEEE Network Operations and Management Symposium (NOMS '96). Kyoto, Japan, pp. 520–529.

  • Holsheimer, M., Kersten, M., Mannila, H., and Toivonen, H. 1995. A perspective on databases and data mining. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD '95). Montréal, Canada, pp. 150–155.

  • Howe, A.E. 1995. Finding dependencies in event streams using local search. In Preliminary Papers of the Fifth International Workshop on Artificial Intelligence and Statistics. Ft. Lauderdale, FL, pp. 271–277.

  • Jakobson, G. and Weissman, M.D. 1993. Alarm correlation. IEEE Network, 7(6):52–59.

    Google Scholar 

  • Jonassen, I., Collins, J.F., and Higgins, D.G. 1995. Finding flexible patterns in unaligned protein sequences. Protein Science, 4(8):1587–1595.

    Google Scholar 

  • Kalbfleisch, J.D. and Prentice, R.L. 1980. The Statistical Analysis of Failure Time Data. New York, NY: John Wiley Inc.

    Google Scholar 

  • Laird, P. 1993. Identifying and using patterns in sequential data. In Algorithmic Learning Theory, 4th International Workshop (ALT 93). (Lecture Notes in Artificial Intelligence 744, Berlin: Springer-Verlag). Chofu, Japan, pp. 1–18.

  • Mannila, H., Toivonen, H., and Verkamo, A.I. 1995. Discovering frequent episodes in sequences. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD '95). Montréal, Canada, pp. 210–215.

  • Mannila, H. and Toivonen, H. 1996. Discovering generalized episodes using minimal occurrences. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD '96). Portland, OR, pp. 146–151.

  • Mannila, H. and Toivonen, H. 1997. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241–258.

    Google Scholar 

  • Milne, R., Nicol, C., Ghallab, M., Trave-Massuyes, L., Bousson, K., Dousson, C., Quevedo, J., Aguilar, J., and Guasch, A. 1994. TIGER: Real-time situation assessment of dynamic systems. Intelligent Systems Engineering, pp. 103–124.

  • Morris, R.A., Shoaff, W.D., and Khatib, L. 1994. An algebraic formulation of temporal knowledge for reasoning about recurring events. In Proceedings of the InternationalWorkshop onTemporal Representation and Reasoning (TIME-94). Pensacola, FL, pp. 29–34.

  • Muggleton, S. 1992. Inductive Logic Programming. London, UK: Academic Press.

    Google Scholar 

  • Oates, T. and Cohen, P.R. 1996. Searching for structure in multiple streams of data. In Proceedings of the Thirteenth International Conference on Machine Learning (ICML '96). Bari, Italy, pp. 346–354.

  • Padmanabhan, B. and Tuzhilin, A. 1996. Pattern discovery in temporal databases: A temporal logic approach. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD '96). Portland, OR, pp. 351–354.

  • Seshadri, P., Livny, M., and Ramakrishnan, R. 1996. SEQ: Design and implementation of a sequence database system. In Proceedings of the 22nd International Conference on Very Large Data Bases (VLDB '96). Mumbay, India, pp. 99–110.

  • Srikant, R. and Agrawal, R. 1995. Mining generalized association rules. In Proceedings of the 21st International Conference on Very Large Data Bases (VLDB '95). Zurich, Swizerland, pp. 407–419.

  • Srikant, R. and Agrawal, R. 1996. Mining sequential patterns: Generalizations and performance improvements. In Advances in Database Technology-5th International Conference on Extending Database Technology (EDBT '96). Avignon, France, pp. 3–17.

  • Wang, J.T.-L., Chirn, G.-W., Marr, T.G., Shapiro, B., Shasha, D., and Zhang, K. 1994. Combinatorial pattern discovery for scientific data: Some preliminary results. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD '94). Minneapolis, MI, pp. 115–125.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mannila, H., Toivonen, H. & Inkeri Verkamo, A. Discovery of Frequent Episodes in Event Sequences. Data Mining and Knowledge Discovery 1, 259–289 (1997). https://doi.org/10.1023/A:1009748302351

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009748302351

Navigation