Advertisement

Behavioral Constraint Template-Based Sequence Classification

  • Johannes De Smedt
  • Galina Deeva
  • Jochen De Weerdt
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10535)

Abstract

In this paper we present the interesting Behavioral Constraint Miner (iBCM), a new approach towards classifying sequences. The prevalence of sequential data, i.e., a collection of ordered items such as text, website navigation patterns, traffic management, and so on, has incited a surge in research interest towards sequence classification. Existing approaches mainly focus on retrieving sequences of itemsets and checking their presence in labeled data streams to obtain a classifier. The proposed iBCM approach, rather than focusing on plain sequences, is template-based and draws its inspiration from behavioral patterns used for software verification. These patterns have a broad range of characteristics and go beyond the typical sequence mining representation, allowing for a more precise and concise way of capturing sequential information in a database. Furthermore, it is possible to also mine for negative information, i.e., sequences that do not occur. The technique is benchmarked against other state-of-the-art approaches and exhibits a strong potential towards sequence classification. Code related to this chapter is available at: http://feb.kuleuven.be/public/u0092789/.

Keywords

Sequence mining Sequence classification Constraint-based mining 

References

  1. 1.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: ICDE, pp. 3–14. IEEE Computer Society (1995)Google Scholar
  2. 2.
    Lee, J., Han, J., Li, X., Cheng, H.: Mining discriminative patterns for classifying trajectories on road networks. IEEE Trans. Knowl. Data Eng. 23(5), 713–726 (2011)CrossRefGoogle Scholar
  3. 3.
    Eichinger, F., Nauck, D.D., Klawonn, F.: Sequence mining for customer behaviour predictions in telecommunications. In: Proceedings of the Workshop on Practical Data Mining at ECML/PKDD, pp. 3–10 (2006)Google Scholar
  4. 4.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: SIGMOD Conference, pp. 207–216. ACM Press (1993)Google Scholar
  5. 5.
    Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: ICDE, pp. 79–90. IEEE Computer Society (2004)Google Scholar
  6. 6.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Prefixspan: mining sequential patterns by prefix-projected growth. In: ICDE, pp. 215–224. IEEE Computer Society (2001)Google Scholar
  7. 7.
    Zaki, M.J.: Sequence mining in categorical domains: Incorporating constraints. In: CIKM, pp. 422–429. ACM (2000)Google Scholar
  8. 8.
    Coquery, E., Jabbour, S., Saïs, L., Salhi, Y.: A sat-based approach for discovering frequent, closed and maximal patterns in a sequence. In: ECAI. Frontiers in Artificial Intelligence and Applications, vol. 242, pp. 258–263. IOS Press (2012)Google Scholar
  9. 9.
    Kemmar, A., Loudni, S., Lebbah, Y., Boizumault, P., Charnois, T.: PREFIX-PROJECTION global constraint for sequential pattern mining. In: Pesant, G. (ed.) CP 2015. LNCS, vol. 9255, pp. 226–243. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-23219-5_17 Google Scholar
  10. 10.
    Kemmar, A., Loudni, S., Lebbah, Y., Boizumault, P., Charnois, T.: A global constraint for mining sequential patterns with GAP constraint. In: Quimper, C.-G. (ed.) CPAIOR 2016. LNCS, vol. 9676, pp. 198–215. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-33954-2_15 Google Scholar
  11. 11.
    Aoga, J.O.R., Guns, T., Schaus, P.: An efficient algorithm for mining frequent sequence with constraint programming. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9852, pp. 315–330. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46227-1_20 CrossRefGoogle Scholar
  12. 12.
    Negrevergne, B., Guns, T.: Constraint-based sequence mining using constraint programming. In: Michel, L. (ed.) CPAIOR 2015. LNCS, vol. 9075, pp. 288–305. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-18008-3_20 Google Scholar
  13. 13.
    Dehaspe, L., Toivonen, H.: Discovery of frequent DATALOG patterns. Data Min. Knowl. Discov. 3(1), 7–36 (1999)CrossRefGoogle Scholar
  14. 14.
    Esposito, F., Mauro, N.D., Basile, T.M.A., Ferilli, S.: Multi-dimensional relational sequence mining. Fundam. Inform. 89(1), 23–43 (2008)zbMATHGoogle Scholar
  15. 15.
    Cule, B., Goethals, B.: Mining association rules in long sequences. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS (LNAI), vol. 6118, pp. 300–309. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-13657-3_33 CrossRefGoogle Scholar
  16. 16.
    Cheng, H., Yan, X., Han, J., Hsu, C.: Discriminative frequent pattern analysis for effective classification. In: ICDE, pp. 716–725. IEEE Computer Society (2007)Google Scholar
  17. 17.
    Zhou, C., Cule, B., Goethals, B.: Pattern based sequence classification. IEEE Trans. Knowl. Data Eng. 28(5), 1285–1298 (2016)CrossRefGoogle Scholar
  18. 18.
    Fradkin, D., Mörchen, F.: Mining sequential patterns for classification. Knowl. Inf. Syst. 45(3), 731–749 (2015)CrossRefGoogle Scholar
  19. 19.
    Fowkes, J.M., Sutton, C.A.: A subsequence interleaving model for sequential pattern mining. In: KDD, pp. 835–844. ACM (2016)Google Scholar
  20. 20.
    Egho, E., Gay, D., Boullé, M., Voisine, N., Clérot, F.: A parameter-free approach for mining robust sequential classification rules. In: ICDM, pp. 745–750. IEEE Computer Society (2015)Google Scholar
  21. 21.
    Pesic, M., Schonenberg, H., van der Aalst, W.M.P.: DECLARE: full support for loosely-structured processes. In: EDOC, pp. 287–300. IEEE Computer Society (2007)Google Scholar
  22. 22.
    Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Patterns in property specifications for finite-state verification. In: ICSE, pp. 411–420. ACM (1999)Google Scholar
  23. 23.
    Pesić, M.: Constraint-based work on management systems: shifting control to users. Ph.D. thesis, Eindhoven University of Technology, p. 26 (2008)Google Scholar
  24. 24.
    Westergaard, M., Stahl, C., Reijers, H.A.: Unconstrainedminer: efficient discovery of generalized declarative process models. BPM Center Report BPM-13-28, p. 28. BPMcenter.org (2013)Google Scholar
  25. 25.
    Di Ciccio, C., Mecella, M.: A two-step fast algorithm for the automated discovery of declarative workflows. In: CIDM, pp. 135–142. IEEE (2013)Google Scholar
  26. 26.
    Maggi, F.M., Bose, R.P.J.C., van der Aalst, W.M.P.: Efficient discovery of understandable declarative process models from event logs. In: Ralyté, J., Franch, X., Brinkkemper, S., Wrycza, S. (eds.) CAiSE 2012. LNCS, vol. 7328, pp. 270–285. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-31095-9_18 CrossRefGoogle Scholar
  27. 27.
    Maggi, F.M., Dumas, M., García-Bañuelos, L., Montali, M.: Discovering data-aware declarative process models from event logs. In: Daniel, F., Wang, J., Weber, B. (eds.) BPM 2013. LNCS, vol. 8094, pp. 81–96. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40176-3_8 CrossRefGoogle Scholar
  28. 28.
    Di Ciccio, C., Maggi, F.M., Mendling, J.: Efficient discovery of target-branched declare constraints. Inf. Syst. 56, 258–283 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Johannes De Smedt
    • 1
  • Galina Deeva
    • 2
  • Jochen De Weerdt
    • 2
  1. 1.Management Science and Business Economics Group, Business SchoolUniversity of EdinburghEdinburghUK
  2. 2.Department of Decision Sciences and Information Management, Faculty of Economics and BusinessKU LeuvenLeuvenBelgium

Personalised recommendations