Free-Rider Episode Screening via Dual Partition Model

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10827)

Abstract

One of the drawbacks of frequent episode mining is that overwhelmingly many of the discovered patterns are redundant. Free-rider episode, as a typical example, consists of a real pattern doped with some additional noise events. Because of the possible high support of the inside noise events, such free-rider episodes may have abnormally high support that they cannot be filtered by frequency based framework. An effective technique for filtering free-rider episodes is using a partition model to divide an episode into two consecutive subepisodes and comparing the observed support of such episode with its expected support under the assumption that these two subepisodes occur independently. In this paper, we take more complex subepisodes into consideration and develop a novel partition model named EDP for free-rider episode filtering from a given set of episodes. It combines (1) a dual partition strategy which divides an episode to an underlying real pattern and potential noises; (2) a novel definition of the expected support of a free-rider episode based on the proposed partition strategy. We can deem the episode interesting if the observed support is substantially higher than the expected support estimated by our model. The experiments on synthetic and real-world datasets demonstrate EDP can effectively filter free-rider episodes compared with existing state-of-the-arts.

Keywords

Episode dual partition Interesting pattern discovery Episode mining Sequence mining 

Notes

Acknowledgement

This research is supported by the National Natural Science Foundation of China (No. 61602438, 91546122, 61573335), National key R&D program of China (No. 2017YFB1002104), Guangdong provincial science and technology plan projects (No. 2015B010109005).

References

  1. 1.
    Achar, A., Ibrahim, A., Sastry, P.S.: Pattern-growth based frequent serial episode discovery. DKE 87, 91–108 (2013)CrossRefGoogle Scholar
  2. 2.
    Achar, A., Laxman, S., Viswanathan, R., Sastry, P.S.: Discovering injective episodes with general partial orders. DMKD 25, 67–108 (2012)MathSciNetMATHGoogle Scholar
  3. 3.
    Ao, X., Luo, P., Li, C., Zhuang, F., He, Q.: Online frequent episode mining. In: ICDE (2015)Google Scholar
  4. 4.
    Ao, X., Luo, P., Li, C., Zhuang, F., He, Q., Shi, Z.: Discovering and learning sensational episodes of news events. In: WWW (2014)Google Scholar
  5. 5.
    Ao, X., Luo, P., Wang, J., Zhuang, F., He, Q.: Mining precise-positioning episode rules from event sequences. IEEE TKDE 30, 530–543 (2018)Google Scholar
  6. 6.
    Bertens, R., Vreeken, J., Siebes, A.: Keeping it short and simple: summarising complex event sequences with multivariate patterns. In: KDD (2016)Google Scholar
  7. 7.
    Bhattacharyya, A., Vreeken, J.: Efficiently summarising event sequences with rich interleaving patterns. In: SDM (2017)CrossRefGoogle Scholar
  8. 8.
    Fowkes, J., Sutton, C.: A subsequence interleaving model for sequential pattern mining. In: KDD (2016)Google Scholar
  9. 9.
    Robert, G., Atallah, M.J., Szpankowski, W.: Reliable detection of episodes in event sequences. In: KAIS (2005)Google Scholar
  10. 10.
    Ibrahim, A., Sastry, S., Sastry, P.S.: Discovering compressing serial episodes from event sequences. KAIS 47, 405–432 (2016)Google Scholar
  11. 11.
    Lam, H.T., Mörchen, F., Fradkin, D., Calders, T.: Mining compressing sequential patterns. Stat. Anal. Data Mining 7, 34–52 (2014)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Laxman, S., Sastry, P.S., Unnikrishnan, K.P.: A formal connection discovering frequent episodes and learning hidden Markov models. IEEE TKDE 17, 1505–1517 (2005)Google Scholar
  13. 13.
    Mampaey, M., Vreeken, J., Tatti, N.: Summarizing data succinctly with the most informative itemsets. ACM TKDD 6, 16 (2012)Google Scholar
  14. 14.
    Heikki, M., Toivonen, H., Inkeri Verkamo, A.: Discovery of frequent episodes in event sequences. DMKD 1, 259–289 (1997)Google Scholar
  15. 15.
    Pei, J., Wang, H., Liu, J., Wang, K., Wang, J., Yu, P.S.: Discovering frequent closed partial orders from strings. IEEE TKDE 18, 1467–1481 (2006)Google Scholar
  16. 16.
    Petitjean, F., Li, T., Tatti, N., Webb, G.I.: Skopus: mining top-k sequential patterns under leverage. DMKD 30, 1086–1111 (2016)MathSciNetGoogle Scholar
  17. 17.
    Tatti, N.: Discovering episodes with compact minimal windows. DMKD 28, 1046–1077 (2014)MathSciNetMATHGoogle Scholar
  18. 18.
    Tatti, N.: Ranking episodes using a partition model. DMKD 29, 1312–1342 (2015)MathSciNetGoogle Scholar
  19. 19.
    Tatti, N., Cule, B.: Mining closed strict episodes. In: ICDM (2010)Google Scholar
  20. 20.
    Tatti, N., Cule, B.: Mining closed episodes with simultaneous events. In: KDD (2011)Google Scholar
  21. 21.
    Tatti, N., Vreeken, J.: The long and the short of it: summarising event sequences with serial episodes. In: KDD (2012)Google Scholar
  22. 22.
    Vreeken, J., Tatti, N.: Interesting patterns. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, pp. 105–134. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-07821-2_5CrossRefGoogle Scholar
  23. 23.
    Webb, G.I.: Self-sufficient itemsets: an approach to screening potentially interesting associations between items. In: ACM TKDD (2010)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Xiang Ao
    • 1
    • 2
  • Yang Liu
    • 1
    • 2
  • Zhen Huang
    • 3
  • Luo Zuo
    • 1
    • 2
  • Qing He
    • 1
    • 2
  1. 1.Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS)Institute of Computing Technology, CASBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.Tsinghua UniversityBeijingChina

Personalised recommendations