Skip to main content
Log in

Mining contrast sequential pattern based on subsequence time distribution variation with discreteness constraints

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Contrast sequential pattern is defined as a pattern that occurs frequently in one sequence dataset but not in the others. Contrast sequential pattern mining has been widely used in many fields, such as customer behavior analysis and medical diagnosis. Existing algorithms first require users to set a distinguishing location and then use this fixed location to identify distribution differences of different subsequences, i.e., the subsequence pattern that appears before the given distinguishing location in one sequence dataset and after the same location in another sequence dataset. However, it is difficult for users to set an appropriate location without sufficient prior knowledge. Since the distinguishing location is different for different subsequences, setting a fixed location may ignore many meaningful patterns. In addition, previous studies rarely considered the time distribution variation of subsequences and the discreteness of patterns. To solve the above problems, we propose a novel method of mining contrast sequential pattern based on subsequence time distribution variation with discreteness constraints in this paper. A suffix-tree based search algorithm, which transforms the dataset to be processed into a tree representation, is designed to mine contrast sequential pattern based on subsequence time distribution variation. Experiments are conducted on real-world time-series datasets, and the experimental results validate the superiority of our method in terms of effectiveness and efficiency when compared with other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Deng K, Zaïane OR (2010) An occurrence based approach to mine emerging sequences[J]. Lect Notes Comput Sci 6263:275–284

    Article  Google Scholar 

  2. Chen X, Xiao B. (2017) Emerging sequences pattern mining based on location information[J]. Comput Sci 44(07):175–179

    Google Scholar 

  3. Huynh B, Vo B, Snasel V (2017) An efficient method for mining frequent sequential patterns using multi-Core processors[J]. Appl Intell 46(3):703–716

    Article  Google Scholar 

  4. Pazhanikumar K, Arumugaperumal S (2015) An algorithm for mining closed weighted sequential patterns with flexing time interval for medical time series data[C]. In: International conference on computers

  5. Dinh DT, Le B, Fournier-Viger P, et al. (2018) An efficient algorithm for mining periodic high-utility sequential patterns[J]. Appl Intell, 1–21

  6. Pei J, Wang H, Liu J, et al. (2006) Discovering frequent closed partial orders from strings[J]. IEEE Trans Knowl Data Eng 18(11):1467–1481

    Article  Google Scholar 

  7. Yang H, Duan L, Dong G, et al. (2015) Mining itemset-based distinguishing sequential patterns with gap constraint[M]. Database systems for advanced applications. Springer International Publishing, pp 39–54

  8. Zheng Z, Wei W, Liu C, et al. (2016) An effective contrast sequential pattern mining approach to taxpayer behavior analysis[J]. World Wide Web-internet Web Inf Syst 19(4):633–651

    Article  Google Scholar 

  9. Conklin D, Anagnostopoulou C (2010) Comparative pattern analysis of cretan folk songs[C]. In: International workshop on machine learning and music. ACM, pp 33–36

  10. Nielsen H, Engelbrecht J, Von HG, et al. (2015) Defining a similarity threshold for a functional protein sequence pattern: the signal peptide cleavage site[J]. Proteins Struct Funct Bioinform 24(2):165–177

    Article  Google Scholar 

  11. Colbran LL, Chen L, Capra JA, Short DNA (2017) sequence patterns accurately identify broadly active human enhancers[J]. Bmc Genom 18(1):536

    Article  Google Scholar 

  12. Xie X, Guan J, Zhou S (2015) Similarity evaluation of DNA sequences based on frequent patterns and entropy[J]. BMC Genom, 16

  13. Tanvee MM, Kabeer SJ, Chowdhury TM, et al. (2014) Mining maximal adjacent frequent patterns from DNA sequences using location information[J]. Int J Comput Appl 76(15):26–32

    Google Scholar 

  14. Shen B, Zheng Q, Li X, et al. (2015) A framework for mining actionable navigation patterns from in-store RFID datasets via indoor mapping[J]. Sensors 15(3):5344–75

    Article  Google Scholar 

  15. Yaeli A, Bak P, Feigenblat G (2014) Understanding customer behavior using indoor location analysis and visualization[J]. Ibm J Res Develop 58(5/6):3:1-3:12

    Article  Google Scholar 

  16. Wang X, Leckie C, Xie H, et al. (2015) Discovering the impact of urban traffic interventions using contrast mining on vehicle trajectory data[C]. Pacific-asia conference on knowledge discovery & data mining. Springer, Cham

    Google Scholar 

  17. Li L, Leckie C (2016) Trajectory pattern identification and anomaly detection of pedestrian flows based on visual clustering[M]. Trajectory pattern intelligent information processing VIII. Springer International Publishing

  18. An A, Wan Q, Zhao J, et al. (2009) Diverging patterns: discovering significant frequency change dissimilarities in large databases[C]. In: ACM Conference on information and knowledge management. ACM, pp 1473–1476

  19. Ji X, Bailey J, Dong G (2007) Mining minimal distinguishing subsequence patterns with gap constraints[J]. Knowled Inf Syst 11(3):259–286

    Article  Google Scholar 

  20. Wang HF, Lei D, Jie Z, et al. (2016) Efficient mining of distinguishing sequential patterns without a predefined gap constraint[J]. Chinese Journal of Computers

  21. Hao Y, Lei D, Bin HU, et al. (2015) Mining top-k distinguishing sequential patterns with gap constraint[J]. Journal of Software

  22. Gao C, Duan L, Dong G, et al. (2016) Mining top- k distinguishing sequential patterns with flexible gap constraints[M]. Web-age information management. Springer International Publishing, pp 82–94

  23. Wang X, Duan L, Dong G, et al. (2014) Efficient mining of density-aware distinguishing sequential patterns with gap constraints[M]. Database systems for advanced applications. Springer International Publishing, pp 372–387

  24. Pang T, Duan L, Liling J, et al. (2017) Mining similarity-aware distinguishing sequential patterns from biomedical sequences[C]. IEEE Second international conference on data science in cyberspace

  25. Wu Y, Wang Y, Liu J et al (2018) Mining distinguishing subsequence patterns with nonoverlapping condition[J]. Cluster Comput 1:1–13

    Google Scholar 

  26. Duan L, Yan L, Dong G, et al. (2017) Mining top-k distinguishing temporal sequential patterns from event sequences[M]. Database Systems for Advanced Applications

  27. Li L, Erfani S, Leckie C (2017) Pattern tree based method for mining conditional contrast patterns of multi-source data[C]. In: IEEE International conference on data mining workshops IEEE computer society, pp 916–923

  28. UCI machine learning repository. http://archive.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangtao Chen.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, R., Li, Q. & Chen, X. Mining contrast sequential pattern based on subsequence time distribution variation with discreteness constraints. Appl Intell 49, 4348–4360 (2019). https://doi.org/10.1007/s10489-019-01492-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01492-7

Keywords

Navigation