Knowledge and Information Systems

, Volume 14, Issue 1, pp 81–100 | Cite as

Mining follow-up correlation patterns from time-related databases

  • Shichao Zhang
  • Zifang Huang
  • Jilian Zhang
  • Xiaofeng Zhu
Regular Paper

Abstract

Research on traditional association rules has gained a great attention during the past decade. Generally, an association rule AB is used to predict that B likely occurs when A occurs. This is a kind of strong correlation, and indicates that the two events will probably happen simultaneously. However, in real world applications such as bioinformatics and medical research, there are many follow-up correlations between itemsets A and B, such as, B is likely to occur n times after A has occurred m times. That is, the correlative itemsets do not belong to the same transaction. We refer to this relation as a follow-up correlation pattern (FCP). The task of mining FCP patterns brings more challenges on efficient processing than normal pattern discovery because the number of potentially interesting patterns becomes extremely large as the length limit of transactions no longer exists. In this paper, we develop an efficient algorithm to identify FCP patterns in time-related databases. We also experimentally evaluate our approach, and provide extensive results on mining this new kind of patterns.

Keywords

Data mining Time-related database Correlation mining Follow-up correlation pattern 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD conference on management of data, pp 207–216Google Scholar
  2. 2.
    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large database. In: Proceedings of the 20th international conference on very large data Bases, pp 478–499Google Scholar
  3. 3.
    Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the international conference on data engineering, pp 3–14Google Scholar
  4. 4.
    Bettini C, Wang Sean X and Jajodia S (1998). Mining temporal relationships with multiple granularities in time sequences. Data Eng Bull 21: 32–38 Google Scholar
  5. 5.
    Brin S, Motwani R, Silverstein C (1997) Beyond market baskets: generalizing association rules to correlations. In: The ACM SIGMOD conference on management of data, pp 265–276Google Scholar
  6. 6.
    Carlos O, Norberto E and Cesar AS (2006). Constraining and summarizing association rules in medical data. Knowl Inf Syst 9(3): 1–2 CrossRefGoogle Scholar
  7. 7.
    Chen G, Wu X, Zhu X, Arslan AN and He Y (2006). Efficient string matching with wildcards and length constraints. Knowl Inf Syst 10(4): 399–419 CrossRefGoogle Scholar
  8. 8.
    Elfeky MG, Aref WG, Elmagarmid AK (2004) Using convolution to mine obscure periodic patterns in one pass. In: Proceedings of the 9th international conference on extending database technology, pp 605–620Google Scholar
  9. 9.
    Francesco B and Claudio L (2006). On condensed representations of constrained frequent patterns. Knowl Inf Syst 9(2): 180–201 CrossRefGoogle Scholar
  10. 10.
    Garofalakis M, Rastogi R, Shim K (1999) SPIRIT: sequential pattern mining with regular expression constraints. In: Proceedings of the international conference on very large data bases, pp 223–234Google Scholar
  11. 11.
    Han J, Gong W, Yin Y (1998) Mining segment-wise periodic patterns in time-related databases. In: Proceedings of the international conference on knowledge discovery and data mining, pp 214–218Google Scholar
  12. 12.
    Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu M (2000) FreeSpan: frequent pattern-projected sequential pattern mining. In: Proceedings of the international conference on knowledge discovery and data mining, pp 355–359Google Scholar
  13. 13.
    Han J, Dong G, Yin Y (1999) Efficient mining of partial periodic patterns in time series database. In: Proceedings of the 15th international conference on data engineering, pp 106–115Google Scholar
  14. 14.
    Ismail HT, Kantarcioglu M (2001) Mining cyclically repeated patterns. In: Proceedings of the international conference on data warehousing and knowledge discovery, pp 83–92Google Scholar
  15. 15.
    Ismail HT (2003). Repetition support and mining cyclic patterns. Expert Sys Appl 25(3): 303–311 CrossRefGoogle Scholar
  16. 16.
    Lin MY and Lee SY (2005). Efficient mining of sequential patterns with time constraints by delimited pattern growth. Knowl Inf Sys 7(4): 499–514 CrossRefMathSciNetGoogle Scholar
  17. 17.
    Lu H, Han J, Feng L (1998) Stock movement and n-dimensional inter-transaction association rules. In: Proceedings of the 1998 SIGMOD workshop on research issues on data mining and knowledge discovery. Seattle, Washington, vol.12, pp 1–7Google Scholar
  18. 18.
    Mannila H, Toivonen H, Verkamo AI (1995) Discovering frequent episodes in sequence. In: Proceedings of the first international conference on knowledge discovery and data mining. Montreal, Quebec, pp 144–155Google Scholar
  19. 19.
    Ozden B, Ramaswamy S, Silberschatz A (1998) Cyclic Association Rules. In: Proceedings of the 14th international conference on data engineering, pp 412–421Google Scholar
  20. 20.
    Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Proceedings of the 5th international conference on extending database technology, pp 3–17Google Scholar
  21. 21.
    Yang J, Wang W and Yu P (2004). Discovering high-order periodic patterns. Knowl Inf Sys 6(3): 243–268 CrossRefGoogle Scholar
  22. 22.
    Zaki M (2000) Sequence mining in categorical domains: incorporating constraints. In: Proceedings of the 9th international conference on information and knowledge management, pp 422–429Google Scholar
  23. 23.
    Zhang S, Lu J and Zhang C (2004). A fuzzy logic based method to acquire user threshold of minimum-support for mining association rules. Infn Sci 164(1–4): 1–16 MATHGoogle Scholar
  24. 24.
    Zhang S, Zhang J, Zhu X, Huang Z (2006) Identifying follow-correlation itemset-pairs. In: Proceedings of the 6th international conference on data mining (ICDM06), pp 765–774Google Scholar

Copyright information

© Springer-Verlag London Limited 2007

Authors and Affiliations

  • Shichao Zhang
    • 1
  • Zifang Huang
    • 2
  • Jilian Zhang
    • 1
  • Xiaofeng Zhu
    • 1
  1. 1.Faculty of Computer Science and Information TechnologyGuangxi Normal UniversityGuilinChina
  2. 2.Department of Automatic ControlBeihang UniversityBeijingChina

Personalised recommendations