Advertisement

Knowledge and Information Systems

, Volume 28, Issue 2, pp 311–331 | Cite as

Discovering frequent behaviors: time is an essential element of the context

  • Bashar Saleh
  • Florent MassegliaEmail author
Regular Paper

Abstract

One of the most popular problems in usage mining is the discovery of frequent behaviors. It relies on the extraction of frequent itemsets from usage databases. However, those databases are usually considered as a whole, and therefore, itemsets are extracted over the entire set of records. Our claim is that possible subsets, hidden within the structure of the data and containing relevant itemsets, may exist. These subsets, as well as the itemsets they contain, depend on the context. Time is an essential element of the context. The users’ intents will differ from one period to another. Behaviors over Christmas will be different from those extracted during the summer. Unfortunately, these periods might be lost because of arbitrary divisions of the data. The goal of our work is to find itemsets that are frequent over a specific period, but would not be extracted by traditional methods since their support is very low over the whole dataset. We introduce the definition of solid itemsets, which represent coherent and compact behaviors over specific periods, and we propose Sim, an algorithm for their extraction.

Keywords

Itemsets Periods Time-aware 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: SIGMOD, Washington, pp 207–216Google Scholar
  2. 2.
    Ale JM, Rossi GH (2000) An approach to discovering temporal association rules. In: SAC’00: Proceedings of the 2000 ACM symposium on applied computing, pp 294–300Google Scholar
  3. 3.
    Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) Mafia: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17(11): 1490–1504CrossRefGoogle Scholar
  4. 4.
    Calders T, Dexters N, Goethals B (2007) Mining frequent itemsets in a stream. In: ICDM, pp 83–92Google Scholar
  5. 5.
    Chang JH, Lee WS (2003) Finding recent frequent itemsets adaptively over online data streams. In: KDD’03: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 487–492Google Scholar
  6. 6.
    Chen X, Petrounias I (1999) Mining temporal features in association rules. In: PKDD’99: Proceedings of the 3rd European conference on principles of data mining and knowledge discovery, pp 295–300Google Scholar
  7. 7.
    Fung C, Xu J, Yu PSY, Lu H (2005) Parameter free bursty events detection in text streams. In: VLDB’05: Proceedings of the 31st international conference on very large data bases, pp 181–192Google Scholar
  8. 8.
    Chi Y, Wang H, Yu PS, Muntz RR (2006) Catch the moment: maintaining closed frequent itemsets over a data stream sliding window. Knowl Inf Syst 10(3): 265–294CrossRefGoogle Scholar
  9. 9.
    Chong Z, Yu JX, Lu H, Zhang Z, Zhou A (2005) False-negative frequent items mining from data streams with bursting. In: DASFAA’05: Database systems for advanced applications, pp 422–434Google Scholar
  10. 10.
    Crepeau RC (2010) The economics of super bowl xliii. [Online], http://www.poppolitics.com/archives/2009/01/the-economics-of-super-bowl-xliii
  11. 11.
    Duncan A (2010) Super bowl xxxv fun facts. [Online], http://advertising.about.com/od/superbowlcoverage/a/xxxvfunfacts.htm
  12. 12.
    Gao C, Wang J (2009) Efficient itemset generator discovery over a stream sliding window. In: CIKM’09: Proceeding of the 18th ACM conference on information and knowledge management. ACM, New York, pp 355–364Google Scholar
  13. 13.
    Giannella C, Han J, Pei J, Yan X, Yu P (2003) Mining frequent patterns in data streams at multiple time granularities. In: Kargupta H, Joshi A, Sivakumar K, Yesha Y (eds) Next generation data mining. AAAI/MITGoogle Scholar
  14. 14.
    Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: SIGMOD, pp 1–12Google Scholar
  15. 15.
    Bayardo RJ Jr (1998) Efficiently mining long patterns from databases. In: SIGMOD, June 2–4, Seattle, pp 85–93Google Scholar
  16. 16.
    Lee C-H, Lin C-R, Chen M-S (2001) On mining general temporal association rules in a publication database. In: ICDM 29 Nov–2 Dec 2001, San Jose, pp 337–344Google Scholar
  17. 17.
    Li Y, Ning P, Wang XS, Jajodia S (2003) Discovering calendar-based temporal association rules. Data Knowl Eng 44(2): 193–218CrossRefGoogle Scholar
  18. 18.
    Lian W, Cheung DW, Yiu SM (2007) Maintenance of maximal frequent itemsets in large databases. In SAC, pp 388–392Google Scholar
  19. 19.
    Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1): 21–36CrossRefGoogle Scholar
  20. 20.
    Masseglia F, Poncelet P, Teisseire M, Marascu A (2008) Web usage mining: extracting unexpected periods from web logs. Data Min Knowl Discov 16(1): 39–65MathSciNetCrossRefGoogle Scholar
  21. 21.
    Vlachos M, Wu K-L, Chen S-K, Yu PS (2005) Fast burst correlation of financial data. In: Knowledge discovery in databases: PKDD 2005, pp 422–434Google Scholar
  22. 22.
    Ozden B, Ramaswamy S, Silberschatz A (1998) Cyclic association rules. In: ICDE, Orlando, pp 412–421Google Scholar
  23. 23.
    Palma AT, Bogorny V, Kuijpers B, Alvares LO (2008) A clustering-based approach for discovering interesting places in trajectories. In: SAC, pp 863–868Google Scholar
  24. 24.
    Palshikar GK, Kale MS, Apte MM (2007) Association rules mining using heavy itemsets. Data Knowl Eng 61(1): 93–113CrossRefGoogle Scholar
  25. 25.
    Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: ICDT, pp 398–416Google Scholar
  26. 26.
    Roddick JF, Spiliopoulou M (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE TKDE 14(4): 750–767Google Scholar
  27. 27.
    Saleh B, Masseglia F (2008) Time aware mining of itemsets. In: TIME, pp 93–97Google Scholar
  28. 28.
    Teng W-G, Chen M-S, Yu PS (2003) A regression-based temporal pattern mining scheme for data streams. In: VLDB, pp 93–104Google Scholar
  29. 29.
    Toivonen H (1996) Sampling large databases for association rules. In: VLDB, 3–6 Sep, Mumbai (Bombay), pp 134–145Google Scholar
  30. 30.
    Wang J, Han J, Pei J (2003) Closet+: searching for the best strategies for mining frequent closed itemsets. In: KDD, Washington, 24–27 Aug, pp 236–245Google Scholar
  31. 31.
    Xiong H, Steinbach M, Ruslim A, Kumar V (2009) Characterizing pattern preserving clustering. Knowl Inf Syst 19(3): 311–336CrossRefGoogle Scholar
  32. 32.
    Yoo JS, Zhang P, Shekhar S (2005) Mining time-profiled associations: an extended abstract. In: PAKDD, Hanoi, 18–20 May, pp 136–142Google Scholar
  33. 33.
    Zhang S, Wu X, Zhang C, Lu J (2008) Computing the minimum-support for mining frequent patterns. Knowl Inf Syst 15(2): 233–257CrossRefGoogle Scholar
  34. 34.
    Zhu Y, Shasha D (2003) Efficient elastic burst detection in data streams. In: KDD’03: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 336–345Google Scholar

Copyright information

© Springer-Verlag London Limited 2010

Authors and Affiliations

  1. 1.INRIASophia Antipolis, NiceFrance

Personalised recommendations