Abstract
One of the most popular problems in usage mining is the discovery of frequent behaviors. It relies on the extraction of frequent itemsets from usage databases. However, those databases are usually considered as a whole, and therefore, itemsets are extracted over the entire set of records. Our claim is that possible subsets, hidden within the structure of the data and containing relevant itemsets, may exist. These subsets, as well as the itemsets they contain, depend on the context. Time is an essential element of the context. The users’ intents will differ from one period to another. Behaviors over Christmas will be different from those extracted during the summer. Unfortunately, these periods might be lost because of arbitrary divisions of the data. The goal of our work is to find itemsets that are frequent over a specific period, but would not be extracted by traditional methods since their support is very low over the whole dataset. We introduce the definition of solid itemsets, which represent coherent and compact behaviors over specific periods, and we propose Sim, an algorithm for their extraction.
Similar content being viewed by others
References
Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: SIGMOD, Washington, pp 207–216
Ale JM, Rossi GH (2000) An approach to discovering temporal association rules. In: SAC’00: Proceedings of the 2000 ACM symposium on applied computing, pp 294–300
Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) Mafia: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17(11): 1490–1504
Calders T, Dexters N, Goethals B (2007) Mining frequent itemsets in a stream. In: ICDM, pp 83–92
Chang JH, Lee WS (2003) Finding recent frequent itemsets adaptively over online data streams. In: KDD’03: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 487–492
Chen X, Petrounias I (1999) Mining temporal features in association rules. In: PKDD’99: Proceedings of the 3rd European conference on principles of data mining and knowledge discovery, pp 295–300
Fung C, Xu J, Yu PSY, Lu H (2005) Parameter free bursty events detection in text streams. In: VLDB’05: Proceedings of the 31st international conference on very large data bases, pp 181–192
Chi Y, Wang H, Yu PS, Muntz RR (2006) Catch the moment: maintaining closed frequent itemsets over a data stream sliding window. Knowl Inf Syst 10(3): 265–294
Chong Z, Yu JX, Lu H, Zhang Z, Zhou A (2005) False-negative frequent items mining from data streams with bursting. In: DASFAA’05: Database systems for advanced applications, pp 422–434
Crepeau RC (2010) The economics of super bowl xliii. [Online], http://www.poppolitics.com/archives/2009/01/the-economics-of-super-bowl-xliii
Duncan A (2010) Super bowl xxxv fun facts. [Online], http://advertising.about.com/od/superbowlcoverage/a/xxxvfunfacts.htm
Gao C, Wang J (2009) Efficient itemset generator discovery over a stream sliding window. In: CIKM’09: Proceeding of the 18th ACM conference on information and knowledge management. ACM, New York, pp 355–364
Giannella C, Han J, Pei J, Yan X, Yu P (2003) Mining frequent patterns in data streams at multiple time granularities. In: Kargupta H, Joshi A, Sivakumar K, Yesha Y (eds) Next generation data mining. AAAI/MIT
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: SIGMOD, pp 1–12
Bayardo RJ Jr (1998) Efficiently mining long patterns from databases. In: SIGMOD, June 2–4, Seattle, pp 85–93
Lee C-H, Lin C-R, Chen M-S (2001) On mining general temporal association rules in a publication database. In: ICDM 29 Nov–2 Dec 2001, San Jose, pp 337–344
Li Y, Ning P, Wang XS, Jajodia S (2003) Discovering calendar-based temporal association rules. Data Knowl Eng 44(2): 193–218
Lian W, Cheung DW, Yiu SM (2007) Maintenance of maximal frequent itemsets in large databases. In SAC, pp 388–392
Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1): 21–36
Masseglia F, Poncelet P, Teisseire M, Marascu A (2008) Web usage mining: extracting unexpected periods from web logs. Data Min Knowl Discov 16(1): 39–65
Vlachos M, Wu K-L, Chen S-K, Yu PS (2005) Fast burst correlation of financial data. In: Knowledge discovery in databases: PKDD 2005, pp 422–434
Ozden B, Ramaswamy S, Silberschatz A (1998) Cyclic association rules. In: ICDE, Orlando, pp 412–421
Palma AT, Bogorny V, Kuijpers B, Alvares LO (2008) A clustering-based approach for discovering interesting places in trajectories. In: SAC, pp 863–868
Palshikar GK, Kale MS, Apte MM (2007) Association rules mining using heavy itemsets. Data Knowl Eng 61(1): 93–113
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: ICDT, pp 398–416
Roddick JF, Spiliopoulou M (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE TKDE 14(4): 750–767
Saleh B, Masseglia F (2008) Time aware mining of itemsets. In: TIME, pp 93–97
Teng W-G, Chen M-S, Yu PS (2003) A regression-based temporal pattern mining scheme for data streams. In: VLDB, pp 93–104
Toivonen H (1996) Sampling large databases for association rules. In: VLDB, 3–6 Sep, Mumbai (Bombay), pp 134–145
Wang J, Han J, Pei J (2003) Closet+: searching for the best strategies for mining frequent closed itemsets. In: KDD, Washington, 24–27 Aug, pp 236–245
Xiong H, Steinbach M, Ruslim A, Kumar V (2009) Characterizing pattern preserving clustering. Knowl Inf Syst 19(3): 311–336
Yoo JS, Zhang P, Shekhar S (2005) Mining time-profiled associations: an extended abstract. In: PAKDD, Hanoi, 18–20 May, pp 136–142
Zhang S, Wu X, Zhang C, Lu J (2008) Computing the minimum-support for mining frequent patterns. Knowl Inf Syst 15(2): 233–257
Zhu Y, Shasha D (2003) Efficient elastic burst detection in data streams. In: KDD’03: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 336–345
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Saleh, B., Masseglia, F. Discovering frequent behaviors: time is an essential element of the context. Knowl Inf Syst 28, 311–331 (2011). https://doi.org/10.1007/s10115-010-0361-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-010-0361-5