Mining Emerging High Utility Itemsets over Streaming Database
HUIM (High Utility Itemset Mining) is a classical data mining problem that has gained much attention in the research community with a wide range of applications. The goal of HUIM is to identify all itemsets whose utility satisfies a user-defined threshold. In this paper, we address a new and interesting direction of high utility itemsets mining, which is mining temporal emerging high utility itemsets from data streams. The temporal emerging high utility itemsets are those that are not high utility in the current time window of the data stream but have high potential to become a high utility in the subsequent time windows. Discovery of temporal emerging high utility itemsets is an important process for mining interesting itemsets that yield high profits from streaming databases, which has many applications such as proactive decision making by domain experts, building powerful classifiers, market basket analysis, catalogue design, among others. We propose a novel method, named EFTemHUI (Efficient Framework for Temporal Emerging HUI mining), to identify Emerging High Utility Itemsets better. To improve the efficiency of the mining process, we devise a new mechanism to evaluate the high utility itemsets that will emerge, which has the ability to capture and store the information about potential high utility itemsets. Through extensive experimentation using three datasets, we proved that the proposed method yields excellent accuracy and low errors in the prediction of emerging patterns for the next window.
KeywordsHigh utility itemset Utility pattern mining Emerging patterns Data stream Data mining
This research was partially supported by Ministry of Science and Technology, Taiwan, under grant no. 108-2218-E-009-051.
- 2.Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Proceedings of the Third IEEE International Conference on Data Mining, ICDM 2003, Washington, DC, USA, p. 19. IEEE Computer Society (2003)Google Scholar
- 3.Liu, J., Wang, K., Fung, B.C.M.: Direct discovery of high utility itemsets without candidate generation. In: Proceedings of the 2012 IEEE 12th International Conference on Data Mining, ICDM 2012, Washington, DC, USA, pp. 984–989. IEEE Computer Society (2012)Google Scholar
- 7.Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 1999, pp. 43–52. ACM, New York (1999)Google Scholar
- 10.Ikonomovska, E., Loskovska, S., Gjorgjevik, D.: A survey of stream data mining. In: Proceedings of 8th National Conference with International Participation, ETAI, pp. 19–21 (2007)Google Scholar
- 11.Manike, C., Om, H.: Time-fading based high utility pattern mining from uncertain data streams. In: Kumar Kundu, M., Mohapatra, D.P., Konar, A., Chakraborty, A. (eds.) Advanced Computing, Networking and Informatics-Volume 1. SIST, vol. 27, pp. 529–536. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07353-8_61CrossRefGoogle Scholar
- 13.Zhang, X., Dong, G., Ramamohanarao, K.: Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets. In: KDD, pp. 310–314 (2000)Google Scholar
- 14.García-Vico, A.M., Montes, J., Aguilera, J., Carmona, C.J., del Jesus, M.J.: Analysing concentrating photovoltaics technology through the use of emerging pattern mining. In: Graña, M., López-Guede, J.M., Etxaniz, O., Herrero, Á., Quintián, H., Corchado, E. (eds.) ICEUTE/SOCO/CISIS -2016. AISC, vol. 527, pp. 334–344. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-47364-2_32CrossRefGoogle Scholar
- 17.Hackman, A., Huang, Y., Tseng, V.S.: Mining trending high utility itemsets from temporal transaction databases. In: Hartmann, S., Ma, H., Hameurlain, A., Pernul, G., Wagner, R.R. (eds.) DEXA 2018. LNCS, vol. 11030, pp. 461–470. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98812-2_42CrossRefGoogle Scholar