HDSHUI-miner: a novel algorithm for discovering spatial high-utility itemsets in high-dimensional spatiotemporal databases

Uday Kiran, Rage; Veena, Pamalla; Ravikumar, Penugonda; Venus Vikranth Raj, Bathala; Dao, Minh-Son; Zettsu, Koji; Bommisetti, Sai Chithra

doi:10.1007/s10489-022-04436-w

HDSHUI-miner: a novel algorithm for discovering spatial high-utility itemsets in high-dimensional spatiotemporal databases

Published: 11 March 2023

Volume 53, pages 8536–8561, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Rage Uday Kiran ORCID: orcid.org/0000-0002-5417-0289^1,2,
Pamalla Veena³,
Penugonda Ravikumar^1,4,
Bathala Venus Vikranth Raj⁴,
Minh-Son Dao²,
Koji Zettsu² &
…
Sai Chithra Bommisetti⁴

297 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Spatial high-utility itemset (SHUI) mining is a significant big data analysis technique. It aims to locate all geographically interesting itemsets with high utility in a spatiotemporal database. An SHUI-Miner algorithm was presented in the literature to find the desired itemsets. Unfortunately, this algorithm suffered from performance issues when dealing with high-dimensional spatiotemporal databases. Based on this finding, this paper extends the state-of-the-art method by proposing a novel algorithm known as the high-dimensional SHUI-miner (HDSHUI-Miner). Our algorithm explores several novel pruning strategies to decrease the search space and computational cost required to find the desired itemsets. Experimental results obtained on seven real-world databases demonstrate that HDSHUI-Miner outperforms SHUI-Miner with respect to memory consumption, runtime, and scalability. Finally, we present two real-world case studies to illustrate the usefulness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering Spatial High Utility Itemsets in High-Dimensional Spatiotemporal Databases

Discovering Spatial High Utility Frequent Itemsets in Spatiotemporal Databases

A Survey on Spatiotemporal and Semantic Data Mining

Data Availability

The databases generated during and/or analysed during the current study are available in the well known open-source data mining library named sequence pattern mining repository found at [44]. We have also used one more real-world database named Drought found at [45].

Code Availability

To ensure the repeatability of our experiments, we made the complete evaluation results, as well as the databases and algorithms, available on GitHub [47].

Notes

¹ The downward closure property says that all nonempty subsets of an interesting itemset are also interesting itemsets [2]. This property is widely employed in itemset mining algorithms to reduce their search space and computational costs. The apriori property and anti-monotonic property are other names for this property.

References

Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Acm sigmod record, vol 22, pp 207–216
Agrawal R (1994) Srikant, R. In: Proceedings 20th international conference very large data bases, VLDB, vol 1215, pp 487–499
Luna JM, Fournier-Viger P, Ventura S (2019) Frequent itemset mining: a 25 years review. Wiley Interdiscip Rev Data Min Knowl Discov 9(6)
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: SIAM, pp 482–486
Ahmed CF, Tanbeer SK, Jeong B-S (2010) Mining high utility web access sequences in dynamic web log data. In: International conference on software engineering, artificial intelligence, networking and parallel/distributed computing. SNPD ’10, pp 76–81
Tseng VS, Shie B-E, Wu C-W, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Article Google Scholar
Liu Y-C, Cheng C-P, Tseng VS (2013) Mining differential top-k co-expression patterns from time course comparative gene expression datasets. BMC Bioinforma 14(1):230
Article Google Scholar
Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Hong T-P, Fujita H (2018) A survey of incremental high-utility itemset mining. Wiley Interdiscip Rev: Data Min Knowl Discov 8(2)
Uday Kiran R, Yashwanth Reddy T, Fournier-Viger P, Toyoda M, Krishna Reddy P, Kitsuregawa M (2019) Efficiently finding high utility-frequent itemsets using cutoff and suffix utility. In: PAKDD, pp 191–203
Lin JC, Djenouri Y, Srivastava G, Li Y, Yu PS (2022) Scalable mining of high-utility sequential patterns with three-tier mapreduce model. ACM Trans Knowl Discov Data 16(3):60–16026. https://doi.org/10.1145/3487046
Article Google Scholar
Lin JC, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive ga-based model for closed high-utility itemset mining. Appl Soft Comput 108:107422. https://doi.org/10.1016/j.asoc.2021.107422
Article Google Scholar
Lin JC, Li Y, Fournier-Viger P, Djenouri Y, Zhang J (2020) Efficient chain structure for high-utility sequential pattern mining. IEEE Access 8:40714–40722. https://doi.org/10.1109/ACCESS.2020.2976662
Article Google Scholar
Lin JC, Gan W, Fournier-Viger P, Hong T, Tseng VS (2016) Fast algorithms for mining high-utility itemsets with various discount strategies. Adv Eng Inform 30(2):109–126. https://doi.org/10.1016/j.aei.2016.02.003
Article Google Scholar
Wu JM, Srivastava G, Wei M, Yun U, Lin JC (2021) Fuzzy high-utility pattern mining in parallel and distributed hadoop framework. Inf Sci 553:31–48. https://doi.org/10.1016/j.ins.2020.12.004
Article MathSciNet Google Scholar
Fournier-Viger P, Zhang Y, Lin JC, Dinh D, Le HB (2020) Mining correlated high-utility itemsets using various measures. Log J IGPL 28(1):19–32. https://doi.org/10.1093/jigpal/jzz068
Article MathSciNet MATH Google Scholar
Yin J, Zheng Z, Cao L (2012) Uspan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’12, pp 660–668
Nouioua M, Fournier Viger P, Wu C-W, Lin C-W, Gan W (2021) Fhuqi-miner: fast high utility quantitative itemset mining. Appl Intell 51:1–25. https://doi.org/10.1007/s10489-021-02204-w
Article Google Scholar
Verma A, Dawar S, Kumar R, Navathe S, Goyal V (2021) High-utility and diverse itemset mining. Appl Intell 51(7):4649–4663. https://doi.org/10.1007/s10489-020-02063-x
Article Google Scholar
Wu JM-T, Li Z, Srivastava G, Yun U, Lin JC-W (2022) Analytics of high average-utility patterns in the industrial internet of things. Appl Intell 52(6):6450–6463. https://doi.org/10.1007/s10489-021-02751-2
Article Google Scholar
Lin JC, Djenouri Y, Srivastava G (2021) Efficient closed high-utility pattern fusion model in large-scale databases. Inf Fusion 76:122–132. https://doi.org/10.1016/j.inffus.2021.05.011
Article Google Scholar
Lin JC, Zhang J, Fournier-Viger P, Hong T, Zhang J (2017) A two-phase approach to mine short-period high-utility itemsets in transactional databases. Adv Eng Inform 33:29–43. https://doi.org/10.1016/j.aei.2017.04.007
Article Google Scholar
Fournier-Viger P, Lin JC, Duong Q, Dam T (2016) PHM: mining periodic high-utility itemsets. In: Industrial conference on data mining, pp 64–79
Kiran RU, Zettsu K, Toyoda M, Fournier-Viger P, Reddy PK, Kitsuregawa M (2019) Discovering spatial high utility itemsets in spatiotemporal databases. In: Proceedings of the 31st international conference on scientific and statistical database management. SSDBM ’19. Association for Computing Machinery, New York, pp 49–60. https://doi.org/10.1145/3335783.3335789
Kiran RU, Ito S, Dao M-S, Zettsu K, Wu C-W, Watanobe Y, Paik I, Thang TC (2020) Distributed mining of spatial high utility itemsets in very large spatiotemporal databases using spark in-memory computing architecture. In: 2020 IEEE international conference on big data (big data), pp 4724–4733. https://doi.org/10.1109/BigData50022.2020.9377946
Bommisetty SC, Penugonda R, Rage UK, Dao MS, Zettsu K (2021) Discovering spatial high utility itemsets in high-dimensional spatiotemporal databases. In: Fujita H, Selamat A, Lin JC-W, Ali M (eds) Advances and trends in artificial intelligence. Artificial intelligence practices. Springer, Cham, pp 53–65
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87
Article MathSciNet Google Scholar
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Disc 14(1)
Aggarwal CC (2014) . In: Aggarwal CC, Han J (eds) Applications of frequent pattern mining. Springer, Cham, pp 443–467. https://doi.org/10.1007/978-3-319-07821-2_18
Fournier-Viger P, Lin JC-W, Kiran RU, Koh YS (2017) A survey of sequential pattern mining. Data Sci Pattern Recog 1(1):54–77
Google Scholar
Kiran RU, Shrivastava S, Fournier-Viger P, Zettsu K, Toyoda M, Kitsuregawa M (2020) Discovering frequent spatial patterns in very large spatiotemporal databases. In: Proceedings of the 28th international conference on advances in geographic information systems. SIGSPATIAL ’20. Association for Computing Machinery, New York, pp 445–448. https://doi.org/10.1145/3397536.3422206
Aggarwal A, Toshniwal D (2019) Frequent pattern mining on time and location aware air quality data. IEEE Access 7:98921–98933. https://doi.org/10.1109/ACCESS.2019.2930004
Article Google Scholar
Ding W, Eick CF, Wang J, Yuan X (2006) A framework for regional association rule mining in spatial datasets. In: 6th international conference on data mining (ICDM’06), pp 1851–856. https://doi.org/10.1109/ICDM.2006.5
Mohan P, Shekhar S, Shine JA, Rogers JP, Jiang Z, Wayant N (2011) A neighborhood graph based approach to regional co-location pattern discovery: a summary of results. In: Proceedings of the 19th ACM SIGSPATIAL international conference on advances in geographic information systems. GIS ’11. Association for Computing Machinery, New York, pp 122–132. https://doi.org/10.1145/2093973.2093991
Sengstock C, Gertz M (2013) Spatial itemset mining: a framework to explore itemsets in geographic space. In: Catania B, Guerrini G, Pokorný J (eds) Advances in databases and information systems. Springer, Berlin, pp 148–161
Tran-The H, Zettsu K (2017) Discovering co-occurrence patterns of heterogeneous events from unevenly-distributed spatiotemporal data. In: 2017 IEEE international conference on big data (Big Data), pp 1006–1011. https://doi.org/10.1109/BigData.2017.8258023
Chan R, Yang Q, Shen Y-D (2003) Mining high utility itemsets. In: 3rd IEEE international conference on data mining, pp 19–26. https://doi.org/10.1109/ICDM.2003.1250893
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 55–64
Fournier Viger P, Wu C-W, Zida S, Tseng V (2014) Fhm: faster high-utility itemset mining using estimated utility co-occurrence pruning. https://doi.org/10.1007/978-3-319-08326-1_9
Lin JC-W, Zhang J, Fournier-Viger P, Hong T-P, Zhang J (2017) A two-phase approach to mine short-period high-utility itemsets in transactional databases. Adv Eng Inform 33:29–43. https://doi.org/10.1016/j.aei.2017.04.007
Article Google Scholar
Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2017) Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625
Article Google Scholar
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management. CIKM ’12. Association for Computing Machinery, New York, pp 55–64. https://doi.org/10.1145/2396761.2396773
Tung NT, Nguyen LTT, Nguyen TDD, Vo B (2022) An efficient method for mining multi-level high utility itemsets. Appl Intell 52(5):5475–5496. https://doi.org/10.1007/s10489-021-02681-z
Article Google Scholar
Krishnamoorthy S (2017) Hminer: efficiently mining high utility itemsets. Expert Syst Appl 90:168–183
Article Google Scholar
Fournier-Viger P (2020) SPMF: a java open-source data mining library. http://www.philippe-fournier-viger.com/spmf/index.php?link=datasets.php. Accessed 4 June 2020
National Center for Atmospheric Research, University Corporation for Atmospheric Research: Standardized precipitation index (SPI) for global land surface (1949-2012) (2013) Research data archive at the national center for atmospheric research, computational and information systems laboratory, Boulder CO
Atmospheric Environmental Regional Observation System: AEROS. http://soramame.taiki.go.jp/
Kiran RU (2022) PAMI: Pattern mining. https://github.com/udayRage/PAMI/tree/main/PAMI/highUtilitySpatialPattern/basic. Accessed 10 Sept 2022

Download references

Acknowledgements

We would like to acknowledge that some of the databases named Congestion and Pollution that were used in the SHUI-Miner [23] and distributed SHUI-Miner [24] papers have been re-used for experimental evaluation purposes with appropriate citations of those databases.

Funding

This research was funded by JSPS Kakenhi 21K12034.

Author information

Authors and Affiliations

Department of Computer and Information Systems, The University of AIZU, Itsukimachi Ōaza Tsuruga, Aizuwakamatsu, 965-8580, Fukushima, Japan
Rage Uday Kiran & Penugonda Ravikumar
NICT, National Institute of Information and Communications Technology, Nukui-Kitamachi, Tokyo, 184-8795, Tokyo, Japan
Rage Uday Kiran, Minh-Son Dao & Koji Zettsu
Department of Computer Science, Sri Balaji PG College, JNTU, Ananthapuram, 515001, Andhra Pradesh, India
Pamalla Veena
Department of Computer Science and Engineering, IIIT-RK Valley, RGUKT-AP, Idupulapaya, Vempally, 516330, Andhra Pradesh, India
Penugonda Ravikumar, Bathala Venus Vikranth Raj & Sai Chithra Bommisetti

Authors

Rage Uday Kiran
View author publications
You can also search for this author in PubMed Google Scholar
Pamalla Veena
View author publications
You can also search for this author in PubMed Google Scholar
Penugonda Ravikumar
View author publications
You can also search for this author in PubMed Google Scholar
Bathala Venus Vikranth Raj
View author publications
You can also search for this author in PubMed Google Scholar
Minh-Son Dao
View author publications
You can also search for this author in PubMed Google Scholar
Koji Zettsu
View author publications
You can also search for this author in PubMed Google Scholar
Sai Chithra Bommisetti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rage Uday Kiran.

Ethics declarations

Conflict of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Emerging Topics in Artificial Intelligence Selected from IEA/AIE2021 Guest Editors: Ali Selamat and Jerry Chun-Wei Lin

Uday Kiran Rage, Veena Pamalla and Ravikumar Penugonda contributed equally to this work.

Uday proposed the idea. Veena introduced optimizations to reduce the search space. Venus and Sai have done coding and conducted the experiments. Ravi verified the experiments. Dao and Zettsu have shared the real-world datasets.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Uday Kiran, R., Veena, P., Ravikumar, P. et al. HDSHUI-miner: a novel algorithm for discovering spatial high-utility itemsets in high-dimensional spatiotemporal databases. Appl Intell 53, 8536–8561 (2023). https://doi.org/10.1007/s10489-022-04436-w

Download citation

Accepted: 27 December 2022
Published: 11 March 2023
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10489-022-04436-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HDSHUI-miner: a novel algorithm for discovering spatial high-utility itemsets in high-dimensional spatiotemporal databases

Abstract

Access this article

Similar content being viewed by others

Discovering Spatial High Utility Itemsets in High-Dimensional Spatiotemporal Databases

Discovering Spatial High Utility Frequent Itemsets in Spatiotemporal Databases

A Survey on Spatiotemporal and Semantic Data Mining

Data Availability

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

HDSHUI-miner: a novel algorithm for discovering spatial high-utility itemsets in high-dimensional spatiotemporal databases

Abstract

Access this article

Similar content being viewed by others

Discovering Spatial High Utility Itemsets in High-Dimensional Spatiotemporal Databases

Discovering Spatial High Utility Frequent Itemsets in Spatiotemporal Databases

A Survey on Spatiotemporal and Semantic Data Mining

Data Availability

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation