Fast privacy-preserving utility mining algorithm based on utility-list dictionary

Yin, Chunyong; Li, Ying

doi:10.1007/s10489-023-04791-2

Fast privacy-preserving utility mining algorithm based on utility-list dictionary

Published: 27 October 2023

Volume 53, pages 29363–29377, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

91 Accesses
Explore all metrics

Abstract

Privacy preserving utility mining (PPUM) aims to solve the problem of sensitive information leakage in utility pattern mining. In recent years, researchers have proposed algorithms to solve the privacy-preserving problem. However, these algorithms have high side effects, long sanitization time, and computational complexity. Although the FPUTT algorithm reduces the number of database scans, tree construction and traversal still take much time. The paper proposes a fast utility-list dictionary algorithm (FULD). The utility-list dictionary consists of all sensitive items. Through dictionary lookup, sensitive items can be found and modified. In addition, the novel concepts of SINS and tns are proposed to reduce the side effects of the algorithm. In this paper, the experiments show that the FULD algorithm has good performance, such as running time and side effects. The running time of the FULD is 15–20 times shorter than the FPUTT algorithm. It performs well both on sparse and dense datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Privacy Preserving Data Utility Mining Using Perturbation

A residual utility-based concept for high-utility itemset mining

Article 23 August 2023

An efficient utility-list based high-utility itemset mining algorithm

Article 13 July 2022

References

Chen M-S, Han J, Philip SY (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883
Article Google Scholar
Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Yu PS (2019) A survey of parallel sequential pattern mining. ACM Transac Knowl Disc Data (TKDD) 13(3):1–34
Article Google Scholar
Mannila H, Toivonen H, Verkamo IA (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Disc 1(3):259–289
Article Google Scholar
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p. 207–216
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Article MathSciNet Google Scholar
Fournier-Viger P, Lin JC-W, Kiran RU, Koh YS, Thomas R (2017) A survey of sequential pattern mining. Data Sci Patt Recog 1(1):54–77
Google Scholar
Koh YS, Ravana SD (2016) Unsupervised rare pattern mining: a survey. ACM Transac Knowl Disc Data (TKDD) 10(4):1–29
Article Google Scholar
Fournier-Viger P, Lin JC-W, Vo B, Chi TT, Zhang J, Le HB (2017) A survey of itemset mining. Wiley Interdisciplinary Reviews. Data Min Knowl Disc 7(4):e1207
Article Google Scholar
Yao H, Hamilton HJ, Geng L (2006) A unified framework for utility-based measures for mining itemsets. In: Proc. of ACM SIGKDD 2nd Workshop on Utility-Based Data Mining, pages 28–37. Citeseer
Geng L, Hamilton HJ (2006) Interestingness measures for data mining: A survey. ACM Comput Surv (CSUR) 38(3):9–es
Article Google Scholar
Tan P-N, Kumar V, Srivastava J (2004) Selecting the right objective measure for association analysis. Inf Syst 29(4):293–313
Article Google Scholar
McGarry K (2005) A survey of interestingness measures for knowledge discovery. Knowl Eng Rev 20(1):39–61
Article Google Scholar
Hilderman RJ, Hamilton HJ (2003) Measuring the interestingness of discovered knowledge: A principled approach. Int Data Analy 7(4):347–382
Article MATH Google Scholar
Silberschatz A, Tuzhilin A (1995) On subjective measures of interestingness in knowledge discovery. In: KDD, volume 95, pp. 275–281
Dwork C (2006) Differential privacy. In: International Colloquium on Automata, Languages, and Programming, pp. 1–12, Springer
Gentry C (2009) Fully homomorphic encryption using ideal lattices. In: Proceedings of the forty-first annual ACM symposium on Theory of computing, pp. 169–178
Weng J, Weng J, Zhang J, Li M, Zhang Y, Luo W (2019) Deepchain: Auditable and privacy-preserving deep learning with blockchain-based incentive. IEEE Transactions on Dependable and Secure Computing
Yeh J-S, Hsu P-C (2010) Hhuif and msicf: Novel algorithms for privacy preserving utility mining. Expert Syst Appl 37(7):4779–4786
Article Google Scholar
Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Tseng VS (2016) Fast algorithms for mining high-utility itemsets with various discount strategies. Adv Eng Inform 30(2):109–126
Article Google Scholar
Yun U, Kim J (2015) A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst Appl 42(3):1149–1165
Article Google Scholar
Li S, Nankun M, Le J, Liao X (2019) A novel algorithm for privacy preserving utility mining based on integer linear programming. Eng Appl Artif Intell 81:300–312
Article Google Scholar
Lin JC-W, Djenouri Y, Srivastava G, Fourier-Viger P (2022) Efficient evolutionary computation model of closed high-utility itemset mining. Appl Intell, p. 1–13
Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Tseng VS, Philip SY (2021) A survey of utility-oriented pattern mining. IEEE Trans Knowl Data Eng 33(4):1306–1327
Article Google Scholar
Lin JC-W, Djenouri Y, Srivastava G (2021) Efficient closed high-utility pattern fusion model in large-scale databases. Inform Fusion 76:122–132
Article Google Scholar
Lin JC-W, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive ga-based model for closed high-utility itemset mining. Appl Soft Comput, 108:107422
Kim H, Ryu T, Lee C, Kim H, Yoon E, Vo B, Lin JC-W, Yun U (2022) Ehmin: Efficient approach of list based high-utility pattern mining with negative unit profits. Expert Syst Appl, 209:118214
Lee C, Baek Y, Ryu T, Kim H, Kim H, Lin JC-W, Vo B, Yun U (2022) An efficient approach for mining maximized erasable utility patterns. Inf Sci 609:1288–1308
Article Google Scholar
Ryu T, Yun U, Lee C, Lin JC-W, Pedrycz W (2022) Occupancy-based utility pattern mining in dynamic environments of intelligent systems. Int J Intell Syst 37(9):5477–5507
Article Google Scholar
Jianying H, Mojsilovic A (2007) High-utility pattern mining: A method for discovery of high-utility item sets. Pattern Recogn 40(11):3317–3324
Article MATH Google Scholar
Lin C-W, Hong T-P, Wen-Hsiang L (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424
Article Google Scholar
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381
Article Google Scholar
Zida S, Fournier-Viger P, Lin JC-W, Cheng-Wei W, Tseng VS (2015) Efim: a highly efficient algorithm for high-utility itemset mining. In: Mexican international conference on artificial intelligence, pp. 530–546. Springer
Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th international conference on data mining, pages 984–989. IEEE
Kim H, Yun U, Baek Y, Kim H, Nam H, Lin JC-W, Fournier-Viger P (2021) Damped sliding based utility oriented pattern mining over stream data. Knowl-Based Syst, 213, p. 106653
Baek Y, Yun U, Kim H, Kim J, Vo B, Truong T, Deng Z-H (2021) Approximate high utility itemset mining in noisy environments. Knowl-Based Syst, 212:106596
Hong T-P, Lin C-W, Yang K-T, Wang S-L (2013) Using tf-idf to hide sensitive itemsets. Appl Intell, 38:502–510
Jangra S, Toshniwal D (2022) Efficient algorithms for victim item selection in privacy-preserving utility mining. Futur Gener Comput Syst, 128, pp. 219–234
Lin JC-W, Fournier-Viger P, Wu L, Gan W, Djenouri Y, Zhang J (2018) Ppsf: An open-source privacy-preserving and security mining framework. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1459–1463
Jimmy Ming-Tai W, Srivastava G, Jolfaei A, Pirouz M, Lin JC-W (2021) Security and privacy in shared hitlcps using a ga-based multiple-threshold sanitization model. IEEE Transactions on Emerging Topics in Comput Intell
Lin JC-W, Srivastava G, Zhang Y, Djenouri Y, Aloqaily M (2020) Privacy-preserving multiobjective sanitization model in 6g iot environments. IEEE Internet Things J 8(7):5340–5349
Article Google Scholar
Dinh T, Quang MN, Le B (2015) A novel approach for hiding high utility sequential patterns. In: Proceedings of the Sixth International Symposium on Information and Communication Technology, pp. 121–128
Quang MN, Huynh U, Dinh T, Le NH, Le B (2016) An approach to decrease execution time and difference for hiding high utility sequential patterns. In: International Symposium on Integrated Uncertainty in Knowledge Modelling and Decision Making, pp. 435–446. Springer
Lin JC-W, Liu Q, Fournier-Viger P, Hong T-P, Voznak M, Zhan J (2016) A sanitization approach for hiding sensitive itemsets based on particle swarm optimization. Eng Appl Artif Intell, 53:1–18
Duong Q-H, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877
Article Google Scholar
Ge Z, Song Z, Ding SX, Huang B (2017) Data mining and analytics in the process industry: The role of machine learning. Ieee Access, 5, pp. 20590–20616
Tassa T (2013) Secure mining of association rules in horizontally distributed databases. IEEE Trans Knowl Data Eng 26(4):970–983
Article Google Scholar

Download references

Acknowledgments

The authors are sincerely grateful to the editor and anonymous reviewers for their insightful and constructive comments, which helped us improve this work greatly.

Funding

This research was funded by the National Natural Science Foundation of China (61772282). It was also supported by the Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX21 1008).

Author information

Authors and Affiliations

School of Computer and Software, Nanjing University of Information Science & Technology, Ningliu Road, Nanjing, 210044, State, China
Chunyong Yin & Ying Li

Authors

Chunyong Yin
View author publications
You can also search for this author in PubMed Google Scholar
Ying Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Chunyong Yin and Ying Li. The first draft of the manuscript was written by Ying Li and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Chunyong Yin.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interest or personal relationship that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yin, C., Li, Y. Fast privacy-preserving utility mining algorithm based on utility-list dictionary. Appl Intell 53, 29363–29377 (2023). https://doi.org/10.1007/s10489-023-04791-2

Download citation

Accepted: 12 June 2023
Published: 27 October 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10489-023-04791-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast privacy-preserving utility mining algorithm based on utility-list dictionary

Abstract

Access this article

Similar content being viewed by others

Privacy Preserving Data Utility Mining Using Perturbation

A residual utility-based concept for high-utility itemset mining

An efficient utility-list based high-utility itemset mining algorithm

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast privacy-preserving utility mining algorithm based on utility-list dictionary

Abstract

Access this article

Similar content being viewed by others

Privacy Preserving Data Utility Mining Using Perturbation

A residual utility-based concept for high-utility itemset mining

An efficient utility-list based high-utility itemset mining algorithm

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation