A new algorithm using integer programming relaxation for privacy-preserving in utility mining

Nguyen, Duc; Tran, Minh-Thai; Le, Bac

doi:10.1007/s10489-023-04913-w

A new algorithm using integer programming relaxation for privacy-preserving in utility mining

Published: 03 August 2023

Volume 53, pages 25106–25118, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Abstract

High-utility itemset mining (HUIM) is an effective technique for discovering significant information in data. However, data containing sensitive and private information may cause privacy concerns. Therefore, privacy preserving utility mining (PPUM) has recently become a critical research area. PPUM is the process of transforming a quantitative transactional database into a sanitised one, thus ensuring that utility mining algorithms cannot discover sensitive information. The sanitisation process can have several side effects, including the loss of non-sensitive information and the introduction of redundant information. Additionally, the running times of heuristic algorithms for sanitising data are high. To minimise negative effects and lower the execution time of the hiding process, we propose the G-ILP algorithm with a GPU parallel programming method for preprocessing and a new efficient constraint satisfaction problem for hiding data. The experimental evaluations of G-ILP show the algorithm’s efficiency in terms of running time and its ability to minimise side effects in large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hiding sensitive frequent itemsets by item removal via two-level multi-objective optimization

Article 13 August 2022

An efficient utility-list based high-utility itemset mining algorithm

Article 13 July 2022

Multiobjective-integer-programming-based Sensitive Frequent Itemsets Hiding

References

Yun U, Kim D (2017) Analysis of privacy preserving approaches in high utility pattern mining. In: Park, J.J.J.H., Pan, Y., Yi, G., Loia, V. (eds.) Advances in Computer Science and Ubiquitous Computing, Singapore, pp. 883–887. https://doi.org/10.1007/978-981-10-3023-9_137
Yeh J-S, Hsu P-C (2010) HHUIF and MSICF: Novel algorithms for privacy preserving utility mining. Expert Syst Appl 37(7):4779–4786. https://doi.org/10.1016/j.eswa.2009.12.038
Article Google Scholar
Lin C-W, Hong T-P, Wong J-W, Lan G-C, Lin W-Y (2014) A GA-based approach to hide sensitive high utility itemsets. Sci World J 2014:2356–6140. https://doi.org/10.1155/2014/804629
Article Google Scholar
Lin JC-W, Hong T-P, Fournier-Viger P, Liu Q, Wong J-W, Zhan J (2017) Efficient hiding of confidential high-utility itemsets with minimal side effects. J Exp Theoretical Artif Intell 29(6):1225–1245. https://doi.org/10.1080/0952813X.2017.1328462
Article Google Scholar
Lin JC-W, Wu T-Y, Fournier-Viger P, Lin G, Zhan J, Voznak M (2016) Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining. Eng Appl Artif Intell 55:269–284. https://doi.org/10.1016/j.engappai.2016.07.003
Article Google Scholar
Li S, Mu N, Le J, Liao X (2019) A novel algorithm for privacy preserving utility mining based on integer linear programming. Eng Appl Artif Intell 81:300–312. https://doi.org/10.1016/j.engappai.2018.12.006
Article Google Scholar
Liu X, Chen G, Wen S, Song G (2020) An improved sanitization algorithm in privacy-preserving utility mining. Mathematical Problems in Engineering 2020:1–14. https://doi.org/10.1155/2020/7489045
Article Google Scholar
Liu X, Wen S, Zuo W (2020) Effective sanitization approaches to protect sensitive knowledge in high-utility itemset mining. Appl Intell 50(1):169–191. https://doi.org/10.1007/s10489-019-01524-2
Article Google Scholar
Jangra S, Toshniwal D (2022) Efficient algorithms for victim item selection in privacypreserving utility mining. Future Gener Comput Syst 128:219–234. https://doi.org/10.1016/j.future.2021.10.008
Article Google Scholar
Ashraf M, Rady S, Abdelkader T, Gharib TF (2023) Efficient privacy preserving algorithms for hiding sensitive high utility itemsets. Comput Sec 103360. https://doi.org/10.1016/j.cose.2023.103360
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 482–486. https://doi.org/10.1137/1.9781611972740.51
Liu Y, Liao W-K, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Ho TB, Cheung D, Liu H (eds.) Advances in Knowledge Discovery and Data Mining, Berlin, Heidelberg, pp. 689–695. https://doi.org/10.1007/1143091979
Lin C-W, Hong T-P, Lu W-H (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424. https://doi.org/10.1016/j.eswa.2010.12.082
Article Google Scholar
Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) Up-growth: An efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD-10, pp. 253–262. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1835804.1835839
Tseng VS, Shie B, Wu C, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786. https://doi.org/10.1109/TKDE.2012.59
Article Google Scholar
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. CIKM-12, pp. 55–64, New York, NY, USA. https://doi.org/10.1145/2396761.2396773
Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen T, Christiansen H, Cubero J-C, Ra’s ZW (eds.) Foundations of Intelligent Systems, Cham, pp. 83–92. https://doi.org/10.1007/978-3-319-08326-19
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381. https://doi.org/10.1016/j.eswa.2014.11.001
Article Google Scholar
Zida S, Fournier Viger P, Lin C-W, Wu C-W, Tseng V (2016) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51:595–625. https://doi.org/10.1007/s10115-016-0986-0
Article Google Scholar
Lin JC-W, Gan W, Fournier-Viger P, Hong T-P (2015) Mining high-utility itemsets with multiple minimum utility thresholds. In: Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering. C3S2E-15, pp. 9–17, New York, NY, USA. https://doi.org/10.1145/2790798.2790807
Gan W, Lin JC-W, Chao H-C, Fournier-Viger P, Wang X, Yu PS (2020) Utility-driven mining of trend information for intelligent system. ACM Trans Manag Inf Syst 11(3). https://doi.org/10.1145/3391251
Vo B, Nguyen LTT, Nguyen TDD, Fournier-Viger P, Yun U (2020) A multi-core approach to efficiently mining high-utility itemsets in dynamic profit databases. IEEE Access 8:85890–85899. https://doi.org/10.1109/ACCESS.2020.2992729
Article Google Scholar
Yun U, Nam H, Kim J, Kim H, Baek Y, Lee J, Yoon E, Truong T, Vo B, Pedrycz W (2020) Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases. Futur Gener Comput Syst 103:58–78. https://doi.org/10.1016/j.future.2019.09.024
Article Google Scholar
Yun U, Kim J (2015) A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst Appl 42(3):1149–1165. https://doi.org/10.1016/j.eswa.2014.08.037
Article Google Scholar
Nguyen D (2022) Le B (2022) A fast algorithm for privacy-preserving utility mining. J Inf Technol Commun 1:12–22. https://doi.org/10.32913/mic-ict-research.v2022.n1.1026
Article Google Scholar
Wu C, Fournier-Viger P, Gu J, Tseng VS (2015) Mining closed+ high utility itemsets without candidate generation. In: 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 187–194. https://doi.org/10.1109/TAAI.2015.7407089
Li X, Shih P-C, Overbey J, Seals C, Lim A (2016) Comparing programmer productivity in openacc and cuda: An empirical investigation. Intern J Comput Sci, Eng Appl (IJCSEA) 6(5):1–15
Dong J, Han M (2007) BitTableFI: An efficient mining frequent itemsets algorithm. Knowl-Based Syst 20(4):329–335. https://doi.org/10.1016/j.knosys.2006.08.005
Article Google Scholar
Liu J, Wang K, Fung BC (2012) Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International Conference on Data Mining, pp. 984–989. https://doi.org/10.1109/ICDM.2012.20. IEEE
Gurobi Optimization L (2020) Gurobi Optimizer Reference Manual. http://www.gurobi.com

Download references

Acknowledgements

This research is funded by University of Science, VNU-HCM under grant number CNTT 2021-05

Author information

Authors and Affiliations

Faculty of Information Technology, University of Science, Ho Chi Minh City, Vietnam
Duc Nguyen & Bac Le
Vietnam National University, Ho Chi Minh City, Vietnam
Duc Nguyen & Bac Le
Faculty of Information Technology, University of Foreign Languages -Information Technology, Ho Chi Minh City, Vietnam
Minh-Thai Tran

Authors

Duc Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Minh-Thai Tran
View author publications
You can also search for this author in PubMed Google Scholar
Bac Le
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bac Le.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Nguyen, D., Tran, MT. & Le, B. A new algorithm using integer programming relaxation for privacy-preserving in utility mining. Appl Intell 53, 25106–25118 (2023). https://doi.org/10.1007/s10489-023-04913-w

Download citation

Accepted: 22 July 2023
Published: 03 August 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10489-023-04913-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new algorithm using integer programming relaxation for privacy-preserving in utility mining

Abstract

Access this article

Similar content being viewed by others

Hiding sensitive frequent itemsets by item removal via two-level multi-objective optimization

An efficient utility-list based high-utility itemset mining algorithm

Multiobjective-integer-programming-based Sensitive Frequent Itemsets Hiding

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new algorithm using integer programming relaxation for privacy-preserving in utility mining

Abstract

Access this article

Similar content being viewed by others

Hiding sensitive frequent itemsets by item removal via two-level multi-objective optimization

An efficient utility-list based high-utility itemset mining algorithm

Multiobjective-integer-programming-based Sensitive Frequent Itemsets Hiding

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation