Abstract
Data mining plays a vital role today for decision making and analysis in education, health care, business and more. It is very important to protect the data before the mining process such that it is protected from security threats and produces correct and desirable results. Privacy-preserving data mining (PPDM) allows securing data, thus maintaining data privacy. In this paper, we have used perturbation-based methods for data transformation, making it secure before applying the data mining process. The authors have proposed extended non-negative matrix factorization (NMF), which includes the NMF method followed by double-reflecting data perturbation (DRDP) method to distort data. This gives higher protection levels compared to NMF alone based upon various privacy measures. We have used R language for the implementation of the research work. We have evaluated and compared various privacy parameters to show that the proposed method of extended NMF (NMF followed by DRDP), provides higher level of protection to nonnegative numeric data compared to NMF alone.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8:866–883
Xu L, Jiang C, Wang J, Yuan J, Ren Y (2014) Information security in big data: privacy and data mining. IEEE Access 2:1149–1176
Xiao-dan W, Dian-min Y, Feng-li L, Yun-feng W, Chao-Hsien C (2006) Privacy preserving data mining algorithms by data distortion. In: International conference on management science and engineering, Lille, pp 223–228
Maheswari N, Revathi M (2014) Data security using decomposition. Int J Appl Sci Eng 12(4):303–312
Manikandan G, Sairam N, Sudhan R, Vaishnavi B (2012) Shearing based data transformation approach for privacy preserving clustering. In: 2012 Third international conference on computing, communication and networking technologies, ICCCNT’12, Coimbatore, pp 1–5
Kabir SMA, Youssef AM, Elhakeem AK (2007) On data distortion for privacy preserving data mining. In: 2007 Canadian conference on electrical and computer engineering, Vancouver, BC, pp 308–311
Bhandare SK (2013) Data distortion based privacy preserving method for data mining system. Int J Emerg Trends Technol Comput Sci 2
Peng B, Geng X, Zhang J (2010) Combined data distortion strategies for privacy-preserving data mining. In: 3rd International conference on advanced computer theory and engineering, ICACTE, Chengdu, V1-572–V1-576
Zhang J, Wang J, Xu S (2007) Matrix decomposition based data distortion techniques for privacy preserving in data mining. Technical report, Department of Computer Science, University of Kentucky, Lexington. Retrieved from https://www.academia.edu/7981302/Matrix_Decomposition-Based_Data_Distortion_Techniques_for_Privacy_Preservation_in_Data_Mining
Li L, Zhang Q (2009) A privacy preserving clustering technique using hybrid data transformation method. In: 2009 IEEE International conference on grey systems and intelligent services, GSIS 2009, Nanjing, pp 1502–1506
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Proceedings of 13th international conference on neural information processing systems, NIPS’00, pp 535–541
Li T, Gao C, Du J (2009) A NMF-based privacy-preserving recommendation algorithm. In: 2009 First international conference on information science and engineering, Nanjing, pp 754–757
Wang J, Zhong W, Zhang J (2006) NNMF-based factorization techniques for high-accuracy privacy protection on non-negative-valued datasets. In: Sixth IEEE international conference on data mining—workshops, ICDMW’06, Hong Kong, pp 513–517
Nagalakshmi M, Rani KS (2013) Privacy preserving clustering by hybrid data transformation approach. Int J Emerg Technol Adv Eng 3
Li G, Xi M (2015) An improved algorithm for privacy-preserving data mining based on NMF. J Inf Comput Sci 3423–3430
Xu S, Zhang J, Han D, Wang J (2005) Data distortion for privacy preservation in terrorist analysis System. In: Proceedings of IEEE international conference on intelligence and security informatics, ISI 2005, vol 3495, Atlanta, GA, USA
Afrin A, Paul MK, Sattar AHMS (2019) Privacy preserving data mining using non-negative matrix factorization and singular value decomposition. In: Proceedings of 4th international conference on electrical information and communication technology, EICT, pp 1–6
Koushika N, Premlatha K (2021) An improved privacy-preserving data mining technique using singular value decomposition with three-dimensional rotation data perturbation. J Supercomput 1–9
Malik MB, Ghazi MA, Ali R (2012) Privacy preserving data mining techniques: current scenario and future prospects. In: 2012 Third international conference on computer and communication technology, Allahabad, pp 26–32
Li X, Yan Z, Zhang P (2014) A review on privacy-preserving data mining. In: 2014 IEEE International conference on computer and information technology, Xi’an, pp 769–774
Vaghashia H, Ganatra A (2015) A survey: privacy preserving techniques in data mining. Int J Comput Appl 119
Bhandari N, Pahwa P (2019) Comparative analysis of privacy-preserving data mining techniques. In: Bhattacharyya S, Hassanien A, Gupta D, Khanna A, Pan I (eds) International conference on innovative computing and communications. Lecture notes in networks and systems, vol 56. Springer, Singapore. (Proceedings of ICICC, Delhi, India, vol 2, 2018)
Balajee M, Narasimham C (2012) Double-reflecting data perturbation method for information security. Orient J Comput Sci Technol 5:283–288
Gaujoux R (2018) An introduction to NMF package version 0.20.6. Retrieved from https://cran.r-project.org/web/packages/NMF/vignettes/NMF-vignette.pdf
UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu/ml/index.php
Fisher RA (1988) Iris. UCI Machine Learning Repository
Martiniano A, Ferreira R (2018) Absenteeism at work. UCI Machine Learning Repository
Cardoso M (2014) Wholesale customers. UCI Machine Learning Repository
Wolberg WH (1992) Breast cancer Wisconsin (original). UCI Machine Learning Repository
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bhandari, N., Pahwa, P. (2022). Achieving Data Privacy Using Extended NMF. In: Skala, V., Singh, T.P., Choudhury, T., Tomar, R., Abul Bashar, M. (eds) Machine Intelligence and Data Science Applications. Lecture Notes on Data Engineering and Communications Technologies, vol 132. Springer, Singapore. https://doi.org/10.1007/978-981-19-2347-0_17
Download citation
DOI: https://doi.org/10.1007/978-981-19-2347-0_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2346-3
Online ISBN: 978-981-19-2347-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)