Abstract
Educational data is available in today’s world in abundance; it can be leveraged to improve students’ performance based on their academic records and to predict their future performances. Data sharing without intruding the privacy of individuals is a major concern. The present work proposes an improved privacy preserving k-anonymization Cluster-based Algorithm for a multi-relational educational dataset. To overcome the limitations of k-Anonymization, anonymized data is l-diversified to protect sensitive data from attacks. Further, Text Steganography is applied to avoid similarity attacks on l-diversified data to provide the second layer of privacy. Since the utility of data is an important factor, it must be maintained along with privacy to get useful information from the analysis. A Loss Metric is used to find the distortion of k-anonymized data to evaluate the balance between privacy and utility. Earth’s mover distance has been calculated for l-diversified data with steganography and without steganography to validate the results. For experiment purposes, an educational dataset has been used and results are compared with the existing approaches available in the literature. Statistical analysis has also been performed to justify the results.
Similar content being viewed by others
Data availability
Not applicable.
References
Nisha AS, Muttoo SK (2020) Learning Analytics: A Literature Review and its Challenges. Proceedings of the 5th International conference on Information and Communication Technology for Competitive Strategies (ICTCS). Springers (Scopus Indexed)
Idrees SM, Afshar Alam M, Agarwal P (2019) A study of big data and its challenges. Int J Inf Tecnol 11:841–846
Sen AAA, Eassa FA, Jambi K (2018) Preserving privacy in internet of things: a survey. Int J Inf Technol 10:189–200
Allagi S, Rachh R, Anami B (2021) A hybrid model for data security and preserving anonymity using machine learning. Int J Inf Technol 13:2397–2410
Shastri MD, Pandit AA (2021) Remodeling: improved privacy preserving data mining (PPDM). Int J Inf Technol 13:131–137
Gursoy ME, Inan A, Nergiz ME, Saygin Y (2017) Privacy-preserving learning analytics: challenges and techniques. IEEE Trans Learn Technol 10(1):68
Nergiz ME, Clifton C, Senior Member, IEEE, Nergiz AE (2009) Multirelational k-anonymity. IEEE Trans Knowl Data Eng 21(8):1104
Ozalp I, Gursoy ME, Nergiz ME, Acadsoft Research YUCEL SAYGIN (2016) Privacy-preserving publishing of hierarchal data. ACM Trans Priv Secur 19(3):7
Mayer-Schonberger V, Cukier K (2013) Learning with big data: the future of education. Houghton Mifflin, Boston
Crawford K (2011) Six provocations for big data. [Online]. Available: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1926431
Crawford K (2013) The hidden biases in big data. HBR Blog Network. [Online]. Available: https://hbr.org/2013/04/thehidden-biases-in-big-data
Gangadharan SP (2012) Digital inclusion and data profiling. First Monday. 17(5):2–3
Slade S, Prinsloo P (2013) Learning analytics ethical issues and dilemmas. Am Behav Sci 57(10):1510–1529
Sweeney L (1997) Guaranteeing Anonymity When Sharing Medical Data, the Datafly System. Proc J Am Medical Informatics Assoc. Hanley & Belfus
Iyengar V (2002) Transforming Data to Satisfy Privacy Constraints. Proc. Eighth ACM SIGKDD Int’l Conf Knowledge Discovery and Data Mining (KDD ’02). p. 279–288
Samarati P, Sweeney L (1998) Protecting Privacy When Disclosing Information: KAnonymity and Its Enforcement through Generalization and Suppression. Technical Report. SRI International
Ghinita G, Kalnis P, Tao Y (2011) Anonymous publication of sensitive transactional data. IEEE Trans Knowl Data Eng 23:161–174
Terrovitis M, Mamoulis N, Kalnis P (2008) Privacy-preserving anonymization of set-valued data. Proc VLDB Endow 1:115–125
Aggarwal M (2013) Text steganograhic approaches: a comparison. Int J Netw Secur Appl (IJNSA) 5(1):012024
Li N, Li T, Venkatasubramanian S (2007) t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. IEEE 23rd International Conference on Data Engineering.
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript. The authors have no relevant financial or non-financial interests to disclose.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study’s conception and design. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
Author Dr. Sunil Kumar Muttoo declares that he has no conflict of interest. Author Ms. Nisha declares that she has no conflict of interest. Author Dr. Archana Singhal declares that she has no conflict of interest.
Research involving human or animal participants
This article does not contain any studies with human participants or animals performed by any of the authors.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Muttoo, S.K., Nisha & Singhal, A. A novel privacy-preserving technique using steganography and L-diversity for multi-relational educational dataset. Int. j. inf. tecnol. 15, 3307–3325 (2023). https://doi.org/10.1007/s41870-023-01305-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-023-01305-8