Skip to main content

Standardization of Big Data and Its Policies

  • Chapter
  • First Online:
Privacy and Security Issues in Big Data

Part of the book series: Services and Business Process Reengineering ((SBPR))

  • 636 Accesses

Abstract

Big data is defined as massive sets consisting of a broader variety of data, further complicated as well as dynamic structure with challenges in collecting, storing, examining, and then applying additional procedures or extracting results then visualizing further the outcomes. The phrase big data analytics is utilized to delineate the aforementioned method of studying vast volumes of complex data to discover hidden trends or to find secret associations. There is, nevertheless, a strong inconsistency seen between privacy, security, and the widely accepted use of big data. This article deals with the use of privacy by adapting established techniques, like k-anonymity, HybrEx, T-closeness, and L-diversity, and introducing them in trade and commerce. A variety of privacy-preservation frameworks are being geared toward the preservation of solitude at various levels (such as production, storage, and processing of data) of the big data lifetime. This paper aims to include a detailed summary of frameworks for protecting the privacy and also to address some barriers to current frameworks. This paper also covers different policies related to big data standards. At least, a brief review of the Indian Personal Data Protection Bill is done.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kolomvatsos K, Anagnostopoulos C, Hadjiefthymiades IS (2015) An efficient time optimized scheme for progressive analytics in big data. Big Data Res 2(4):155–165

    Article  Google Scholar 

  2. Abadi DJ, Carney D, Cetintemel U, Cherniack M, Convey C, Lee S, Stonebraker M, Tatbul N, Zdonik SB (2003) Aurora: a new model and architecture for data stream management. VLDB J 12(2):120–139

    Article  Google Scholar 

  3. Big data at the speed of business [online]. https://www-01.ibm.com/soft-ware/data/bigdata/2012

  4. Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers A (2011) Big data: the next frontier for innovation, competition, and productivity. Mickensy Global Institute, New York, pp 1–137

    Google Scholar 

  5. Gantz J, Reinsel D (2011) Extracting value from chaos. In: Proc on IDC IView, p 1–12

    Google Scholar 

  6. Tsai C-W, Lai C-F, Chao H-C, Vasilakos AV (2015) Big data analytics: a survey. J Big Data Springer Open J

    Google Scholar 

  7. Mehmood A, Natgunanathan I, Xiang Y, Hua G, Guo S (2016) Protection of big data privacy. In: IEEE translations and content mining are permitted for academic research

    Google Scholar 

  8. Jain P, Pathak N, Tapashetti P, Umesh AS (2013) Privacy preserving processing of data decision tree based on sample selection and singular value decomposition. In: 39th international conference on information assurance and security (lAS)

    Google Scholar 

  9. Qin Y et al (2016) When things matter: a survey on data-centric internet of things. J Netw Comp Appl 64:137–153

    Article  Google Scholar 

  10. Fong S, Wong R, Vasilakos AV (2016) Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans Services Comput 9(1)

    Google Scholar 

  11. Middleton P, Kjeldsen P, Tully J (2013) Forecast: the internet of things, worldwide. Gartner, Stamford

    Google Scholar 

  12. Hu J, Vasilakos AV (2016) Energy big data analytics and security: challenges and opportunities. IEEE Trans Smart Grid 7(5):2423–2436

    Article  Google Scholar 

  13. Porambage P et al (2016) The quest for privacy in the internet of things. IEEE Cloud Comp 3(2):36–45

    Article  Google Scholar 

  14. Jing Q et al (2014) Security of the internet of things: perspectives and challenges. Wirel Netw 20(8):2481–2501

    Article  Google Scholar 

  15. Han J, Ishii M, Makino H (2013) A Hadoop performance model for multi-rack clusters. In: IEEE 5th international conference on computer science and information technology (CSIT), pp 265–274

    Google Scholar 

  16. Gudipati M, Rao S, Mohan ND, Gajja NK (2012) Big data: testing approach to overcome quality challenges. Data Eng 23–31

    Google Scholar 

  17. Xu L, Jiang C, Wang J, Yuan J, Ren Y (2014) Information security in big data: privacy and data mining. IEEE Access 2:1149–1176

    Article  Google Scholar 

  18. Liu S (2011) Exploring the future of computing. IT Prof 15(1):2–3

    Article  Google Scholar 

  19. Sokolova M, Matwin S (2015) Personal privacy protection in time of big data. Springer, Berlin

    Google Scholar 

  20. Cheng H, Rong C, Hwang K, Wang W, Li Y (2015) Secure big data storage and sharing scheme for cloud tenants. China Commun 12(6):106–115

    Article  Google Scholar 

  21. Mell P, Grance T (2009) The NIST definition of cloud computing. Natl Inst Stand Technol 53(6):50

    Google Scholar 

  22. Wei L, Zhu H, Cao Z, Dong X, Jia W, Chen Y, Vasilakos AV (2014) Security and privacy for storage and computation in cloud computing. Inf Sci 258:371–386

    Article  Google Scholar 

  23. Xiao Z, Xiao Y (2013) Security and privacy in cloud computing. IEEE Trans Commun Surv Tutorials 15(2):843–859

    Article  Google Scholar 

  24. Wang C, Wang Q, Ren K, Lou W (2010) Privacy-preserving public auditing for data storage security in cloud computing. In: Proceedings of IEEE international conference on INFOCOM, pp 1–9

    Google Scholar 

  25. Liu C, Ranjan R, Zhang X, Yang C, Georgakopoulos D, Chen J (2013) Public auditing for big data storage in cloud computing—a survey. In: Proceedings of IEEE international conference on computational science and engineering, pp 1128–1135

    Google Scholar 

  26. Liu C, Chen J, Yang LT, Zhang X, Yang C, Ranjan R, Rao K (2014) Authorized public auditing of dynamic big data storage on the cloud with efficient verifiable fine-grained updates. In: IEEE trans on parallel and distributed systems, vol 25, no 9, pp 2234–2244

    Google Scholar 

  27. Xu K et al (2015) Privacy-preserving machine learning algorithms for big data systems. In: IEEE 35th international conference on distributed computing systems (ICDCS)

    Google Scholar 

  28. Zhang Y, Cao T, Li S, Tian X, Yuan L, Jia H, Vasilakos AV (2016) Parallel processing systems for big data: a survey. In: Proceedings of the IEEE

    Google Scholar 

  29. Li N et al (2007) t-Closeness: privacy beyond k-anonymity and L-diversity. In: IEEE 23rd International Conference on Data Engineering (ICDE)

    Google Scholar 

  30. Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) L-diversity: privacy beyond k-anonymity. In: Proceedings 22nd international conference data engineering (ICDE), p 24

    Google Scholar 

  31. Ton A, Saravanan M Ericsson research [Online]. https://www.ericsson.com/research-blog/data-knowledge/big-data-privacy-preservation/2015

  32. Samarati P (2001) Protecting respondent’s privacy in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027

    Article  Google Scholar 

  33. Samarati P, Sweeney L (1998) Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical Report SRI-CSL-98–04, SRI Computer Science Laboratory

    Google Scholar 

  34. Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertain Fuzz 10(5):557–570

    Article  MathSciNet  Google Scholar 

  35. Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of the ACM symposium on principles of database systems

    Google Scholar 

  36. Bredereck R, Nichterlein A, Niedermeier R, Philip G (2011) The effect of homogeneity on the complexity of k-anonymity. In: FCT, pp 53–64

    Google Scholar 

  37. Ko SY, Jeon K, Morales R (2011) The HybrEx model for confidentiality and privacy in cloud computing. In: 3rd USENIX workshop on hot topics in cloud computing, HotCloud’11, Portland

    Google Scholar 

  38. Lu R, Zhu H, Liu X, Liu JK, Shao J (2014) Toward efficient and privacy-preserving computing in the big data era. IEEE Netw 28:46–50

    Article  Google Scholar 

  39. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: EUROCRYPT, pp 223–238

    Google Scholar 

  40. Microsoft differential privacy for everyone [online] (2015). https://download.microsoft.com/…/Differential_Privacy_for_Everyone.pdf

  41. Sedayao J, Bhardwaj R (2014) Making big data, privacy, and anonymization work together in the enterprise: experiences and issues. In: Big Data Congress

    Google Scholar 

  42. Yong Yu et al (2016) Cloud data integrity checking with an identity-based auditing mechanism from RSA. Future Gener Comp Syst 62:85–91

    Article  Google Scholar 

  43. Oracle Big Data for the Enterprise (2012) [online]. https://www.oracle.com/ca-en/technoloqies/biq-doto

  44. Hadoop Tutorials (2012) https://developer.yahoo.com/hadoop/tutorial

  45. Fair Scheduler Guide (2013). https://hadoop.apache.org/docs/r0.20.2/fair_scheduler.html

  46. Jung K, Park S, Park S (2014) Hiding a needle in a haystack: privacy-preserving Apriori algorithm in MapReduce framework PSBD’14, Shanghai, pp 11–17

    Google Scholar 

  47. Ateniese G, Johns RB, Curtmola R, Herring J, Kissner L, Peterson Z, Song D (2007) Provable data possession at untrusted stores. In: Proceedings of international conference of ACM on the computer and communications security, pp 598–609

    Google Scholar 

  48. Verma A, Cherkasova L, Campbell RH (2011) Play it again, SimMR!. In: Proceedings IEEE Int’l conference cluster computing (Cluster’11)

    Google Scholar 

  49. Feng Z et al (2014) TRAC: truthful auction for location-aware collaborative sensing in mobile crowdsourcing INFOCOM. Piscataway, IEEE, pp 1231–1239

    Google Scholar 

  50. HessamZakerdah CC, Aggarwal KB (2015) Privacy-preserving big data publishing. ACM, La Jolla

    Google Scholar 

  51. Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570

    Article  MathSciNet  Google Scholar 

  52. Wu X (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107

    Article  Google Scholar 

  53. Mishra S, Mallick PK, Jena L, Chae GS (2020) Optimization of skewed data using sampling-based preprocessing approach. Front Public Health 8:274. https://doi.org/10.3389/fpubh.2020.00274

    Article  Google Scholar 

  54. Zhang X, Yang T, Liu C, Chen J (2014) A scalable two-phase top-down specialization approach for data anonymization using systems, in MapReduce on the cloud. IEEE Trans Parallel Distrib 25(2):363–373

    Article  Google Scholar 

  55. Dutta A, Misra C, Barik RK, Mishra S (2021) Enhancing mist assisted cloud computing toward secure and scalable architecture for smart healthcare. In: Hura G, Singh A, Siong Hoe L (eds) Advances in communication and computational technology. Lecture Notes in Electrical Engineering, vol 668. Springer, Singapore. https://doi.org/10.1007/978-981-15-5341-7_116

  56. Zhang X, Dou W, Pei J, Nepal S, Yang C, Liu C, Chen J (2015) Proximity-aware local-recoding anonymization with MapReduce for scalable big data privacy preservation in the cloud. IEEE Trans Comput 64(8)

    Google Scholar 

  57. Chen F et al (2015) Data mining for the internet of things: literature review and challenges. Int J Distrib Sens Netw 501:431047

    Article  Google Scholar 

  58. Mohapatra SK, Nayak P, Mishra S, Bisoy SK (2019) Green computing: a step towards eco-friendly computing. In: Emerging trends and applications in cognitive computing, pp 124–149. IGI Global

    Google Scholar 

  59. Mallick PK, Mishra S, Chae GS (2020) Digital media news categorization using Bernoulli document model for web content convergence. Pers Ubiquit Comput. https://doi.org/10.1007/s00779-020-01461-9

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Nayak, S., Dash, A., Swain, S. (2021). Standardization of Big Data and Its Policies. In: Das, P.K., Tripathy, H.K., Mohd Yusof, S.A. (eds) Privacy and Security Issues in Big Data. Services and Business Process Reengineering. Springer, Singapore. https://doi.org/10.1007/978-981-16-1007-3_6

Download citation

Publish with us

Policies and ethics