Skip to main content

Securing the Data Deduplication to Improve the Performance of Systems in the Cloud Infrastructure

  • Chapter
  • First Online:
Performance Management of Integrated Systems and its Applications in Software Engineering

Part of the book series: Asset Analytics ((ASAN))

Abstract

Data duplication is a data quality problem which may exist in database system where the same record is stored multiple times in the same or different database systems. Data duplication issue may lead to issues like data redundancy, wasted cost, lost income, negative impact on response rate, ROI, and brand reputation, poor customer service, inefficiency and lack of productivity, decreased user adoption, inaccurate reporting, less informed decisions, and poor business process. The solution to the problem of data duplication may be countered with data deduplication which is often termed as intelligent compression or single instance storage. Data deduplication eradicates duplicate copies of information resulting in the reduction of storage overheads and in enhancement of various performance parameters. The recent study on data deduplication has shown that there exists modern data redundancy in primary storage in the cloud infrastructure. Data redundancy can be reduced in primary storage system of cloud architecture using data deduplication. The research work carried out highlights the identified and established methods of data deduplication based on capacity and performance parameters. In the research work, the authors have proposed a performance-oriented data (POD) deduplication scheme which improves performance and primary storage system in the cloud. In addition to this, security analysis using encryption technique has also been performed and demonstrated to protect the sensitive data after the completion of deduplication process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pachpor, N. N., & Prasad, P. S. (2018). Improving the performance of system in cloud by using selective deduplication. In IEEE 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA).

    Google Scholar 

  2. Mao, B., Jiang, H., Wu, S., & Tian, L. (2016). Leveraging data deduplication to improve the performance of primary storage systems in the cloud. IEEE Transactions on Computers, 65(6), 1775–1788.

    Google Scholar 

  3. Koller, R., & Rangaswami, R. (2010). I/O deduplication: Utilizing content similarity to improve I/O performance. In Proceedings of USENIX File Storage Technologies, February 2010 (pp. 1–14).

    Google Scholar 

  4. Meyer, D. T., & Bolosky, W. J. (2011). A study of practical deduplication. In Proceedings of 9th USENIX Conference on File Storage Technologies, February 2011 (pp. 1–14).

    Google Scholar 

  5. Clements, T., Ahmad, I., Vilayannur, M., & Li, J. (2009). Decentralized deduplication in SAN cluster file systems. In Proceedings of USENIX Annual Technical Conference, June 2009 (pp. 101–114).

    Google Scholar 

  6. Bibawe, C. B., & Baviskar, V. (2017). Secure authorized deduplication for data reduction with low overheads in hybrid cloud. International Journal of Innovative Research in Computer and Communication Engineering, 5(2), 1797–1804.

    Google Scholar 

  7. Jin, K., & Miller, E. L. (2009). The effectiveness of deduplication on virtual machine disk images. In Proceedings of the Israeli Experimental Systems Conference, May 2009 (pp. 1–12).

    Google Scholar 

  8. Srinivasan, K., Bisson, T., Goodson, G., & Voruganti, K. (2012). iDedup: Latency-aware, inline data deduplication for primary storage. In Proceedings of 10th USENIX Conference on File Storage Technologies, February 2012 (pp. 299–312).

    Google Scholar 

  9. Gode, R. V., & Dalvi, R. A survey on authorized deduplication technique for encrypted data with DARE scheme in a twin cloud environment. IJIRCCE, ISSN(Online): 2320-9801

    Google Scholar 

  10. El-Shimi, A., Kalach, R., Kumar, A., Oltean, A., Li, J., & Sengupta, S. (2012). Primary data deduplication-large scale study and system design. In Proceedings of USENIX Annual Technical Conference, June 2012 (pp. 285–296).

    Google Scholar 

  11. Kiswany, S., Ripeanu, M., Vazhkudai, S. S., & Gharaibeh, A. (2008). STDCHK: A checkpoint storage system for desktop gridcomputing. In Proceedings of 28th International Conference on Distributed Computing Systems, June 2008 (pp. 613–624).

    Google Scholar 

  12. Meister, D., Kaiser, J., Brinkmann, A., Cortes, T., Kuhn, M., & Kunkel, J. (2012) A study on data deduplication in HPC storage systems. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, November 2012 (pp. 1–11).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nishant N. Pachpor .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

N. Pachpor, N., S. Prasad, P. (2020). Securing the Data Deduplication to Improve the Performance of Systems in the Cloud Infrastructure. In: Pant, M., Sharma, T., Basterrech, S., Banerjee, C. (eds) Performance Management of Integrated Systems and its Applications in Software Engineering. Asset Analytics. Springer, Singapore. https://doi.org/10.1007/978-981-13-8253-6_5

Download citation

Publish with us

Policies and ethics