Data Deduplication System for Supporting Multi-mode

Jung, Ho Min; Park, Won Vien; Lee, Wan Yeon; Lee, Jeong Gun; Ko, Young Woong

doi:10.1007/978-3-642-20039-7_8

Ho Min Jung²²,
Won Vien Park²²,
Wan Yeon Lee²²,
Jeong Gun Lee²² &
…
Young Woong Ko²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6591))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

1126 Accesses
5 Citations

Abstract

The implementation approaches of data deduplication system divide into several modes including SBA(source-based approach), ILA(in-line approach) and PPA(post-process approach). Currently, most commercial systems are implemented and operated in an ILA and PPA approach, and some researchers have focused on the SBA approach. As data deduplication systems are widely used, to choose an appropriate mode considering operation environment becomes more and more important than ever. Because the overhead of each mode and resource usage wasn’t fully studied, in some operating environments, the deduplication mode can lead to inefficiency and poor performance. In this study, we propose a data deduplication system supporting multi-mode. The proposed system can be operated in a mode that a user specifies during system operation, therefore, this system can be dynamically adjusted under consideration of system characteristics. In this paper, we operate the proposed system with the SBA, ILA and PPA mode, respectively, and we present the measurement results with a comparative analysis of the mode-specific performance and overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Tan, Y., Jiang, H., Feng, D., Tian, L., Yan, Z., Zhou, G.: SAM: A Semantic-Aware Multi-tiered Source De-duplication Framework for Cloud Backup. In: 39th International Conference on Parallel Processing (2010)
Google Scholar
Quinlan, S., Dorward, S.: Venti: a new approach to archival storage. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST (2002)
Google Scholar
Muthitacharoen, A., Chen, B., Mazieres, D.: A Low-Bandwidth Network File System. In: Proceedings of the Symposium on Operating Systems Principles (SOSP 2001) (2001)
Google Scholar
Rabin, M.O.: Fingerprinting by random polynomials:Technical Report TR-15-81, Center for Research in Computing Technology, Harvard University (1981)
Google Scholar
Zhu, B., Li, K., Patterson, H.: Avoiding the disk bottleneck in the data domain deduplication file system. In: Proceedings of the 6th USENIX Conference on File and Storage Technologies, FAST (2008)
Google Scholar
Broder, A., Mitzenmacher, M.: Network Applications of Bloom Filters: A Survey. In: Internet Mathematics (2002)
Google Scholar
Lillibridge, M., Eshghi, K., Bhagwat, D., Deolalikar, V., Trezise, G., Campbell, P.: Sparse Indexing, Large Scale, Inline Deduplication Using Sampling and Locality. In: Proceedings of the 7th USENIX Conference on File and Storage Technologies, FAST (2009)
Google Scholar
Clements, A., Ahmad, I., Vilayannur, M., Li, J.: Decentralized Deduplication in SAN Cluster File Systems. In: Proceedings of 2009 USENIX Technical Conference (2009)
Google Scholar
Dubnicki, C., Gryz, L., Heldt, L., Kaczmarczyk, M., Kilian, W., Strzelczak, P., Szczepkowski, J., Ungureanu, C., Welnicki, M.: HYDRAstor: a Scalable Secondary Storage. In: Proceedings of the 7th USENIX Conference on File and Storage Technologies, FAST (2009)
Google Scholar
Ungureanu, C., Atkin, B., Aranya, A., Salil Gokhale, S.R., Calkowski, G., Dubnicki, C., Bohra, A.: HydraFS: a High-Throughput File System for the HYDRAstor Content-Addressable Storage System. In: Proceedings of the 8th USENIX Conference on File and Storage Technologies, FAST (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Engineering, Hallym University, Okcheon-dong, Chunchon-si, Gangwon-do, Korea
Ho Min Jung, Won Vien Park, Wan Yeon Lee, Jeong Gun Lee & Young Woong Ko

Authors

Ho Min Jung
View author publications
You can also search for this author in PubMed Google Scholar
Won Vien Park
View author publications
You can also search for this author in PubMed Google Scholar
Wan Yeon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jeong Gun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Young Woong Ko
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Wroclaw University of Technology, 50-370, Wroclaw, Poland
Ngoc Thanh Nguyen
Department of Computer Engineering, Yeungnam University, 712-749, Dae-Dong, Gyeungsan, Korea
Chong-Gun Kim
Institute of Informatics, Automation and Robotics, Wroclaw University of Technology, 50-370, Wrocław, Poland
Adam Janiak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jung, H.M., Park, W.V., Lee, W.Y., Lee, J.G., Ko, Y.W. (2011). Data Deduplication System for Supporting Multi-mode. In: Nguyen, N.T., Kim, CG., Janiak, A. (eds) Intelligent Information and Database Systems. ACIIDS 2011. Lecture Notes in Computer Science(), vol 6591. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20039-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-20039-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20038-0
Online ISBN: 978-3-642-20039-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics