Skip to main content
Log in

Research on data pre-deployment in information service flow of digital ocean cloud computing

  • Published:
Acta Oceanologica Sinica Aims and scope Submit manuscript

Abstract

Data pre-deployment in the HDFS (Hadoop distributed file systems) is more complicated than that in traditional file systems. There are many key issues need to be addressed, such as determining the target location of the data prefetching, the amount of data to be prefetched, the balance between data prefetching services and normal data accesses. Aiming to solve these problems, we employ the characteristics of digital ocean information service flows and propose a deployment scheme which combines input data prefetching with output data oriented storage strategies. The method achieves the parallelism of data preparation and data processing, thereby massively reducing I/O time cost of digital ocean cloud computing platforms when processing multi-source information synergistic tasks. The experimental results show that the scheme has a higher degree of parallelism than traditional Hadoop mechanisms, shortens the waiting time of a running service node, and significantly reduces data access conflicts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Reference

  • Chen D W, He Y J. 2010. A study on secure data storage strategy in cloud computing. JCIT: Journal of Convergence Information Technology, 5(7): 175–179

    Article  Google Scholar 

  • Chilimbi T M, Hirzel M. 2002. Dynamic hot data stream prefetching for general-purpose programs. In: Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation. New York: ACM Press, 199–209

    Chapter  Google Scholar 

  • Cilku B, Ye X D, Hu G, et al. 2010. Using a local prefetch strategy to obtain temporal time predictability. In: 2010 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2010). Chengdu, China: IEEE, 8: 576–580

    Google Scholar 

  • Couceiro M, Romano P, Rodrigues L. 2011. PolyCert: Polymorphic self-optimizing replication for in-memory transactional grids. In: Proceedings of the ACM/IFIP/USENIX 12th International Middleware Conference. Berlin Heidelberg: Springer, 309–328

    Google Scholar 

  • Huang Y, Gu Z M, Tang J, et al. 2012. Reducing cache pollution of threaded prefetching by controlling prefetch distance. In: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW 2012). Shanghai, China: IEEE, 1812–1819

    Google Scholar 

  • Kawata S. 2010. Review of PSE (Problem Solving Environment) study. JCIT: Journal of Convergence Information Technology, 5(4): 204–215

    Article  Google Scholar 

  • Kobashi H, Kawata S, Manabe Y, et al. 2010. PSE Park: Framework for problem solving environments. JCIT: Journal of Convergence Information Technology, 5(4): 225–239

    Article  Google Scholar 

  • Kyriazis D, Tserpes K, Menychtas A, et al. 2008. An innovative workflow mapping mechanism for grids in the frame of quality of service. Future Generation Computer Systems, 24(6): 498–511

    Article  Google Scholar 

  • Lin F, Zeng W H, Jiang Y, et al. 2010. A group tracing and filtering tree for REST DDos in cloud. JDCTA: International Journal of Digital Content Technology and its Applications, 4(9): 212–224

    Article  Google Scholar 

  • Lin L, Li X M, Jiang H, et al. 2008. AMP: an affinity-based metadata prefetching scheme in large-scale distributed storage systems. In: Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID’08). 459–466

    Chapter  Google Scholar 

  • Liu K, Chen J, Yang Y, et al. 2008. A throughput maximization strategy for scheduling transaction-intensive workflows on SwinDeW-G. In: Concurrency and Computation: Practice and Experience-2nd International Workshop on Workflow Management and Applications in Grid Environments. Chichester, UK: John Wiley and Sons Ltd., 1807–1820

    Google Scholar 

  • Nori A K. 2010. Distributed caching platforms. In: Proceedings of the 36th International Conference on Very Large Data Bases (VLDB 2010). Singapore: VLDB Endowment Inc., 1645–1646

    Google Scholar 

  • Seo S, Jang I, Woo K, et al. 2009. HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment. In: Proceedings of IEEE International Conference on Cluster Computing and Workshops. New Orleans, LA: IEEE, 1–8

    Google Scholar 

  • Shafer J, Rixner S, Cox A L. 2010. The Hadoop distributed filesystem: Balancing portability and performance. In: Proceedings of the IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS 2010). White Plains, NY: IEEE, 122–133

    Chapter  Google Scholar 

  • Shi Suixiang, Liu Yang, Wei Hongyu, et al. 2013. Research on cloud computing and services framework of marine environmental information management. Acta Oceanologica Sinica, 32(10):57–66

    Article  Google Scholar 

  • Tang L M, Xing S X, Chen T H. 2012. An improved adaptive cache prefetch algorithm. In: 2012 5th International Symposium on Computational Intelligence and Design (ISCID 2012), 2: 255–258

    Article  Google Scholar 

  • Wenisch T F, Somogyi S, Hardavellas N, et al. 2005. Temporal streaming of shared memory. In: Proceedings of the 32nd Annual International Symposium on Computer Architecture. Los Alamitos: IEEE Computer Society, 222–233

    Google Scholar 

  • Wu C J, Jaleel A, Martonosi M, et al. 2011. PACMan: Prefetchaware cache management for high performance caching. In: Proceedings of the Annual International Symposium on Microarchitecture, MICRO. Porto Alegre, Brazil: ACM, 442–453

    Google Scholar 

  • Xu Y J, Xu L Y, Liu N, et al. 2010. Marine service flow design based on cloud computing. In: 2010 3rd International Conference on Computer and Electrical Engineering. V4-24–V4-27

    Google Scholar 

  • Yoon U K, Kim H J, Chang J Y. 2010. Intelligent data prefetching for hybrid flash-disk storage using sequential pattern mining technique. In: Proceedings of the 2010 IEEE/ACIS 9th International Conference on Computer and Information Science. Yamagata: IEEE, 280–285

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lingyu Xu.

Additional information

Foundation item: The Ocean Public Welfare Scientific Research Project of State Oceanic Administration of China under contract No. 20110533.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, S., Xu, L., Dong, H. et al. Research on data pre-deployment in information service flow of digital ocean cloud computing. Acta Oceanol. Sin. 33, 82–92 (2014). https://doi.org/10.1007/s13131-014-0520-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13131-014-0520-8

Key words

Navigation