Skip to main content
Log in

‘MaaS’: fast retrieval of E-file in cloud using metadata as a service

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

In cloud era as the data stored is enormous, efficient retrieval of data with reduced latency plays a major role. In cloud, owing to the size of the stored data and lack of locality information among the stored files, metadata is a suitable method of keeping track of the storage. This paper describes a novel framework for efficient retrieval of data from the cloud data servers using metadata with less amount of time. Performance of queries due to availability of files for query processing can be greatly improved by the efficient use of metadata and its analysis thereof. Hence this paper proposes a generic approach of using metadata in cloud, named ‘MaaS—Metadata as a Service’. The proposed approach has exploited various methodologies in reducing the latency during data retrieval. This paper investigates the issues on creation of metadata, metadata management and analysis of metadata in a cloud environment for fast retrieval of data. Cloud bloom filter, a probabilistic data structure used for efficient retrieval of metadata is stored across various metadata servers dispersed geographically. We have implemented the model in a cloud environment and the experimental results show that methodology used is efficient on increasing the throughput and also by handling large number of queries efficiently with reduced latency. The efficacy of the approach is tested through experimental studies using KDD Cup 2003 dataset. In the experimental results, proposed ‘MaaS’ has outperformed other existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  • Aalst, V., & Wil, M. P. (2013). Decomposing Petri nets for process mining: A generic approach. Journal on Distributed and Parallel Databases, 31(4), 471–507.

    Article  Google Scholar 

  • Ahmad, A., Maynard, S. B., & Park, S. (2014). Information security strategies: Towards an organizational multi-strategy perspective. Journal of Intelligent Manufacturing, 25(2), 357–370.

    Article  Google Scholar 

  • Anitha, R., & Mukherjee, S. (2011). A dynamic metadata model in cloud computing. In Proceedings of the Springer CCIS, 2 (pp. 13–21).

  • Bice, T., Chiu, D., & Agrawal, G. (2012). Time and cost sensitive data-intensive computing on hybrid clouds. In Proceedings of the IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

  • Boris, Y., Chan, A. S., & Hong, V. L. (2001). Framework for cache management for mobile databases: Design and evaluation. Journal on Distributed and Parallel Databases, 10, 23–57.

    Article  Google Scholar 

  • Broder, A. Z., & Mitzenmacher, M. (2003). Network applications of bloom filters: A survey. Journal of Internet Mathematics, 1(4), 485–509.

    Article  Google Scholar 

  • Cammert, M., Kramer, J., Seeger, B. (2007). Dynamic metadata management for scalable stream processing systems. In Proceedings of the IEEE International Conference on Data Engineering Workshop (pp. 644–653).

  • Chen, S., Huang, X., Xu, P. & Zheng, W. (2009). Distributed metadata management based on hierarchical bloom filters in data grid. In Proceedings of the IEEE ChinaGrid Conference.

  • Choudhary, A., Harding, J. A., & Tiwari, M. K. (2009). Data mining in manufacturing: A review based on the kind of knowledge. Journal of Intelligent Manufacturing, 20(5), 501–521.

    Article  Google Scholar 

  • Dublin Core. (2004). Dublin core metadata element set, version 1.1: Reference description. http://dublincore.org/documents/docs

  • Foster, I., Zhao, Y., Raicu, I. & Lu, S. (2008). Cloud computing and grid computing 360-degree compared. In Grid Computing Environments Workshop. (pp. 1–10).

  • Gray, J., Liu, D. T., Nieto-Santisteban, M., Szalay, A., DeWitt, D. J., & Heber, G. (2005). Scientific data management in the coming decade. ACM SIGMOD, 34(4), 34–41.

    Article  Google Scholar 

  • Guha, S., Rastogi, T., & Shim, K. (1998). CURE: An efficient clustering algorithm for large databases’. ACM SIGMOD Record, 27(2), 73–84.

    Article  Google Scholar 

  • Guha, S., Meyerson, A., Mishra, N., Motwani, R., & O’Callaghan, L. (2003). Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering, 15(3), 515–528.

    Article  Google Scholar 

  • Hua, Y., Jiang, H., Zhu, Y., Feng, D., & Tian, L. (2012). Semantic-aware metadata organization paradigm in next-generation file systems. IEEE Transactions on Parallel Distributed Systems, 23(2), 337–344.

    Article  Google Scholar 

  • Lei, P.-R., Li, S.-C., & Peng, W. C. (2013). QS-STT: QuadSection clustering and spatial-temporal trajectory model for location prediction. Journal on Distributed and Parallel Databases, 31, 231–258.

    Article  Google Scholar 

  • Leung, A. W., Shao, M., Bisson, T., Pasupathy, S., & Miller, E. L. (2008). High-performance metadata indexing and search in petascale data storage systems. Journal of Physics: Conference Series, 125(1), 1–6.

    Google Scholar 

  • Leung, A. W., Shao, M., Bisson, T., Pasupathy, S., & Miller, E. L. (2009). Spyglass: Fast, scalable metadata search for large-scale storage systems. Proceedings of the International Conference on File and Storage Technologies, 9, 153–166.

    Google Scholar 

  • Li, W., Xue, W., Shu, J., Zheng, W. (2006). Dynamic hashing: Adaptive metadata management for petabyte-scale file systems. Proceedings of the IEEE NASA Goddard Conference on Mass Storage Systems and Technologies (pp. 1–6).

  • Li, Q., Zhou, J., Peng, Q. R., Li, C. Q., Wang, C., Wu, J., et al. (2010). Business processes oriented heterogeneous systems integration platform for networked enterprises. Computers in Industry, 61, 127–144.

    Article  Google Scholar 

  • Li, Q., Wang, C., Wu, J., Li, J., & Wang, Z.-Y. (2011). Towards the business-information technology alignment in cloud computing environment: An approach based on collaboration points and agents. International Journal of Computer Integrated Manufacturing, 24, 1038–1057.

    Article  Google Scholar 

  • Li, Q., Wang, Z., Li, W., Li, J., Wang, C., & Du, R. (2013). Applications integration in a hybrid cloud computing environment: Modelling and platform. Enterprise Information Systems, 7(3), 237–271.

    Article  Google Scholar 

  • Li, Q., Wang, Z., Li, W., Cao, Z., Du, R., & Lu, H. (2013). Model based services convergence and multi-clouds integration. Computers in Industry, 64, 813–832.

    Article  Google Scholar 

  • Liu, C., & An, J. (2008). Fast mining and updating frequent itemsets. Proceedings of the International Colloquium on Computing, Communication, Control and Management, 1, 365–368.

    Google Scholar 

  • Maria, H., Batistakis, Y., & Vazirgiannis, M. (2001). On clustering validation techniques. Journal of Intelligent Information Systems, 17(2–3), 107–145.

    Google Scholar 

  • Pierson, J.-M., Seitz, L., Duque, H. & Montagnat J. (2004). Metadata for efficient, secure and extenxible access to data in medical grid. In Proceedings of the 15th Inetrnational Workshop on Database and Expert Systems Applications.

  • Rahman, A. M. J., Balasubramanie, P., & Venkatakrishna, P. (2009). A hash based mining algorithm for maximal frequent item-sets using linear probing. Journal of Computer Science, 8(1), 14–19.

    Google Scholar 

  • Sarnovsky, M., & Kacur, T. (2012). Cloud-based classification of text documents using the Grid gain platform, Proceedings of the International Symposium on Applied Computational Intelligence and Informatics (pp. 241–245).

  • Wang, S., Liu, Z., Sun, Q., Zou, H., & Yang, F. (2014). Towards an accurate evaluation of quality of cloud service in service-oriented cloud computing. Journal of Intelligent Manufacturing, 25(2), 283–291.

    Article  Google Scholar 

  • Weil, S. A., Pollack, Kristal T., Brandt, S. A., & Miller, E. L. (2004). Dynamic metadata management for petabyte-scale file systems. Proceedings of the IEEE Computer Society conference on Supercomputing (pp. 172–180).

  • Wu, J.-J., Liu, P. & Chung, Y.-C. (2010). Metadata partitioning for large-scale distributed storage systems. In Proceedings of the IEEE International Conference on Cloud Computing.

  • Wu, Q., Zhu, Q., & Zhou, M. (2014). A correlation-driven optimal service selection approach for virtual enterprise establishment. Journal of Intelligent Manufacturing, 25(6), 1441–1453.

  • Xiong, M., Jin, H., & Wu, S. (2006). FDSSS: An efficient metadata management scheme in large scale data environment. International Conference on Grid and Cooperative Computing Workshops (pp. 71–77).

  • Xu, Q., Arumugam, R. V., Yong, K. L., & Mahadevan, Z. (2013). Efficient and scalable metadata management in EB-scale file system. IEEE Transactions on Parallel and Distributed Systems, 6(1), 1–10.

  • Zhipeng, T., Wei, Z., Jianliang, S., Tian, Z., & Jie, C. (2012). An improvement of static subtree partitioning in metadata server cluster. International Journal of Distributed Sensor Networks, 3, 1–10.

  • Zhu, Y., & Jiang, H. (2010). Efficient update control of bloom filter replicas in distributed systems. In Handbook of Research on Scalable Computing Technologies, pp. 1–24.

  • Zhu, Y., Jiang, H., Wang, J., & Xian, F. (2008). HBA: Distributed metadata management for large cluster-based storage systems. IEEE Transactions on Parallel and Distributed Systems, 19(6), 750–763.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Anitha.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Anitha, R., Mukherjee, S. ‘MaaS’: fast retrieval of E-file in cloud using metadata as a service. J Intell Manuf 28, 1871–1891 (2017). https://doi.org/10.1007/s10845-015-1076-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-015-1076-y

Keywords

Navigation