Towards Automated Configuration of Cloud Storage Gateways: A Data Driven Approach

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11513)


Cloud storage gateways (CSGs) are an essential part of enterprises to take advantage of the scale and flexibility of cloud object store. A CSG provides clients the impression of a locally configured large size block-based storage device, which needs to be mapped to remote cloud storage which is invariably object based. Proper configuration of the cloud storage gateway is extremely challenging because of numerous parameters involved and interactions among them. In this paper, we study this problem for a commercial CSG product that is typical of offerings in the market. We explore how machine learning techniques can be exploited both for the forward problem (i.e. predicting performance from the configuration parameters) and backward problem (i.e. predicting configuration parameter values from the target performance). Based on extensive testing with real world customer workloads, we show that it is possible to achieve excellent prediction accuracy while ensuring that the model is not overfitted to the data.


Cloud storage gateway Object store Performance Configuration management Machine learning 



This research was supported by NSF grant IIP-330295. Discussions with Dr. S. Vucetic of Temple University were highly valuable in devising the extended validation techniques presented in the paper.


  1. 1.
    Almseidin, M., Alzubi, M., Kovacs, S., Alkasassbeh, M.: Evaluation of machine learning algorithms for intrusion detection system. In: 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), pp. 000277–000282. IEEE (2017)Google Scholar
  2. 2.
    Hsu, C.-J., Panta, R.K., Ra, M.-R., Freeh, V.W.: Inside-out: reliable performance prediction for distributed storage systems in the cloud. In: 2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS), pp. 127–136. IEEE (2016)Google Scholar
  3. 3.
    Klimovic, A., Litz, H., Kozyrakis, C.: Selecta: heterogeneous cloud storage configuration for data analytics. In: 2018 \(\{\)USENIX\(\}\) Annual Technical Conference (\(\{\)USENIX\(\}\)\(\{\)ATC\(\}\) 2018), pp. 759–773 (2018)Google Scholar
  4. 4.
    Pedregosa, F., Varoquaux, G., Gramfort, A.E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Prahlad, A., Muller, M.S., Kottomtharayil, R.E.: Cloud gateway system for managing data storage to cloud storage sites, 2010. US Patent App. 12/751,953Google Scholar
  6. 6.
    Sorower, M.S.: A literature survey on algorithms for multi-label learning. Oregon State University, Corvallis 18, 1–25 (2010)Google Scholar
  7. 7.
    Tesauro, G., et al.: Online resource allocation using decompositional reinforcement learning. AAAI 5, 886–891 (2005)Google Scholar
  8. 8.
    Wang, M., Au, K., Ailamaki, A., Brockwell, A., Faloutsos, C., Ganger, G.R.: Storage device performance prediction with cart models. In: Proceedings of the IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems. (MASCOTS 2004). IEEE, pp. 588–595 (2004)Google Scholar
  9. 9.
    Xu, Q.-S., Liang, Y.-Z., Du, Y.-P.: Monte carlo cross-validation for selecting a model and estimating the prediction error in multivariate calibration. J. Chemom. J. Chemom. Soc. 18(2), 112–120 (2004)Google Scholar
  10. 10.
    Xu, T., Zhou, Y.: Systems approaches to tackling configuration errors: a survey. ACM Comput. Surv. (CSUR) 47(4), 70 (2015)CrossRefGoogle Scholar
  11. 11.
    Yin, Z., Ma, X., Zheng, J., Zhou, Y., Bairavasundaram, L.N., Pasupathy, S.: An empirical study on configuration errors in commercial and open source systems. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 159–172. ACM (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Temple UniversityPhiladelphiaUSA

Personalised recommendations