How good are machine learning clouds? Benchmarking two snapshots over 5 years

Jiang, Jiawei; Wei, Yi; Liu, Yu; Wu, Wentao; Hu, Chuang; Zheng, Zhigao; Zhang, Ziyi; Shao, Yingxia; Zhang, Ce

doi:10.1007/s00778-024-00842-3

How good are machine learning clouds? Benchmarking two snapshots over 5 years

Regular Paper
Published: 15 March 2024

Volume 33, pages 833–857, (2024)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Jiawei Jiang ORCID: orcid.org/0000-0003-0051-0046¹,
Yi Wei¹,
Yu Liu²,
Wentao Wu³,
Chuang Hu¹,
Zhigao Zheng¹,
Ziyi Zhang¹,
Yingxia Shao⁴ &
…
Ce Zhang²

119 Accesses
Explore all metrics

Abstract

We conduct an empirical study of machine learning functionalities provided by major cloud service providers, which we call machine learning clouds. Machine learning clouds hold the promise of hiding all the sophistication of running large-scale machine learning: Instead of specifying how to run a machine learning task, users only specify what machine learning task to run and the cloud figures out the rest. Raising the level of abstraction, however, rarely comes free—a performance penalty is possible. How good, then, are current machine learning clouds on real-world machine learning workloads? We study this question by conducting benchmark on the mainstream machine learning clouds. Since these platforms continue to innovate, our benchmark tries to reflect their evolvement. Concretely, this paper consists of two sub-benchmarks—mlbench and automlbench. When we first started this work in 2016, only two cloud platforms provide machine learning services and limited themselves to model training and simple hyper-parameter tuning. We then focus on binary classification problems and present mlbench, a novel benchmark constructed by harvesting datasets from Kaggle competitions. We then compare the performance of the top winning code available from Kaggle with that of running machine learning clouds from both Azure and Amazon on mlbench. In the recent few years, more cloud providers support machine learning and include automatic machine learning (AutoML) techniques in their machine learning clouds. Their AutoML services can ease manual tuning on the whole machine learning pipeline, including but not limited to data preprocessing, feature selection, model selection, hyper-parameter, and model ensemble. To reflect these advancements, we design automlbench to assess the AutoML performance of four machine learning clouds using different kinds of workloads. Our comparative study reveals the strength and weakness of existing machine learning clouds and points out potential future directions for improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 17

Fig. 18

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Notes

Refer to https://dl.acm.org/doi/pdf/10.14778/3231751.3231770.
https://drive.google.com/file/d/1lXC47nBjDyfqrNyUIC9xv-SEIuks44Gg/view.
That is, when users can only be satisfied by winning the Kaggle competition or being ranked among the top 1%.
https://www.kaggle.com/c/kdd-cup-2014-predicting-excitement-at-donors-choose.
https://archive.ics.uci.edu/ml/datasets/EEG+Eye+State.
http://www.tpc.org/information/benchmarks.asp.

References

https://archive.ics.uci.edu/ml/datasets/
Aguilar Melgar, L., et al.: Ease.ml: a lifecycle management system for machine learning. In: 11th Annual Conference on Innovative Data Systems Research (CIDR 2021) (virtual). CIDR (2021)
Amazon: Amazon cloud. http://docs.aws.amazon.com/machine-learning/latest/dg/learning-algorithm.html (2021)
Amazon: Amazon sagemaker autopilot. https://aws.amazon.com/sagemaker/autopilot/ (2021)
Auto, I.: Ibm autoai. https://www.ibm.com/cloud/watson-studio/autoai (2021)
Azure, M.: Azure automated machine learning. https://aws.amazon.com/sagemaker/autopilot/ (2021)
Balaji, A., Allen, A.: Benchmarking automatic machine learning frameworks. arXiv:1808.06492 (2018)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36, 105–139 (1998)
Article Google Scholar
Bergstra, J., et al.: Hyperopt: a python library for optimizing the hyperparameters of machine learning algorithms. In: Proceedings of the 12th Python in science conference, vol. 13, p. 20. Citeseer (2013)
Caruana, R., et al.: An empirical comparison of supervised learning algorithms. In: ICML (2006)
Cooper, B.F., et al.: Benchmarking cloud serving systems with YCSB. In: SoCC (2010)
Cortes, C., Vapnik, V.: Support-vector networks. Mach Learn 20, 273–297 (1995)
Article Google Scholar
DeWitt, D.J.: The Wisconsin benchmark: past, present, and future. In: The Benchmark Handbook for Database and Transaction Systems (1993)
Domingos, P.: A few useful things to know about machine learning. In: CACM (2012)
Erickson, N., Mueller, J., Shirkov, A., Zhang, H., Larroy, P., Li, M., Smola, A.: Autogluon-tabular: robust and accurate automl for structured data. arXiv:2003.06505 (2020)
Fernández-Delgado, M., et al.: Do we need hundreds of classifiers to solve real world classification problems. In: JMLR (2014)
Feurer, M., et al.: Initializing Bayesian hyperparameter optimization via meta-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29 (2015)
Feurer, M., et al.: Auto-sklearn: efficient and robust automated machine learning. In: Automated Machine Learning, pp. 113–134 (2019)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: JCSS (1997)
Fusi, N., et al.: Probabilistic matrix factorization for automated machine learning. Adv. Neural Inf. Process. Syst. 31, 3348–3357 (2018)
Google Scholar
Gomes, T.A., et al.: Combining meta-learning and search techniques to select parameters for support vector machines. Neurocomputing 75(1), 3–13 (2012)
Article Google Scholar
Google: Google cloud automl. https://cloud.google.com/automl (2021)
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall PTR, Hoboken (1998)
Google Scholar
He, X., et al.: Automl: a survey of the state-of-the-art. Knowl.-Based Syst. 212, 106622 (2021)
Article Google Scholar
Herbrich, R., et al.: Bayes point machines. In: JMLR (2001)
Ho, T.K.: Random decision forests. In: ICDAR (1995)
Hutter, F., et al.: Sequential model-based optimization for general algorithm configuration. In: International Conference on Learning and Intelligent Optimization, pp. 507–523. Springer (2011)
Jiang, J., Gan, S., Liu, Y., Wang, F., Alonso, G., Klimovic, A., Singla, A., Wu, W., Zhang, C.: Towards demystifying serverless machine learning training. In: Proceedings of the 2021 International Conference on Management of Data, pp. 857–871 (2021)
Kotthoff, L., et al.: Auto-weka: automatic model selection and hyperparameter optimization in weka. In: Automated Machine Learning, pp. 81–95. Springer, Cham (2019)
LeDell, E., Poirier, S.: H2o automl: scalable automatic machine learning. In: Proceedings of the AutoML Workshop at ICML (2020)
Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A.: Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 18(1), 6765–6816 (2017)
MathSciNet Google Scholar
Li, P., et al.: Cleanml: a study for evaluating the impact of data cleaning on ml classification tasks. In: 36th IEEE International Conference on Data Engineering (ICDE 2020) (virtual) (2021)
Li, Y., Shen, Y., Zhang, W., Zhang, C., Cui, B.: Volcanoml: speeding up end-to-end automl via scalable search space decomposition. VLDB J. 32(2), 389–413 (2023)
Article Google Scholar
Liu, Y., et al.: MLbench: benchmarking machine learning services against human experts. Proc. VLDB Endow. 11(10), 1220–1232 (2018)
Article Google Scholar
Luo, C., et al.: Cloudrank-d: benchmarking and ranking cloud computing systems for data processing applications. Front. Comput. Sci. 6, 347–362 (2012)
Article MathSciNet Google Scholar
Mısır, M., et al.: Alors: an algorithm recommender system. Artif. Intell. 244, 291–314 (2017)
Article MathSciNet Google Scholar
Olson, R.S., Moore, J.H.: Tpot: a tree-based pipeline optimization tool for automating machine learning. In: Workshop on Automatic Machine Learning, pp. 66–74. PMLR (2016)
Olson, R.S., et al.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 485–492 (2016)
Parry, P., et al.: auto_ml. https://github.com/ClimbsRocks/auto_ml (2007)
Perrone, V., Shen, H., Seeger, M.W., Archambeau, C., Jenatton, R.: Learning search spaces for Bayesian optimization: another view of hyperparameter transfer learning. Adv. Neural Inf. Process. Syst. 32 (2019)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. (1986)
Reif, M., et al.: Meta-learning for evolutionary parameter optimization of classifiers. Mach. Learn. 87(3), 357–380 (2012)
Article MathSciNet Google Scholar
Shotton, J., et al.: Decision jungles: compact and rich models for classification. In: NIPS (2013)
Sun-Hosoya, L., et al.: Activmetal: algorithm recommendation with active meta learning. In: IAL 2018 workshop, ECML PKDD (2018)
Thornton, C., et al.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 847–855 (2013)
Wong, C., et al.: Transfer learning with neural automl. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 8366–8375 (2018)
Wu, Z., Ramsundar, B., Feinberg, E.N., Gomes, J., Geniesse, C., Pappu, A.S., Leswing, K., Pande, V.: Moleculenet: a benchmark for molecular machine learning. Chem. Sci. 9(2), 513–530 (2018)
Yakovlev, A., et al.: Oracle automl: a fast and predictive automl pipeline. Proc. VLDB Endow. 13(12), 3166–3180 (2020)
Yogatama, D., Mann, G.: Efficient transfer learning method for automatic hyperparameter tuning. In: Artificial Intelligence and Statistics, pp. 1077–1085. PMLR (2014)
Zhang, C., et al.: An overreaction to the broken machine learning abstraction: the ease.ml vision. In: HILDA (2017)
Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks. J. Artif. Intell. Res. 70, 409–472 (2021)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was sponsored by National Science and Technology Major Project (No. 2022ZD0116315) and Key R &D Program of Hubei Province (No. 2023BAB077). CZ and the DS3Lab gratefully acknowledge the support from the Swiss National Science Foundation (Project Number 200021_184628), Innosuisse/SNF BRIDGE Discovery (Project Number 40B2-0_187132), European Union Horizon 2020 Research and Innovation Programme (DAPHNE, 957407), Botnar Research Centre for Child Health, Swiss Data Science Center, Alibaba, Cisco, eBay, Google Focused Research Awards, Microsoft Swiss Joint Research Center, Oracle Labs, Swisscom, Zurich Insurance, Chinese Scholarship Council and the Department of Computer Science at ETH Zurich.

Author information

Authors and Affiliations

School of Computer Science, Wuhan University, Wuhan, China
Jiawei Jiang, Yi Wei, Chuang Hu, Zhigao Zheng & Ziyi Zhang
Department of Computer Science, ETH Zürich, Zurich, Switzerland
Yu Liu & Ce Zhang
Redmond, Microsoft Research, Washington, USA
Wentao Wu
Beijing University of Posts and Telecommunications, Beijing, China
Yingxia Shao

Authors

Jiawei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wentao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Chuang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhigao Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Ziyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yingxia Shao
View author publications
You can also search for this author in PubMed Google Scholar
Ce Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiawei Jiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jiang, J., Wei, Y., Liu, Y. et al. How good are machine learning clouds? Benchmarking two snapshots over 5 years. The VLDB Journal 33, 833–857 (2024). https://doi.org/10.1007/s00778-024-00842-3

Download citation

Received: 14 January 2023
Revised: 01 October 2023
Accepted: 19 December 2023
Published: 15 March 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s00778-024-00842-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

How good are machine learning clouds? Benchmarking two snapshots over 5 years

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A survey on Image Data Augmentation for Deep Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

How good are machine learning clouds? Benchmarking two snapshots over 5 years

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A survey on Image Data Augmentation for Deep Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation