Learnware: small models do big

Zhou, Zhi-Hua; Tan, Zhi-Hao

doi:10.1007/s11432-023-3823-6

Learnware: small models do big

Research Paper
Published: 20 October 2023

Volume 67, article number 112102, (2024)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Zhi-Hua Zhou¹ &
Zhi-Hao Tan¹

353 Accesses
4 Citations
3 Altmetric
Explore all metrics

Abstract

There are complaints about current machine learning techniques such as the requirement of a huge amount of training data and proficient training skills, the difficulty of continual learning, the risk of catastrophic forgetting, and the leaking of data privacy/proprietary. Most research efforts have been focusing on one of those concerned issues separately, paying less attention to the fact that most issues are entangled in practice. The prevailing big model paradigm, which has achieved impressive results in natural language processing and computer vision applications, has not yet addressed those issues, whereas becoming a serious source of carbon emissions. This article offers an overview of the learnware paradigm, which attempts to enable users not to need to build machine learning models from scratch, with the hope of reusing small models to do things even beyond their original purposes, where the key ingredient is the specification which enables a trained model to be adequately identified to reuse according to the requirement of future users who know nothing about the model in advance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Zhou Z-H. A brief introduction to weakly supervised learning. Natl Sci Rev, 2018, 5: 44–53
Article MathSciNet Google Scholar
Zhou Z-H. Open-environment machine learning. Natl Sci Rev, 2022, 9: nwac123
Article MathSciNet Google Scholar
Delange M, Aljundi R, Masana M, et al. A continual learning survey: defying forgetting in classification tasks. IEEE Trans Pattern Anal Machine Intell, 2022, 44: 3366–3385
Google Scholar
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017. 5998–6008
Brown T, Mann B, Ryder N, et al. Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020. 1877–1901
Zhou Z-H. Learnware: on the future of machine learning. Front Comput Sci, 2016, 10: 589–590
Article Google Scholar
Yang Y, Zhan D C, Fan Y, et al. Deep learning for fixed model reuse. In: Proceedings of Association for the Advancement of Artificial Intelligence, 2017. 2831–2837
Zhao P, Cai L W, Zhou Z-H. Handling concept drift via model reuse. Mach Learn, 2020, 109: 533–568
Article MathSciNet MATH Google Scholar
Mansour Y, Mohri M, Rostamizadeh A. Domain adaptation: learning bounds and algorithms. In: Proceedings of the 22nd Conference on Learning Theory, 2009
Zhang Y, Yang Q. A survey on multi-task learning. IEEE Trans Knowl Data Eng, 2022, 34: 5586–5609
Article Google Scholar
Peng X, Huang Z, Sun X, et al. Domain agnostic learning with disentangled representations. In: Proceedings of International Conference on Machine Learning, 2019. 5102–5112
Xie Y, Tan Z H, Jiang Y, et al. Identifying helpful learnwares without examining the whole market. In: Proceedings of the 26th European Conference on Artificial Intelligence, 2023
Zhou Z-H. Ensemble Methods: Foundations and Algorithms. New York: Chapman & Hall/CRC, 2012
Book Google Scholar
Li N, Tsang I W, Zhou Z-H. Efficient optimization of performance measures by classifier adaptation. IEEE Trans Pattern Anal Mach Intell, 2013, 35: 1370–1382
Article Google Scholar
Ding Y-X, Zhou Z-H. Boosting-based reliable model reuse. In: Proceedings of the 12th Asian Conference on Machine Learning, 2020. 145–160
Kuzborskij I, Orabona F. Stability and hypothesis transfer learning. In: Proceedings of International Conference on Machine Learning, 2013. 942–950
Romero F, Li Q, Yadwadkar N J, et al. INFaaS: automated model-less inference serving. In: Proceedings of USENIX Annul Technical Conference, 2021. 397–411
Schölkopf B, Mika S, Burges C J C, et al. Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw, 1999, 10: 1000–1017
Article Google Scholar
Berlinet A, Thomas-Agnan C. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Berlin: Springer, 2011
MATH Google Scholar
Zhou Z-H, Jiang Y. NeC4.5: neural ensemble based C4.5. IEEE Trans Knowl Data Eng, 2004, 16: 770–773
Article Google Scholar
Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. 2015. ArXiv:1503.02531
Wu X-Z, Xu W, Liu S, et al. Model reuse with reduced kernel mean embedding specification. IEEE Trans Knowl Data Eng, 2023, 35: 699–710
Article Google Scholar
Zhang Y-J, Yan Y-H, Zhao P, et al. Towards enabling learnware to handle unseen jobs. In: Proceedings of Association for the Advancement of Artificial Intelligence, 2021. 10964–10972
Muandet K, Fukumizu K, Sriperumbudur B, et al. Kernel mean embedding of distributions: a review and beyond. FNT Machine Learn, 2017, 10: 1–141
Article MATH Google Scholar
Sriperumbudur B, Fukumizu K, Lanckriet G. Universality, characteristic kernels and rkhs embedding of measures. J Machine Learning Res, 2011, 12: 2389–2410
MathSciNet MATH Google Scholar
Bach F, Lacoste-Julien S, Obozinski G. On the equivalence between herding and conditional gradient algorithms. In: Proceedings of International Conference on Machine Learning, 2012. 1355–1362
Karnin Z, Liberty E. Discrepancy, coresets, and sketches in machine learning. In: Proceedings of the 32nd Conference on Learning Theory, 2019. 1975–1993
Phillips J M, Tai W M. Near-optimal coresets of kernel density estimates. Discrete Comput Geom, 2020, 63: 867–887
Article MathSciNet MATH Google Scholar
Ramaswamy H G, Scott C, Tewari A. Mixture proportion estimation via kernel embeddings of distributions. In: Proceedings of International Conference on Machine Learning, 2016. 2052–2060
du Plessis M C, Niu G, Sugiyama M. Class-prior estimation for learning from positive and unlabeled data. Mach Learn, 2017, 106: 463–492
Article MathSciNet MATH Google Scholar
Schapire R E. The strength of weak learnability. Mach Learn, 1990, 5: 197–227
Article Google Scholar

Download references

Acknowledgements This work was supported by National Natural Science Foundation of China (Grant No. 62250069).

Author information

Authors and Affiliations

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Zhi-Hua Zhou & Zhi-Hao Tan

Authors

Zhi-Hua Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Hao Tan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhi-Hua Zhou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, ZH., Tan, ZH. Learnware: small models do big. Sci. China Inf. Sci. 67, 112102 (2024). https://doi.org/10.1007/s11432-023-3823-6

Download citation

Received: 22 April 2023
Accepted: 27 June 2023
Published: 20 October 2023
DOI: https://doi.org/10.1007/s11432-023-3823-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learnware: small models do big

Abstract

Access this article

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation