Feature extractor stacking for cross-domain few-shot learning

Wang, Hongyu; Frank, Eibe; Pfahringer, Bernhard; Mayo, Michael; Holmes, Geoffrey

doi:10.1007/s10994-023-06483-x

Feature extractor stacking for cross-domain few-shot learning

Published: 30 November 2023

Volume 113, pages 121–158, (2024)
Cite this article

Machine Learning Aims and scope Submit manuscript

Hongyu Wang ORCID: orcid.org/0000-0002-2898-0771¹,
Eibe Frank¹,
Bernhard Pfahringer¹,
Michael Mayo¹ &
…
Geoffrey Holmes¹

369 Accesses
1 Altmetric
Explore all metrics

Abstract

Cross-domain few-shot learning (CDFSL) addresses learning problems where knowledge needs to be transferred from one or more source domains into an instance-scarce target domain with an explicitly different distribution. Recently published CDFSL methods generally construct a universal model that combines knowledge of multiple source domains into one feature extractor. This enables efficient inference but necessitates re-computation of the extractor whenever a new source domain is added. Some of these methods are also incompatible with heterogeneous source domain extractor architectures. We propose feature extractor stacking (FES), a new CDFSL method for combining information from a collection of extractors, that can utilise heterogeneous pretrained extractors out of the box and does not maintain a universal model that needs to be re-computed when its extractor collection is updated. We present the basic FES algorithm, which is inspired by the classic stacked generalisation approach, and also introduce two variants: convolutional FES (ConFES) and regularised FES (ReFES). Given a target-domain task, these algorithms fine-tune each extractor independently, use cross-validation to extract training data for stacked generalisation from the support set, and learn a simple linear stacking classifier from this data. We evaluate our FES methods on the well-known Meta-Dataset benchmark, targeting image classification with convolutional neural networks, and show that they can achieve state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Fig. 10

Fig. 11

Fig. 13

Dual selective knowledge transfer for few-shot classification

Article Open access 18 September 2023

Global and local representation collaborative learning for few-shot learning

Article 21 December 2022

A Survey on Cross-Domain Few-Shot Image Classification

Availability of data and materials

All data used can be acquired publicly via https://github.com/google-research/meta-dataset for the official Meta-Dataset, https://github.com/cambridge-mlg/cnaps for three additional target domains, https://github.com/IBM/cdfsl-benchmark for four additional target domains, and https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/ for one additional target domain.

Code availability

The implementation and the computational work are done using the Python programming language and the PyTorch deep learning library (Paszke et al., 2019). The code and data files are available via GitHub at https://github.com/HongyuJerryWang/FeatureExtractorStacking.

Notes

Note that, strictly speaking, this also creates distinct domains because the joint probability distributions will differ. However, they will be strongly related.

References

Bateni, P., Barber, J., van de Meent, J., Wood, F. (2022). Enhancing fewshot image classification with unlabelled examples. IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA (pp. 1597-1606). IEEE.
Bateni, P., Goyal, R., Masrani, V., Wood, F., Sigal, L. (2020). Improved few-shot visual classification. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA (pp. 14481-14490). Computer Vision Foundation / IEEE.
Bossard, L., Guillaumin, M., Gool, L.V. (2014). Food-101 - mining discriminative components with random forests. Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland (Vol. 8694, pp. 446-461). Springer.
Chen, W., Liu, Y., Kira, Z., Wang, Y.F., Huang, J. (2019). A closer look at few-shot classification. 7th International Conference on Learning Representations, New Orleans, LA, USA. OpenReview.net.
Chen, Y., Liu, Z., Xu, H., Darrell, T., Wang, X. (2021). Meta-baseline: Exploring simple meta-learning for few-shot learning. 2021 IEEE/CVF international conference on computer vision, Montreal, QC, Canada (pp. 9042-9051). IEEE.
Demsar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.
MathSciNet Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. IEEE computer society conference on computer vision and pattern recognition, Miami, Florida, USA (pp. 248-255). IEEE Computer Society.
Dvornik, N., Schmid, C., Mairal, J. (2020). Selecting relevant features from a multi-domain representation for few-shot classification. Computer Vision- ECCV 2020 - 16th European Conference, Glasgow, UK (Vol. 12355, pp. 769-786). Springer.
Galeano, P., Joseph, E., & Lillo, R. E. (2015). The Mahalanobis distance for functional data with applications to classification. Technometrics, 57(2), 281–291.
Article MathSciNet Google Scholar
Guo, Y., Codella, N., Karlinsky, L., Codella, J.V., Smith, J.R., Saenko, K.,. . . Feris, R. (2020). A broader study of cross-domain few-shot learning. Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK (Vol. 12372, pp. 124-141). Springer.
He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA (pp. 770-778). IEEE Computer Society.
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd international conference on machine learning, Lille, France (Vol. 37, pp. 448-456). JMLR.org.
Li, W., Liu, X., Bilen, H. (2021). Universal representation learning from multiple domains for few-shot classification. 2021 IEEE/CVF international conference on computer vision, Montreal, QC, Canada (pp.9506-9515). IEEE.
Li, W., Liu, X., Bilen, H. (2022). Cross-domain few-shot learning with task-specific adapters. IEEE/CVF conference on computer vision and pattern recognition, New Orleans, LA, USA (pp. 7151-7160). IEEE.
Li, W.-h., Liu, X., Bilen, H. (2022). Universal representation learning and task-specific adaptation for few-shot learning. https://github.com/VICO-UoE/URL. (Accessed: 2022-09-29)
Liu, L., Hamilton, W.L., Long, G., Jiang, J., Larochelle, H. (2021). A universal representation transformer layer for few-shot image classification. 9th international conference on learning representations, Virtual Event, Austria. OpenReview.net.
Liu, Y., Lee, J., Zhu, L., Chen, L., Shi, H., Yang, Y. (2021). A multi-mode modulator for multi-domain few-shot classification. 2021 IEEE/CVF international conference on computer vision, Montreal, QC, Canada (pp. 8433-8442). IEEE.
Mensink, T., Verbeek, J., Perronnin, F., & Csurka, G. (2013). Distance-based image classification: Generalizing to new classes at near-zero cost. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2624–2637.
Article Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32, Vancouver, BC, Canada (pp. 8024-8035).
Perez, E., Strub, F., de Vries, H., Dumoulin, V., Courville, A.C. (2018). FiLM: Visual reasoning with a general conditioning layer. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, the 30th innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, Louisiana, USA (pp. 3942-3951). AAAI Press.
Requeima, J., Gordon, J., Bronskill, J., Nowozin, S., Turner, R.E. (2019). Fast and flexible multi-task classification using conditional neural adaptive processes. Advances in Neural Information Processing Systems 32, Vancouver, BC, Canada (pp. 7957-7968).
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Article MathSciNet Google Scholar
Snell, J., Swersky, K., Zemel, R.S. (2017). Prototypical networks for few-shot learning. Advances in Neural Information Processing Systems 30, Long Beach, CA, USA (pp. 4077-4087).
Tan, M., & Le, Q.V. (2021). Efficientnetv2: Smaller models and faster training. Proceedings of the 38th International Conference on Machine Learning, Virtual Event (Vol. 139, pp. 10096-10106). PMLR.
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1), 91–108.
Article MathSciNet Google Scholar
Triantafillou, E., Larochelle, H., Zemel, R.S., Dumoulin, V. (2021). Learning a universal template for few-shot dataset generalization. Proceedings of the 38th international conference on machine learning, Virtual Event (Vol. 139, pp. 10424-10433). PMLR.
Triantafillou, E., Zhu, T., Dumoulin, V., Lamblin, P., Evci, U., Xu, K., . . . Larochelle, H. (2020). Meta-dataset: A dataset of datasets for learning to learn from few examples. 8th International conference on learning representations, Addis Ababa, Ethiopia. OpenReview.net.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., . . . Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems 30, Long Beach, CA, USA (pp. 5998-6008).
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D. (2016). Matching networks for one shot learning. Advances in Neural Information Processing Systems 29, Barcelona, Spain (pp. 3630-3638).
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.
Article Google Scholar
Zaheer, M., Kottur, S., Ravanbakhsh, S., Póczos, B., Salakhutdinov, R., Smola, A.J. (2017). Deep sets. Advances in Neural Information Processing Systems 30, Long Beach, CA, USA (pp. 3391-3401).

Download references

Funding

This research is funded by the Ministry of Business, Innovation and Employment of New Zealand as part of a Smart Ideas project entitled “User-friendly Deep Learning”, please refer to https://www.mbie.govt.nz/science-and-technology/science-and-innovation/funding-information-and-opportunities/investment-funds/endeavour-fund/.

Author information

Authors and Affiliations

Department of Computer Science, University of Waikato, Knighton Road, Hamilton, 3240, Waikato, New Zealand
Hongyu Wang, Eibe Frank, Bernhard Pfahringer, Michael Mayo & Geoffrey Holmes

Authors

Hongyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Eibe Frank
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Pfahringer
View author publications
You can also search for this author in PubMed Google Scholar
Michael Mayo
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey Holmes
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation and data collection and analysis were performed by Hongyu Wang. The first draft of the manuscript was written by Wang and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Hongyu Wang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Editor: Steven Schockaert.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Additional heatmaps

Additional heatmaps visualising kernel weights on target domains with TSA fine-tuning are shown by Figs. 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 35, 36, 37, 38, 39 and 40.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, H., Frank, E., Pfahringer, B. et al. Feature extractor stacking for cross-domain few-shot learning. Mach Learn 113, 121–158 (2024). https://doi.org/10.1007/s10994-023-06483-x

Download citation

Received: 24 October 2022
Revised: 20 October 2023
Accepted: 24 October 2023
Published: 30 November 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s10994-023-06483-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature extractor stacking for cross-domain few-shot learning

Abstract

Access this article

Similar content being viewed by others

Dual selective knowledge transfer for few-shot classification

Global and local representation collaborative learning for few-shot learning

A Survey on Cross-Domain Few-Shot Image Classification

Availability of data and materials

Code availability

Notes

References

Funding