Skip to main content
Log in

Feature extractor stacking for cross-domain few-shot learning

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

Cross-domain few-shot learning (CDFSL) addresses learning problems where knowledge needs to be transferred from one or more source domains into an instance-scarce target domain with an explicitly different distribution. Recently published CDFSL methods generally construct a universal model that combines knowledge of multiple source domains into one feature extractor. This enables efficient inference but necessitates re-computation of the extractor whenever a new source domain is added. Some of these methods are also incompatible with heterogeneous source domain extractor architectures. We propose feature extractor stacking (FES), a new CDFSL method for combining information from a collection of extractors, that can utilise heterogeneous pretrained extractors out of the box and does not maintain a universal model that needs to be re-computed when its extractor collection is updated. We present the basic FES algorithm, which is inspired by the classic stacked generalisation approach, and also introduce two variants: convolutional FES (ConFES) and regularised FES (ReFES). Given a target-domain task, these algorithms fine-tune each extractor independently, use cross-validation to extract training data for stacked generalisation from the support set, and learn a simple linear stacking classifier from this data. We evaluate our FES methods on the well-known Meta-Dataset benchmark, targeting image classification with convolutional neural networks, and show that they can achieve state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Availability of data and materials

All data used can be acquired publicly via https://github.com/google-research/meta-dataset for the official Meta-Dataset, https://github.com/cambridge-mlg/cnaps for three additional target domains, https://github.com/IBM/cdfsl-benchmark for four additional target domains, and https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/ for one additional target domain.

Code availability

The implementation and the computational work are done using the Python programming language and the PyTorch deep learning library (Paszke et al., 2019). The code and data files are available via GitHub at https://github.com/HongyuJerryWang/FeatureExtractorStacking.

Notes

  1. Note that, strictly speaking, this also creates distinct domains because the joint probability distributions will differ. However, they will be strongly related.

References

  • Bateni, P., Barber, J., van de Meent, J., Wood, F. (2022). Enhancing fewshot image classification with unlabelled examples. IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA (pp. 1597-1606). IEEE.

  • Bateni, P., Goyal, R., Masrani, V., Wood, F., Sigal, L. (2020). Improved few-shot visual classification. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA (pp. 14481-14490). Computer Vision Foundation / IEEE.

  • Bossard, L., Guillaumin, M., Gool, L.V. (2014). Food-101 - mining discriminative components with random forests. Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland (Vol. 8694, pp. 446-461). Springer.

  • Chen, W., Liu, Y., Kira, Z., Wang, Y.F., Huang, J. (2019). A closer look at few-shot classification. 7th International Conference on Learning Representations, New Orleans, LA, USA. OpenReview.net.

  • Chen, Y., Liu, Z., Xu, H., Darrell, T., Wang, X. (2021). Meta-baseline: Exploring simple meta-learning for few-shot learning. 2021 IEEE/CVF international conference on computer vision, Montreal, QC, Canada (pp. 9042-9051). IEEE.

  • Demsar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.

    MathSciNet  Google Scholar 

  • Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. IEEE computer society conference on computer vision and pattern recognition, Miami, Florida, USA (pp. 248-255). IEEE Computer Society.

  • Dvornik, N., Schmid, C., Mairal, J. (2020). Selecting relevant features from a multi-domain representation for few-shot classification. Computer Vision- ECCV 2020 - 16th European Conference, Glasgow, UK (Vol. 12355, pp. 769-786). Springer.

  • Galeano, P., Joseph, E., & Lillo, R. E. (2015). The Mahalanobis distance for functional data with applications to classification. Technometrics, 57(2), 281–291.

    Article  MathSciNet  Google Scholar 

  • Guo, Y., Codella, N., Karlinsky, L., Codella, J.V., Smith, J.R., Saenko, K.,. . . Feris, R. (2020). A broader study of cross-domain few-shot learning. Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK (Vol. 12372, pp. 124-141). Springer.

  • He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA (pp. 770-778). IEEE Computer Society.

  • Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd international conference on machine learning, Lille, France (Vol. 37, pp. 448-456). JMLR.org.

  • Li, W., Liu, X., Bilen, H. (2021). Universal representation learning from multiple domains for few-shot classification. 2021 IEEE/CVF international conference on computer vision, Montreal, QC, Canada (pp.9506-9515). IEEE.

  • Li, W., Liu, X., Bilen, H. (2022). Cross-domain few-shot learning with task-specific adapters. IEEE/CVF conference on computer vision and pattern recognition, New Orleans, LA, USA (pp. 7151-7160). IEEE.

  • Li, W.-h., Liu, X., Bilen, H. (2022). Universal representation learning and task-specific adaptation for few-shot learning. https://github.com/VICO-UoE/URL. (Accessed: 2022-09-29)

  • Liu, L., Hamilton, W.L., Long, G., Jiang, J., Larochelle, H. (2021). A universal representation transformer layer for few-shot image classification. 9th international conference on learning representations, Virtual Event, Austria. OpenReview.net.

  • Liu, Y., Lee, J., Zhu, L., Chen, L., Shi, H., Yang, Y. (2021). A multi-mode modulator for multi-domain few-shot classification. 2021 IEEE/CVF international conference on computer vision, Montreal, QC, Canada (pp. 8433-8442). IEEE.

  • Mensink, T., Verbeek, J., Perronnin, F., & Csurka, G. (2013). Distance-based image classification: Generalizing to new classes at near-zero cost. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2624–2637.

    Article  Google Scholar 

  • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32, Vancouver, BC, Canada (pp. 8024-8035).

  • Perez, E., Strub, F., de Vries, H., Dumoulin, V., Courville, A.C. (2018). FiLM: Visual reasoning with a general conditioning layer. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, the 30th innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, Louisiana, USA (pp. 3942-3951). AAAI Press.

  • Requeima, J., Gordon, J., Bronskill, J., Nowozin, S., Turner, R.E. (2019). Fast and flexible multi-task classification using conditional neural adaptive processes. Advances in Neural Information Processing Systems 32, Vancouver, BC, Canada (pp. 7957-7968).

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.

    Article  MathSciNet  Google Scholar 

  • Snell, J., Swersky, K., Zemel, R.S. (2017). Prototypical networks for few-shot learning. Advances in Neural Information Processing Systems 30, Long Beach, CA, USA (pp. 4077-4087).

  • Tan, M., & Le, Q.V. (2021). Efficientnetv2: Smaller models and faster training. Proceedings of the 38th International Conference on Machine Learning, Virtual Event (Vol. 139, pp. 10096-10106). PMLR.

  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1), 91–108.

    Article  MathSciNet  Google Scholar 

  • Triantafillou, E., Larochelle, H., Zemel, R.S., Dumoulin, V. (2021). Learning a universal template for few-shot dataset generalization. Proceedings of the 38th international conference on machine learning, Virtual Event (Vol. 139, pp. 10424-10433). PMLR.

  • Triantafillou, E., Zhu, T., Dumoulin, V., Lamblin, P., Evci, U., Xu, K., . . . Larochelle, H. (2020). Meta-dataset: A dataset of datasets for learning to learn from few examples. 8th International conference on learning representations, Addis Ababa, Ethiopia. OpenReview.net.

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., . . . Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems 30, Long Beach, CA, USA (pp. 5998-6008).

  • Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D. (2016). Matching networks for one shot learning. Advances in Neural Information Processing Systems 29, Barcelona, Spain (pp. 3630-3638).

  • Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.

    Article  Google Scholar 

  • Zaheer, M., Kottur, S., Ravanbakhsh, S., Póczos, B., Salakhutdinov, R., Smola, A.J. (2017). Deep sets. Advances in Neural Information Processing Systems 30, Long Beach, CA, USA (pp. 3391-3401).

Download references

Funding

This research is funded by the Ministry of Business, Innovation and Employment of New Zealand as part of a Smart Ideas project entitled “User-friendly Deep Learning”, please refer to https://www.mbie.govt.nz/science-and-technology/science-and-innovation/funding-information-and-opportunities/investment-funds/endeavour-fund/.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation and data collection and analysis were performed by Hongyu Wang. The first draft of the manuscript was written by Wang and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Hongyu Wang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Editor: Steven Schockaert.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Additional heatmaps

Appendix A: Additional heatmaps

Additional heatmaps visualising kernel weights on target domains with TSA fine-tuning are shown by Figs. 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 3435, 36, 37, 38, 39 and 40.

Fig. 14
figure 14

FES kernel for mscoco

Fig. 15
figure 15

ConFES kernel for mscoco

Fig. 16
figure 16

ReFES kernel for mscoco

Fig. 17
figure 17

FES kernel for mnist

Fig. 18
figure 18

ConFES kernel for mnist

Fig. 19
figure 19

ReFES kernel for mnist

Fig. 20
figure 20

FES kernel for cifar10

Fig. 21
figure 21

ConFES kernel for cifar10

Fig. 22
figure 22

ReFES kernel for cifar10

Fig. 23
figure 23

FES kernel for cifar100

Fig. 24
figure 24

ConFES kernel for cifar100

Fig. 25
figure 25

ReFES kernel for cifar100

Fig. 26
figure 26

FES kernel for CropDisease

Fig. 27
figure 27

ConFES kernel for CropDisease

Fig. 28
figure 28

ReFES kernel for CropDisease

Fig. 29
figure 29

FES kernel for EuroSAT

Fig. 30
figure 30

ConFES kernel for EuroSAT

Fig. 31
figure 31

ReFES kernel for EuroSAT

Fig. 32
figure 32

FES kernel for ISIC

Fig. 33
figure 33

ConFES kernel for ISIC

Fig. 34
figure 34

ReFES kernel for ISIC

Fig. 35
figure 35

FES kernel for ChestX

Fig. 36
figure 36

ConFES kernel for ChestX

Fig. 37
figure 37

ReFES kernel for ChestX

Fig. 38
figure 38

FES kernel for Food101

Fig. 39
figure 39

ConFES kernel for Food101

Fig. 40
figure 40

ReFES kernel for Food101

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Frank, E., Pfahringer, B. et al. Feature extractor stacking for cross-domain few-shot learning. Mach Learn 113, 121–158 (2024). https://doi.org/10.1007/s10994-023-06483-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-023-06483-x

Keywords

Navigation