Skip to main content

Semi-Supervised Learning on a Budget: Scaling Up to Large Datasets

  • Conference paper
Computer Vision – ACCV 2012 (ACCV 2012)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7724))

Included in the following conference series:

Abstract

Internet data sources provide us with large image datasets which are mostly without any explicit labeling. This setting is ideal for semi-supervised learning which seeks to exploit labeled data as well as a large pool of unlabeled data points to improve learning and classification. While we have made considerable progress on the theory and algorithms, we have seen limited success to translate such progress to the large scale datasets which these methods are inspired by. We investigate the computational complexity of popular graph-based semi-supervised learning algorithms together with different possible speed-ups. Our findings lead to a new algorithm that scales up to 40 times larger datasets in comparison to previous approaches and even increases the classification performance. Our method is based on the key insights that by employing a density-based measure unlabeled data points can be selected similar to an active learning scheme. This leads to a compact graph resulting in an improved performance up to 11.6% at reduced computational costs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hein, M., Maier, M.: Manifold Denoising. In: NIPS (2006)

    Google Scholar 

  2. Zhou, D., Huang, J.: Learning from Labeled and Unlabeled Data on a Directed Graph. In: ICML (2005)

    Google Scholar 

  3. Liu, W., Chang, S.: Robust multi-class transductive learning with graphs. In: CVPR (2009)

    Google Scholar 

  4. Ebert, S., Larlus, D., Schiele, B.: Extracting Structures in Image Collections for Object Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 720–733. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Delalleau, O., Bengio, Y., Le Roux, N.: Efficient non-parametric function induction in semi-supervised learning. In: AISTATS (2005)

    Google Scholar 

  6. Liu, W., He, J., Chang, S.: Large graph construction for scalable semi-supervised learning. In: ICML (2010)

    Google Scholar 

  7. Schroff, F., Criminisi, A., Zisserman, A.: Harvesting Image Databases from the Web. In: ICCV (2007)

    Google Scholar 

  8. Collins, B., Deng, J., Li, K., Fei-Fei, L.: Towards Scalable Dataset Construction: An Active Learning Approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 86–98. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: ICCV (2009)

    Google Scholar 

  10. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  11. Perronnin, F., Liu, Y., Sánchez, J.: Large-scale image retrieval with compressed Fisher vectors. In: CVPR (2010)

    Google Scholar 

  12. Deselaers, T., Ferrari, V.: Visual and Semantic Similarity in ImageNet. In: CVPR (2011)

    Google Scholar 

  13. Rohrbach, M., Stark, M., Schiele, B.: Evaluating Knowledge Transfer and Zero-Shot Learning in a Large-Scale Setting. In: CVPR (2011)

    Google Scholar 

  14. Zhou, D., Schölkopf, B., Bousquet, O., Lal, T.N., Weston, J.: Learning with Local and Global Consistency. In: NIPS (2004)

    Google Scholar 

  15. Sindhwani, V., Niyogi, P., Belkin: Beyond the point cloud: from transductive to semi-supervised learning. ML (2005)

    Google Scholar 

  16. Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML (2003)

    Google Scholar 

  17. Wang, F., Zhang, C.: Label propagation through linear neighborhoods. TKDE 1, 55–67 (2007)

    Google Scholar 

  18. Zhang, Z., Wang, J., Zha, H.: Adaptive Manifold Learning. TPAMI, 1–14 (2011)

    Google Scholar 

  19. Torralba, A., Fergus, R., Weiss, Y.: Small codes and large image databases for recognition. In: CVPR (2008)

    Google Scholar 

  20. Fergus, R., Weiss, Y., Torralba, A.: Semi-supervised learning in gigantic image collections. In: NIPS (2009)

    Google Scholar 

  21. Zhang, Z., Zha, H., Zhang, M., Tech, G.: Spectral Methods for Semi-supervised Manifold Learning. In: CVPR (2008)

    Google Scholar 

  22. Zhang, K., Kwok, J.T., Parvin, B.: Prototype vector machine for large scale semi-supervised learning. In: ICML (2009)

    Google Scholar 

  23. Li, Y.F., Zhou, Z.H.: Towards Making Unlabeled Data Never Hurt. In: ICML (2011)

    Google Scholar 

  24. Ebert, S., Fritz, M., Schiele, B.: Reinforced Active Learning: An Object Class Learning-By-Doing Approach. In: CVPR (2012)

    Google Scholar 

  25. Leibe, B., Schiele, B.: Analyzing Appearance and Contour Based Methods for Object Categorization. In: CVPR (2003)

    Google Scholar 

  26. Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV (2001)

    Google Scholar 

  27. Vedaldi, A., Fulkerson, B.: VLFEAT: An Open and Portable Library of Computer Vision Algorithms (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ebert, S., Fritz, M., Schiele, B. (2013). Semi-Supervised Learning on a Budget: Scaling Up to Large Datasets. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37331-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37331-2_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37330-5

  • Online ISBN: 978-3-642-37331-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics