Abstract
As computer vision research considers more object categories and greater variation within object categories, it is clear that larger and more exhaustive datasets are necessary. However, the process of collecting such datasets is laborious and monotonous. We consider the setting in which many images have been automatically collected for a visual category (typically by automatic internet search), and we must separate relevant images from noise. We present a discriminative learning process which employs active, online learning to quickly classify many images with minimal user input. The principle advantage of this work over previous endeavors is its scalability. We demonstrate precision which is often superior to the state-of-the-art, with scalability which exceeds previous work.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories (2004)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007)
Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge 2006 (VOC2006) Results, http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: A database and web-based tool for image annotation. Int. J. Comput. Vision 77, 157–173 (2008)
Fink, M., Ullman, S.: From aardvark to zorro: A benchmark for mammal image classification. Int. J. Comput. Vision 77, 143–156 (2008)
Yao, B., Yang, X., Zhu, S.C.: Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks, pp. 169–183 (2007)
Fergus, R., Perona, P., Zisserman, A.: A visual category filter for google images. In: Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, pp. 242–256 (2004)
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Tenth IEEE International Conference on Computer Vision, 2005. ICCV 2005, 17-21 October 2005, vol. 2, pp. 1816–1823 (2005)
Berg, T.L., Forsyth, D.A.: Animals on the web. Computer Vision and Pattern Recognition, 1463–1470 (2006)
Li, J., Wang, G., Fei-Fei, L.: Optimol: automatic object picture collection via incremental model learning. Computer Vision and Pattern Recognition (2006)
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. Journal of the American Statistical Association (2006)
Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web (2007)
Torralba, A., Fergus, R., Freeman, W.T.: Tiny images. Technical Report MIT-CSAIL-TR-2007-024, Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology (2007)
Princeton Cognitive Science Laboratory: Wordnet, http://wordnet.princeton.edu
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37, 297–336 (1999)
Schapire, R.: The boosting approach to machine learning: An overview. In: Hansen, M., Holmes, C., Mallick, B., Yu, B. (eds.) Nonlinear Estimation and Classification. Springer, Heidelberg (2003)
Yan, R., Yang, J., Hauptmann, A.: Automatically labeling video data using multi-class active learning. In: Eighth IEEE International Conference on Computer Vision. ICCV 2003, vol. 01, p. 516 (2003)
Abramson, Y., Freund, Y.: Semi-automatic visual learning (seville): a tutorial on active learning for visual object recognition. In: Computer Vision and Pattern Recognition (2005)
Hakkani-Tür, D., Riccardi, G., Tur, G.: An active approach to spoken language processing. ACM Trans. Speech Lang. Process 3, 1–31 (2006)
Oza, N.: Online bagging and boosting. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2340–2345 (2005)
Grabner, H., Bischof, H.: On-line boosting and vision. In: CVPR 2006: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, pp. 260–267. IEEE Computer Society Press, Los Alamitos (2006)
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (2004)
Elkan, C.: Using the triangle inequality to accelerate k-means. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 147–153 (2003)
Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Collins, B., Deng, J., Li, K., Fei-Fei, L. (2008). Towards Scalable Dataset Construction: An Active Learning Approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88682-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-88682-2_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88681-5
Online ISBN: 978-3-540-88682-2
eBook Packages: Computer ScienceComputer Science (R0)