Abstract
We present a noise resilient probabilistic model for active learning of a Gaussian process classifier from crowds, i.e., a set of noisy labelers. It explicitly models both the overall label noise and the expertise level of each individual labeler with two levels of flip models. Expectation propagation is adopted for efficient approximate Bayesian inference of our probabilistic model for classification, based on which, a generalized EM algorithm is derived to estimate both the global label noise and the expertise of each individual labeler. The probabilistic nature of our model immediately allows the adoption of the prediction entropy for active selection of data samples to be labeled, and active selection of high quality labelers based on their estimated expertise to label the data. We apply the proposed model for four visual recognition tasks, i.e., object category recognition, multi-modal activity recognition, gender recognition, and fine-grained classification, on four datasets with real crowd-sourced labels from the Amazon Mechanical Turk. The experiments clearly demonstrate the efficacy of the proposed model. In addition, we extend the proposed model with the Predictive Active Set Selection Method to speed up the active learning system, whose efficacy is verified by conducting experiments on the first three datasets. The results show our extended model can not only preserve a higher accuracy, but also achieve a higher efficiency.
Similar content being viewed by others
References
Ambati, V., Vogel, S., & Carbonell, J. (May 2010). Active learning and crowd-sourcing for machine translation. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10).
Branson, S., Perona, P., & Belongie, S. (November 2011). Strong supervision from weak annotation: Interactive training of deformable part models. In: Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain.
Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., & Belongie, S. (September 2010). Visual recognition with humans in the loop. In: Proceedings of the European Conference on Computer Vision, Heraklion, Crete.
Burl, M., & Perona, P. (1998). Using hierarchical shape models to spot keywords in cursive handwriting data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 23–28). IEEE.
Burl, M., Leung, T. K., & Perona, P. (1995). Face localization via shape statistics. In: Proceedings of the First International Workshop on Automatic Face and Gesture Recognition (pp. 154–159). Zurich.
Chen, S., Zhang, J., Chen, G., & Zhang, C. (2010). What if the irresponsible teachers are dominating? A method of training on samples and clustering on teachers. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI).
Dekel, O., & Shamir, O. (2009). Good learners for evil teachers. In: Proceedings of the IEEE International Conference on Machine Learning. IEEE.
Dekel, O., & Shamir, O. (2009). Vox populi: Collecting high-quality labels from a crowd. In: Proceedings of the 22nd Annual Conference on Learning Theory.
Deng, J., Krause, J., & Fei-Fei, L. (June 2013). Fine-grained crowdsourcing for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE.
Deng, J., Dong, W., Socher, R., Li, L. -J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE International Conference on Computer Vision (pp. 248–255), June 2009. IEEE.
Donmez, P., Carbonell, J., & Schneider, J. (2009). Efficiently learning the accuracy of labeling sources for selective sampling. In: Special Interest Group on Knowledge Discovery in Data (SIGKDD).
Donmez, P., Carbonell, J., & Schneider, J. (2010). A probabilistic framework to learn from multiple annotators with time-varying accuracy. In: Proceedings of the SIAM Conference on Data Mining (SDM). Philadelphia: SIAM.
Ebert, S., Fritz, M., & Schiele, B. (2012). Ralf: A reinforced active learning formulation for object class recognition. In: Proceedings of the IEEE International Conference on Computer Vision. IEEE.
Fergus, R., Fei-Fei, L., Perona, P., & Zisserman, A. (Oct. 2005). Learning object categories from google’s image search. In: Proceedings of the 10th International Conference on Computer Vision, Beijing.
Gibbs, M., & Mackay, D. (2000). Variational gaussian process classifiers. IEEE Transactions on Neural Networks 11(6), 1458–1464.
Groot, P., Birlutiu, A., & Heskes, T. (2011). Learning from multiple annotators with gaussian processes. In: Proceedings of the Artificial Neural Networks and Machine Learning—ICANN 2011 21st International Conference on Artificial Neural Networks, Part II, Espoo, June 14–17, 2011 (pp. 159–164).
Henao, R., & Winther, O. (2010). Pass-gp: Predictive active set selection for gaussian processes. In: Proceedings of the Machine Learning for Signal Processing (MLSP), 2010 IEEE International Workshop (p. 148153).
Henao, R., & Winther, O. (2012). Predictive active set selection methods for gaussian processes. Neurocomputing, 80, 10–18.
Hua, G., Long, C., Yang, M., & Gao, Y. (2013). Collaborative active learning of a kernel machine ensemble for recognition. In: Proceedings IEEE International Conference on Computer Vision (pp. 1209–1216). IEEE
Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2007). Active learning with gaussian processes for object categorization. In: Proceedings IEEE International Conference on Computer Vision.
Kapoor, A., Hua, G., Akbarzadeh, A., & Baker, S. (2009). Which faces to tag: Adding prior constraints into active learning. In: Proceedings IEEE International Conference on Computer Vision. IEEE
Kim, H.-C., & Ghahramani, Z. (2006). Bayesian gaussian process classification with the EM-EP algorithm. IEEE Transactions Pattern Analysis and Machine Intelligence, 28(12), 1948–1959.
Kim, H.-C., & Ghahramani, Z. (2008). Outlier robust gaussian process classification. In: Proceedings of the Structural, Syntactic, and Statistical Pattern Recognition. Joint IAPR International Workshops (SSPR/SPR) (pp. 896–905).
Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. Neural Information Processing Systems, 1, 31–40.
Lawrence, N. D., Seeger, M., & Herbrich, R. (2003). Fast sparse gaussian process methods: The informative vector machine. In: Advances in Neural Information Processing Systems (vol. 15, pp. 609–616). Cambridge: MIT Press.
Lin, Y., Lv, F., Zhu, S., Yang, M., Cour, T., Yu, K., Cao, L., & Huang, T. (2011). Large-scale image classification: Fast feature extraction and svm training. In: Proceedings IEEE International Conference on Computer Vision. IEEE
Liu, D., Hua, G., Viola, P., & Chen, T. (2008). Integrated feature selection and higher-order spatial feature extraction for object categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008 (pp. 1–8). IEEE.
Long, C., Hua, G., & Kapoor, A. (December 2013). Active visual recognition with expertise estimation in crowdsourcing. In: Proceedings IEEE International Conference on Computer Vision. IEEE
Loy, C., Hospedales, T., Xiang, T., & Gong, S. (2012). Stream-based joint exploration-exploitation active learning. In: Proceedings IEEE International Conference on Computer Vision. IEEE
Minka, T. (2001). A family of algorithms for approximate Bayesian inference. Ph.D. Thesis. Cambridge: MIT.
Naish-Guzman, A., & Holden, S.B. (2007). The generalized FITC approximation. In: Neural Information Processing Systems (NIPS) (pp. 1057–1064).
Neal, R.M. (1997). Monte carlo implementation of gaussian process models for bayesian regression and classification. Technical Report CRGTR972, University of Toronto.
Opper, M., & Winther, O. (1999). Gaussian processes for classification: Mean-field algorithms. Neural Computation, 12, 2000.
Parikh, D. (November 2011). Recognizing jumbled images: The role of local and global information in image classification. In: Proceedings IEEE International Conference on Computer Vision. IEEE
Parikh, D., & Zitnick, L. (June 2010). The role of features, algorithms and data in visual recognition. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. IEEE
Parikh, D., & Zitnick, L. (June 2011). Finding the weakest link in person detectors. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. IEEE
Parikh, D., Zitnick, C. L., & Chen, T. (2012). Exploring tiny images: The roles of appearance and contextual information for machine and human object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(10), 1978–1991.
Patterson, G., Horn, G. V., Belongie, S., Perona, P., & Hays, J. (2013). Bootstrapping fine-grained classifiers: Active learning with a crowd in the loop. In: Proceedings Neural Information Processing Systems (NIPS) 2013 Crowd Workshops.
Quinonero-candela, J., Rasmussen, C. E., & Herbrich, R. (2005). A unifying view of sparse approximate gaussian process regression. Journal of Machine Learning Research, 6, 2005.
Rasmussen, C. E. (2006). Gaussian processes for machine learning. Cambridge, MA: MIT Press.
Raykar, V. C., & Yu, S. (2012). Eliminating spammers and ranking annotators for crowdsourced labeling tasks. Journal of Machine Learning Research, 13, 491–518 .
Raykar, V. C., Yu, S., Zhao, L. H., Jerebko, A., Florin, C., Valadez, G. H., Bogoni, L., & Moy, L. (2009). Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings IEEE International Conference on Machine Learning. IEEE
Rodrigues, F., Pereira, F. C., & Ribeiro, B. (2013). Learning from multiple annotators: Distinguishing good from random labelers. Pattern Recognition Letters, 34(12), 1428–1436.
Rodrigues, F., Pereira, F., & Ribeiro, B. (2014). Gaussian process classification and active learning with multiple annotators. In: Proceedings IEEE International Conference on Machine Learning. IEEE
Rodrigues, F., Pereira, F.C., & Ribeiro, B. (2013). Sequence labeling with multiple annotators. Machine Learning, 95(2), 165–181.
Roy, N., & Mccallum, A. (2001). Toward optimal active learning through sampling estimation of error reduction. In: Proceedings IEEE International Conference on Machine Learning, pp. 441–448. Burlington, MA: Morgan Kaufmann.
Sanchez, J., & Perronnin, F. (2011). High-dimensional signature compression for large-scale image classification. In: Proceedings IEEE International Conference on Computer Vision. IEEE
Seeger, M. (2002). Pac-Bayesian generalisation error bounds for gaussian process classification. Journal of Machine Learning Research, 3, 233–269.
Seeger, M., Williams, C. K. I., & Lawrence, N. D. (2003). Fast forward selection to speed up sparse gaussian process regression. In: Proceedings of the Workshop on Artificial Intelligence and Statistics, (vol. 9).
Simpson, E., Roberts, S. J., Psorakis, I., & Smith, A. (2013). Dynamic bayesian combination of multiple imperfect classifiers. In: Decision Making and Imperfection (pp. 1–35). Berlin: Springer
Snelson, E., & Ghahramani Z. (2006). Sparse gaussian processes using pseudo-inputs. In: Advances in Neural information Processing Systems (pp. 1257–1264). Cambridge: MIT Press.
Snelson, E., & Ghahramani, Z. (2006). Variable noise and dimensionality reduction for sparse gaussian processes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI). Edinburgh: AUAI Press.
Spriggs, E. H., Torre, F. D. L., & Hebert, M. (2009). Temporal segmentation and activity classification from first-person sensing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop. IEEE
Thouless, D. J., Anderson, P. W., & Palmer, R. G. (1977). Solution of a “solvable model of a spin glass”. Philosophical Magazine, 35, 593.
Titsias, M. K. (2009). Variational learning of inducing variables in sparse gaussian processes. Artificial Intelligence and Statistics, 12, 567–574.
Tivive, F. H. C., & Bouzerdoum, A. (2006). A gender recognition system using shunting inhibitory convolutional neural networks. In: Proceedings of the International Joint Conference on Neural Networks, IJCNN 2006, Part of the IEEE World Congress on Computational Intelligence, WCCI 2006, Vancouver, BC, 16–21 July 2006, (pp. 5336–5341). IEEE
Vijayanarasimhan, S., & Grauman, K. (2014). Large-scale live active learning: Training object detectors with crawled data and crowds. International Journal of Computer Vision (IJCV), 108(1–2), 97–114.
von Ahn, L., & Dabbish, L. (2004). Labeling images with a computer game. In: Proceedings ACM Conference on Human Factors in Computing Systems (pp. 319–326). New York, NY: ACM
von Ahn, L., Liu, R., & Blum, M. (2006). Peekaboom: A game for locating objects in images. In: Proceedings ACM Conference on Human Factors in Computing Systems (pp. 55–64). New York, NY: ACM
Vondrick, C., & Ramanan, D. (2011). Video annotation and tracking with active learning. In: Neural Information Processing Systems (NIPS) (pp. 28–36). Cambridge, MA: MIT Press
Wah, C., Branson, S., Perona, P., & Belongie, S. (November 2011). Multiclass recognition and part localization with humans in the loop. In: Proceedings IEEE International Conference on Computer Vision, Barcelona, Spain. IEEE
Welinder, P., & Perona, P. (June 2010). Online crowdsourcing: rating annotators and obtaining cost-effective labels. San Francisco. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. IEEE
Welinder, P., & Perona., P. (2010). Online crowdsourcing: Rating annotators and obtaining cost-effective labels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop. IEEE
Welinder, P., Branson, S., Belongie, S., & Perona, P. (2010). The multidimensional wisdom of crowds. In: Neural Information Processing Systems (NIPS).
Williams, C., & Barber, D. (1998). Bayesian classification with gaussian processes. IEEE Trans Pattern Analysis and Machine Intelligence, 20(12), 1342–1351.
Wu, O., Hu, W., & Gao, J. (2011). Learning to rank under multiple annotators. In: Proceedings of the Twenty-Second international joint conference on Artificial Intelligence-Volume Volume Two (pp. 1571-1576). Menlo Park, CA: AAAI Press
Yan, F., & Qi, Y. A. (2010). Sparse gaussian process regression via l1 penalization. In: Proceedings IEEE International Conference on Machine Learning (pp. 1183–1190). IEEE
Yan, Y., Rosales, R., Fung, G., & Dy, J. G. (2011). Active learning from crowds. In: Proceedings IEEE International Conference on Machine Learning (pp. 1161–1168). IEEE
Yan, Y., Rosales, R., Fung, G., & Dy, J. (2012). Active learning from multiple knowledge sources. In: Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS).
Yao, A., Gall, J., Leistner, C., & Van Gool, L. (2012). Interactive object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE
Yao, B., Khosla, A., & Fei-Fei, L. (2011). Combining randomization and discrimination for fine-grained image categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Springs, Colorado, June 2011. IEEE
Zhang, Z., Dai, G., & Jordan, M. I. (2011). Bayesian generalized kernel mixed models. Journal of Machine Learning Research, 12, 111–139.
Zhao, L., Sukthankar, G., & Sukthankar, R. (2011). Incremental relabeling for active learning with noisy crowdsourced annotations. In: Proceedings of the IEEE Third International Conference on and 2011 IEEE Third International Conference on Social Computing (SocialCom). IEEE
Zhu, X., Lafferty, J., & Ghahramani, Z. (2003). Combining active learning and semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining (pp. 58–65).
Zhu, C., Byrd, R. H., Lu, P., & Nocedal, J. (1997). Algorithm 778: L-bfgs-b: Fortran subroutines for large-scale bound-constrained optimization. ACM Transactions on Mathematical Software (TOMS), 23, 550–560.
Zitnick, L., & Parikh, D. (2012). The role of image understanding in contour detection. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (pp. 622–629). IEEE
Acknowledgments
Research reported in this publication was partly supported by the National Institute Of Nursing Research of the National Institutes of Health under Award Number R01NR015371. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This work is also partly supported by US National Science Foundation Grant IIS 1350763, China National Natural Science Foundation Grant 61228303, GH’s start-up funds form Stevens Institute of Technology, a Google Research Faculty Award, a gift grant from Microsoft Research, and a gift grant from NEC Labs America.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Jakob Verbeek.
Appendices
Appendix 1: Derivation of the Normalization Factor \(Z_i\)
where
Consider that
and we can rewrite the eqnarray as:
Appendix 2: Moment Matching in the Expectation Propagation Algorithm
Minimizing \(KL[Q_{-i}(s_i)p(\mathbf{t}_i|s_i)||Q_{-i}(s_i){\tilde{F}}_i(s_i)]\) to obtain an update for the approximation \({\tilde{F}}_i(s_i)\), we recompute the parameters according to the normalized constant presented in Appendix 1, i.e.,
where
By moment matching (Minka 2001), we have
where \(\alpha = \frac{1}{\sqrt{v_{-i}^{\text {old}}}} \cdot \frac{(C_2 - C_1) \mathcal{N}(z_i;0,1)}{Z_i}\). Hence we can obtain a new \({\tilde{F}}_i(s_i)\) by recomputing its parameters \(A_i\), \(\tilde{m}_i\), and \(v_i\) as
Appendix 3: Gradients of the Lower Bound
The gradients of the lower bound with respect to the parameters \(\varvec{\varepsilon }\) are as follows
where
And the gradient with respect to a kernel parameter \(\theta \in \varvec{\vartheta }\) is
Rights and permissions
About this article
Cite this article
Long, C., Hua, G. & Kapoor, A. A Joint Gaussian Process Model for Active Visual Recognition with Expertise Estimation in Crowdsourcing. Int J Comput Vis 116, 136–160 (2016). https://doi.org/10.1007/s11263-015-0834-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-015-0834-9