Advertisement

Towards Realistic Predictors

  • Pei Wang
  • Nuno VasconcelosEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11217)

Abstract

A new class of predictors, denoted realistic predictors, is defined. These are predictors that, like humans, assess the difficulty of examples, reject to work on those that are deemed too hard, but guarantee good performance on the ones they operate on. In this paper, we talk about a particular case of it, realistic classifiers. The central problem in realistic classification, the design of an inductive predictor of hardness scores, is considered. It is argued that this should be a predictor independent of the classifier itself, but tuned to it, and learned without explicit supervision, so as to learn from its mistakes. A new architecture is proposed to accomplish these goals by complementing the classifier with an auxiliary hardness prediction network (HP-Net). Sharing the same inputs as classifiers, the HP-Net outputs the hardness scores to be fed to the classifier as loss weights. Alternatively, the output of classifiers is also fed to HP-Net in a new defined loss, variant of cross entropy loss. The two networks are trained jointly in an adversarial way where, as the classifier learns to improve its predictions, the HP-Net refines its hardness scores. Given the learned hardness predictor, a simple implementation of realistic classifiers is proposed by rejecting examples with large scores. Experimental results not only provide evidence in support of the effectiveness of the proposed architecture and the learned hardness predictor, but also show that the realistic classifier always improves performance on the examples that it accepts to classify, performing better on these examples than an equivalent nonrealistic classifier. All of these make it possible for realistic classifiers to guarantee a good performance.

Keywords

Hardness score prediction Realistic predictors 

References

  1. 1.
    Alain, G., Lamb, A., Sankar, C., Courville, A., Bengio, Y.: Variance reduction in SGD by distributed importance sampling. In: International Conference on Learning Representations (2016)Google Scholar
  2. 2.
    Bansal, A., Farhadi, A., Parikh, D.: Towards transparent systems: semantic characterization of failure modes. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VI. LNCS, vol. 8694, pp. 366–381. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10599-4_24CrossRefGoogle Scholar
  3. 3.
    Basu, S., Christensen, J.: Teaching classification boundaries to humans. In: AAAI Conference on Artificial Intelligence (2013)Google Scholar
  4. 4.
    Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: International Conference on Machine Learning, pp. 41–48. ACM (2009)Google Scholar
  5. 5.
    Chang, H.S., Learned-Miller, E., McCallum, A.: Active bias: training more accurate neural networks by emphasizing high variance samples. In: Advances in Neural Information Processing Systems, pp. 1003–1013 (2017)Google Scholar
  6. 6.
    Daftry, S., Zeng, S., Bagnell, J.A., Hebert, M.: Introspective perception: learning to predict failures in vision systems. In: IEEE International Conference on Intelligent Robots and Systems, pp. 1743–1750. IEEE (2016)Google Scholar
  7. 7.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)Google Scholar
  8. 8.
    Freund, Y., Schapire, R.E., et al.: Experiments with a new boosting algorithm. In: International Conference on Machine Learning, vol. 96, pp. 148–156 (1996)Google Scholar
  9. 9.
    Gao, J., Jagadish, H., Ooi, B.C.: Active sampler: light-weight accelerator for complex data analytics at scale. In: Advances in Neural Information Processing Systems (2016)Google Scholar
  10. 10.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  11. 11.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. CoRR abs/1412.6572 (2014)Google Scholar
  12. 12.
    Gopal, S.: Adaptive sampling for SGD by exploiting side information. In: International Conference on Machine Learning, pp. 364–372 (2016)Google Scholar
  13. 13.
    Hinton, G.E.: To recognize shapes, first learn to generate images. Progr. Brain Res. 165, 535–547 (2007)CrossRefGoogle Scholar
  14. 14.
    Jammalamadaka, N., Zisserman, A., Eichner, M., Ferrari, V., Jawahar, C.V.: Has my algorithm succeeded? An evaluator for human pose estimators. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 114–128. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33712-3_9CrossRefGoogle Scholar
  15. 15.
    Kim, T.H., Choi, J.: Screenernet: learning curriculum for neural networks (2018). arXiv preprint: arXiv:1801.00904
  16. 16.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  17. 17.
    Lapedriza, A., Pirsiavash, H., Bylinskii, Z., Torralba, A.: Are all training examples equally valuable? (2013). arXiv preprint: arXiv:1311.6510
  18. 18.
    Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision (2017)Google Scholar
  19. 19.
    Loshchilov, I., Hutter, F.: Online batch selection for faster training of neural networks. In: International Conference on Learning Representations Workshop (2016)Google Scholar
  20. 20.
    McGill, M., Perona, P.: Deciding how to decide: dynamic routing in artificial neural networks. In: International Conference on Machine Learning, pp. 2363–2372 (2017)Google Scholar
  21. 21.
    Moghimi, M., Belongie, S.J., Saberian, M.J., Yang, J., Vasconcelos, N., Li, L.J.: Boosted convolutional neural networks. In: British Machine Vision Conference (2016)Google Scholar
  22. 22.
    Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 413–420. IEEE (2009)Google Scholar
  23. 23.
    Shin, H.C., Roberts, K., Lu, L., Demner-Fushman, D., Yao, J., Summers, R.M.: Learning to read chest x-rays: recurrent neural cascade model for automated image annotation. In: The IEEE Conference on Computer Vision and Pattern Recognition, June 2016Google Scholar
  24. 24.
    Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 761–769 (2016)Google Scholar
  25. 25.
    Spitkovsky, V.I., Alshawi, H., Jurafsky, D.: Baby steps: how “less is more” in unsupervised dependency parsing. In: NIPS: Grammar Induction, Representation of Language and Language Learning, pp. 1–10 (2009)Google Scholar
  26. 26.
    Szegedy, C., et al.: Intriguing properties of neural networks (2013). arXiv preprint: arXiv:1312.6199
  27. 27.
    Tudor Ionescu, R., Alexe, B., Leordeanu, M., Popescu, M., Papadopoulos, D.P., Ferrari, V.: How hard can it be? Estimating the difficulty of visual search in an image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2157–2166 (2016)Google Scholar
  28. 28.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, p. I (2001)Google Scholar
  29. 29.
    Wang, X., Luo, Y., Crankshaw, D., Tumanov, A., Yu, F., Gonzalez, J.E.: IDK cascades: fast deep learning by learning not to overthink (2017). arXiv preprint: arXiv:1706.00885
  30. 30.
    Wu, B., Chen, W., Sun, P., Liu, W., Ghanem, B., Lyu, S.: Tagging like humans: diverse and distinct image annotation. In: The IEEE Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
  31. 31.
    Xiao, T., Xia, T., Yang, Y., Huang, C., Wang, X.: Learning from massive noisy labeled data for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2691–2699 (2015)Google Scholar
  32. 32.
    Zhang, P., Wang, J., Farhadi, A., Hebert, M., Parikh, D.: Predicting failures of vision systems. In: CVPR (2014)Google Scholar
  33. 33.
    Zhao, P., Zhang, T.: Stochastic optimization with importance sampling for regularized loss minimization. In: International Conference on Machine Learning, pp. 1–9 (2015)Google Scholar
  34. 34.
    Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2017)Google Scholar
  35. 35.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Statistical and Visual Computing LabUC San DiegoSan DiegoUSA

Personalised recommendations