Deep Learning Applications pp 113-135 | Cite as

# Deep Active Learning for Image Regression

- 104 Downloads

## Abstract

Image regression is an important problem in computer vision and is useful in a variety of applications. However, training a robust regression model necessitates large amounts of labeled training data, which is time-consuming and expensive to acquire. Active learning algorithms automatically identify the salient and exemplar instances from large amounts of unlabeled data and tremendously reduce human annotation effort in inducing a machine learning model. Further, deep learning models like Convolutional Neural Networks (CNNs) have gained popularity to automatically learn representative features from a given dataset and have depicted promising performance in a variety of classification and regression applications. In this chapter, we exploit the feature learning capabilities of deep neural networks and propose a novel framework to address the problem of active learning for regression. We formulate a loss function (based on the expected model output change) relevant to the research task and exploit the gradient descent algorithm to optimize the loss and train the deep CNN. To the best of our knowledge, this is the first research effort to learn a discriminative set of features using deep neural networks to actively select informative samples in the regression setting. Our extensive empirical studies on five benchmark regression datasets (from three different application domains: rotation angle estimation of handwritten digits, age, and head pose estimation) demonstrate the merit of our framework in tremendously reducing human annotation effort to induce a robust regression model.

## References

- 1.J. Azimi, A. Fern, X. Zhang-Fern, G. Borradaile, B. Heeringa, Batch active learning via coordinated matching (2012), arXiv:1206.6458
- 2.T. Baltrušaitis, P. Robinson, L.P. Morency, 3D constrained local model for rigid and non-rigid facial tracking, in
*2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*(IEEE, 2012), pp. 2610–2617Google Scholar - 3.V. Belagiannis, S. Amin, M. Andriluka, B. Schiele, N. Navab, S. Ilic, 3D pictorial structures for multiple human pose estimation, in
*Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition*(2014), pp. 1669–1676Google Scholar - 4.K. Brinker, Incorporating diversity in active learning with support vector machines, in
*Proceedings of the 20th International Conference on Machine Learning (ICML-03)*(2003), pp. 59–66Google Scholar - 5.R. Burbidge, J.J. Rowland, R.D. King, Active learning for regression based on query by committee, in
*International Conference on Intelligent Data Engineering and Automated Learning*(Springer, 2007), pp. 209–218Google Scholar - 6.W. Cai, Y. Zhang, J. Zhou, Maximizing expected model change for active learning in regression, in
*2013 IEEE 13th International Conference on Data Mining (ICDM)*(IEEE, 2013), pp. 51–60Google Scholar - 7.P. Campigotto, A. Passerini, R. Battiti, Active learning of pareto fronts. IEEE Trans. Neural Netw. Learn. Syst.
**25**(3), 506–519 (2014)CrossRefGoogle Scholar - 8.S. Chakraborty, V. Balasubramanian, S. Panchanathan, Adaptive batch mode active learning. IEEE Trans. Neural Netw. Learn. Syst.
**26**(8), 1747–1760 (2015)MathSciNetCrossRefGoogle Scholar - 9.R. Chattopadhyay, Z. Wang, W. Fan, I. Davidson, S. Panchanathan, J. Ye, Batch mode active sampling based on marginal probability distribution matching. ACM Trans. Knowl. Discov. Data (TKDD)
**7**(3), 13 (2013)Google Scholar - 10.D.A. Cohn, Z. Ghahramani, M.I. Jordan, Active learning with statistical models. J. Artif. Intell. Res. (1996)Google Scholar
- 11.A. Dosovitskiy, J.T. Springenberg, T. Brox, Learning to generate chairs with convolutional neural networks, in
*2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*(IEEE, 2015), pp. 1538–1546Google Scholar - 12.D. Eigen, C. Puhrsch, R. Fergus, Depth map prediction from a single image using a multi-scale deep network, in
*Advances in neural information processing systems*(2014), pp. 2366–2374Google Scholar - 13.E.J. de Fortuny, D. Martens, Active learning-based pedagogical rule extraction. IEEE Trans. Neural Netw. Learn. Syst.
**26**(11), 2664–2677 (2015)MathSciNetCrossRefGoogle Scholar - 14.Y. Freund, H.S. Seung, E. Shamir, N. Tishby, Selective sampling using the query by committee algorithm. Mach. Learn.
**28**(2–3), 133–168 (1997)CrossRefGoogle Scholar - 15.A. Freytag, E. Rodner, J. Denzler, Selecting influential examples: Active learning with expected model output changes, in
*European Conference on Computer Vision*(Springer, 2014), pp. 562–577Google Scholar - 16.R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in
*Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition*(2014), pp. 580–587Google Scholar - 17.G. Gkioxari, B. Hariharan, R. Girshick, J. Malik, R-CNNS for pose estimation and action detection (2014), arXiv:1406.5212
- 18.Y. Guo, Active instance sampling via matrix partition, in
*Advances in Neural Information Processing Systems*(2010), pp. 802–810Google Scholar - 19.Y. Guo, D. Schuurmans, Discriminative batch mode active learning, in
*Advances in Neural Information Processing Systems*(2008), pp. 593–600Google Scholar - 20.S.C. Hoi, R. Jin, M.R. Lyu, Large-scale text categorization by batch mode active learning, in
*Proceedings of the 15th International Conference on World Wide Web*(ACM, 2006), pp. 633–642Google Scholar - 21.S.C. Hoi, R. Jin, M.R. Lyu, Batch mode active learning with applications to text categorization and image retrieval. IEEE Trans. Knowl. Data Eng.
**21**(9), 1233–1248 (2009)CrossRefGoogle Scholar - 22.M. Jaderberg, K. Simonyan, A. Vedaldi, A. Zisserman, Reading text in the wild with convolutional neural networks. Int. J. Comput. Vis.
**116**(1), 1–20 (2016)MathSciNetCrossRefGoogle Scholar - 23.C. Käding, E. Rodner, A. Freytag, J. Denzler, Active and continuous exploration with deep neural networks and expected model output changes (2016), arXiv:1612.06129
- 24.A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in
*Advances in Neural Information Processing Systems*(2012), pp. 1097–1105Google Scholar - 25.D. Laptev, N. Savinov, J.M. Buhmann, M. Pollefeys, Ti-pooling: transformation-invariant pooling for feature learning in convolutional neural networks, in
*Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition*(2016), pp. 289–297Google Scholar - 26.Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature
**521**(7553), 436 (2015)CrossRefGoogle Scholar - 27.D.D. Lewis, W.A. Gale, A sequential algorithm for training text classifiers, in
*Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval*(Springer, New York, 1994), pp. 3–12Google Scholar - 28.Z. Liu, X. Li, P. Luo, C.C. Loy, X. Tang, Semantic image segmentation via deep parsing network, in
*2015 IEEE International Conference on Computer Vision (ICCV)*(IEEE, 2015), pp. 1377–1385Google Scholar - 29.S. Liu, Y. Zhao, F. Xue, B. Chen, X. Chen, Deepcount: crowd counting with wifi via deep learning (2019), arXiv:1903.05316
- 30.T. Pfister, K. Simonyan, J. Charles, A. Zisserman, Deep convolutional neural networks for efficient pose estimation in gesture videos, in
*Asian Conference on Computer Vision*(Springer, 2014), pp. 538–552Google Scholar - 31.H. Ranganathan, S. Chakraborty, S. Panchanathan, Multimodal emotion recognition using deep learning architectures, in
*IEEE Winter Conference on Applications of Computer Vision (WACV)*(2016)Google Scholar - 32.H. Ranganathan, S. Chakraborty, S. Panchanathan, Transfer of multimodal emotion features in deep belief networks, in
*2016 50th Asilomar Conference on Signals, Systems and Computers*(IEEE, 2016), pp. 449–453Google Scholar - 33.H. Ranganathan, H. Venkateswara, S. Chakraborty, S. Panchanathan, Deep active learning for image classification, in
*IEEE International Conference on Image Processing (ICIP)*(2017)Google Scholar - 34.H. Ranganathan, H. Venkateswara, S. Chakraborty, S. Panchanathan, Multi-label deep active learning with label correlation, in
*IEEE International Conference on Image Processing (ICIP)*(2018)Google Scholar - 35.R. Ranjan, V.M. Patel, R. Chellappa, Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2017)Google Scholar
- 36.R. Rothe, R. Timofte, L. Van Gool, Dex: deep expectation of apparent age from a single image, in
*IEEE International Conference on Computer Vision Workshops*(2015), pp. 10–15Google Scholar - 37.R. Rothe, R. Timofte, L. Van Gool, Deep expectation of real and apparent age from a single image without facial landmarks. Int. J. Comput. Vis.
**126**(2–4), 144–157 (2018)MathSciNetCrossRefGoogle Scholar - 38.B. Settles, M. Craven, An analysis of active learning strategies for sequence labeling tasks, in
*Proceedings of the Conference on Empirical Methods in Natural Language Processing*(Association for Computational Linguistics, 2008), pp. 1070–1079Google Scholar - 39.J. Sherrah, S. Gong, Fusion of perceptual cues for robust tracking of head pose and position. Pattern Recognit.
**34**(8), 1565–1572 (2001)CrossRefGoogle Scholar - 40.Z. Shi, L. Zhang, Y. Liu, X. Cao, Crowd counting with deep negative correlation learning, in
*IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*(2018)Google Scholar - 41.F. Stark, C. Hazırbas, R. Triebel, D. Cremers, Captcha recognition with active deep learning, in
*Workshop New Challenges in Neural Computation 2015*(Citeseer, 2015), p. 94Google Scholar - 42.M. Sugiyama, Active learning in approximately linear regression based on conditional expectation of generalization error. J. Mach. Learn. Res.
**7**, 141–166 (2006)Google Scholar - 43.M. Sugiyama, S. Nakajima, Pool-based active learning in approximate linear regression. Mach. Learn.
**75**(3), 249–274 (2009)CrossRefGoogle Scholar - 44.Y. Sun, X. Wang, X. Tang, Deep convolutional network cascade for facial point detection, in
*2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*(IEEE, 2013), pp. 3476–3483Google Scholar - 45.C. Szegedy, A. Toshev, D. Erhan, Deep neural networks for object detection, in
*Advances in Neural Information Processing Systems*(2013), pp. 2553–2561Google Scholar - 46.S. Tong, D. Koller, Support vector machine active learning with applications to text classification. J. Mach. Learn. Res.
**2**, 45–66 (2001)Google Scholar - 47.A. Toshev, C. Szegedy, Deeppose: human pose estimation via deep neural networks, in
*Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition*(2014), pp. 1653–1660Google Scholar - 48.D. Wang, Y. Shang, A new active labeling method for deep learning, in
*2014 International Joint Conference on Neural Networks (IJCNN)*(IEEE, 2014), pp. 112–119Google Scholar - 49.X. Wang, R. Guo, C. Kambhamettu, Deeply-learned feature for age estimation, in
*IEEE Winter Conference on Applications of Computer Vision (WACV)*(2015)Google Scholar - 50.X. Wang, L. Zhang, L. Lin, Z. Liang, W. Zuo, Deep joint task learning for generic object extraction, in
*Advances in Neural Information Processing Systems*(2014), pp. 523–531Google Scholar - 51.M.A. Wani, F.A. Bhat, S. Afzal, A.I. Khan,
*Advances in Deep Learning*, vol. 57. (Springer, Berlin, 2020)Google Scholar - 52.R. Willett, R. Nowak, R.M. Castro, Faster rates in regression via active learning, in
*Advances in Neural Information Processing Systems*(2006), pp. 179–186Google Scholar - 53.H. Yu, S. Kim, Passive sampling for regression, in
*2010 IEEE 10th International Conference on Data Mining (ICDM)*(IEEE, 2010), pp. 1151–1156Google Scholar - 54.S. Zaghbani, N. Boujneh, M. Bouhlel, Age estimation using deep learning. Comput. Electr. Eng.
**68**, 1337–1347 (2018)CrossRefGoogle Scholar - 55.Z. Zhang, P. Luo, C.C. Loy, X. Tang, Facial landmark detection by deep multi-task learning, in
*European Conference on Computer Vision*(Springer, 2014), pp. 94–108Google Scholar - 56.C. Zhang, H. Li, X. Wang, X. Yang, Cross-scene crowd counting via deep convolutional neural networks, in
*IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*(2015)Google Scholar - 57.Z. Zhao, H. Li, R. Zhao, X. Wang, Crossing-line crowd counting with two-phase deep neural networks, in
*European Conference on Computer Vision (ECCV)*(2016)Google Scholar - 58.S. Zhou, Q. Chen, X. Wang, Active deep networks for semi-supervised sentiment classification, in
*Proceedings of the 23rd International Conference on Computational Linguistics: Posters*(Association for Computational Linguistics, 2010), pp. 1515–1523Google Scholar - 59.X. Zhu, J. Lafferty, Z. Ghahramani, Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions, in
*ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining*, vol. 3 (2003)Google Scholar