Abstract
Deep convolutional neural networks are currently applied to computer vision tasks, especially object detection. Due to the large dimensionality of the output space, four dimensions per bounding box of an object, classification techniques do not apply easily. We propose to adapt a structured loss function for neural network training which directly maximizes overlap of the prediction with ground truth bounding boxes. We show how this structured loss can be implemented efficiently, and demonstrate bounding box prediction on two of the Pascal VOC 2007 classes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Adv. In Neural Information Processing Systems (2012)
Schulz, H., Behnke, S.: Learning object-class segmentation with convolutional neural networks. In: Eur. Symp. on Art. Neural Networks (2012)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580 (2012)
Scherer, D., Müller, A., Behnke, S.: Evaluation of pooling operations in convolutional architectures for object recognition. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 92–101. Springer, Heidelberg (2010)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786) (2006)
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., et al.: Greedy layer-wise training of deep networks. In: Adv. in Neural Information Processing Systems 19 (2007)
Szegedy, C., Toshev, A., Erhan, D.: Deep Neural Networks for Object Detection. In: Adv. in Neural Information Processing Systems (2013)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv: 1311.2524 (2013)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Over-Feat: Integrated Recognition, Localization and Detection using Convolutional Networks, arXiv: 1312.6229 (2013)
Uijlings, J., van de Sande, K., Gevers, T., Smeulders, A.: Selective search for object recognition. Int. Journal of Computer Vision 104(2) (2013)
Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable Object Detection using Deep Neural Networks. arXiv: 1312.2249 (2013)
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: A branch and bound framework for object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(12) (2009)
Lampert, C.H.: Maximum Margin Multi-Label Structured Prediction. In: Adv. in Neural Information Processing Systems, vol. 11 (2011)
Taskar, B., Chatalbashev, V., Koller, D., Guestrin, C.: Learning structured prediction models: A large margin approach. In: Int. Conf. on Machine Learning (2005)
Zhu, X., Vondrick, C., Ramanan, D., Fowlkes, C.: Do We Need More Training Data or Better Models for Object Detection? In: British Machine Vision Conference (2012)
Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: Int. Conf. on Machine Learning (2013)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research 12 (2011)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. Journal of Computer Vision 88(2) (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Schulz, H., Behnke, S. (2014). Structured Prediction for Object Detection in Deep Neural Networks. In: Wermter, S., et al. Artificial Neural Networks and Machine Learning – ICANN 2014. ICANN 2014. Lecture Notes in Computer Science, vol 8681. Springer, Cham. https://doi.org/10.1007/978-3-319-11179-7_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-11179-7_50
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11178-0
Online ISBN: 978-3-319-11179-7
eBook Packages: Computer ScienceComputer Science (R0)