Multiple Object Detection in \(360^{\circ }\) Videos for Robust Tracking

  • V. Vineeth KumarEmail author
  • Shanthika Naik
  • Polisetty L. Sarvani
  • Shreya M. Pattanshetti
  • Uma Mudenagudi
  • Meena Maralappanavar
  • Priyadarshini Patil
  • Ramesh A. Tabib
  • Basavaraja S. Vandrotti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11942)


In this paper, we propose an efficient way to detect objects in 360\(^{\circ }\) videos in order to boost the performance of tracking on the same. Though extensive work has been done in the field of 2D video processing, the domain of 360\(^{\circ }\) video processing has not been explored much yet, as it poses difficulties such as (1) unavailability of the annotated dataset (2) severe geometric distortions at panoramic poles of the image and (3) high resolution of the media which requires high computation capable machinery. The State-of-the-art detection algorithm involves the use of CNN (Convolution Neural Networks) trained on a large dataset. Faster RCNN, SSD, YOLO, YOLO9000, YOLOv3 etc. are some of the detection algorithms that use CNN. Among these, though YOLOv3 might not be the most accurate, it is the fastest, and this trade-off between speed and accuracy is acceptable. We improvise upon this algorithm, to make it suitable for the 360\(^{\circ }\) dataset. We propose YOLO360, a CNN network to detect objects in 360\(^{\circ }\) videos and thus increase the tracking precision and accuracy. This is achieved by performing transfer learning on YOLOv3 with the manually annotated dataset.


Computer vision Equirectangular frames 360\(^{\circ }\) images Detection Tracking Transfer Learning 



This work was supported and mentored by Samsung R&D Institute Bangalore, India.


  1. 1.
    Sujatha, C., Chivate, A.R., Ganihar, S.A., Mudenagudi, U.: Time driven video summarization using GMM. In: 2013 4th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, IIT Jodhapur, pp. 1–4 (2013)Google Scholar
  2. 2.
    Sujatha, C., Mudenagudi, U.: Gaussian mixture model for summarization of surveillance video. In: 2015 5th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, IIT Patna, pp. 1–4 (2015)Google Scholar
  3. 3.
    Tabib, R.A., Patil, U., Ganihar, S.A., Trivedi, N., Mudenagudi, U.: Decision fusion for robust horizon estimation using dempster shafer combination rule. In: 2013 4th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, NCVPRIPG 2013, IIT Jodhapur, pp. 1–4 (2013)Google Scholar
  4. 4.
    Lampert, C.H., Blaschko, M.B., Hofmann, T.: Beyond sliding windows Object localization by efficient subwindow search. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
  5. 5.
    Harzallah, H., Jurie, F., Schmid, C.: Combining efficient object localization and image classification. In: 2009 IEEE 12th International Conference on Computer Vision (2009)Google Scholar
  6. 6.
    Wang, J., Yu, K., Lv, F., Gong, Y., Huang, T., Yang, J.: Locality-constrained linear coding for image classification. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  7. 7.
    Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) (2009).
  8. 8.
    Reid, D.: An algorithm for tracking multiple targets. IEEE Trans. Autom. Control 24, 843–854 (1979)CrossRefGoogle Scholar
  9. 9.
    Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38, 13 (2006). Scholar
  10. 10.
    Peterfreund, N.: Robust tracking of position and velocity with Kalman Snakes. IEEE Trans. Pattern Anal. Mach. Intell. 21, 564–569 (1999)CrossRefGoogle Scholar
  11. 11.
    Bewle, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. CoRR (2016)
  12. 12.
    Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. CoRR (2017).
  13. 13.
    Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process. (2008). Scholar
  14. 14.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. CoRR (2014).
  15. 15.
    Frey, B.J., Dueck, D.: Clustering by passing messages between data points (2007)Google Scholar
  16. 16.
    Gabriel, P., Verly, J., Piater, J., Genon, A.: Proceedings of Advanced Concepts for Intelligent Vision Systems (2014)Google Scholar
  17. 17.
    Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. CoRR (2018).
  18. 18.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • V. Vineeth Kumar
    • 1
    Email author
  • Shanthika Naik
    • 1
  • Polisetty L. Sarvani
    • 1
  • Shreya M. Pattanshetti
    • 1
  • Uma Mudenagudi
    • 1
  • Meena Maralappanavar
    • 1
  • Priyadarshini Patil
    • 1
  • Ramesh A. Tabib
    • 1
  • Basavaraja S. Vandrotti
    • 2
  1. 1.KLE Technological UniversityHubballiIndia
  2. 2.Samsung R&D InstituteBangaloreIndia

Personalised recommendations