SmartTennisTV: Automatic Indexing of Tennis Videos

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 841)


In this paper, we demonstrate a score based indexing approach for tennis videos. Given a broadcast tennis video (btv), we index all the video segments with their scores to create a navigable and searchable match. Our approach temporally segments the rallies in the video and then recognizes the scores from each of the segments, before refining the scores using the knowledge of the tennis scoring system. We finally build an interface to effortlessly retrieve and view the relevant video segments by also automatically tagging the segmented rallies with human accessible tags such as ‘fault’ and ‘deuce’. The efficiency of our approach is demonstrated on btv’s from two major tennis tournaments.


Tennis Video Score Refinement Recurrent Convolutional Neural Network Scene Text Detection Method Tennis Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Huang, Y.P., Chiou, C.L., Sandnes, F.E.: An intelligent strategy for the automatic detection of highlights in tennis video recordings. Expert Syst. Appl. 36(6), 9907–9918 (2009)CrossRefGoogle Scholar
  2. 2.
    Ghanem, B., Kreidieh, M., Farra, M., Zhang, T.: Context-aware learning for automatic sports highlight recognition. In: 21st International Conference on Pattern Recognition (ICPR), pp. 1977–1980. IEEE (2012)Google Scholar
  3. 3.
    Hanjalic, A.: Generic approach to highlights extraction from a sport video. In: International Conference on Image Processing (ICIP), vol. 1, pp. I–1. IEEE (2003)Google Scholar
  4. 4.
    Mentzelopoulos, M., Psarrou, A., Angelopoulou, A., Garcıá-Rodrıáguez, J.: Active foreground region extraction and tracking for sports video annotation. Neural Process. Lett. 37(1), 33–46 (2013)CrossRefGoogle Scholar
  5. 5.
    Zhang, Y., Zhang, X., Xu, C., Lu, H.: Personalized retrieval of sports video. In: Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval, pp. 313–322 (2007)Google Scholar
  6. 6.
    Kolekar, M.H., Sengupta, S.: Bayesian network-based customized highlight generation for broadcast soccer videos. IEEE Trans. Broadcast. 61(2), 195–209 (2015)CrossRefGoogle Scholar
  7. 7.
    Liu, C., Huang, Q., Jiang, S., Xing, L., Ye, Q., Gao, W.: A framework for flexible summarization of racquet sports video using multiple modalities. Comput. Vis. Image Underst. 113(3), 415–424 (2009)CrossRefGoogle Scholar
  8. 8.
    Connaghan, D., Kelly, P., O’Connor, N.E.: Game, shot and match: event-based indexing of tennis. In: 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 97–102. IEEE (2011)Google Scholar
  9. 9.
    Miyamori, H., Iisaku, S.-I.: Video annotation for content based retrieval using human behavior analysis and domain knowledge. In: Fourth IEEE International Conference on Automatic Face and Gesture Recognition Proceedings, pp. 320–325. IEEE (2000)Google Scholar
  10. 10.
    Sukhwani, M., Jawahar, C.: TennisVid2Text: fine-grained descriptions for domain specific videos. In: Proceedings of the British Machine Vision Conference (BMVC), pp. 117.1-117.12. BMVA Press, September 2015Google Scholar
  11. 11.
    Sukhwani, M., Jawahar, C.: Frame level annotations for tennis videos. In: International Conference on Pattern Recognition (2016)Google Scholar
  12. 12.
    Xu, C., Wang, J., Lu, H., Zhang, Y.: A novel framework for semantic annotation and personalized retrieval of sports video. IEEE Trans. Multimed. 10(3), 421–436 (2008)CrossRefGoogle Scholar
  13. 13.
    Yan, F., Christmas, W., Kittler, J.: All pairs shortest path formulation for multiple object tracking with application to tennis video analysis. In: British Machine Vision Conference (2007)Google Scholar
  14. 14.
    Yan, F., Kittler, J., Windridge, D., Christmas, W., Mikolajczyk, K., Cox, S., Huang, Q.: Automatic annotation of tennis games: an integration of audio, vision, and learning. Image Vis. Comput. 32(11), 896–903 (2014)CrossRefGoogle Scholar
  15. 15.
    Zhou, X., Xie, L., Huang, Q., Cox, S.J., Zhang, Y.: Tennis ball tracking using a two-layered data association approach. IEEE Trans. Multimed. 17(2), 145–156 (2015)CrossRefGoogle Scholar
  16. 16.
    Liao, S., Wang, Y., Xin, Y.: Research on scoreboard detection and localization in basketball video. Int. J. Multimed. Ubiquit. Eng. 10(11), 57–68 (2015)CrossRefGoogle Scholar
  17. 17.
    Miao, G., Zhu, G., Jiang, S., Huang, Q., Xu, C., Gao, W.: A real-time score detection and recognition approach for broadcast basketball video. In: IEEE International Conference on Multimedia and Expo, pp. 1691–1694 (2007)Google Scholar
  18. 18.
    Smith, R.: An overview of the Tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition, vol. 2, pp. 629–633 (2007)Google Scholar
  19. 19.
    Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2315–2324 (2016)Google Scholar
  20. 20.
    Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2298–2304 (2016)CrossRefGoogle Scholar
  21. 21.
    Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Synthetic data and artificial neural networks for natural scene text recognition. arXiv preprint arXiv:1406.2227 (2014)

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.CVIT, KCIS, IIITHyderabadIndia

Personalised recommendations