Automatic analysis of broadcast football videos using contextual priors
- 477 Downloads
- 3 Citations
Abstract
The presence of standard video editing practices in broadcast sports videos, like football, effectively means that such videos have stronger contextual priors than most generic videos. In this paper, we show that such information can be harnessed for automatic analysis of sports videos. Specifically, given an input video, we output per-frame information about camera angles and the events (goal, foul, etc.). Our main insight is that in the presence of temporal context (camera angles) for a video, the problem of event tagging (fouls, corners, goals, etc.) can be cast as per frame multi-class classification problem. We show that even with simple classifiers like linear SVM, we get significant improvement in the event tagging task when contextual information is included. We present extensive results for 10 matches from the recently concluded Football World Cup, to demonstrate the effectiveness of our approach.
Keywords
Sports video analysis Broadcast video Event classification Content-based retrievalSupplementary material
References
- 1.Assfalg, J., Bertini, M., Colombo, C., Bimbo, A.D., Nunziati, W.: Semantic annotation of soccer videos: automatic highlights identification. CVIU 92, 285–305 (2003)Google Scholar
- 2.Chen, C., Wang, O., Heinzle, S., Carr, P., Smolic, A., Gross, M.: Computational sports broadcasting: automated director assistance for live sports. In: ICME (2013)Google Scholar
- 3.Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, vol. 1, pp. 1–2 (2004)Google Scholar
- 4.Duan, L.Y., Xu, M., Tian, Q., Xu, C.S., Jin, J.S.: A unified framework for semantic shot classification in sports video. Multimed. IEEE Trans. 7(6), 1066–1083 (2005)CrossRefGoogle Scholar
- 5.Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
- 6.Heckbert, P.: Color image quantization for frame buffer display. In: SIGGRAPH (1982)Google Scholar
- 7.Jain, V., Singhal, A., Luo, J.: Selective hidden random fields: exploiting domain-specific saliency for event classification. In: CVPR (2008)Google Scholar
- 8.Kapela, R., McGuinness, K., Swietlicka, A., OConnor, N.E.: Real-time event detection in field sport videos. In: Computer vision in Sports, pp. 293–316 (2014)Google Scholar
- 9.Kim, K., Grundmann, M., Shamir, A., Matthews, I., Hodgins, J., Essa, I.: Motion fields to predict play evolution in dynamic sport scenes. In: CVPR (2010)Google Scholar
- 10.Kong, Y., Hu, W., Zhang, X., Wang, H., Jia, Y.: Learning group activity in soccer videos from local motion. In: ACCV (2010)Google Scholar
- 11.Koprinska, I., Carrato, S.: Temporal video segmentation: a survey. Signal Process.: Image Commun. 16(5), 477–500 (2001)Google Scholar
- 12.Ladicky, L., Russell, C., Kohli, P., Torr, P.H.: Graph cut based inference with co-occurrence statistics. In: ECCV (2010)Google Scholar
- 13.Lucey, P., Bialkowski, A., Carr, P., Morgan, S., Matthews, I., Sheikh, Y.: Representing and discovering adversarial team behaviors using player roles. In: CVPR (2013)Google Scholar
- 14.Ma, Z., Yang, Y., Cai, Y., Sebe, N., Hauptmann, A.G.: Knowledge adaptation for ad hoc multimedia event detection with few exemplars. In: ACM Multimedia, pp. 469–478 (2012)Google Scholar
- 15.Nguyen, N., Yoshitaka, A.: Shot type and replay detection for soccer video parsing. In: Multimedia (ISM), 2012 IEEE International Symposium on, pp. 344–347 (2012)Google Scholar
- 16.Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Globally-optimal greedy algorithms for tracking a variable number of objects. In: CVPR (2011)Google Scholar
- 17.Qian, X., Liu, G., Wang, Z., Li, Z., Wang, H.: Highlight events detection in soccer video using hcrf. In: Proceedings of the Second International Conference on Internet Multimedia Computing and Service, pp. 171–174 (2010)Google Scholar
- 18.Sigari, M.H., Soltanian-Zadeh, H., Kiani, V., Pourreza, A.R.: Counterattack detection in broadcast soccer videos using camera motion estimation. In: AISP, pp. 101–106 (2015)Google Scholar
- 19.Thompson, R., Bowen, C.: Grammar of the Edit. Focal Press, Massachusetts (2009)Google Scholar
- 20.Walecki, R., Rudovic, O., Pavlovic, V., Pantic, M.: Variable-state latent conditional random fields for facial expression recognition and action unit detection. In: Automatic Face and Gesture Recognition (2015)Google Scholar
- 21.Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: CVPR (2015)Google Scholar
- 22.Xie, L., Chang, S.F., Divakaran, A., Sun, H.: Structure analysis of soccer video with hidden Markov models. In: ICASSP (2002)Google Scholar
- 23.Xu, C., Wang, J., Wan, K., Li, Y., Duan, L.: Live sports event detection based on broadcast video and web-casting text. In: ACM Multimedia, pp. 221–230 (2006)Google Scholar
- 24.Xu, G., Ma, Y.F., Zhang, H.J., Yang, S.Q.: An hmm-based framework for video semantic analysis. Circuits Syst. Video Technol. IEEE Trans. 15(11), 1422–1433 (2005)CrossRefGoogle Scholar
- 25.Xu, P., Xie, L., Chang, S.F., Divakaran, A., Vetro, A., Sun, H.: Algorithms and system for segmentation and structure analysis in soccer video. ICME 1, 928–931 (2001)Google Scholar