Advertisement

Fusion of Multiple Cue Detectors for Automatic Sports Video Annotation

  • Josef Kittler
  • Marco Ballette
  • W. J. Christmas
  • Edward Jaser
  • Kieron Messer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2396)

Abstract

This paper describes an aspect of a developing system named ASSAVID which will provide an automatic and semantic annotation of sports video. This annotation process segments the sports video into semantic categories (e.g. type of sport) and permits the user to formulate queries to retrieve events that are significant to that particular sport (e.g. goal, foul). The system relies upon the concept of “cues” which attach semantic meaning to low-level features computed on the video. In this paper we adopt the multiple classifier system approach to fusing the outputs of multiple cue detectors using Behaviour Knowledge Space fusion. Using this technique, unknown sports video can be classified into the type of sport being played. Experimental results on sports video provided by the BBC demonstrate that this method is working well.

Keywords

Semantic Annotation Sport Video Video Annotation Colour Pair British Broadcasting Corporation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
  2. 2.
    B. V. Levienaise-Obadia, W. Christmas, J. Kittler, K. Messer, and Y. Yusoff. Ovid: towards object-based video retrieval. In Proceedings of Storage and Retrieval for Video and Image Databases VIII (part of the SPIE/ITT Symposium: Electronic Imaging’2000), Jan 2000.Google Scholar
  3. 3.
    Y. Huang and C. Suen. A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Transaction on Pattern Analysis and Machine Intelligence, 17(1), 1 1995.Google Scholar
  4. 4.
    S. S. Intille and A. F. Bobick. A framework for representing multi-agent action from visual evidence. In Proceedings of the National Conference on Artificial Intelligence (AAAI), July 1999.Google Scholar
  5. 5.
    B. Levienaise-Obadia, J. Kittler, and W. Christmas. Defining quantisation strategies and a perceptual similarity measure for texture-based annotation and retrieval. In IEEE, editor, ICPR’2000, volume III, Sep 2000.Google Scholar
  6. 6.
    J. Matas, D. Koubaroulis, and J. Kittler. Colour Image Retrieval and Object Recognition Using the Multimodal Neighbourhood Signature. In D. Vernon, editor, Proceedings of the European Conference on Computer Vision, LNCS vol. 1842, pages 48–64, Berlin, Germany, June 2000. Springer.Google Scholar
  7. 7.
    K. Messer and J. Kittler. A region-based image database system using colour and texture. Pattern Recognition Letters, pages 1323–1330, November 1999.Google Scholar
  8. 8.
    H. Mo, S. Satoh, and M. Sakauchi. A study of image recognition using similarity retrieval. In First International Conference on Visual Information Systems (Visual’96), pages 136–141, 1996.Google Scholar
  9. 9.
    D. D. Saur, Y.-P. Tan, S. R. Kulkarni, and P. J. Ramadge. Automated analysis and annotation of basketball video. In SPIE Storage and Retrieval for Still Image and Video Databases V, Vol.3022, pages 176–187, 1997.Google Scholar
  10. 10.
    V. Kobla, D. DeMenthon, and D. Doermann. Identifying sporst video using replay, text and camera motion features. In SPIE Storage and retrieval for Media Database 2000, pages 332–342, 2000.Google Scholar
  11. 11.
    K.-D. Wernecke. A coupling procedure for the discrimination of mixed data. Biometrics, 48:497–506, 6 1992.CrossRefGoogle Scholar
  12. 12.
    L. Xu, A. Krzyzak, and C. Y. Suen. Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Transaction. SMC, 22(3):418–435, 1992.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Josef Kittler
    • 1
  • Marco Ballette
    • 1
  • W. J. Christmas
    • 1
  • Edward Jaser
    • 1
  • Kieron Messer
    • 1
  1. 1.Centre for Vision Speech and Signal ProcessingUniversity of SurreySurreyUK

Personalised recommendations