A Novel Framework for Robust Annotation and Retrieval in Video Sequences

  • Arasanathan Anjulan
  • Nishan Canagarajah
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4071)


This paper describes a method for automatic video annotation and scene retrieval based on local region descriptors. A novel framework is proposed for combined video segmentation, content extraction and retrieval. A similarity measure, previously proposed by the authors based on local region features, is used for video segmentation. The local regions are tracked throughout a shot and stable features are extracted. The conventional key frame method is replaced with these stable local features to characterise different shots. Compared to previous video annotation approaches, the proposed method is highly robust to camera and object motions and can withstand severe illumination changes and spatial editing. We apply the proposed framework to shot cut detection and scene retrieval applications and demonstrate superior performance compared to existing methods. Furthermore as segmentation and content extraction are performed within the same step, the overall computational complexity of the system is considerably reduced.


Video Sequence Video Segmentation Video Annotation Maximally Stable Extremal Region Region Descriptor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zhang, H.J., Kankanhalli, A., Smoliar, S.W.: Automatic partitioning of full-motion video. Multimedia Syetems 1, 10–28 (1993)CrossRefGoogle Scholar
  2. 2.
    Hampapur, A., Jain, R., Weymouth, T.: Digital video segmentation. In: Proc. ACM Multimedia, pp. 357–364 (1994)Google Scholar
  3. 3.
    Nagasaka, A., Tanaka, Y.: Automatic video indexing and full-video search for object appearences. In: Proc. Visual database Systems, pp. 113–127 (1992)Google Scholar
  4. 4.
    Yusoff, Y., Christmas, W., Kittler, J.: Video Shot Cut Detection Using Adaptive Thresholding. In: Proc. British Machine Vision Conference, pp. 362–381 (2000)Google Scholar
  5. 5.
    Porter, S.V., Mirmehdi, M., Thomas, B.T.: Video Cut Detection using Frequency Domain Correlation. In: Proc. International Conference on Pattern Recognition, pp. 413–416 (2000)Google Scholar
  6. 6.
    Chen, H.-Y., Wu, J.-L.: A multi-layer video browsing system. IEEE Trans. on Consumer Electronics 44, 842–850 (1995)CrossRefGoogle Scholar
  7. 7.
    Gunsel, B., Tekalp, A.M.: Content-based video abstraction. In: Proc. International Conference on Image Processing, pp. 128–132 (1998)Google Scholar
  8. 8.
    Ardizzone, E., Cascia, M.L.: Video indexing using optical flow field. In: Proc. International Conference on Image Processing, pp. 831–834 (1996)Google Scholar
  9. 9.
    Zhang, H.J., Zhong, Smoliar, S.W.: An integrated system for content-based video retrieval and browsing. Pattern Recognition 30, 643–658 (1997)CrossRefGoogle Scholar
  10. 10.
    Vermaak, J., Peraz, P., Gangnet, M., Blake, A.: Rapid summarisation and browsing of video sequences. In: Proc. British Machine Vision Conference, pp. 424–433 (2002)Google Scholar
  11. 11.
    Sivic, J., Zisserman, A.: Video Google: A text retrieval Approach to object matching in videos. In: Proc. International Conference on Computer Vision (2003)Google Scholar
  12. 12.
    Sivic, J., Schaffalitzky, F., Zisserman, A.: Efficient Object Retrieval from Videos. In: Proc. EUSIPCO (2004)Google Scholar
  13. 13.
    Anjulan, A., Canagarajah, N.: Invariant Region Descriptors for Robust Shot Segmentation. In: Proc. of IS&T/SPIE, 18th Annual Symposium on Electronic Imaging, California, USA (accepted, 2006)Google Scholar
  14. 14.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors, Technical report, University of Oxford (2004) Google Scholar
  15. 15.
    Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proc. British Machine Vision Conference, pp. 384–393 (2002)Google Scholar
  16. 16.
    Lowe, D.G.: Distinctive image features from scale-invariant key points. Int. Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  17. 17.
    Mikolajczy, K., Schmid, C.: A performance evaluation of local descriptors. In: Proc. International Conference on Computer Vision and Pattern Recognition, pp. 257–263 (2003)Google Scholar
  18. 18.
    Van Rijsbergen, C.J.: Information Retrieval. Butterworths (1979)Google Scholar
  19. 19.
    Müller, H., Marchand-Maillet, S., Pun, T.: The truth about corel - evaluation in image retrieval. In: Lew, M., Sebe, N., Eakins, J.P. (eds.) CIVR 2002. LNCS, vol. 2383, pp. 38–49. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Arasanathan Anjulan
    • 1
  • Nishan Canagarajah
    • 1
  1. 1.Department of Electrical and Electronic EngineeringUniversity of BristolBristolUK

Personalised recommendations