Signal, Image and Video Processing

, Volume 10, Issue 2, pp 319–326 | Cite as

Use of trajectory and spatiotemporal features for retrieval of videos with a prominent moving foreground object

Original Paper


This paper presents generalized spatiotemporal analysis and lookup tool (GESTALT), an unsupervised framework for content-based video retrieval. GESTALT takes a query video and retrieves “similar” videos from the database. Motion and dynamics of appearance (shape) patterns of a prominent moving foreground object are considered as the key components of the video content and captured using corresponding feature descriptors. GESTALT automatically segments the moving foreground object from the given query video shot and estimates the motion trajectory. A graph-based framework is used to explicitly capture the structural and kinematics property of the motion trajectory, while an improved version of an existing spatiotemporal feature descriptor is proposed to model the change in object shape and movement over time. A combined match cost is computed as a convex combination of the two match scores, using these two feature descriptors, which is used to rank-order the retrieved video shots. Effectiveness of GESTALT is shown using extensive experimentation, and comparative study with recent techniques exhibits its superiority.


CBVR Spatiotemporal Time series Trajectory  Hyperstring Tracking 


  1. 1.
    Zheng, W., Faisal, Q.: I remember seeing this video: image driven search in video collections. In: ICCRV (2013)Google Scholar
  2. 2.
    Hsieh, J.W., Yu, S.L., Chen, Y.S.: Motion-based video retrieval by trajectory matching. In: IEEE T-CSVT, pp. 396–409 (2006)Google Scholar
  3. 3.
    Dyana, A., Das, S.: Trajectory representation using gabor features for motion-based video retrieval. Pattern Recognit. Lett. 30, 877–892 (2009)CrossRefGoogle Scholar
  4. 4.
    Chattopadhyay, C., Das, S.: A novel hyperstring based descriptor for an improved representation of motion trajectory and retrieval of similar video shots with static camera. In: IEEE Proceedings of EAIT (2012)Google Scholar
  5. 5.
    Papazoglou, A., Ferrari, V.: Fast object segmentation in unconstrained video. In: ICCV (2013)Google Scholar
  6. 6.
    Zhang, D., Javed, O., Shah, M.: Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. In: CVPR (2013)Google Scholar
  7. 7.
    Wang, T., Wang, S., Xiaoqing, D.: Detecting human action as the spatio-temporal tube of maximum mutual information. In: IEEE T-CSVT, vol. 2(24), pp. 277–290 (2014)Google Scholar
  8. 8.
    Kim, S.W., Yin, S., Yun, K., Choi, J.Y.: Spatio-temporal weighting in local patches for direct estimation of camera motion in video stabilization. In: CVIU, vol. 118, pp. 71–83 (2014)Google Scholar
  9. 9.
    Liu, S., Yuan, L., Tan, P., Sun, J.: SteadyFlow: spatially smooth optical flow for video stabilization. In: CVPR (2014)Google Scholar
  10. 10.
    Roshtkhari, J.M., Levine, M.D.: An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions. In: CVIU, pp. 1436–1452 (2013)Google Scholar
  11. 11.
    Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware activity recognition and anomaly detection in video. In: IEEE STSP, vol. 1(7), pp. 91–101 (2013)Google Scholar
  12. 12.
    Laptev, I.: On space-time interest points. In: IJCV, vol. 64(2–3), pp. 107–123 (2005)Google Scholar
  13. 13.
    Mehmet, E.D., Ozgur, U., Ugur, G.: Rule-based spatiotemporal query processing for video databases. In: VLDB, pp. 86–103 (2004)Google Scholar
  14. 14.
    Dimitrova, N., Golshani, F.: Motion recovery for video content classification. In: ACM IS, pp. 408–439 (1995)Google Scholar
  15. 15.
    Bashir, F., Khokhar, A., Schonfeld, D.: Real-time motion trajectory-based indexing and retrieval of video sequences. In: IEEE T-M, pp. 58–65 (2007)Google Scholar
  16. 16.
    Khalid, S., Naftel, A.: Motion trajectory clustering for video retrieval using spatio-temporal approximations. In: VIIS, vol. 3736, pp. 60–70 (2006)Google Scholar
  17. 17.
    Wang, H., Klaser, A., Schmid, C., Cheng, L. L.: Action recognition by dense trajectories. In: CVPR (2011)Google Scholar
  18. 18.
    Zhao, Z., Cui, B., Gao, C., Zi, H., Tao, S.H.: Extracting representative motion flows for effective video retrieval. Multimed. Tools Appl. 58(3), 687–711 (2011)CrossRefGoogle Scholar
  19. 19.
    Choon-Bo, S., Jae-Woo C.: Spatio-temporal representation and retrieval using moving object’s trajectories. In: ACM MM (2000)Google Scholar
  20. 20.
    Zhe, J.L., Little, J.J., Gu, Z.: Video retrieval by spatial and temporal structure of trajectories. In: SPIE (2001)Google Scholar
  21. 21.
    Megrhi, S., Souidene, W., Beghdadi, A.: Spatio-temporal salient feature extraction for perceptual content based video retrieval. In: CVCS (2013)Google Scholar
  22. 22.
    Basharat, A., Zhai, Y., Shah, M.: Content based video matching using spatiotemporal volumes. In: CVIU, vol. 110(3), pp. 360–377 (2008)Google Scholar
  23. 23.
    Liang, B., Xiao, W., Liu, X.: Design of video retrieval system using MPEG-7 descriptors. Procedia Eng. 29, 2578–2582 (2012)Google Scholar
  24. 24.
    Choi, J., Wang, Z., Lee, S.C., Jeon, W.J.: A spatio-temporal pyramid matching for video retrieval. In: CVIU, vol. 117(6), pp. 660–669 (2013)Google Scholar
  25. 25.
    Chattopadhyay, C., Das, S.: STAR: A content based video retrieval system for moving camera video shots. In: NCVPRIPG (2013)Google Scholar
  26. 26.
    Chattopadhyay, C., Das, S.: Enhancing the MST-CSS representation using robust geometric features, for efficient content based video retrieval (CBVR). In: ISM (2012)Google Scholar
  27. 27.
    Dyana, A., Das, S.: MST-CSS (Multi-Spectro-Temporal Curvature Scale Space), a novel spatio-temporal representation for content-based video retrieval. In: IEEE T-CSVT, pp. 1080–1094 (2010)Google Scholar
  28. 28.
    Chattopadhyay, C., Maurya, A.K.: Multivariate time series modeling of geometric features of spatio-temporal volumes for content based video retrieval. In: IJMIR, vol. 3(1), pp. 15–28 (2013)Google Scholar
  29. 29.
    Hong, C., Li, N., Song, M., Bu, J., Chen, C.: An efficient approach to content-based object retrieval in videos. Neurocomputing 74(17), 3565–3575 (2011)CrossRefGoogle Scholar
  30. 30.
    Gao, H.P., Yang, Z.Q.: Content based video retrieval using spatiotemporal salient objects. In: ICPR (2010)Google Scholar
  31. 31.
    Cuturi, M.: Fast global alignment kernels. In: ICML (2011)Google Scholar
  32. 32.
    Zhang, K., Zhang, L., Yang, M.H.: Real-time compressive tracking. In: ECCV (2012)Google Scholar
  33. 33.
    Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: CVPR (2009)Google Scholar
  34. 34.
    Chuan, Y., Lihe, Z., Huchuan, L., Xiang, R., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: CVPR (2013)Google Scholar
  35. 35.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 6(24), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  36. 36.
    O’Neill, B.: Elementary Differential Geometry. Academic Press, London (1997)MATHGoogle Scholar
  37. 37.
  38. 38.
    Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: ICCV (2011)Google Scholar

Copyright information

© Springer-Verlag London 2015

Authors and Affiliations

  1. 1.Indian Institute of Technology MadrasChennaiIndia

Personalised recommendations