Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

High-level representation sketch for video event retrieval

Abstract

Representing video events is an essential step for a wide range of visual applications. In this paper, we propose the event sketch, a high-level event representation, to depict the dynamic properties of video events composed of actions of semantic objects. We show that this representation can facilitate a novel sketch based video retrieval (SBVR) system, which has not been considered before to the best of our knowledge. In this system, users are allowed to draw the evolutions (e.g. spatiotemporal layouts and behaviors of semantic objects) on a board, and retrieve the events whose semantic objects have the similar evolutions from a database. To do this, event sketches are constructed on both the user queries and database videos, and compared under a novel graph-matching scheme based on data-driven Monta Carlo Markov chain (DDMCMC). To test our approach, we collect a novel dataset of goal events in real soccer videos, which consists actions of multiple players and shows large variability in the evolution process of the events. Experiments on this dataset and the publicly available dataset CAVIAR demonstrated the effectiveness of the proposed approach.

This is a preview of subscription content, log in to check access.

References

  1. 1

    Yuan J, Zha Z J, Zheng Y T, et al. Learning concept bundles for video search with complex queries. In: Proceedings of International Conference on Multimedia, Scottsdale, 2011. 453–462

  2. 2

    Bao L, Cao J, Zhang Y, et al. Explicit and implicit concept-based video retrieval with bipartite graph propagation model. In: Proceedings of International Conference on Multimedia, Firenze, 2010. 939–942

  3. 3

    Ulges A, Schulze C, Koch M, et al. Learning automatic concept detectors from online video. Comput Vis Image Underst, 2010, 114: 429–438

  4. 4

    Hu R, Collomosse J. Motion-sketch based video retrieval using a trellis levenshtein distance. In: Proceedings of International Conference on Pattern Recognition, Istanbul, 2010. 121–124

  5. 5

    Collomosse J P, McNeill G, Qian Y. Storyboard sketches for content based video retrieval. In: Proceedings of International Conference on Computer Vision, Kyoto, 2009. 245–252

  6. 6

    Hu R, James S, Collomosse J. Annotated free-hand sketches for video retrieval using object semantics and motion. In: Proceedings of the 18th International Conference on Advances in Multimedia Modeling. Berlin: Springer, 2012. 473–484

  7. 7

    Hu R, James S, Wang T, et al. Markov random fields for sketch based video retrieval. In: Proceedings of International Conference on Multimedia Retrieval, Dallas, 2013. 279–286

  8. 8

    Zhou R, Chen L, Zhang L. Sketch-based image retrieval on a large scale database. In: Proceedings of International Conference on Multimedia, Nara, 2012. 973–976

  9. 9

    Eitz M, Hildebrand K, Boubekeur T, et al. Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans Vis Comput Graph, 2011, 17: 1624–1636

  10. 10

    Cao Y, Wang C, Zhang L, et al. Edgel index for large-scale sketch-based image search. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, Colorado, 2011. 761–768

  11. 11

    Lu D, Ma H, Fu H. Efficient Sketch-based 3D shape retrieval via view selection. In: Proceedings of Advances in Multimedia Information Processing–PCM, Nanjing, 2013. 396–407

  12. 12

    Xu H, Wang J, Hua X S, et al. Interactive image search by 2D semantic map. In: Proceedings of International Conference on World Wide Web, Raleigh, 2010. 1321–1324

  13. 13

    Yu G, Yuan J, Liu Z. Action search by example using randomized visual vocabularies. IEEE Trans Image Process, 2013, 22: 377–390

  14. 14

    Lan T, Wang Y, Mori G, et al. Retrieving actions in group contexts. In: Proceedings of the 11th European Conference on Trends and Topics in Computer Vision–Volume Part I. Berlin: Springer, 2012. 181–194

  15. 15

    Ma X, Chen X, Khokhar A, et al. Motion trajectory-based video retrieval, classification, and summarization. In: Video Search and Mining. Berlin: Springer, 2010. 53–82

  16. 16

    Cheng Z, Qin L, Huang Q, et al. Human group activity analysis with fusion of motion and appearance information. In: Proceedings of International Conference on Multimedia, Scottsdale, 2011. 1401–1404

  17. 17

    Fisher M, Savva M, Hanrahan P. Characterizing structural relationships in scenes using graph kernels. ACM Trans Graph, 2011, 30: 34

  18. 18

    Chang C C, Lin C J. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Tech, 2011, 2: 27

  19. 19

    Pérez P, Hue C, Vermaak J, et al. Color-based probabilistic tracking. In: Proceedings of European Conference on Computer Vision, Copenhagen, 2002. 661–675

  20. 20

    Tran D, Sorokin A. Human activity recognition with metric learning. In: Proceedings of European Conference on Computer Vision, Copenhagen, 2008. 548–561

  21. 21

    Jiang K, Chen X, Zhang Y, et al. Video event representation and inference on and-or graph. Comput Animat Virtual Worlds, 2012, 23: 145–154

  22. 22

    Ribeiro P C, Santos-Victor J. Human activity recognition from video: modeling, feature selection and classification architecture. In: Proceedings of International Workshop on Human Activity Recognition and Modelling, Oxford, 2005. 61–78

  23. 23

    Ben Shitrit H, Berclaz J, Fleuret F, et al. Tracking multiple people under global appearance constraints. In: Proceedings of International Conference on Computer Vision, Barcelona, 2011. 137–144

  24. 24

    Xie Y, Chang H, Li Z, et al. A unified framework for locating and recognizing human actions. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, Colorado, 2011. 25–32

  25. 25

    Hua X-S, Qi G-J. Online multi-label active annotation: towards large-scale content-based video search. In: Proceedings of International Conference on Multimedia, Vancouver, 2008. 141–150

  26. 26

    Ahn L-V, Dabbish L. Labeling images with a computer game. In: Processings of SIGCHI Conference on Human Factors in Computing Systems, Vienna, 2004. 319–326

  27. 27

    Sorokin A, Forsyth D. Utility data annotation with amazon mechanical turk. In: Workshops of International Conference on Computer Vision and Pattern Recognition, Anchorage, 2008. 1–8

  28. 28

    Lee J, Cho M, Lee K M. A graph matching algorithm using data-driven markov chain monte carlo sampling. In: Proceedings of International Conference on Pattern Recognition, Istanbul, 2010. 2816–2819

Download references

Author information

Correspondence to Xiaowu Chen.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Chen, X., Lin, L. et al. High-level representation sketch for video event retrieval. Sci. China Inf. Sci. 59, 072103 (2016). https://doi.org/10.1007/s11432-015-5494-4

Download citation

Keywords

  • video retrieval
  • event representation
  • event sketch
  • drawing board
  • distance function
  • relevance feedback
  • DDMCMC