Multimedia Tools and Applications

, Volume 76, Issue 5, pp 6521–6549 | Cite as

SurvSurf: human retrieval on large surveillance video data

  • Sihao Ding
  • Gang Li
  • Ying Li
  • Xinfeng Li
  • Qiang Zhai
  • Adam C. Champion
  • Junda Zhu
  • Dong Xuan
  • Yuan F. Zheng


The volume of surveillance videos is increasing rapidly, where humans are the major objects of interest. Rapid human retrieval in surveillance videos is therefore desirable and applicable to a broad spectrum of applications. Existing big data processing tools that mainly target textual data cannot be applied directly for timely processing of large video data due to three main challenges: videos are more data-intensive than textual data; visual operations have higher computational complexity than textual operations; and traditional segmentation may damage video data’s continuous semantics. In this paper, we design SurvSurf, a human retrieval system on large surveillance video data that exploits characteristics of these data and big data processing tools. We propose using motion information contained in videos for video data segmentation. The basic data unit after segmentation is called M-clip. M-clips help remove redundant video contents and reduce data volumes. We use the MapReduce framework to process M-clips in parallel for human detection and appearance/motion feature extraction. We further accelerate vision algorithms by processing only sub-areas with significant motion vectors rather than entire frames. In addition, we design a distributed data store called V-BigTable to structuralize M-clips’ semantic information. V-BigTable enables efficient retrieval on a huge amount of M-clips. We implement the system on Hadoop and HBase. Experimental results show that our system outperforms basic solutions by one order of magnitude in computational time with satisfactory human retrieval accuracy.


Video analysis Surveillance MapReduce 


  1. 1.
    Apache Hadoop.
  2. 2.
  3. 3.
    Araujo A, Chaves J, Angst R, Girod B (2015) Temporal aggregation for large-scale query-by-image video retrieval. In: Proceedings of IEEE ICIP. IEEE, pp 1519–1522Google Scholar
  4. 4.
    Babu RV, Ramakrishnan K (2007) Compressed domain video retrieval using object and global motion descriptors. Multimed Tools Appl 32(1):93–113CrossRefGoogle Scholar
  5. 5.
    Bhattacharyya A (1946) On a measure of divergence between two multinomial populations. Sankhyā: the indian journal of statistics, pp 401–406Google Scholar
  6. 6.
    Candan KS, Kim JW, Nagarkar P, Nagendra M, Yu R (2011) Rankloud: scalable multimedia data processing in server clusters. IEEE Multimedia 18(1):64–77CrossRefGoogle Scholar
  7. 7.
    Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2): 4CrossRefGoogle Scholar
  8. 8.
    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of IEEE CVPR, vol 1. IEEE, pp 886–893Google Scholar
  9. 9.
    De Bruyne S, Van Deursen D, De Cock J, De Neve W, Lambert P, Van de Walle R (2008) A compressed-domain approach for shot boundary detection on h. 264/avc bit streams. Signal Process Image Commun 23(7):473–489CrossRefGoogle Scholar
  10. 10.
    Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1):107–113CrossRefGoogle Scholar
  11. 11.
    Deng J, Berg AC, Fei-Fei L (2011) Hierarchical semantic indexing for large scale image retrieval. In: Proceedings of IEEE CVPR. IEEE, pp 785–792Google Scholar
  12. 12.
    Derpanis KG, Sizintsev M, Cannons K, Wildes RP (2010) Efficient action spotting based on a space-time-oriented structure representation. In: Proceedings IEEE CVPR. IEEE, pp 1990–1997Google Scholar
  13. 13.
    Doersch C, Singh S, Gupta A, Sivic J, Efros A (2012) What makes paris look like paris? ACM Trans Graphics 31(4):101CrossRefGoogle Scholar
  14. 14.
    Duan LY, Lin J, Chen J, Huang T, Gao W (2014) Compact descriptors for visual search. IEEE Multimedia 21(3):30–40CrossRefGoogle Scholar
  15. 15.
    Efros A (2012) What makes big visual data hard? [Online]
  16. 16.
    Enzweiler M, Gavrila DM (2009) Monocular pedestrian detection: survey and experiments. IEEE Trans Pattern Anal Mach Intell 31(12):2179–2195CrossRefGoogle Scholar
  17. 17.
    Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645CrossRefGoogle Scholar
  18. 18.
    Fernandez-Beltran R, Pla F (2016) Latent topics-based relevance feedback for video retrieval. Pattern Recogn 51:72–84CrossRefGoogle Scholar
  19. 19.
    Heikkinen A, Sarvanko J, Rautiainen M, Ylianttila M (2013) Distributed multimedia content analysis with mapreduce. In: 2013 IEEE 24th international symposium on personal indoor and mobile radio communications (PIMRC). IEEE, pp 3497–3501Google Scholar
  20. 20.
    Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst, Man, Cybern C 34(3):334–352CrossRefGoogle Scholar
  21. 21.
    Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Pattern Anal Mach Intell 41 (6):797–819Google Scholar
  22. 22.
    Huang T (2014) Surveillance video: the biggest big data. IEEE Computer Society [Online] 7(2).
  23. 23.
    International Data Corporation (2012) The Digital Universe in 2020: Big Data Bigger Digital Shadows, and Biggest Growth in the Far East.
  24. 24.
    Lai Yh, Yang C (2015) Video object retrieval by trajectory and appearance. IEEE Trans Circuits Systems Video Technology 25:1026–1037CrossRefGoogle Scholar
  25. 25.
    Mei S, Guan G, Wang Z, Wan S, He M, Feng DD (2015) Video summarization via minimum sparse reconstruction. Pattern Recogn 48(2):522–533CrossRefGoogle Scholar
  26. 26.
  27. 27.
    Over P, Awad G, Michel M, Fiscus J, Sanders G, Kraaij W, Smeaton AF, Quénot G (2014) Trecvid 2014- an overview of the goals, tasks, data, evaluation mechanisms and metricsGoogle Scholar
  28. 28.
    Ozer IB, Wolf W (2001) Human detection in compressed domain. In: Proceeding IEEE ICIP, vol 3. IEEE, pp 274–277Google Scholar
  29. 29.
    Riggs M (2013) Intense Smog Is Making Beijing’s Massive Surveillance Network Practically Useless.
  30. 30.
    Sadanand S, Corso JJ (2012) Action bank: a High-Level representation of activity in video. In: Proceeding IEEE CVPR. IEEE, pp 1234–1241Google Scholar
  31. 31.
    Sivic J, Everingham M, Zisserman A (2009) Who are you?”–Learning Person Specific Classifiers from Video. In: Proceeding IEEE CVPR. IEEE, pp 1145–1152Google Scholar
  32. 32.
    Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: Proceeding IEEE ICCV. IEEE, pp 1470–1477Google Scholar
  33. 33.
    Torralba A, Fergus R, Freeman W (2008) 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970CrossRefGoogle Scholar
  34. 34.
    White B, Yeh T, Lin J, Davis L (2010) Web-scale Computer Vision using MapReduce for Multimedia Data Mining. In: Proceeding ACM MDMKDD, p 9Google Scholar
  35. 35.
    Yang MH, Kriegman D, Ahuja N (2002) Detecting faces in images: a survey. IEEE Trans Pattern Anal Mach Intell 24(1):34–58CrossRefGoogle Scholar
  36. 36.
    Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on hot topics in cloud computing, pp 10–10Google Scholar
  37. 37.
    Zaharia M, Konwinski A, Joseph AD, Katz RH, Stoica I (2008) Improving mapreduce performance in heterogeneous environments. In: OSDI, vol 8, p 7Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Sihao Ding
    • 1
  • Gang Li
    • 2
  • Ying Li
    • 1
  • Xinfeng Li
    • 2
  • Qiang Zhai
    • 2
  • Adam C. Champion
    • 2
  • Junda Zhu
    • 3
  • Dong Xuan
    • 2
  • Yuan F. Zheng
    • 1
  1. 1.Department of Electrical and Computer EngineeringThe Ohio State UniversityColumbusUSA
  2. 2.Department of Computer Science and EngineeringThe Ohio State UniversityColumbusUSA
  3. 3.Department of Electrical and Computer EngineeringUniversity of MacauMacauChina

Personalised recommendations