Multimedia Tools and Applications

, Volume 78, Issue 5, pp 5233–5254 | Cite as

Multimedia file forensics system exploiting file similarity search

  • Min-Ja Kim
  • Chuck Yoo
  • Young-Woong KoEmail author


With the fast increase of multimedia contents, efficient forensics investigation methods for multimedia files have been required. In multimedia files, the similarity means that the identical media (audio and video) data are existing among multimedia files. This paper proposes an efficient multimedia file forensics system based on file similarity search of video contents. The proposed system needs two key techniques. First is a media-aware information detection technique. The first critical step for the similarity search is to find the meaningful keyframes or key sequences in the shots through a multimedia file, in order to recognize altered files from the same source file. Second is a video fingerprint-based technique (VFB) for file similarity search. The byte for byte comparison is an inefficient similarity searching method for large files such as multimedia. The VFB technique is an efficient method to extract video features from the large multimedia files. It also provides an independent media-aware identification method for detecting alterations to the source video file (e.g., frame rates, resolutions, and formats, etc.). In this paper, we focus on two key challenges: to generate robust video fingerprints by finding meaningful boundaries of a multimedia file, and to measure video similarity by using fingerprint-based matching. Our evaluation shows that the proposed system is possible to apply to realistic multimedia file forensics tools.


Multimedia file forensics File similarity search Video fingerprint media-aware information detection fingerprint-based matching 



This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT and future Planning (2014R1A2A1A11054160). And this research was supported by The Leading Human Resource Training Program of Regional Neo industry through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT and future Planning (2016H1D5A1910630)


  1. 1.
    Allamanche, E., Herre, J., Hellmuth, O., Fröba, B., Kastner, T., & Cremer, M. (2001). Content-based Identification of Audio Material Using MPEG-7 Low Level Description. In ISMIRGoogle Scholar
  2. 2.
    Anand A, Balachandran A, Akella A, Sekar V, Seshan S (2016) Enhancing video accessibility and availability using information-bound references. IEEE/ACM Trans Networking 24(2):1223–1236CrossRefGoogle Scholar
  3. 3.
    Baber, J., Afzulpurkar, N., Dailey, M. N., & Bakhtyar, M. (2011). Shot boundary detection from videos using entropy and local descriptor. In 2011 17th International Conference on Digital Signal Processing (DSP) (pp. 1–6). IEEEGoogle Scholar
  4. 4.
    Bae, S., Nam, G., & Park, K. (2014) Effective content-based video caching with cache-friendly encoding and media-aware chunking. InProceedings of the 5th ACM Multimedia Systems Conference (pp. 203–212). ACMGoogle Scholar
  5. 5.
    Chasanis VT, Likas AC, Galatsanos NP (2009) Scene detection in videos using shot clustering and sequence alignment. IEEE Trans Multimed 11(1):89–100CrossRefGoogle Scholar
  6. 6.
    S. Chen, J. Wang, Y. Ouyang, B. Wang, Q. Tian, H. Lu. (2010) Multi-level trajectory modeling for video copy detection. Proc IEEE Int Conf Acoustics Speech Sign Process (ICASSP’10), 2378–2381Google Scholar
  7. 7.
    Coskun, B., & Sankur, B. (2004) Robust video hash extraction. In Signal Processing and Communications Applications Conference, 2004. Proceedings of the IEEE 12th (pp. 292–295). IEEEGoogle Scholar
  8. 8.
    De Roover C, De Vleeschouwer C, Lefèbvre F, Macq B (2005) Robust video hashing based on radial projections of key frames. IEEE Trans Signal Process 53(10):4020–4037MathSciNetCrossRefGoogle Scholar
  9. 9.
    Eshghi K, Lillibridge M, Wilcock L, Belrose G, Hawkes R (2007) Jumbo Store: Providing Efficient Incremental Upload and Versioning for a Utility Rendering Service. FAST 7:123–138Google Scholar
  10. 10.
    Forman, G., Eshghi, K. Chiocchetti, S. (2005) Finding similar files in large document repositories. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining (pp. 394–400). ACMGoogle Scholar
  11. 11.
    Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. VLDB 99(6):518–529Google Scholar
  12. 12.
    Gloe T, Fischer A, Kirchner M (2014) Forensic analysis of video file formats. Digit Investig 11:S68–S76CrossRefGoogle Scholar
  13. 13.
    Haitsma J, Kalker T, Oostveen J (2001) Robust audio hashing for content identification. Int Workshop Content-Based Multimed Index 4:117–124zbMATHGoogle Scholar
  14. 14.
    HotSync, P. (2007) Palm Developer Online DocumentationGoogle Scholar
  15. 15.
    Ko Y-W, Jung H-M, Lee W-Y, Kim M-J, Yoo C (2013) Stride Static Chunking Algorithm for Deduplication System. IEICE Trans Inf Syst 96(7):1544–1547CrossRefGoogle Scholar
  16. 16.
    Li J, Ding Y, Shi Y, Li W (2010) A Divide-And-Rule Scheme For Shot Boundary Detection Based on SIFT. JDCTA 4(3):202–214CrossRefGoogle Scholar
  17. 17.
    Lu, J. (2009). Video fingerprinting for copy identification: from research to industry applications. In IS&T/SPIE Electronic Imaging (pp. 725402–725402). International Society for Optics and PhotonicsGoogle Scholar
  18. 18.
    Mas J, Fernandez G (2003) Video shot boundary detection based on color histogram. Notebook Papers TRECVID2003. NIST, Gaithersburg, MarylandGoogle Scholar
  19. 19.
  20. 20.
    Meunier, P., Nystrom, S., Kamara, S., Yost, S., Alexander, K., Noland, D., Crane, J. (2002) ActiveSync, TCP/IP and 802.11 b Wireless Vulnerabilities of WinCE-based PDAs, pp. 145–150. IEEEGoogle Scholar
  21. 21.
    Muthitacharoen A, Chen B, Mazieres D (2001) A low-bandwidth network file system. ACM SIGOPS OperA Syst Rev 35:174–187CrossRefGoogle Scholar
  22. 22.
    OpenCV. Open source computer vision,
  23. 23.
    pHash. The Open source perceptual hash library,
  24. 24.
    Pucha, H., Andersen, D. G., & Kaminsky, M. (2007). Exploiting Similarity for Multi-Source Downloads Using File Handprints. In NSDIGoogle Scholar
  25. 25.
    Quinlan S, Dorward S (2002) Venti: A New Approach to Archival Storage. FAST 2:89–101Google Scholar
  26. 26.
    Standaert, F. X., Lefebvre, E., Rouvroy, G., Macq, B., Quisquater, J. J., & Legat, J. D. (2005). Practical evaluation of a radial soft hash algorithm. International Conference on Information Technology: Coding and Computing (ITCC'05)-Volume II (Vol. 2, pp. 89–94). IEEEGoogle Scholar
  27. 27.
    Starobinski D, Trachtenberg A, Agarwal S (2003) Efficient PDA synchronization. IEEE Trans Mob Comput 2:40–51CrossRefGoogle Scholar
  28. 28.
    Steinebach, M., Liu, H., & Yannikos, Y. (2014). Efficient Cropping-Resistant Robust Image Hashing. In Availability, Reliability and Security (ARES), 2014 Ninth International Conference on (pp. 579-585). IEEEGoogle Scholar
  29. 29.
    Tridgell, A. (1999) Efficient algorithms for sorting and synchronization. PhD thesis, The Australian National UniversityGoogle Scholar
  30. 30.
    Vukelic, B., & Baca, M. (2014). Comparison of RADIAL variance based and Maar-Hildreth operator perceptual image hash functions on biometric templates. In Central European Conference on Information and Intelligent Systems (p. 286). Faculty of Organization and Informatics VarazdinGoogle Scholar
  31. 31.
    Wu, C., Zhu, J., & Zhang, J. (2012). A content-based video copy detection method with randomly projected binary features. 2012 I.E. Computer Society Conference on Computer Vision and Pattern Recognition Workshops (pp. 21–26). IEEEGoogle Scholar
  32. 32.
    Xu D, Sheng Y, Ju D, Wu J, Wang D (2011) High Effective Two-round Remote File Fast Synchronization Algorithm. Jisuanji Kexue yu Tansuo 5:38–49Google Scholar
  33. 33.
    Yan, H., Irmak, U., Suel, T. (2008) Algorithms for low-latency remote file synchronization, pp. 156–160. IEEEGoogle Scholar
  34. 34.
    Yang, B., Gu, F., & Niu, X. (2006). Block mean value based image perceptual hashing. In 2006 International Conference on Intelligent Information Hiding and Multimedia (pp. 167–172). IEEEGoogle Scholar
  35. 35.
    You, L., & Karamanolis, C. T. (2004). Evaluation of Efficient Archival Storage Techniques. In MSST (pp. 227–232)Google Scholar
  36. 36.
    Zauner, C. (2010). Implementation and benchmarking of perceptual image hash functions. naGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringKorea UniversitySeoulSouth Korea
  2. 2.Department of Computer EngineeringHallym UniversityChuncheonRepublic of Korea

Personalised recommendations