Skip to main content

VPCID—A VoIP Phone Call Identification Database

  • Conference paper
  • First Online:
Digital Forensics and Watermarking (IWDW 2018)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11378))

Included in the following conference series:

Abstract

Audio forensic plays an important role in the field of information security to address disputes related to the authenticity and originality of audio. However, some audio forensics methods presented in existing references were evaluated under either non-forensic oriented databases or private databases which were not publicly available. It creates difficulty for researchers to make comparison between different methods. In this paper we established VPCID, a VoIP phone call identification database for audio forensic purpose. As there is an increasing trend of phone scams or voice phishing via VoIP, through which the caller’s identity can be hidden or forged easily, it is demanded to address the issues of identifying VoIP phone calls. The VPCID database is comprising of 1152 VoIP call recordings and 1152 mobile phone call recordings, each of which has more than two minutes. Recordings were collected from 48 different speakers using different smart phones and by considering varies recording conditions such as VoIP software, locations etc. We used MFCC (Mel-Frequency Cepstral Coefficients) and ACV (Amplitude Co-occurrence Vector) based features respectively equipped with SVM (Support Vector Machine) classifier to perform classification on the database. We also evaluated our own database on a CNN (convolutional neural network), but the performance is not too much satisfactory. Therefore the VoIP phone call identification problem is challenging and it calls for more effective solutions to address the problem. We hope our proposed database will convey more than this paper and inspire the future studies, which is openly available in below link, http://media-sec.szu.edu.cn/VPCID.html, and we welcome the use of this database.

This work was supported in part by the NSFC (U1636202, 61572329, 61772349), Shenzhen R&D Program (JCYJ20160328144421330). This work was also supported by Alibaba Group through Alibaba Innovative Research (AIR) Program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shahani, A.: Why phone fraud starts with a silent call (2015). https://www.npr.org/sections/alltechconsidered/2015/08/24/434313813/why-phone-fraud-starts-with-a-silent-call

  2. vd Groenendaal, H.: Why phone fraud starts with a silent call (2014). https://mybroadband.co.za/news/telecoms/112935-voip-fraud-explained.html

  3. McGlasson, L.: Vishing scam: four more states struck (2010). http://www.bankinfosecurity.com/articles.php?art_id=2138

  4. Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: Darpa timit acoustic-phonetic continous speech corpus CD-ROM. nist speech disc 1–1.1. NASA STI/Recon technical report n 93 (1993)

    Google Scholar 

  5. Jenner, F., Kwasinski, A.: Highly accurate non-intrusive speech forensics for codec identifications from observed decoded signals. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1737–1740. IEEE (2012)

    Google Scholar 

  6. Luo, D., Yang, R., Li, B., Huang, J.: Detection of double compressed AMR audio using stacked autoencoder. IEEE Trans. Inf. Forensics Secur. 12(2), 432–444 (2017)

    Article  Google Scholar 

  7. Robinson, T., Fransen, J., Pye, D., Foote, J., Renals, S.: WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition. In: 1995 International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1995, vol. 1, pp. 81–84. IEEE (1995)

    Google Scholar 

  8. Lin, X., Liu, J., Kang, X.: Audio recapture detection with convolutional neural networks. IEEE Trans. Multimedia 18(8), 1480–1487 (2016)

    Article  Google Scholar 

  9. Hu, Y., Loizou, P.C.: Subjective comparison and evaluation of speech enhancement algorithms. Speech Commun. 49(7–8), 588–601 (2007)

    Article  Google Scholar 

  10. Cao, W., Wang, H., Zhao, H., Qian, Q., Abdullahi, S.M.: Identification of electronic disguised voices in the noisy environment. In: Shi, Y.Q., Kim, H.J., Perez-Gonzalez, F., Liu, F. (eds.) IWDW 2016. LNCS, vol. 10082, pp. 75–87. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-53465-7_6

    Chapter  Google Scholar 

  11. Hanilci, C., Ertas, F., Ertas, T., Eskidere, Ö.: Recognition of brand and models of cell-phones from recorded speech signals. IEEE Trans. Inf. Forensics Secur. 7(2), 625–634 (2012)

    Article  Google Scholar 

  12. Kotropoulos, C., Samaras, S.: Mobile phone identification using recorded speech signals. In: 2014 19th International Conference on Digital Signal Processing (DSP), pp. 586–591. IEEE (2014)

    Google Scholar 

  13. Wu, Z., et al.: SAS: a speaker verification spoofing database containing diverse attacks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4440–4444. IEEE (2015)

    Google Scholar 

  14. Kinnunen, T., et al.: The ASVspoof 2017 challenge: assessing the limits of replay spoofing attack detection (2017)

    Google Scholar 

  15. Luo, D., Korus, P., Huang, J.: Band energy difference for source attribution in audio forensics. IEEE Trans. Inf. Forensics Secur. 13(9), 2179–2189 (2018)

    Article  Google Scholar 

  16. Hicsonmez, S., Sencar, H.T., Avcibas, I.: Audio codec identification from coded and transcoded audios. Digital Signal Process. 23(5), 1720–1730 (2013)

    Article  Google Scholar 

  17. Scholz, K., Leutelt, L., Heute, U.: Speech-codec detection by spectral harmonic-plus-noise decomposition. In: Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers. vol. 2, pp. 2295–2299. IEEE (2004)

    Google Scholar 

  18. Svečko, R., Kotnik, B., Chowdhury, A., Mezgec, Z.: GSM speech coder indirect identification algorithm. Informatica 21(4), 575–596 (2010)

    Google Scholar 

  19. Zhou, J.: Automatic speech codec identification with applications to tampering detection of speech recordings. Ph.D. thesis (2011)

    Google Scholar 

  20. Sharma, D., Naylor, P.A., Gaubitch, N.D., Brookes, M.: Non intrusive codec identification algorithm. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4477–4480. IEEE (2012)

    Google Scholar 

  21. Drăghicescu, D., Pop, G., Burileanu, D., Burileanu, C.: GMM-based audio codec detection with application in forensics. In: 2015 38th International Conference on Telecommunications and Signal Processing (TSP), pp. 1–5. IEEE (2015)

    Google Scholar 

  22. Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. In: Readings in speech recognition, pp. 65–74. Elsevier (1990)

    Google Scholar 

  23. Luo, D., Sun, M., Huang, J.: Audio postprocessing detection based on amplitude cooccurrence vector feature. IEEE Signal Process. Lett. 23(5), 688–692 (2016)

    Article  Google Scholar 

  24. Dai, W., Dai, C., Qu, S., Li, J., Das, S.: Very deep convolutional neural networks for raw waveforms. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 421–425. IEEE (2017)

    Google Scholar 

  25. Kraetzer, C., Oermann, A., Dittmann, J., Lang, A.: Digital audio forensics: a first practical evaluation on microphone and environment classification. In: Proceedings of the 9th Workshop on Multimedia & Security, pp. 63–74. ACM (2007)

    Google Scholar 

  26. Furui, S.: Speaker-independent isolated word recognition based on emphasized spectral dynamics. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP 1986, vol. 11, pp. 1991–1994. IEEE (1986)

    Google Scholar 

  27. Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, Y., Tan, S., Li, B., Huang, J. (2019). VPCID—A VoIP Phone Call Identification Database. In: Yoo, C., Shi, YQ., Kim, H., Piva, A., Kim, G. (eds) Digital Forensics and Watermarking. IWDW 2018. Lecture Notes in Computer Science(), vol 11378. Springer, Cham. https://doi.org/10.1007/978-3-030-11389-6_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-11389-6_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-11388-9

  • Online ISBN: 978-3-030-11389-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics