VPCID—A VoIP Phone Call Identification Database

Huang, Yuankun; Tan, Shunquan; Li, Bin; Huang, Jiwu

doi:10.1007/978-3-030-11389-6_23

Yuankun Huang¹⁷,
Shunquan Tan¹⁸,
Bin Li¹⁷ &
…
Jiwu Huang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11378))

Included in the following conference series:

International Workshop on Digital Watermarking

1251 Accesses
1 Citations

Abstract

Audio forensic plays an important role in the field of information security to address disputes related to the authenticity and originality of audio. However, some audio forensics methods presented in existing references were evaluated under either non-forensic oriented databases or private databases which were not publicly available. It creates difficulty for researchers to make comparison between different methods. In this paper we established VPCID, a VoIP phone call identification database for audio forensic purpose. As there is an increasing trend of phone scams or voice phishing via VoIP, through which the caller’s identity can be hidden or forged easily, it is demanded to address the issues of identifying VoIP phone calls. The VPCID database is comprising of 1152 VoIP call recordings and 1152 mobile phone call recordings, each of which has more than two minutes. Recordings were collected from 48 different speakers using different smart phones and by considering varies recording conditions such as VoIP software, locations etc. We used MFCC (Mel-Frequency Cepstral Coefficients) and ACV (Amplitude Co-occurrence Vector) based features respectively equipped with SVM (Support Vector Machine) classifier to perform classification on the database. We also evaluated our own database on a CNN (convolutional neural network), but the performance is not too much satisfactory. Therefore the VoIP phone call identification problem is challenging and it calls for more effective solutions to address the problem. We hope our proposed database will convey more than this paper and inspire the future studies, which is openly available in below link, http://media-sec.szu.edu.cn/VPCID.html, and we welcome the use of this database.

This work was supported in part by the NSFC (U1636202, 61572329, 61772349), Shenzhen R&D Program (JCYJ20160328144421330). This work was also supported by Alibaba Group through Alibaba Innovative Research (AIR) Program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Shahani, A.: Why phone fraud starts with a silent call (2015). https://www.npr.org/sections/alltechconsidered/2015/08/24/434313813/why-phone-fraud-starts-with-a-silent-call
vd Groenendaal, H.: Why phone fraud starts with a silent call (2014). https://mybroadband.co.za/news/telecoms/112935-voip-fraud-explained.html
McGlasson, L.: Vishing scam: four more states struck (2010). http://www.bankinfosecurity.com/articles.php?art_id=2138
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: Darpa timit acoustic-phonetic continous speech corpus CD-ROM. nist speech disc 1–1.1. NASA STI/Recon technical report n 93 (1993)
Google Scholar
Jenner, F., Kwasinski, A.: Highly accurate non-intrusive speech forensics for codec identifications from observed decoded signals. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1737–1740. IEEE (2012)
Google Scholar
Luo, D., Yang, R., Li, B., Huang, J.: Detection of double compressed AMR audio using stacked autoencoder. IEEE Trans. Inf. Forensics Secur. 12(2), 432–444 (2017)
Article Google Scholar
Robinson, T., Fransen, J., Pye, D., Foote, J., Renals, S.: WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition. In: 1995 International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1995, vol. 1, pp. 81–84. IEEE (1995)
Google Scholar
Lin, X., Liu, J., Kang, X.: Audio recapture detection with convolutional neural networks. IEEE Trans. Multimedia 18(8), 1480–1487 (2016)
Article Google Scholar
Hu, Y., Loizou, P.C.: Subjective comparison and evaluation of speech enhancement algorithms. Speech Commun. 49(7–8), 588–601 (2007)
Article Google Scholar
Cao, W., Wang, H., Zhao, H., Qian, Q., Abdullahi, S.M.: Identification of electronic disguised voices in the noisy environment. In: Shi, Y.Q., Kim, H.J., Perez-Gonzalez, F., Liu, F. (eds.) IWDW 2016. LNCS, vol. 10082, pp. 75–87. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-53465-7_6
Chapter Google Scholar
Hanilci, C., Ertas, F., Ertas, T., Eskidere, Ö.: Recognition of brand and models of cell-phones from recorded speech signals. IEEE Trans. Inf. Forensics Secur. 7(2), 625–634 (2012)
Article Google Scholar
Kotropoulos, C., Samaras, S.: Mobile phone identification using recorded speech signals. In: 2014 19th International Conference on Digital Signal Processing (DSP), pp. 586–591. IEEE (2014)
Google Scholar
Wu, Z., et al.: SAS: a speaker verification spoofing database containing diverse attacks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4440–4444. IEEE (2015)
Google Scholar
Kinnunen, T., et al.: The ASVspoof 2017 challenge: assessing the limits of replay spoofing attack detection (2017)
Google Scholar
Luo, D., Korus, P., Huang, J.: Band energy difference for source attribution in audio forensics. IEEE Trans. Inf. Forensics Secur. 13(9), 2179–2189 (2018)
Article Google Scholar
Hicsonmez, S., Sencar, H.T., Avcibas, I.: Audio codec identification from coded and transcoded audios. Digital Signal Process. 23(5), 1720–1730 (2013)
Article Google Scholar
Scholz, K., Leutelt, L., Heute, U.: Speech-codec detection by spectral harmonic-plus-noise decomposition. In: Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers. vol. 2, pp. 2295–2299. IEEE (2004)
Google Scholar
Svečko, R., Kotnik, B., Chowdhury, A., Mezgec, Z.: GSM speech coder indirect identification algorithm. Informatica 21(4), 575–596 (2010)
Google Scholar
Zhou, J.: Automatic speech codec identification with applications to tampering detection of speech recordings. Ph.D. thesis (2011)
Google Scholar
Sharma, D., Naylor, P.A., Gaubitch, N.D., Brookes, M.: Non intrusive codec identification algorithm. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4477–4480. IEEE (2012)
Google Scholar
Drăghicescu, D., Pop, G., Burileanu, D., Burileanu, C.: GMM-based audio codec detection with application in forensics. In: 2015 38th International Conference on Telecommunications and Signal Processing (TSP), pp. 1–5. IEEE (2015)
Google Scholar
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. In: Readings in speech recognition, pp. 65–74. Elsevier (1990)
Google Scholar
Luo, D., Sun, M., Huang, J.: Audio postprocessing detection based on amplitude cooccurrence vector feature. IEEE Signal Process. Lett. 23(5), 688–692 (2016)
Article Google Scholar
Dai, W., Dai, C., Qu, S., Li, J., Das, S.: Very deep convolutional neural networks for raw waveforms. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 421–425. IEEE (2017)
Google Scholar
Kraetzer, C., Oermann, A., Dittmann, J., Lang, A.: Digital audio forensics: a first practical evaluation on microphone and environment classification. In: Proceedings of the 9th Workshop on Multimedia & Security, pp. 63–74. ACM (2007)
Google Scholar
Furui, S.: Speaker-independent isolated word recognition based on emphasized spectral dynamics. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP 1986, vol. 11, pp. 1991–1994. IEEE (1986)
Google Scholar
Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Guangdong Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, College of Information Engineering, Shenzhen University, Shenzhen, 518060, China
Yuankun Huang, Bin Li & Jiwu Huang
National Engineering Laboratory for Big Data System Computing Technology, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
Shunquan Tan

Authors

Yuankun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Shunquan Tan
View author publications
You can also search for this author in PubMed Google Scholar
Bin Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiwu Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Li .

Editor information

Editors and Affiliations

Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Chang D. Yoo
New Jersey Institute of Technology, Newark, NJ, USA
Yun-Qing Shi
Korea University, Seoul, Korea (Republic of)
Hyoung Joong Kim
University of Florence, Florence, Italy
Alessandro Piva
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Gwangsu Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, Y., Tan, S., Li, B., Huang, J. (2019). VPCID—A VoIP Phone Call Identification Database. In: Yoo, C., Shi, YQ., Kim, H., Piva, A., Kim, G. (eds) Digital Forensics and Watermarking. IWDW 2018. Lecture Notes in Computer Science(), vol 11378. Springer, Cham. https://doi.org/10.1007/978-3-030-11389-6_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-11389-6_23
Published: 24 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11388-9
Online ISBN: 978-3-030-11389-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics