Speech Quality Enhancement in Digital Forensic Voice Analysis

  • Moses Ekpenyong
  • Okure Obot
Part of the Studies in Computational Intelligence book series (SCI, volume 555)


The influence of noise and reverberation in Digital Forensic voice evidence can conceal the identification, verification and processing of crime data. Computationally, the efficiency in processing speech signals largely depends on the integrity and authenticity of audio/voice recordings. Our interest is on improving integrity, vis-à-vis the intelligibility of speech signals. We achieved this in four folds. First, a speech quality enhancement technique that cleans and rebuilds defective speech data for quality Forensic analysis is proposed by exploring an optimal estimator for the magnitude spectrum, where the Discrete Fourier Transform (DFT) coefficients of clean speech are modelled by a Laplacian distribution and the noise DFT coefficients are modelled using a Gaussian distribution. Second, an automatic speech pre-processing algorithm for phoneme segmentation of raw speech data, capable of iteratively refining Hidden Markov Model (HMM) speech labels for improved intelligibility is introduced. Third, a simulation of the distortion from a quantised R-bit and computation of the Signal-to-Noise Ratio (SNR) for the signal to quantisation noise is carried out for the purpose of managing speech signal distortions. Fourth, an investigation of the effect of confused phonemic and tone bearing unit features on the intelligibility of speech is presented to assist Forensic experts decode voice disguise or language “barriers” that may impede proper Forensic voice analysis. Results obtained in this investigation reveal a future of prospects in the field of Forensic intelligence and is most likely to reduce unnecessary setbacks during Forensic analysis.


Forensic science intelligent system speech quality evaluation speech synthesis voice adaptation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Nance, A., Hay, B., Bishop, M.: Digital Forensics: Defining a Research Agenda. In: 42nd Hawaii International Conference on System Sciences, pp. 1–6 (2009)Google Scholar
  2. 2.
    Ren, W.: Distributed agent-based real time network intrusion Forensics system architecture design. In: 19th International Conference on Advanced Information Networking and Applications, AINA 2005), vol. 1, pp. 177–182 (2005)Google Scholar
  3. 3.
    Satheesh Kumar, S., Thomas, B., Thomas, K.L.: An agent based tool for windows mobile forensics. In: Gladyshev, P., Rogers, M.K. (eds.) ICDF2C 2011. LNICST, vol. 88, pp. 77–88. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  4. 4.
    Bhat, V.H., Rao, P.G., Abhilash, R.V., Patnaik, L.M.: A Novel data generation ap-proach for Digital Forensic Application in Data Mining. In: 2nd IEEE International Confer-ence on Machine Learning and Computing, pp. 86–90. IEEE Computer Society (2010)Google Scholar
  5. 5.
    Morrison, G.S.: Measuring the validity and reliability of Forensic likelihood-ratio systems. Science & Justice 51, 91–98 (2011)CrossRefGoogle Scholar
  6. 6.
    McKenmmish, R.: What is Forensic Computing? In: Trends and Issues in Crime and crimi-nal Justice, pp. 1–6. Australian Institute of Criminology (1999), htttp:// Scholar
  7. 7.
    Reilly, D., Wren, C., Berry, T.: Cloud Computing: Pros and cons for Computer Forensic Investigations. Int Journal Multimedia and Image Processing (IJMIP) 1(1), 26–34 (2011)Google Scholar
  8. 8.
    Rose, P.: Forensic speaker identification. Taylor and Francis, London (2002)CrossRefGoogle Scholar
  9. 9.
    Rose, P.: Technical Forensic speaker recognition: evaluation, types and testing of evidence. Comput Speech Lang 20(2–3), 159–191 (2006)CrossRefGoogle Scholar
  10. 10.
    Raynolds, D.A., Quanteieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital signal process 10, 19–41 (2000)CrossRefGoogle Scholar
  11. 11.
    De Leon, P.L., Pucher, M., Yamagishi, J., Inma, H., Saratxaga, I.: Evaluation of speaker verification security and detection of HMM-based synthetic speech. IEEE Transactions on Audio, Speech and Language Process 20(8), 2280–2290 (2012)CrossRefGoogle Scholar
  12. 12.
    Lau, Y.W., Wagner, M., Tran, D.: Vulnerability of speaker verification to voice mim-icking. In: International Symposium on Intelligent Multimedia, Video, Speech Process, pp. 145–148 (2004)Google Scholar
  13. 13.
    Sullivan, K.P.H., Pelecanos, J.: Revisiting carl bildt’s impostor: Would a speaker verification system foil him? In: Bigun, J., Smeraldi, F. (eds.) AVBPA 2001. LNCS, vol. 2091, pp. 144–149. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  14. 14.
    Zhang, C., Morrison, G.S., Thiruvaran, T.: Forensic voice comparison using Chinese /iau/. In: 17th ICPhS, Hong Kong, China, pp. 2280–2283 (2011)Google Scholar
  15. 15.
    Huang, C.C., Epps, J.: A study of automatic phonetic segmentation for Forensic voice comparison. In: IEEE International conference on Acoustic, Speech and Signal Process, pp. 1853–1856 (2012)Google Scholar
  16. 16.
    Kind, S.: The Scientific Investigation of Crime. Forensic Science Services Ltd., Harrogate (1987)Google Scholar
  17. 17.
    Ribaux, O., Walsh, S.J., Margot, P.: The contribution of Forensic science to crime analysis and investigation: Forensic intelligence. Forensic Science International 156, 171–181 (2006)CrossRefGoogle Scholar
  18. 18.
    Brewer, N., Liu, N., De Vel, O., Caelli, T.: Using Coupled Hidden Markov Models to Model Suspect Interactions in Digital Forensic Analysis. In: IEEE International Workshop on Integrating AI and Data Mining, AIDM 2006, pp. 58–64 (2006)Google Scholar
  19. 19.
    Ekpenyong, E., Urua, E.-A.: Agent-based Framework for Intelligent Natural Language Interface. Telecommunication Systems Journal (2011a) (First online, September, 2011)Google Scholar
  20. 20.
    Ekpenyong, M.: Optimizing Speech Naturalness in Voice User Interface Design: A Weakly-Supervised Approach. In: Proceedings of IEEE World Congress on Information and Communication Technologies, Mumbai, India, pp. 99–105 (2011b)Google Scholar
  21. 21.
    Toda, T., Kawai, H., Tsuzaki, M., Shikano, K.: An evaluation of cost functions sensi-tively capturing local degradation of naturalness for segment selection in concatenative speech synthesis. Speech Communication 48, 45–56 (2006)CrossRefGoogle Scholar
  22. 22.
    Nusbaum, H.C., Francis, A.L., Henly, A.S.: Measuring the naturalness of synthetic speech. International Journal of Speech Technology 2(1), 7–19 (1997)CrossRefGoogle Scholar
  23. 23.
    Ekpenyong, M., Urua, E.-A., Watts, O., King, S., Yamagishi, J.: Statistical Parametric Speech Synthesis for Ibibio. Speech Communication (2013), (First online: February 2013)
  24. 24.
    Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. In: IEEE Int. Conf. Acoustic., Speech, Signal Processing, ASS, vol. P-32(6), pp. 1109–1121 (1984)Google Scholar
  25. 25.
    Papoulis, A., Pillai, S.U.: Probability, Random Variables, and Stochastic Processes. McGraw Hill (2001)Google Scholar
  26. 26.
    Chen, B., Loizou, P.C.: A Laplacian-based MMSE estimator for speech enhancement. Speech Communication 49, 134–143 (2007)CrossRefGoogle Scholar
  27. 27.
    Rashidi-nejad, M., Abutalebi, H.R., Tadaion, A.A.: Speech Enhancement using an Im-proved MMSE Estimator with Laplacian Prior. In: 5th International Symposium on Tele-Communications, pp. 889–894 (2010)Google Scholar
  28. 28.
    Titze, I.R.: Principles of Voice Production. Prentice Hall (1994)Google Scholar
  29. 29.
    Baken, R.J.: Clinical Measurement of Speech and Voice. Taylor and Francis Ltd, London (1987)Google Scholar
  30. 30.
    Yamagishi, J., Veaux, C., King, S., Renals, S.: Speech synthesis technologies for indi-viduals with vocal disabilities: Voice banking and reconstruction. Acoustical Science and Technology 33(1), 1–5 (2012)CrossRefGoogle Scholar
  31. 31.
    Peisert, S., Bishop, M., Karin, S., Marzullo, K.: Toward Models for Forensic Analysis. In: 2nd International Workshop on Systematic Approaches to Digital Forensic Engineering (SADFE), Seattle, WA, pp. 3–15 (2007)Google Scholar
  32. 32.
    Shapiro, H.T.: ’The willingness to risk failure. Science, Editorial 250(4981), 609 (1990)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of UyoUyoNigeria
  2. 2.Centre for Speech Technology Research (CSTR)University of EdinburghEdinburghUK

Personalised recommendations