Advertisement

Speaker Identification for OFDM-Based Aeronautical Communication System

  • Sara Sekkate
  • Mohammed Khalil
  • Abdellah Adib
Article
  • 14 Downloads

Abstract

Although a lot of research has been done on speaker identification in the presence of noise and channel variation, to the best of our knowledge, no work has been reported for aeronautical applications. In this paper, we aim to fulfill this goal by developing a Speaker Identification System (SIS) for future aeronautical communications systems. Furthermore, we present a novel feature extraction scheme based on multi-resolution analysis. The proposed features called SMFCC use Mel Frequency Cepstral Coefficients (MFCCs) features of stationary wavelet transform sub-bands. The extracted features are modeled using the i-vector approach, and support-vector machines are adopted as a back-end classifier. The performance of the proposed SIS is evaluated using two publicly available databases. Comparison of the proposed approach with the baseline MFCC feature extraction shows the feasibility and the robustness of the proposed method. Besides the noise reduction, the identification accuracy is improved by about 12% at higher signal-to-noise ratios and reaches 97.33% as compared to 88.33% using MFCC for ATCOSIM database.

Keywords

Speaker identification MFCC SWT Air Traffic Control (ATC) i-vector 

Notes

References

  1. 1.
    S. Abd El-Moneim, M.I. Dessouky, F.E. Abd El-Samie, M.A. Nassar, M. Abd El-Naby, Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification. Int. J. Speech Technol. 18(4), 555–564 (2015)CrossRefGoogle Scholar
  2. 2.
    N. Asbai, A. Amrouche, M. Debyeche, Performances evaluation of GMM-UBM and GMM-SVM for speaker recognition in realistic world, in Neural Inf. Process., ed. by B.-L. Lu, L. Zhang, J. Kwok (Springer, Berlin, 2011), pp. 284–291CrossRefGoogle Scholar
  3. 3.
    J.G.P. Bernal, A.P. Guerrero, J.G. Close, A speaker verification system using SVM over a Spanish corpus, in 2009 Mexican International Conference on Computer Science, Sept 2009, pp. 381–386Google Scholar
  4. 4.
    J.P. Campbell, Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997)CrossRefGoogle Scholar
  5. 5.
    J.P. Campbell, D.A. Reynolds, Corpora for the evaluation of speaker recognition systems, in 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, pp. 829–832, March 1999Google Scholar
  6. 6.
    M. Carey, E. Parris, H. Lloyd-Thomas, S. Bennett, Robust prosodic features for speaker identification, in Proceedings of ICSLP-96, November 1996Google Scholar
  7. 7.
    P.M. Chauhan, N.P. Desai, Mel Frequency Cepstral Coefficients (MFCCs) based speaker identification in noisy environment using wiener filter, in 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), Mar 2014, pp. 1–5Google Scholar
  8. 8.
    S.-H. Chen, H.-C. Wang, Improvement of speaker recognition by combining residual and prosodic features with acoustic features, in 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, May 2004, pp. 93–96Google Scholar
  9. 9.
    B.J. Chua, X.J. Li, H.D. Tran, Study of automatic biosounds detection and classification using SVM and GMM, in 2011 IEEE/NIH Life Science Systems and Applications Workshop (LiSSA), April 2011, pp. 155–158Google Scholar
  10. 10.
    C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  11. 11.
    S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech Signal Process. 28(4), 357–366 (1980)CrossRefGoogle Scholar
  12. 12.
    N. Dehak, P.J. Kenny, R. Dehak, P. Dumouchel, P. Ouellet, Front-end factor analysis for speaker verification. IEEE Trans. Acoust., Speech Signal Process. 19(4), 788–798 (2011)Google Scholar
  13. 13.
    S. Dey, S. Barman, R.K. Bhukya, R.K. Das, B.C. Haris, S.R.M. Prasanna, R. Sinha, Speech biometric based attendance system, in 2014 Twentieth National Conference on Communications (NCC), Feb 2014, pp. 1–6Google Scholar
  14. 14.
    H. Ding, Z.-M. Tang, L.-H. Wei, Y.-P. Li, A study on speaker identification based on weighted LS-SVM. Autom. Control Comput. Sci. 43(6), 328–335 (2009)CrossRefGoogle Scholar
  15. 15.
    M. Dutta, C. Patgiri, M. Sarma, K.K. Sarma, Closed-Set Text-Independent Speaker Identification System Using Multiple ANN Classifiers (Springer, Cham, 2015), pp. 377–385Google Scholar
  16. 16.
    M. Faundez-Zanuy, E. Monte-Moreno, State-of-the-art in speaker recognition. IEEE Aerosp. Electron. Syst. Mag. 20(5), 7–12 (2005)CrossRefGoogle Scholar
  17. 17.
    J. Garofolo, L. Lamel, Fisher, J. Fiscus, D. Pallett, N. Dahlgren, V. Zue, Timit acoustic-phonetic continuous speech corpus. Linguistic Data Consortium (1993)Google Scholar
  18. 18.
    H. Gish, M. Schmidt, Text-independent speaker identification. IEEE Signal Process. Mag. 11(4), 18–32 (1994)CrossRefGoogle Scholar
  19. 19.
    S.M. Govindan, P. Duraisamy, X. Yuan, Adaptive wavelet shrinkage for noise robust speaker recognition. Digit. Signal Process. 33, 180–190 (2014)CrossRefGoogle Scholar
  20. 20.
    E. Haas, Aeronautical channel modeling. IEEE Trans. Veh. Technol. 51(2), 254–264 (2002)CrossRefGoogle Scholar
  21. 21.
    M. Hagmüller, G. Kübin, Speech watermarking for air traffic control. Eurocontrol Experimental Centre, EEC Note 05/05 (2005)Google Scholar
  22. 22.
    M. Hébert, Text-Dependent Speaker Recognition (Springer, Berlin, 2008), pp. 743–762Google Scholar
  23. 23.
    HINDSIGHT, Nn\(^{\circ }\)2 communication. Technical report, EUROCONTROL (2006)Google Scholar
  24. 24.
    K. Hofbauer, H. Hering, G. Kübin, Speech watermarking for the VHF radio channel, in 4th EUROCONTROL Innovative Research Workshop, December 2005Google Scholar
  25. 25.
    K. Hofbauer, S. Petrik, H. Hering, The ATCOSIM corpus of non-prompted clean air traffic control speech, in Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco, May 2008Google Scholar
  26. 26.
    Y. Hu, P. Loizou, Subjective comparison and evaluation of speech enhancement algorithms. Speech Commun. 49, 588–601 (2007)CrossRefGoogle Scholar
  27. 27.
    A.A.A.A. Khalil, E.S.M. Saad, M.A. El-Nabi, F.E.A. El-Samie, Efficient speaker identification from speech transmitted over Bluetooth based system, in 2013 8th International Conference on Computer Engineering Systems (ICCES), Nov 2013, pp. 190–193Google Scholar
  28. 28.
    W.B. Kheder, D. Matrouf, P.-M. Bousquet, J.-F. Bonastre, M. Ajili, Fast i-vector denoising using map estimation and a noise distributions database for robust speaker recognition. Comput. Speech Lang. 45, 104–122 (2017)CrossRefGoogle Scholar
  29. 29.
    S.G. Koolagudi, K. Sreenivasa Rao, R. Reddy, V.A. Kumar, S. Chakrabarti, Robust Speaker Recognition in Noisy Environments: Using Dynamics of Speaker-Specific Prosody (Springer, New York, 2012), pp. 183–204Google Scholar
  30. 30.
    K.A. Lee, A. Larcher, H. Thai, B. Ma, H. Li, Joint application of speech and speaker recognition for automation and security in smart home, in INTERSPEECH, 2011, pp. 3317–3318Google Scholar
  31. 31.
    K.A. Lee, B. Ma, H. Li, Speaker verification makes its debut in smartphone. IEEE Signal Processing Society Speech and Language Technical Committee Newsletter (2013)Google Scholar
  32. 32.
    Y. Lei, L. Burget, N. Scheffer, A noise robust i-vector extractor using vector Taylor series for speaker recognition, in2013 IEEE international conference on acoustics, speech and signal processing, May 2013, pp. 6788–6791Google Scholar
  33. 33.
    B.G. Nagaraja, H.S. Jayanna, Multilingual Speaker Identification with the Constraint of Limited Data Using Multitaper MFCC (Springer, Berlin, 2012), pp. 127–134Google Scholar
  34. 34.
    M. Neffe, V. Pham, H. Horst, G. Kubin, Speaker segmentation for air traffic control, in Lecture Notes in Artificial Intelligence (Springer, Berlin, 2007), pp. 177–191Google Scholar
  35. 35.
    NIST. The NIST year 2002 speaker recognition evaluation plan. National Institute of Standards and Technology of USA, February (2002). Available: http://www.nist.gov/speech/tests/spk/2002/doc/2002-spkrec-evalplan-v60.pdf
  36. 36.
    S. Qureshi, I. Masood, M. Hashmi, S. Hanninen, M. Sarwar, A. Jameel, Noise reduction of electrocardiographic signals using wavelet transforms. Elektron. Elektrotech. 20(3), 29–32 (2014)Google Scholar
  37. 37.
    D. Reynolds, W. Andrews, J. Campbell, J. Navratil, B. Peskin, A. Adami, Q. Jin, D. Klusacek, J. Abramson, R. Mihaescu, J. Godfrey, D. Jones, B. Xiang, The Supersid Project: exploiting high-level information for high-accuracy speaker recognition, in 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP ’03), vol. 4, Apr 2003, pp. IV–784Google Scholar
  38. 38.
    D.A. Reynolds, An overview of automatic speaker recognition technology, in 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, May 2002, pp. IV–4072–IV–4075Google Scholar
  39. 39.
    D.A. Reynolds, R.C. Rose, Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)CrossRefGoogle Scholar
  40. 40.
    S. Sadjadi, J. Hansen, Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification. Speech Commun. 72, 138–148 (2015)CrossRefGoogle Scholar
  41. 41.
    M. Sajatovic, et al., L-DACS1 system definition proposal: ddeliverable D2. Technical report version 1.0, Feb 2009Google Scholar
  42. 42.
    S. Sekkate, M. Khalil, A. Adib, An improved automatic aircraft identification system, in 2016 International Conference on Wireless Networks and Mobile Communications (WINCOM), Oct 2016, pp. 47–51Google Scholar
  43. 43.
    S. Sekkate, M. Khalil, A. Adib, Speaker identification: a way to reduce call-sign confusion events, in 2017 International Conference on Advanced Technologies for Signal & Image Processing, May 2017Google Scholar
  44. 44.
    U. Seljuq, F. Himayun, H. Rasheed, Selection of an optimal mother wavelet basis function for ECG signal denoising, in 17th IEEE International Multi Topic Conference 2014, Dec 2014, pp. 26–30Google Scholar
  45. 45.
    S. Selva Nidhyananthan, R. Shantha Selva Kumari, T. Senthur Selvi, Noise robust speaker identification using RASTA–MFCC feature with quadrilateral filter bank structure. Wirel. Pers. Commun. 91(3), 1321–1333 (2016)CrossRefGoogle Scholar
  46. 46.
    A. Shafik, S.M. Elhalafawy, S.E.M. Diab, B.M. Sallam, F.E.A. El-Samie, A wavelet based approach for speaker identification from degraded speech. IJCNIS 1(3), 52–58 (2009)Google Scholar
  47. 47.
    Y. Shao, S. Srinivasan, D. Wang, Incorporating auditory feature uncertainties in robust speaker identification, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2007, Honolulu, HI, USA, 15–20 Apr 2007, pp. 277–280Google Scholar
  48. 48.
    D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, S. Khudanpur, X-vectors: robust DNN embeddings for speaker recognition, in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018), pp. 5329–5333Google Scholar
  49. 49.
    M. Stark, T.V. Pham, F. Pernkopf, G. Kubin, H. Hering, Speaker verification for air traffic control, in EUROCONTROL Innovative Research Workshop and Exhibition, Dec 2006Google Scholar
  50. 50.
    V.N. Vapnik, Statistical Learning Theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control (Wiley, New York, 1998)Google Scholar
  51. 51.
    Voxforge database. Technical reportGoogle Scholar
  52. 52.
    A. Vuppala, K.S. Rao, Speaker identification under background noise using features extracted from steady vowel regions. Int. J. Adapt Control Signal Process 27(9), 781–792 (2013)CrossRefGoogle Scholar
  53. 53.
    L. Xu, R.K. Das, E. Yilmaz, J. Yang, H. Li, Generative x-vectors for text-independent speaker verification (2018). CoRR, arXiv:1809.06798
  54. 54.
    X. Zhao, Y. Shao, D. Wang, Casa-based robust speaker identification. IEEE Trans. Audio Speech Lang. Process. 20(5), 1608–1616 (2012)CrossRefGoogle Scholar
  55. 55.
    X. Zhao, Y. Wang, D. Wang, Robust speaker identification in noisy and reverberant conditions. IEEE Trans. Audio Speech Lang. Process. 22(4), 836–845 (2014)CrossRefGoogle Scholar
  56. 56.
    T.F. Zheng, Q. Jin, L. Li, J. Wang, F. Bie, An overview of robustness related issues in speaker recognition, inSignal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific, Dec 2014, pp. 1–10Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Team Networks, Telecoms and MultimediaLIM@II-FSTMMohammediaMorocco

Personalised recommendations