Skip to main content

Evolutionary speech quality estimation in VoIP


Estimating the quality of Voice over Internet Protocol (VoIP) as perceived by humans is considered a formidable task. This is partly due to the relatively large number of variables that are involved as determinants of quality. Moreover, discerning the significance of one variable over the other is difficult. In this paper a novel approach based on genetic programming (GP) is presented. It maps the effect of network traffic parameters on listeners’ perception of speech quality. The ITU-T Recommendation P.862 (PESQ) algorithm is used as a reference model in this research. The GP discovered models that provide effective VoIP quality estimation are highly correlated to ITU-T Recommendation P.862 (PESQ). They also outperform the ITU-T Recommendation P.563 in estimating the effect that packet loss has on speech quality. The GP discovered models prove suited to real-time and in vivo evaluation of VoIP calls. Additionally, they are deployable on a wide variety of hardware platforms.

This is a preview of subscription content, access via your institution.



  2. Adaptive operator probabilities are discussed on page 31 of the GPLab manual.


  • Cole RG, Rosenbluth JH (2001) Voice over ip performance monitoring. SIGCOMM Comput Commun Rev 31(2):9–24

    Article  Google Scholar 

  • Davis L (1989) Adapting operator probabilities in genetic algorithms. In: Proceedings of the third international conference on genetic algorithms, San Mateo, CA

  • ETSI EN 301 704 V7.2.1. Digital cellular telecommunications system; Adaptive Multi-Rate (AMR) speech transcoding

  • Gustafson S, Burke EK, Krasnogor N (2005) On improving genetic programming for symbolic regression. In: Corne D et al (ed) Proceedings of the 2005 IEEE congress on evolutionary computation, vol 1. IEEE Press, Edinburgh, UK, pp 912–919

  • Hoene C, Karl H, Wolisz A (2004) A perceptual quality model for adaptive VOIP applications. In: Proceedings of international symposium on performance evaluation of computer and telecommunication systems (SPECTS), vol 4. San Jose, California, USA, pp 2573–2577

  • ITU-T (1996a) Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP). International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation G.729

  • ITU-T (1996b) Dual rate speech coder for multimedia communication transmitting at 5.3 and 6.3 kbit/s. International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation G.723.1

  • ITU-T (2005a) The E-model, a computational model for use in transmission planning. International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation G.107

  • ITU-T (2005b) Network model for evaluating multimedia transmission performance over internet protocol. International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation G.1050

  • ITU-T (2005c) Single-ended method for objective speech quality assessment in narrow-band telephony applications. International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation P.563

  • Keijzer M (2003) Improving symbolic regression with interval arithmetic and linear scaling. In: Ryan C, Soule T, Keijzer M, Tsang E, Poli R, Costa E (eds) Genetic programming. Proceedings of EuroGP’2003, vol 2610 of LNCS. Springer-Verlag, Essex, pp 70–82

  • Keijzer M (2004) Scaled symbolic regression. Genetic Program Evolvable Mach 5(3):259–269

    Article  Google Scholar 

  • Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

    MATH  Google Scholar 

  • Luke S, Panait L (2002) Lexicographic parsimony pressure. In: Langdon WB et al (ed) GECCO 2002: proceedings of the genetic and evolutionary computation conference. New York, pp 829–836

  • Mitchell T (1997) Machine learning. McGraw Hill, New York

    MATH  Google Scholar 

  • Mohamed S, Cervantes-Perez F, Afifi H (2001) Integrating networks measurements and speech quality subjective scores for control purposes. In: Annual joint conference of the IEEE computer and communications societies (INFOCOM), pp 641–649

  • Mohamed S, Rubino G, Varela M (2004) A method for quantitative evaluation of audio quality over packet networks and its comparison with existing techniques. In: Measurement of speech and audio quality in networks (MESAQIN)

  • Raja A, Azad RMA, Flanagan C, Picovici D, Ryan C (2006) Non-intrusive quality evaluation of VOIP using genetic programming. In: First international conference on bio inspired models of network, information and computer systems, vol 4, pp 2573–2577

  • Raja A, Azad RMA, Flanagan C, Ryan C (2007) Real-time, non-intrusive evaluation of VoIP. In: Ebner M, O’Neill M, Ekárt A, Vanneschi L, Isabel Esparcia-Alcázar A (eds) Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science. Springer, Valencia, Spain, pp. 217–228

  • Sun L, Ifeachor EC (2002) Perceived speech quality prediction for voice over ip-based networks. In: IEEE international conference on communications (ICC), vol 4, pp 2573–2577

  • Sun L, Ifeachor EC (2006) Voice quality prediction models and their application in VoIP networks. IEEE Trans Multimed 8(4):809–820

    Article  Google Scholar 

  • Sun L, Wade G, Lines BM, Ifeachor EC (2001) Impact of packet loss location on perceived speech quality. In: 2nd IP-telephony workshop. Columbia University, New York

  • Thorpe L, Yang W (1996) Performance of current perceptual objective speech quality measures. In: IEEE international speech coding, vol 1, pp 144–146

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Adil Raja.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Raja, A., Azad, R.M.A., Flanagan, C. et al. Evolutionary speech quality estimation in VoIP. Soft Comput 15, 89–94 (2011).

Download citation

  • Published:

  • Issue Date:

  • DOI:


  • Packet Loss
  • Genetic Programming
  • Packet Loss Rate
  • Mean Opinion Score
  • Speech Quality