Abstract
Estimating the quality of Voice over Internet Protocol (VoIP) as perceived by humans is considered a formidable task. This is partly due to the relatively large number of variables that are involved as determinants of quality. Moreover, discerning the significance of one variable over the other is difficult. In this paper a novel approach based on genetic programming (GP) is presented. It maps the effect of network traffic parameters on listeners’ perception of speech quality. The ITU-T Recommendation P.862 (PESQ) algorithm is used as a reference model in this research. The GP discovered models that provide effective VoIP quality estimation are highly correlated to ITU-T Recommendation P.862 (PESQ). They also outperform the ITU-T Recommendation P.563 in estimating the effect that packet loss has on speech quality. The GP discovered models prove suited to real-time and in vivo evaluation of VoIP calls. Additionally, they are deployable on a wide variety of hardware platforms.
Similar content being viewed by others
Notes
Adaptive operator probabilities are discussed on page 31 of the GPLab manual.
References
Cole RG, Rosenbluth JH (2001) Voice over ip performance monitoring. SIGCOMM Comput Commun Rev 31(2):9–24
Davis L (1989) Adapting operator probabilities in genetic algorithms. In: Proceedings of the third international conference on genetic algorithms, San Mateo, CA
ETSI EN 301 704 V7.2.1. Digital cellular telecommunications system; Adaptive Multi-Rate (AMR) speech transcoding
Gustafson S, Burke EK, Krasnogor N (2005) On improving genetic programming for symbolic regression. In: Corne D et al (ed) Proceedings of the 2005 IEEE congress on evolutionary computation, vol 1. IEEE Press, Edinburgh, UK, pp 912–919
Hoene C, Karl H, Wolisz A (2004) A perceptual quality model for adaptive VOIP applications. In: Proceedings of international symposium on performance evaluation of computer and telecommunication systems (SPECTS), vol 4. San Jose, California, USA, pp 2573–2577
ITU-T (1996a) Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP). International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation G.729
ITU-T (1996b) Dual rate speech coder for multimedia communication transmitting at 5.3 and 6.3 kbit/s. International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation G.723.1
ITU-T (2005a) The E-model, a computational model for use in transmission planning. International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation G.107
ITU-T (2005b) Network model for evaluating multimedia transmission performance over internet protocol. International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation G.1050
ITU-T (2005c) Single-ended method for objective speech quality assessment in narrow-band telephony applications. International Telecommunications Union, Geneva, Switzerland. ITU-T Recommendation P.563
Keijzer M (2003) Improving symbolic regression with interval arithmetic and linear scaling. In: Ryan C, Soule T, Keijzer M, Tsang E, Poli R, Costa E (eds) Genetic programming. Proceedings of EuroGP’2003, vol 2610 of LNCS. Springer-Verlag, Essex, pp 70–82
Keijzer M (2004) Scaled symbolic regression. Genetic Program Evolvable Mach 5(3):259–269
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge
Luke S, Panait L (2002) Lexicographic parsimony pressure. In: Langdon WB et al (ed) GECCO 2002: proceedings of the genetic and evolutionary computation conference. New York, pp 829–836
Mitchell T (1997) Machine learning. McGraw Hill, New York
Mohamed S, Cervantes-Perez F, Afifi H (2001) Integrating networks measurements and speech quality subjective scores for control purposes. In: Annual joint conference of the IEEE computer and communications societies (INFOCOM), pp 641–649
Mohamed S, Rubino G, Varela M (2004) A method for quantitative evaluation of audio quality over packet networks and its comparison with existing techniques. In: Measurement of speech and audio quality in networks (MESAQIN)
Raja A, Azad RMA, Flanagan C, Picovici D, Ryan C (2006) Non-intrusive quality evaluation of VOIP using genetic programming. In: First international conference on bio inspired models of network, information and computer systems, vol 4, pp 2573–2577
Raja A, Azad RMA, Flanagan C, Ryan C (2007) Real-time, non-intrusive evaluation of VoIP. In: Ebner M, O’Neill M, Ekárt A, Vanneschi L, Isabel Esparcia-Alcázar A (eds) Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science. Springer, Valencia, Spain, pp. 217–228
Sun L, Ifeachor EC (2002) Perceived speech quality prediction for voice over ip-based networks. In: IEEE international conference on communications (ICC), vol 4, pp 2573–2577
Sun L, Ifeachor EC (2006) Voice quality prediction models and their application in VoIP networks. IEEE Trans Multimed 8(4):809–820
Sun L, Wade G, Lines BM, Ifeachor EC (2001) Impact of packet loss location on perceived speech quality. In: 2nd IP-telephony workshop. Columbia University, New York
Thorpe L, Yang W (1996) Performance of current perceptual objective speech quality measures. In: IEEE international speech coding, vol 1, pp 144–146
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Raja, A., Azad, R.M.A., Flanagan, C. et al. Evolutionary speech quality estimation in VoIP. Soft Comput 15, 89–94 (2011). https://doi.org/10.1007/s00500-009-0521-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-009-0521-2