Abstract
Classification falls under supervised learning. Supervised learning is a learning process from a given dataset or training dataset where both input and mapping output data are provided. The decision rules are designed by observing the training dataset to determine the category or class for future decision-making. Classification is the process of assigning an individual item or dataset to one of the number of existing categories or classes depending on the characteristics or features of the input data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Retrieved September 26, 2018, from https://www.youtube.com/watch?v=4HKqjENq9OU.
Retrieved October 22, 2018, from http://www.scholarpedia.org/article/K-nearest_neighbor.
Retrieved October 22, 2018, from https://www.jstor.org/stable/1403796?seq=1#page_scan_tab_contents.
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.
Hellman, M. E. (1970). The nearest neighbor classification rule with a reject option. IEEE Transactions on Systems Science and Cybernetics, 3, 179–185.
Fukunaga, K., & Hostetler, L. (1975). k-nearest-neighbor bayes-risk estimation. IEEE Transactions on Information Theory, 21(3), 285–293.
Dudani, S. A. (1976). The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems Science and Cybernetics, SMC-6:325–327.
Bailey, T., & Jain, A. (1978). A note on distance-weighted k-nearest neighbor rules. IEEE Transactions on Systems, Man, Cybernetics, 8, 311–313.
Bermejo, S., & Cabestany, J. (2000). Adaptive soft k-nearest-neighbour classifiers. Pattern Recognition, 33, 1999–2005.
Jozwik, A. (1983). A learning scheme for a fuzzy k-nn rule. Pattern Recognition Letters, 1, 287–289.
Pao, T. L., Liao, W. Y., & Chen, Y. T. (2007). Audio-visual speech recognition with weighted KNN-based classification in mandarin database. In 2007 Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIHMSP 2007 (Vol. 1, pp. 39–42). IEEE.
Kacur, J., Vargic, R., & Mulinka, P. (2011). Speaker identification by K-nearest neighbors: Application of PCA and LDA prior to KNN. In 2011 18th International Conference on Systems, Signals and Image Processing (IWSSIP) (pp. 1–4). IEEE.
Feraru, M., & Zbancioc, M. (2013). Speech emotion recognition for SROL database using weighted KNN algorithm. In 2013 International Conference on Electronics, Computers and Artificial Intelligence (ECAI) (pp. 1–4). IEEE.
Rizwan, M., & Anderson, D. V. (2014). Using k-Nearest Neighbor and speaker ranking for phoneme prediction. In 2014 13th International Conference on Machine Learning and Applications (ICMLA) (pp. 383–387). IEEE.
Retrieved October 08, 2018, from http://www.statsoft.com/textbook/naive-bayes-classifier.
Russell, S., & Norvig, P. (2003). Artificial intelligence: A modern approach (2nd ed.). Prentice Hall. ISBN 978-0137903955. [1995].
Fu, Z., Lu, G., Ting, K. M., & Zhang, D. (2010). Learning Naïve Bayes classifiers for music classification and retrieval. In 2010 20th International Conference on Pattern Recognition (ICPR) (pp. 4589–4592). IEEE.
Sanchis, A., Juan, A., & Vidal, E. (2012). A word-based Naïve Bayes classifier for confidence estimation in speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 20(2), 565–574.
Bhakre, S. K., & Bang, A. (2016). Emotion recognition on the basis of audio signal using Naïve Bayes classifier. In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 2363–2367). IEEE.
Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1), 81–106.
Retrieved October 11, 2018, from https://www.youtube.com/watch?v=qDcl-FRnwSU.
Navada, A., Ansari, A. N., Patil, S., & Sonkamble, B. A. (2011). Overview of use of decision tree algorithms in machine learning. In Control and system graduate research colloquium (icsgrc), 2011 IEEE (pp. 37–42). IEEE.
Akamine, M., & Ajmera, J. (2012). Decision tree-based acoustic models for speech recognition. EURASIP Journal on Audio, Speech, and Music Processing, 2012(1), 10.
Telaar, D., & Fuhs, M. C. (2013). Accent-and speaker-specific polyphone decision trees for non-native speech recognition. In INTERSPEECH (pp. 3313–3316).
Hinton, G., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97.
Mohamed, A. R., Dahl, G., & Hinton, G. (2009). Deep belief networks for phone recognition. In Nips workshop on deep learning for speech recognition and related applications (Vol. 1, No. 9, p. 39).
Mohamed, A. R., Dahl, G. E., & Hinton, G. (2012). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech & Language Processing, 20(1), 14–22.
Jaitly, N., Nguyen, P., Senior, A., & Vanhoucke, V. (2012). Application of pretrained deep neural networks to large vocabulary speech recognition. In Thirteenth Annual Conference of the International Speech Communication Association.
Seide, F., Li, G., & Yu, D. (2011). Conversational speech transcription using context-dependent deep neural networks. In Twelfth Annual Conference of the International Speech Communication Association.
Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2012). Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech and Language Processing, 20(1), 30–42.
Li, X., & Wu, X. (2014). Decision tree based state tying for speech recognition using DNN derived embeddings. In 2014 9th International Symposium on Chinese Spoken Language Processing (ISCSLP) (pp. 123–127). IEEE.
Bressan, G. M., de Azevedo, B. C., & ElisangelaAp, S. L. (2017). A decision tree approach for the musical genres classification. Applied Mathematics, 11(6), 1703–1713.
Wang, Y., Cao, L., Dey, N., Ashour, A. S., & Shi, F. (2017). Mice liver cirrhosis microscopic image analysis using gray level co-occurrence matrix and support vector machines. Frontiers in artificial intelligence and applications. In Proceedings of ITITS (pp. 509–515).
Zemmal, N., Azizi, N., Dey, N., & Sellami, M. (2016). Adaptive semi supervised support vector machine semi supervised learning with features cooperation for breast cancer classification. Journal of Medical Imaging and Health Informatics, 6(1), 53–62.
Wang, C., et al. (2018). Histogram of oriented gradient based plantar pressure image feature extraction and classification employing fuzzy support vector machine. Journal of Medical Imaging and Health Informatics, 8(4), 842–854.
Retrieved October 10, 2018, from https://www.kdnuggets.com/2016/07/support-vector-machines-simple-explanation.html.
Kowalczyk, A. (2017). Support vector machines succinctly.
Padrell-Sendra, J., Martin-Iglesias, D., & Diaz-de-Maria, F. (2006, September). Support vector machines for continuous speech recognition. In 2006 14th European Signal Processing Conference (pp. 1–4). IEEE.
Dey, N., & Ashour, A. S. (2018). Challenges and future perspectives in speech-sources direction of arrival estimation and localization. In Direction of arrival estimation and localization of multi-speech sources (pp. 49–52). Springer, Cham.
Dey, N., & Ashour, A. S. (2018). Direction of arrival estimation and localization of multi-speech sources. Springer International Publishing.
Dey, N., & Ashour, A. S. (2018). Applied examples and applications of localization and tracking problem of multiple speech sources. In Direction of arrival estimation and localization of multi-speech sources (pp. 35–48). Springer, Cham.
Dey, N., & Ashour, A. S. (2018). Microphone array principles. In Direction of arrival estimation and localization of multi-speech sources (pp. 5–22). Springer, Cham.
Shen, P., Changjun, Z., & Chen, X. (2011). Automatic speech emotion recognition using support vector machine. In 2011 International Conference on Electronic and Mechanical Engineering and Information Technology (EMEIT) (Vol. 2, pp. 621–625). IEEE.
Mahmoodi, D., Marvi, H., Taghizadeh, M., Soleimani, A., Razzazi, F., & Mahmoodi, M. (2011). Age estimation based on speech features and support vector machine. In 2011 3rd Computer Science and Electronic Engineering Conference (CEEC), (pp. 60–64). IEEE.
Matoušek, J., & Tihelka, D. (2013). SVM-based detection of misannotated words in read speech corpora. In International Conference on Text, Speech and Dialogue (pp. 457–464). Springer, Heidelberg.
Aida-zade, K., Xocayev, A., & Rustamov, S. (2016). Speech recognition using support vector machines. In 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT), (pp. 1–4). IEEE.
Shi, W., & Fan, X. (2017). Speech classification based on cuckoo algorithm and support vector machines. In 2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA) (pp. 98–102). IEEE.
Chan, M. V., Feng, X., Heinen, J. A., & Niederjohn, R. J. (1994). Classification of speech accents with neural networks. In 1994 IEEE International Conference on Neural Networks, IEEE World Congress on Computational Intelligence (Vol. 7, pp. 4483–4486). IEEE.
Kohonen, T. (2012). Self-organization and associative memory (Vol. 8). Springer Science & Business Media, New York.
Rumelhart, D. E., & McClelland, J. L. (1986). Parallel distributed processing: explorations in the microstructure of cognition. volume 1. foundations.
Hecht-Nielsen, R. (1990). Neurocomputing. Boston: Addison-Wesley.
Hansen, J. H., & Womack, B. D. (1996). Feature analysis and neural network-based classification of speech under stress. IEEE Transactions on Speech and Audio Processing, 4(4), 307–313.
Polur, P. D., Zhou, R., Yang, J., Adnani, F., & Hobson, R. S. (2001). Isolated speech recognition using artificial neural networks. Virginia Commonwealth Univ Richmond School of Engineering.
Shao, C., & Bouchard, M. (2003). Efficient classification of noisy speech using neural networks. In 2003 Proceedings of Seventh International Symposium on Signal Processing and Its Applications (Vol. 1, pp. 357–360). IEEE.
Alexandre, E., Cuadra, L., Rosa-Zurera, M., & López-Ferreras, F. (2008). Speech/non-speech classification in hearing aids driven by tailored neural networks. In Speech, Audio, Image and Biomedical Signal Processing using Neural Networks (pp. 145–167). Springer, Heidelberg.
Hinton, G., et al. (2012). Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine, 29(6), 82–97. IEEE.
Wang, Y., et al. (2018). Classification of mice hepatic granuloma microscopic images based on a deep convolutional neural network. Applied Soft Computing.
Lan, K., Wang, D. T., Fong, S., Liu, L. S., Wong, K. K., & Dey, N. (2018). A survey of data mining and deep learning in bioinformatics. Journal of Medical Systems, 42(8), 139.
Hu, S., Liu, M., Fong, S., Song, W., Dey, N., & Wong, R. (2018). Forecasting China future MNP by deep learning. In Behavior engineering and applications (pp. 169–210). Springer, Cham.
Dey, N., Fong, S., Song, W., & Cho, K. (2017). Forecasting energy consumption from smart home sensor network by deep learning. In International Conference on Smart Trends for Information Technology and Computer Communications (pp. 255–265). Springer, Singapore.
Dey, N., Ashour, A. S., & Nguyen, G. N. Recent advancement in multimedia content using deep learning.
Mohamed, A., Dahl, G.E., & Hinton, G. (2012). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 14–22.
Richardson, F., Reynolds, D., & Dehak, N. (2015). Deep neural network approaches to speaker and language recognition. IEEE Signal Processing Letters, 22(10), 1671–1675.
Rajanna, A. R., Aryafar, K., Shokoufandeh, A., &Ptucha, R. (2015). Deep neural networks: A case study for music genre classification. In 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA) (pp. 655–660). IEEE.
Dumpala, S. H., & Kopparapu, S. K. (2017). Improved speaker recognition system for stressed speech using deep neural networks. In 2017 International Joint Conference on Neural Networks (IJCNN) (pp. 1257–1264). IEEE.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Sen, S., Dutta, A., Dey, N. (2019). Audio Classification. In: Audio Processing and Speech Recognition. SpringerBriefs in Applied Sciences and Technology(). Springer, Singapore. https://doi.org/10.1007/978-981-13-6098-5_4
Download citation
DOI: https://doi.org/10.1007/978-981-13-6098-5_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6097-8
Online ISBN: 978-981-13-6098-5
eBook Packages: EngineeringEngineering (R0)