Abstract
We propose an approach of generating a hybrid feature set and using prior knowledge in a multi-stage CNNs for robust infant sound classification. The dominant and auxiliary features within the set are beneficial to enlarge the coverage as well as keeping a good resolution for modeling the diversity of variations within infant sound. The novel multi-stage CNNs method work together with prior knowledge constraints in decision making to overcome the limited data problem in infant sound classification. Prior knowledge either from rules or from statistical results provides a good guidance for searching and classification. The effectiveness of proposed method is evaluated on commonly used Dustan Baby Language Database and Baby Chillanto Database. It gives an encouraging reduction of 4.14% absolute classification error rate compared with the results from the best model using one-stage CNN. In addition, on Baby Chillanto Database, a significant absolute error reduction of 5.33% is achieved compared to one-stage CNN and it outperforms all other existing related studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Banica, I.-A., Cucu, H. et al.: Automatic method for infant cry classification. In: International Conference on Communications, Kuala Lumpur, pp. 51–54. IEEE (2016)
Franti, E., Ispas, I., Dascalu, M.: Testing the universal baby language hypothesis automatic infant speech recognition with CNNs. In: 41st International Conference on Telecommunications and Signal Processing, Athens, pp. 424–427. IEEE (2018)
Bano, S., Ravikumar, K.M.: Decoding baby talk: basic approach for normal classification of infant cry signal. Int. J. Comput. Appl. 0975, 8887 (2015)
Hamidi, M., Chibani, A., Osmani, A.: Machine learning approach for infant cry interpretation. In: International Conference on Tools with Artificial Intelligence, pp. 182–186. DBLP, Volos (2017)
Lei, Y.S., Wang, Z.Y.: The characteristic of infant cries. In: National Conference on Man-Machine Speech Communication, Xi’an (2011)
Varma, N., Mittal, V.K., Asthana, S.: An investigation into classification of infant cries using modified signal processing methods. In: 2nd International Conference on Signal Processing and Integrated Networks, Noida-Delhi NCR, pp. 679–684 (2015)
Li, Y., Kuo, K., Liu, L.: Infant cry signal detection, pattern, extraction and recognition. In: International Conference on Information and Computer Technologies, Paris, pp. 159–163 (2018)
Praat homepage. http://www.fon.hum.uva.nl/praat/
Takahashi, N., Gygli, M., Pfister, B., VanGool, L.: Deep convolutional neural networks and data augmentation for acoustic even detection. In: Interspeech, San Francisco (2016)
Sachin, M.U., Nagaraj, R., Samiksha, M., Rao, S., Moharir, M.: Identification of asphyxia in newborns using GPU for deep learning. In: 2nd International Conference for Convergence in Technology, India, pp. 236–239. IEEE (2017)
Le, L., Kabir, A.N., Ji, C., Basodi, S., Pan, Y.: Using transfer learning, SVM, and ensemble classification to classify baby cries based on their spectrogram images. In: The Sixth National Workshop for REU Research in Networking and Systems, Monterey (2019)
Wahid, N.S.A., Saad, P., Hariharan, M.: Automatic infant cry pattern classification for a multiclass problem. J. Telecommun. Electron. Comput. Eng. 8(9), 45–52 (2016)
Liu, H., Li, J., Zhang, Y.-Q., Pan, Y.: An adaptive genetic fuzzy multi-path routing protocol for wireless ad hoc networks. In: 1st ACIS International Workshop on Self-Assembling Wireless Networks (SAWN 2005), Towson, Maryland, USA, 23–25 May, 2005, pp. 468–475 (2005)
Sox homepage. https://en.wikipedia.org/wiki/SoX
Tensorflow homepage. https://www.tensorflow.org/
Acknowledgement
We’d like to thank Dr. Eduard Franti for sharing the Dunstan Baby database and thank Dr. Orion Reyes and Dr. Carlos A. Reyes for providing the access to the Baby Chillanto database. We want to express our deepest gratitude to Dr. Carlos A. Reyes-Garcia, Dr. Emilio Arch-Tirado and his INR-Mexico group, and Dr. Edgar M. Garcia-Tamayo for their dedication in the collection of the Infant Cry database. We also want to gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ji, C., Basodi, S., Xiao, X., Pan, Y. (2020). Infant Sound Classification on Multi-stage CNNs with Hybrid Features and Prior Knowledge. In: Xu, R., De, W., Zhong, W., Tian, L., Bai, Y., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2020. AIMS 2020. Lecture Notes in Computer Science(), vol 12401. Springer, Cham. https://doi.org/10.1007/978-3-030-59605-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-59605-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59604-0
Online ISBN: 978-3-030-59605-7
eBook Packages: Computer ScienceComputer Science (R0)