Infant Sound Classification on Multi-stage CNNs with Hybrid Features and Prior Knowledge

Ji, Chunyan; Basodi, Sunitha; Xiao, Xueli; Pan, Yi

doi:10.1007/978-3-030-59605-7_1

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12401))

Included in the following conference series:

International Conference on AI and Mobile Services

572 Accesses
6 Citations

Abstract

We propose an approach of generating a hybrid feature set and using prior knowledge in a multi-stage CNNs for robust infant sound classification. The dominant and auxiliary features within the set are beneficial to enlarge the coverage as well as keeping a good resolution for modeling the diversity of variations within infant sound. The novel multi-stage CNNs method work together with prior knowledge constraints in decision making to overcome the limited data problem in infant sound classification. Prior knowledge either from rules or from statistical results provides a good guidance for searching and classification. The effectiveness of proposed method is evaluated on commonly used Dustan Baby Language Database and Baby Chillanto Database. It gives an encouraging reduction of 4.14% absolute classification error rate compared with the results from the best model using one-stage CNN. In addition, on Baby Chillanto Database, a significant absolute error reduction of 5.33% is achieved compared to one-stage CNN and it outperforms all other existing related studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Banica, I.-A., Cucu, H. et al.: Automatic method for infant cry classification. In: International Conference on Communications, Kuala Lumpur, pp. 51–54. IEEE (2016)
Google Scholar
Franti, E., Ispas, I., Dascalu, M.: Testing the universal baby language hypothesis automatic infant speech recognition with CNNs. In: 41st International Conference on Telecommunications and Signal Processing, Athens, pp. 424–427. IEEE (2018)
Google Scholar
Bano, S., Ravikumar, K.M.: Decoding baby talk: basic approach for normal classification of infant cry signal. Int. J. Comput. Appl. 0975, 8887 (2015)
Google Scholar
Hamidi, M., Chibani, A., Osmani, A.: Machine learning approach for infant cry interpretation. In: International Conference on Tools with Artificial Intelligence, pp. 182–186. DBLP, Volos (2017)
Google Scholar
Lei, Y.S., Wang, Z.Y.: The characteristic of infant cries. In: National Conference on Man-Machine Speech Communication, Xi’an (2011)
Google Scholar
Varma, N., Mittal, V.K., Asthana, S.: An investigation into classification of infant cries using modified signal processing methods. In: 2nd International Conference on Signal Processing and Integrated Networks, Noida-Delhi NCR, pp. 679–684 (2015)
Google Scholar
Li, Y., Kuo, K., Liu, L.: Infant cry signal detection, pattern, extraction and recognition. In: International Conference on Information and Computer Technologies, Paris, pp. 159–163 (2018)
Google Scholar
Praat homepage. http://www.fon.hum.uva.nl/praat/
Takahashi, N., Gygli, M., Pfister, B., VanGool, L.: Deep convolutional neural networks and data augmentation for acoustic even detection. In: Interspeech, San Francisco (2016)
Google Scholar
Sachin, M.U., Nagaraj, R., Samiksha, M., Rao, S., Moharir, M.: Identification of asphyxia in newborns using GPU for deep learning. In: 2nd International Conference for Convergence in Technology, India, pp. 236–239. IEEE (2017)
Google Scholar
Le, L., Kabir, A.N., Ji, C., Basodi, S., Pan, Y.: Using transfer learning, SVM, and ensemble classification to classify baby cries based on their spectrogram images. In: The Sixth National Workshop for REU Research in Networking and Systems, Monterey (2019)
Google Scholar
Wahid, N.S.A., Saad, P., Hariharan, M.: Automatic infant cry pattern classification for a multiclass problem. J. Telecommun. Electron. Comput. Eng. 8(9), 45–52 (2016)
Google Scholar
Liu, H., Li, J., Zhang, Y.-Q., Pan, Y.: An adaptive genetic fuzzy multi-path routing protocol for wireless ad hoc networks. In: 1st ACIS International Workshop on Self-Assembling Wireless Networks (SAWN 2005), Towson, Maryland, USA, 23–25 May, 2005, pp. 468–475 (2005)
Google Scholar
Sox homepage. https://en.wikipedia.org/wiki/SoX
Tensorflow homepage. https://www.tensorflow.org/

Download references

Acknowledgement

We’d like to thank Dr. Eduard Franti for sharing the Dunstan Baby database and thank Dr. Orion Reyes and Dr. Carlos A. Reyes for providing the access to the Baby Chillanto database. We want to express our deepest gratitude to Dr. Carlos A. Reyes-Garcia, Dr. Emilio Arch-Tirado and his INR-Mexico group, and Dr. Edgar M. Garcia-Tamayo for their dedication in the collection of the Infant Cry database. We also want to gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.

Author information

Authors and Affiliations

Georgia State University, Atlanta, GA, 30303, USA
Chunyan Ji, Sunitha Basodi, Xueli Xiao & Yi Pan

Authors

Chunyan Ji
View author publications
You can also search for this author in PubMed Google Scholar
Sunitha Basodi
View author publications
You can also search for this author in PubMed Google Scholar
Xueli Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yi Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Pan .

Editor information

Editors and Affiliations

Harbin Institute of Technology, Shenzhen, China
Ruifeng Xu
Sunmi US Inc., Pleasanton, CA, USA
Wang De
University of South Carolina Upstate, Spartanburg, SC, USA
Wei Zhong
University of Electronic Science and Technology of China, Chengdu, China
Ling Tian
Eastern Michigan University, Ypsilanti, MI, USA
Yongsheng Bai
Kingdee International Software Group Co., Ltd., Shenzhen, China
Liang-Jie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ji, C., Basodi, S., Xiao, X., Pan, Y. (2020). Infant Sound Classification on Multi-stage CNNs with Hybrid Features and Prior Knowledge. In: Xu, R., De, W., Zhong, W., Tian, L., Bai, Y., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2020. AIMS 2020. Lecture Notes in Computer Science(), vol 12401. Springer, Cham. https://doi.org/10.1007/978-3-030-59605-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-59605-7_1
Published: 18 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59604-0
Online ISBN: 978-3-030-59605-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics