Skip to main content

Infant Sound Classification on Multi-stage CNNs with Hybrid Features and Prior Knowledge

  • Conference paper
  • First Online:
Artificial Intelligence and Mobile Services – AIMS 2020 (AIMS 2020)

Abstract

We propose an approach of generating a hybrid feature set and using prior knowledge in a multi-stage CNNs for robust infant sound classification. The dominant and auxiliary features within the set are beneficial to enlarge the coverage as well as keeping a good resolution for modeling the diversity of variations within infant sound. The novel multi-stage CNNs method work together with prior knowledge constraints in decision making to overcome the limited data problem in infant sound classification. Prior knowledge either from rules or from statistical results provides a good guidance for searching and classification. The effectiveness of proposed method is evaluated on commonly used Dustan Baby Language Database and Baby Chillanto Database. It gives an encouraging reduction of 4.14% absolute classification error rate compared with the results from the best model using one-stage CNN. In addition, on Baby Chillanto Database, a significant absolute error reduction of 5.33% is achieved compared to one-stage CNN and it outperforms all other existing related studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Banica, I.-A., Cucu, H. et al.: Automatic method for infant cry classification. In: International Conference on Communications, Kuala Lumpur, pp. 51–54. IEEE (2016)

    Google Scholar 

  2. Franti, E., Ispas, I., Dascalu, M.: Testing the universal baby language hypothesis automatic infant speech recognition with CNNs. In: 41st International Conference on Telecommunications and Signal Processing, Athens, pp. 424–427. IEEE (2018)

    Google Scholar 

  3. Bano, S., Ravikumar, K.M.: Decoding baby talk: basic approach for normal classification of infant cry signal. Int. J. Comput. Appl. 0975, 8887 (2015)

    Google Scholar 

  4. Hamidi, M., Chibani, A., Osmani, A.: Machine learning approach for infant cry interpretation. In: International Conference on Tools with Artificial Intelligence, pp. 182–186. DBLP, Volos (2017)

    Google Scholar 

  5. Lei, Y.S., Wang, Z.Y.: The characteristic of infant cries. In: National Conference on Man-Machine Speech Communication, Xi’an (2011)

    Google Scholar 

  6. Varma, N., Mittal, V.K., Asthana, S.: An investigation into classification of infant cries using modified signal processing methods. In: 2nd International Conference on Signal Processing and Integrated Networks, Noida-Delhi NCR, pp. 679–684 (2015)

    Google Scholar 

  7. Li, Y., Kuo, K., Liu, L.: Infant cry signal detection, pattern, extraction and recognition. In: International Conference on Information and Computer Technologies, Paris, pp. 159–163 (2018)

    Google Scholar 

  8. Praat homepage. http://www.fon.hum.uva.nl/praat/

  9. Takahashi, N., Gygli, M., Pfister, B., VanGool, L.: Deep convolutional neural networks and data augmentation for acoustic even detection. In: Interspeech, San Francisco (2016)

    Google Scholar 

  10. Sachin, M.U., Nagaraj, R., Samiksha, M., Rao, S., Moharir, M.: Identification of asphyxia in newborns using GPU for deep learning. In: 2nd International Conference for Convergence in Technology, India, pp. 236–239. IEEE (2017)

    Google Scholar 

  11. Le, L., Kabir, A.N., Ji, C., Basodi, S., Pan, Y.: Using transfer learning, SVM, and ensemble classification to classify baby cries based on their spectrogram images. In: The Sixth National Workshop for REU Research in Networking and Systems, Monterey (2019)

    Google Scholar 

  12. Wahid, N.S.A., Saad, P., Hariharan, M.: Automatic infant cry pattern classification for a multiclass problem. J. Telecommun. Electron. Comput. Eng. 8(9), 45–52 (2016)

    Google Scholar 

  13. Liu, H., Li, J., Zhang, Y.-Q., Pan, Y.: An adaptive genetic fuzzy multi-path routing protocol for wireless ad hoc networks. In: 1st ACIS International Workshop on Self-Assembling Wireless Networks (SAWN 2005), Towson, Maryland, USA, 23–25 May, 2005, pp. 468–475 (2005)

    Google Scholar 

  14. Sox homepage. https://en.wikipedia.org/wiki/SoX

  15. Tensorflow homepage. https://www.tensorflow.org/

Download references

Acknowledgement

We’d like to thank Dr. Eduard Franti for sharing the Dunstan Baby database and thank Dr. Orion Reyes and Dr. Carlos A. Reyes for providing the access to the Baby Chillanto database. We want to express our deepest gratitude to Dr. Carlos A. Reyes-Garcia, Dr. Emilio Arch-Tirado and his INR-Mexico group, and Dr. Edgar M. Garcia-Tamayo for their dedication in the collection of the Infant Cry database. We also want to gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Pan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ji, C., Basodi, S., Xiao, X., Pan, Y. (2020). Infant Sound Classification on Multi-stage CNNs with Hybrid Features and Prior Knowledge. In: Xu, R., De, W., Zhong, W., Tian, L., Bai, Y., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2020. AIMS 2020. Lecture Notes in Computer Science(), vol 12401. Springer, Cham. https://doi.org/10.1007/978-3-030-59605-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59605-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59604-0

  • Online ISBN: 978-3-030-59605-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics