Skip to main content

Tagging of Weakly Labeled Acoustic Data Using Skip Layer Connection Detection Classification Model

  • Conference paper
  • First Online:
Advances in Automation, Signal Processing, Instrumentation, and Control (i-CASIC 2020)

Abstract

Weakly labeled data is hard to classify as it does not contain any time tag of the given data making it harder to identify the target to be trained. Previous works have used joint detection classification model using detector and classifier together to identify the presence of events in frames and classify them later. To increase the efficiency of the neural network, several models use very deep models with arbitrary depth but such model plateau very quickly and take long time to train. We propose skip layer connections in deep neural networks so rather than passing previous layer to the next layer, we pass identity of previous layer output and input. Due to this, the model can decide for itself how much of the previous layer output is necessary, thus pseudo-adapting their depth.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Guo G, Li SZ (2003) Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14(1):209–215

    Google Scholar 

  2. Stowell D, Giannoulis D, Benetos E, Lagrange M, Plumbley MD (2015) Detection and classification of acoustic scenes and events. IEEE Trans Multimedia 17(10):1733–1746

    Google Scholar 

  3. tp://tubularinsights.com/youtube-300-hours/

    Google Scholar 

  4. Kumar A, Raj B (2016) Audio event detection using weakly labeled data. In: Proceedings of the 24th ACM international conference on Multimedia. ACM

    Google Scholar 

  5. Mesaros A, Heittola T, Virtanen T (2016) Tut database for acoustic scene classification and sound event detection. In: 24th European signal processing conference, vol 2016

    Google Scholar 

  6. Kong Q, Xu Y, Wang W, Plumbley MD (2017, March) A joint detection-classification model for audio tagging of weakly labelled data. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 641–645

    Google Scholar 

  7. Xu Y, Kong Q, Huang Q, Wang W, Plumbley MD (2017, May) Convolutional gated recurrent neural network incorporating spatial features for audio tagging. In: 2017 international joint conference on neural networks (IJCNN). IEEE, pp 3461–3466

    Google Scholar 

  8. Cakır E, Heittola T, Virtanen T (2016) Domestic audio tagging with convolutional neural networks. In: IEEE AASP challenge on detection and classification of acoustic scenes and events (DCASE 2016)

    Google Scholar 

  9. Phan H, Krawczyk-Becker M, Gerkmann T, Mertins A (2017) DNN and CNN with weighted and multi-task loss functions for audio event detection. arXiv preprint arXiv:1708.03211

  10. Kong Q, Xu Y, Plumbley MD (2017, August) Joint detection and classification convolutional neural network on weakly labelled bird audio detection. In: 2017 25th European signal processing conference (EUSIPCO). IEEE, pp 1749–1753

    Google Scholar 

  11. Dennis J, Tran HD, Li H (2011) Spectrogram image feature for sound event classification in mismatched conditions. IEEE Sig Process Lett 18(2):130–133

    Article  Google Scholar 

  12. Wyse L (2017) Audio spectrogram representations for processing with convolutional neural networks. arXiv preprint arXiv:1706.09559

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

    Google Scholar 

  14. Foster P, Sigtia S, Krstulovic S, Barker J, Plumbley MD (2015) CHiME-Home: a dataset for sound source recognition in a domestic environment. In: Proceedings of the 11th workshop on applications of signal processing to audio and acoustics (WASPAA)

    Google Scholar 

  15. https://github.com/qiuqiangkong/Hat

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaurya Pratap Singh Sisodia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sisodia, S.P.S., Sisodia, R.P.S., Bharath, K.P., Muthu, R.K. (2021). Tagging of Weakly Labeled Acoustic Data Using Skip Layer Connection Detection Classification Model. In: Komanapalli, V.L.N., Sivakumaran, N., Hampannavar, S. (eds) Advances in Automation, Signal Processing, Instrumentation, and Control. i-CASIC 2020. Lecture Notes in Electrical Engineering, vol 700. Springer, Singapore. https://doi.org/10.1007/978-981-15-8221-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-8221-9_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-8220-2

  • Online ISBN: 978-981-15-8221-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics