Skip to main content

Using Deep Rectifier Neural Nets and Probabilistic Sampling for Topical Unit Classification

  • Chapter
  • First Online:
Cognitive Infocommunications, Theory and Applications

Part of the book series: Topics in Intelligent Engineering and Informatics ((TIEI,volume 13))

Abstract

In the interaction between humans and computers as well as in the interaction among humans, topical units (TUs) have an important role. This motivated our investigation of topical unit recognition. To lay foundations for this, we first create a classifier for topical units using Deep Neural Nets with rectifier units (DRNs) and the probabilistic sampling method. Evaluating the resulting models on the HuComTech corpus using the Unweighted Average Recall (UAR) measure, we find that this method produces significantly higher classification scores than those that can be achieved using Support Vector Machines, and what DRNs can produce in the absence of probabilistic sampling. We also examine experimentally the number of topical unit labels to be used. We demonstrate that not having to discriminate between variations of topic change leads to better classification scores. However, there can be applications where this distinction is necessary, for which case we introduce a hierarchical classification method. Results show that this method increases the UAR scores by more than 7%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The current study is an extended version of this conference paper.

  2. 2.

    The difference between the two is due to the different length in the occurrences of various topical units in the annotation. A 3.2 s long topic elaboration and a 32 s long topic elaboration both count as one occurrence in the annotation, but the number of frames associated with the former will be 10, while the number of frames associated with the latter will be 100.

  3. 3.

    It should be mentioned here that before applying our machine learning methods, we normalized all non-binary features so as to have a zero mean and unit variance.

References

  1. Abuczki A (2011) A multimodal analysis of the sequential organization of verbal and nonverbal interaction. Argumentum 7:261–279

    Google Scholar 

  2. Abuczki A, Baiat GE (2013) An overview of multimodal corpora, annotation tools and schemes. Argumentum 9:86–98

    Google Scholar 

  3. Babbar R, Partalas I, Gaussier E, Amini MR (2013) On flat versus hierarchical classification in large-scale taxonomies. In: Advances in neural information processing systems, vol 26. Curran Associates, Inc., pp 1824–1832

    Google Scholar 

  4. Baiat GE, Szekrényes I (2012) Topic change detection based on prosodic cues in unimodal setting. In: Proceedings of the CogInfoCom, pp 527–530

    Google Scholar 

  5. Baranyi P, Csapó A, Gyula S (2015) Cognitive infocommunications (CogInfoCom). Springer International, Cham, Switzerland

    Book  Google Scholar 

  6. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27

    Google Scholar 

  7. Demichelis P, Rinotti A, Martin JCD (2005) Performance analysis of distributed speech recognition over 802.11 wireless networks on the Timit database. In: Proceedings of the VTC, pp 2751–2754

    Google Scholar 

  8. Grósz T, Nagy I (2014) Document classification with deep rectifier neural networks and probabilistic sampling. In: Proceedings of the TSD, pp 108–115

    Google Scholar 

  9. Hunyadi L, Szekrényes I, Borbély A, Kiss H (2012) Annotation of spoken syntax in relation to prosody and multimodal pragmatics. In: Proceedings of the CogInfoCom, pp 537–541

    Google Scholar 

  10. Jr CS, Freitas A (2009) Novel top-down approaches for hierarchical classification and their application to automatic music genre classification. In: Proceedings of the IEEE SMC, pp 182–196

    Google Scholar 

  11. Kai-Fu Lee HWH (1989) Speaker-independent phone recognition using Hidden Markov Models. IEEE Trans Acoust Speech Signal Process 37(37):1641–1648

    Google Scholar 

  12. Kovács G, Grósz T, Váradi T (2016) Topical unit classification using deep neural nets and probabilistic sampling. In: Proceedings of the CogInfoCom, pp 199–204

    Google Scholar 

  13. Lawrence S, Burns I, Back A, Tsoi AC, Giles CL (1998) Neural network classification and prior class probabilities. In: Orr GB, Müller KR (eds) Neural networks: tricks of the trade. Springer, Heidelberg, Berlin, pp 299–313

    Chapter  Google Scholar 

  14. Pápay K, Szeghalmy S, Szekrényes I (2011) Hucomtech multimodal corpus annotation. Argumentum 7:330–347

    Google Scholar 

  15. Rosenberg A (2012) Classifying skewed data: importance weighting to optimize average recall. In: Proceedings of the Interspeech, pp 2242–2245

    Google Scholar 

  16. Sapru A, Bourlard H (2014) Detecting speaker roles and topic changes in multiparty conversations using latent topic models. In: Proceedings of the Interspeech, pp 2882–2886

    Google Scholar 

  17. Schmidt AP, Stone TKM (2013) Detection of topic change in IRC chat logs. http://www.trevorstone.org/school/ircsegmentation.pdf

  18. Shriberg E, Stolcke A, Hakkani-Tür D, Tür G (2000) Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun 32(1–2):127–154

    Article  Google Scholar 

  19. Silla CN Jr, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Discov 22(1–2):31–72

    Article  MathSciNet  Google Scholar 

  20. Su J (2011) An analysis of content-free dialogue representation, supervised classification methods and evaluation metrics for meeting topic segmentation. PhD thesis, Trinity College

    Google Scholar 

  21. Szekrényes I (2015) Prosotool, a method for automatic annotation of fundamental frequency. In: Proceedings of the CogInfoCom, pp 291–296

    Google Scholar 

  22. Tóth L (2013) Phone recognition with deep sparse rectifier neural networks. In: Proceedings of the ICASSP, pp 6985–6989

    Google Scholar 

  23. Tóth L, Kocsor A (2005) Training HMM/ANN hybrid speech recognizers by probabilistic sampling. In: Proceedings of the ICANN, pp 597–603

    Google Scholar 

  24. Tür G, Hakkani-Tür DZ, Stolcke A, Shriberg E (2001) Integrating prosodic and lexical cues for automatic topic segmentation. CoRR 31–57

    Google Scholar 

  25. Zellers M, Post B (2009) Fundamental frequency and other prosodic cues to topic structure. In: Workshop on the discourse-prosody interface, pp 377–386

    Google Scholar 

Download references

Acknowledgements

The research reported in the paper was conducted with the support of the Hungarian Scientific Research Fund (OTKA) grant # K116938. Tamás Grósz was supported by the ÚNKP-16-3 new national excellence programme of the Ministry of Human Capacities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to György Kovács .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kovács, G., Grósz, T., Váradi, T. (2019). Using Deep Rectifier Neural Nets and Probabilistic Sampling for Topical Unit Classification. In: Klempous, R., Nikodem, J., Baranyi, P. (eds) Cognitive Infocommunications, Theory and Applications. Topics in Intelligent Engineering and Informatics, vol 13. Springer, Cham. https://doi.org/10.1007/978-3-319-95996-2_1

Download citation

Publish with us

Policies and ethics