Attention-Based Neural Text Segmentation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10772)


Text segmentation plays an important role in various Natural Language Processing (NLP) tasks like summarization, context understanding, document indexing and document noise removal. Previous methods for this task require manual feature engineering, huge memory requirements and large execution times. To the best of our knowledge, this paper is the first one to present a novel supervised neural approach for text segmentation. Specifically, we propose an attention-based bidirectional LSTM model where sentence embeddings are learned using CNNs and the segments are predicted based on contextual information. This model can automatically handle variable sized context information. Compared to the existing competitive baselines, the proposed model shows a performance improvement of \(\sim \)7% in WinDiff score on three benchmark datasets.


Texture Segmentation Long Short-term Memory (LSTM) Sentence Embedding Bidirectional LSTM (BiLSTM) Segmentation Boundaries 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Beeferman, D., Berger, A., Lafferty, J.: Statistical models for text segmentation. Mach. Learn. 34(1), 177–210 (1999)CrossRefzbMATHGoogle Scholar
  2. 2.
    Du, L., Buntine, W.L., Johnson, M.: Topic Segmentation with a structured topic model. In: HLT-NAACL, pp. 190–200 (2013)Google Scholar
  3. 3.
    Du, L., Pate, J.K., Johnson, M.: Topic segmentation in an ordering-based topic model. In: AAAI, pp. 2232–2238 (2015)Google Scholar
  4. 4.
    Eisenstein, J., Barzilay, R.: Bayesian unsupervised topic segmentation. In: EMNLP, pp. 334–343. ACL (2008)Google Scholar
  5. 5.
    Galley, M., McKeown, K., Fosler-Lussier, E., Jing, H.: Discourse segmentation of multi-party conversation. In: ACL, pp. 562–569 (2003)Google Scholar
  6. 6.
    Grosz, B., Hirschberg, J.: Some intonational characteristics of discourse structure. In: Proceedings of the Second International Conference on Spoken Language Processing (1992)Google Scholar
  7. 7.
    Hajime, M., Takeo, H., Manabu, O.: Text segmentation with multiple surface linguistic cues. In: ACL, pp. 881–885 (1998)Google Scholar
  8. 8.
    Hearst, M.A.: TextTiling: segmenting text into multi-paragraph subtopic passages. Comput. Linguist. 23(1), 33–64 (1997)Google Scholar
  9. 9.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  10. 10.
    Joty, S., Carenini, G., Murray, G., Ng, R.T.: Supervised topic segmentation of email conversations. In: ICWSM (2011)Google Scholar
  11. 11.
    Kazantseva, A., Szpakowicz, S.: Linear text segmentation using affinity propagation. In: EMNLP, pp. 284–293. ACL (2011)Google Scholar
  12. 12.
    Malioutov, I., Barzilay, R.: Minimum cut model for spoken lecture segmentation. In: ACL, pp. 25–32 (2006)Google Scholar
  13. 13.
    Misra, H., Yvon, F., Jose, J.M., Cappe, O.: Text segmentation via topic modeling: an analytical study. In: CIKM, pp. 1553–1556. ACM (2009)Google Scholar
  14. 14.
    Mitrat, M., Singhal, A., Buckley, C.: Automatic text summarization by paragraph extraction. In: Intelligent Scalable Text Summarization (1997)Google Scholar
  15. 15.
    Nguyen, V.A., Boyd-Graber, J., Resnik, P.: SITS: a hierarchical nonparametric model using speaker identity for topic segmentation in multiparty conversations. In: COLING, pp. 78–87 (2012)Google Scholar
  16. 16.
    Oh, H.J., Myaeng, S.H., Jang, M.G.: Semantic passage segmentation based on sentence topics for question answering. Inf. Sci. 177(18), 3696–3717 (2007)CrossRefGoogle Scholar
  17. 17.
    Pevzner, L., Hearst, M.A.: A critique and improvement of an evaluation metric for text segmentation. Comput. Linguist. 28(1), 19–36 (2002)CrossRefGoogle Scholar
  18. 18.
    Purver, M., Griffiths, T.L., Körding, K.P., Tenenbaum, J.B.: Unsupervised topic modelling for multi-party spoken discourse. In: ACL, pp. 17–24 (2006)Google Scholar
  19. 19.
    Reynar, J.C.: Statistical models for topic segmentation. In: ACL, pp. 357–364 (1999)Google Scholar
  20. 20.
    Riedl, M., Biemann, C.: TopicTiling: a text segmentation algorithm based on LDA. In: ACL Student Research Workshop, pp. 37–42 (2012)Google Scholar
  21. 21.
    Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: EMNLP, pp. 379–389. ACL (2011)Google Scholar
  22. 22.
    Sakahara, M., Okada, S., Nitta, K.: Domain-independent unsupervised text segmentation for data management. In: ICDMW, pp. 481–487 (2014)Google Scholar
  23. 23.
    Sheikh, I., Fohr, D., Illina, I.: Topic segmentation in ASR transcripts using bidirectional RNNs for change detection. In: IEEE Automatic Speech Recognition and Understanding Workshop (2017)Google Scholar
  24. 24.
    Tür, G., Hakkani-Tür, D., Stolcke, A., Shriberg, E.: Integrating prosodic and lexical cues for automatic topic segmentation. Comput. Linguist. 27(1), 31–57 (2001)CrossRefGoogle Scholar
  25. 25.
    Utiyama, M., Isahara, H.: A statistical model for domain-independent text segmentation. In: ACL, pp. 499–506 (2001)Google Scholar
  26. 26.
    Dijk, T.A.V.: Episodes as units of discourse analysis. In: Analyzing Discourse: Text and Talk, pp. 177–195 (1982)Google Scholar
  27. 27.
    Wang, L., Li, S., Xiao, X., Lyu, Y.: Topic segmentation of web documents with automatic cue phrase identification and BLSTM-CNN. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 177–188. Springer, Cham (2016). CrossRefGoogle Scholar
  28. 28.
    Wang, L., Li, S., Lyu, Y., Wang, H.: Learning to rank semantic coherence for topic segmentation. In: EMNLP, pp. 1340–1344. ACL (2017)Google Scholar
  29. 29.
    Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: neural image caption generation with visual attention. In: ICML, pp. 2048–2057 (2015)Google Scholar
  30. 30.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: HLT-NAACL, pp. 1480–1489 (2016)Google Scholar
  31. 31.
    Yin, W., Schütze, H., Xiang, B., Zhou, B.: ABCNN: attention-based convolutional neural network for modeling sentence pairs. In: ACL, pp. 259–272 (2016)Google Scholar
  32. 32.
    Yu, J., Xiao, X., Xie, L., Chng, E.S.: Topic embedding of sentences for story segmentation. In: APSIPA ASC (2017)Google Scholar
  33. 33.
    Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.IIIT-HHyderabadIndia

Personalised recommendations