Skip to main content
Log in

A hybrid approach to recognize generic sections in scholarly documents

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript


Discourse parsing of scholarly documents is the premise and basis for standardizing the writing of scholarly documents, understanding their content, and quickly locating and extracting specific information from them. With the continuous emergence of a large number of scholarly documents, how to automatically analyze scholarly documents quickly and effectively has become a research hotspot. In this paper, we propose a hybrid model, which considers both section headers and body texts, to recognize generic sections in scholarly documents automatically. We conduct a comprehensive analysis of the semantic difference between short phrases and long narrative text chunks on the SectLabel dataset. The experimental results show that our model achieves 91.67% \(F_{1}\)-value in the generic section recognization, which is better than the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others


  1. Pdftotext command line tools


  1. Afshar, H.S., Doosti, M., Movassagh, H.: A comparative study of generic structure of applied linguistics and chemistry research articles: the case of discussions (2018)

  2. BinMakhashen, G.M., Mahmoud, S.: Document layout analysis. ACM Comput. Surv. (CSUR) 52, 1–36 (2020)

    Article  Google Scholar 

  3. Bosc, T., Cabrio, E., Villata, S.: Tweeties squabbling: positive and negative results in applying argument mining on social media. In: COMMA (2016)

  4. Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014).

  5. Cocarascu, O., Toni, F.: Combining deep learning and argumentative reasoning for the analysis of social media textual content using small data sets. Comput. Linguist. 44(4), 833–858 (2018)

    Article  Google Scholar 

  6. Constantin, A., Pettifer, S., Voronkov, A.: Pdfx: fully-automated pdf-to-xml conversion of scientific literature. In: Proceedings of the 2013 ACM Symposium on Document Engineering (2013)

  7. Dasigi, P., Burns, G., Hovy, E., Waard, A.D.: Experiment segmentation in scientific discourse as clause-level structured prediction using recurrent neural networks. arXiv abs/1702.05398 (2017)

  8. Dayan, P., Abbott, L.: Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems 15 (2001)

  9. Devlin, J., Chang, M.W., Lee Kenton andToutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019).

  10. Dongbo, W., Ruiqing, G., Wenhao, Y., Xin, Z., Danhao, Z.: Research on the structure recognition of academic texts under different characteristics. J. China Soc. Sci. Tech. Inf. 37, 997–1008 (2018)

    Google Scholar 

  11. Dumais, S.T., Banko, M., Brill, E., Lin, J.J., Ng, A.Y.: Web question answering: is more always better? In: SIGIR’02 (2002)

  12. Guo, Y., Korhonen, A., Liakata, M., Silins, I., Högberg, J., Stenius, U.: A comparison and user-based evaluation of models of textual information structure in the context of cancer risk assessment. BMC Bioinf. 12, 69 (2010)

    Article  Google Scholar 

  13. Hailin, D., Huan, X.: Generic structure of research article abstracts. Cross-Cult. Commun. 6, 36–44 (2010)

    Google Scholar 

  14. He, D., Cohen, S., Price, B.L., Kifer, D., Giles, C.L.: Multi-scale multi-task FCN for semantic page segmentation and table detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 254–261 (2017)

  15. Hirohata, K., Okazaki, N., Ananiadou, S., Ishizuka, M.: Identifying sections in scientific abstracts using conditional random fields. In: Proceedings of the 3rd International Joint Conference on Natural Language Processing, vol. I (2008).

  16. Hirohata, K., Okazaki, N., Ananiadou, S., Ishizuka, M.: Identifying sections in scientific abstracts using conditional random fields. In: Proc of the IJCNLP (2008)

  17. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–80 (1997).

    Article  Google Scholar 

  18. Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 328–339. Association for Computational Linguistics, Melbourne, Australia (2018).

  19. Kafes, H.: Generic structure of the method sections of research articles and ma thesis by Turkish academic writers (2016)

  20. Kosaraju, S.: Document layout analysis and recognition systems (2019)

  21. Li, W., Liu, P., Zhang, Q., Liu, W.: An improved approach for text sentiment classification based on a deep neural network via a sentiment attention mechanism. Future Internet 11, 96 (2019)

    Article  Google Scholar 

  22. Lin, J., Karakos, D., Demner-Fushman, D., Khudanpur, S.: Generative content models for structural analysis of medical abstracts. In: BioNLP@NAACL-HLT (2006)

  23. thang Luong, M., Nguyen, T.D., yen Kan, M.: Logical structure recovery in scholarly articles with rich document features (2010)

  24. Mullen, T., Mizuta, Y., Collier, N.: A baseline feature set for learning rhetorical zones using full articles in the biomedical domain. SIGKDD Explor. 7, 52–58 (2005)

    Article  Google Scholar 

  25. Nasar, Z., Jaffry, S.W., Malik, M.K.: Information extraction from scientific articles: a survey. Scientometrics 117 (2018)

  26. Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. arXiv abs/1802.05365 (2018)

  27. Rahman, M., Darus, S., Amir, Z.: Rhetorical structure of introduction in applied linguistics research articles (2017)

  28. Sulistyo, I.: An analysis of generic structure of narrative text written by the tenth year students of sma yasiha gubug. English Teach. J. 4 (2013)

  29. Teufel, S.: Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics (2009)

  30. Teufel, S., Carletta, J., Moens, M.: An annotation scheme for discourse-level argumentation in research articles. In: EACL (1999)

  31. Teufel, S., Moens, M.: Summarizing scientific articles: experiments with relevance and rhetorical status. Comput. Linguist. 28(4), 409–446 (2002)

    Article  Google Scholar 

  32. Tkaczyk, D., Szostek, P., Fedoryszak, M., Dendek, P.J., Bolikowski, L.: Cermine: automatic extraction of structured metadata from scientific literature. Int. J. Doc. Anal. Recognit. 18(4), 317–335 (2015)

    Article  Google Scholar 

  33. Waard, A.D., Kircz, J.: Modeling scientific research articles—shifting perspectives and persistent issues (2008)

  34. Waard, A.D., Maat, H.P.: Verb form indicates discourse segment type in biological research papers: experimental evidence. J. Engl. Acad. Purp. 11, 357–366 (2012)

    Article  Google Scholar 

  35. WANG Li-fei, L.X.: Constructing a model for the automatic identification of move structure in english research article abstracts, pp. 45–50 (2017)

  36. Wei, L., Yong, H., Qikai, C.: The structure function of academic text and its classification. J. China Soc. Sci. Tech. Inf. 33, 979–985 (2014)

    Google Scholar 

  37. Yong, H., Wei, L., Qikai, C., Sisi, G.: The structure function recognition of academic text application in academic search. J. China Soc. Sci. Tech. Inf. 35, 425–431 (2016)

    Google Scholar 

  38. Zhong, X., Tang, J., Jimeno-Yepes, A.: Publaynet: largest dataset ever for document layout analysis. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1015–1022 (2019)

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Shoubin Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Wang, Q. A hybrid approach to recognize generic sections in scholarly documents. IJDAR 24, 339–348 (2021).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: