Skip to main content
Log in

Generating a related work section for scientific papers: an optimized approach with adopting problem and method information

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The rapid explosion of scientific publications has made related work writing increasingly laborious. In this paper, we propose a fully automated approach to generate related work sections by leveraging a seq2seq neural network. In particular, the main goal of our work is to improve the abstractive generation of related work by introducing problem and method information, which serve as a pivot to connect the previous works in the related work section and has been ignored by the existing studies. More specifically, we employ a title-generation strategy to automatically obtain problem and method information from given references and add the problem and method information as an additional feature to enhance the generation of related work. To verify the effectiveness and feasibility of our approach, we conduct a comparative experiment on publicly available datasets using several common neural summarizers. The experimental results indicate that the introduction of problem and method information contributes to the better generation of related work and our approach substantially outperforms the informed baseline on ROUGE-1 and ROUGE-L. The case study shows that the problem and method information enables considerable topic coherence between the generated related work section and the original paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://github.com/smalot/pdfparser.

  2. https://github.com/dwadden/dygiepp.

  3. https://huggingface.co/models.

  4. https://github.com/nlpyang/PreSumm.

  5. https://github.com/abisee/pointer-generator.

  6. https://github.com/zhongxiangboy/Improving-related-work-generation-by-introducing-problem-and-method-information.

References

  • Chen, J., & Zhuge, H. (2019). Automatic generation of related work through summarizing citations. Concurrency and Computation: Practice and Experience, 31(3), e4261.

    Article  Google Scholar 

  • Chen, Y. C., & Bansal, M. (2018). Fast abstractive summarization with reinforce-selected sentence rewriting. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 675–686).

  • Cheng, S. W., Kuo, C. W., & Kuo, C. H. (2012). Research article titles in applied linguistics. Journal of Academic Language and Learning, 6(1), A1–A14.

    MathSciNet  Google Scholar 

  • Das, S., & Paik, J. H. (2021). Context-sensitive gender inference of named entities in text. Information Processing & Management, 58(1), 102423.

    Article  Google Scholar 

  • Day, R. A. (1996). How to write and publish a scientific paper. General Pharmacology, 6(27), 1077.

    Google Scholar 

  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long and Short Papers) (pp. 4171–4186).

  • Flowerdew, L. (2008). Corpus-based analyses of the problem-solution pattern: A phraseological approach (Vol. 29). John Benjamins Publishing.

    Book  Google Scholar 

  • Gehrmann, S., Deng, Y., & Rush, A. M. (2018). Bottom-up abstractive summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 4098–4109).

  • Heffernan, K., & Teufel, S. (2018). Identifying problems and solutions in scientific text. Scientometrics, 116(2), 1367–1382.

    Article  Google Scholar 

  • Hoang, C. D. V., & Kan, M. Y. (2010). Towards automated related work summarization. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters (pp. 427–435).

  • Hsu, W. T., Lin, C. K., Lee, M. Y., Min, K., Tang, J., & Sun, M. (2018). A unified model for extractive and abstractive summarization using inconsistency loss. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 132–141).

  • Hu, Y., & Wan, X. (2014, October). Automatic generation of related work sections in scientific papers: An optimization approach. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1624–1633).

  • Jaidka, K., Khoo, C., & Na, J. C. (2013). Deconstructing human literature reviews–a framework for multi-document summarization. In Proceedings of the 14th European Workshop on Natural Language Generation (pp. 125–135).

  • Jamali, H. R., & Nikzad, M. (2011). Article title type and its relation with the number of downloads and citations. Scientometrics, 88(2), 653–661.

    Article  Google Scholar 

  • Ji, D., Tao, P., Fei, H., & Ren, Y. (2020). An end-to-end joint model for evidence information extraction from court record document. Information Processing & Management, 57(6), 102305.

    Article  Google Scholar 

  • Khoo, C. S., Na, J. C., & Jaidka, K. (2011). Analysis of the macro-level discourse structure of literature. Online Information Review, 35(2), 255–271.

    Article  Google Scholar 

  • Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., & Zettlemoyer, L. (2020). BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 7871–7880).

  • Lin, C. Y., & Hovy, E. (2003). Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (pp. 150–157).

  • Liu, Y., & Lapata, M. (2019). Text summarization with pretrained encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 3721–3731).

  • Lu, Y., Dong, Y., & Charlin, L. (2020). Multi-XScience: A large-scale dataset for extreme multi-document summarization of scientific articles. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 8068–8074).

  • Luan, Y., Wadden, D., He, L., Shah, A., Ostendorf, M., & Hajishirzi, H. (2019). A general framework for information extraction using dynamic span graphs. In Proceedings of NAACL-HLT (pp. 3036–3046).

  • Ma, S., Zhang, C., & Liu, X. (2020). A review of citation recommendation: From textual content to enriched context. Scientometrics, 122(3), 1445–1472.

    Article  Google Scholar 

  • Miao, L., Cao, D., Li, J., & Guan, W. (2020). Multi-modal product title compression. Information Processing & Management, 57(1), 102123.

    Article  Google Scholar 

  • Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishnan, P., Qazvinian, V., & Zajic, D. (2009). Using citations to generate surveys of scientific paradigms. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 584–592).

  • Mutlu, B., Sezer, E. A., & Akcayol, M. A. (2020). Candidate sentence selection for extractive text summarization. Information Processing & Management, 57(6), 102359.

    Article  Google Scholar 

  • Nasar, Z., Jaffry, S. W., & Malik, M. K. (2018). Information extraction from scientific articles: A survey. Scientometrics, 117(3), 1931–1990.

    Article  Google Scholar 

  • Paiva, C. E., Lima, J. P. D. S. N., & Paiva, B. S. R. (2012). Articles with short titles describing the results are cited more often. Clinics, 67(5), 509–513.

    Article  Google Scholar 

  • Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002, July). Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (pp. 311–318).

  • Putra, J. W. G., & Khodra, M. L. (2017). Automatic title generation in scientific articles for authorship assistance: A summarization approach. Journal of ICT Research and Applications, 11(3), 253–267.

    Article  Google Scholar 

  • Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. University of British Colombia.

    Google Scholar 

  • Saggion, H., Shvets, A., & Bravo, À. (2020). Automatic related work section generation: Experiments in scientific document abstracting. Scientometrics, 125(3), 3159–3185.

    Article  Google Scholar 

  • Scott, M. (2001). Mapping key words to problem and solution. In M. Scott & G. Thompson (Eds.), Patterns of Text: In Honour of Michael Hoey (pp. 109–127). Benjamins.

    Chapter  Google Scholar 

  • See, A., Liu, P. J., & Manning, C. D. (2017). Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1073–1083).

  • Swales, J. M., & Feak, C. B. (2004). Academic writing for graduate students: Essential tasks and skills (Vol. 1). University of Michigan Press.

    Google Scholar 

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., & Polosukhin, I. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (pp. 6000–6010).

  • Wang, P., Li, S., Zhou, H., Tang, J., & Wang, T. (2019). ToC-RWG: Explore the combination of topic model and citation information for automatic related work generation. IEEE Access, 8, 13043–13055.

    Article  Google Scholar 

  • Wang, Y., Liu, X., & Gao, Z. (2018). Neural related work summarization with a joint context-driven attention mechanism. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1776–1786).

  • Widyantoro, D. H., & Amin, I. (2014). Citation sentence identification and classification for related work summarization. In 2014 International Conference on Advanced Computer Science and Information System (pp. 291–296). IEEE.

  • Yasunaga, M., Kasai, J., Zhang, R., Fabbri, A. R., Li, I., Friedman, D., & Radev, D. R. (2019). ScisummNet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 7386–7393).

  • Zaman, F., Shardlow, M., Hassan, S. U., Aljohani, N. R., & Nawaz, R. (2020). HTSS: A novel hybrid text summarisation and simplification architecture. Information Processing & Management, 57(6), 102351.

    Article  Google Scholar 

  • Zhang, M., Zhou, G., Yu, W., & Liu, W. (2021). FAR-ASS: Fact-aware reinforced abstractive sentence summarization. Information Processing & Management, 58(3), 102478.

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by Major Projects of National Social Science Foundation of China (No. 17ZDA292).

Author information

Authors and Affiliations

Authors

Contributions

PL: Conceptualization, Methodology, Writing—Original Draft. WL: Conceptualization, Methodology, Formal analysis, Supervision. QC: Data Curation, Writing—Review & Editing.

Corresponding author

Correspondence to Qikai Cheng.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, P., Lu, W. & Cheng, Q. Generating a related work section for scientific papers: an optimized approach with adopting problem and method information. Scientometrics 127, 4397–4417 (2022). https://doi.org/10.1007/s11192-022-04458-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04458-8

Keywords

Navigation