, Volume 121, Issue 3, pp 1563–1582 | Cite as

Characterizing human summarization strategies for text reuse and transformation in literature review writing

  • Kokil JaidkaEmail author
  • Christopher S. G. Khoo
  • Jin-Cheon Na


Citations are useful signals of information salience, but little research has identified the patterns of information selection, transformation, and organization that they espouse. This paper investigated the summarization strategies followed in the writing of literature review sections of information science research papers. We found that the summarization strategies followed are different for the two major styles of literature review writing, descriptive versus integrative literature reviews. Descriptive literature reviews, which focus on individual descriptions of research papers, are more likely to reference the Method and the Result sections of the cited paper and copy-paste text the referenced text. In contrast, integrative literature reviews, which synthesize the main ideas for many papers together, have more critiques and focus mainly on the Conclusion sections. These findings, based on a hand-annotated dataset, have the potential to scale up into a transformation-invariant neural architecture for scientific summarization that can generate different summaries of the input text with integrative or descriptive characteristics.


Literature review writing Scientific summarization Discourse analysis Citance Abstracting Citation analysis 

Mathematics Subject Classification


Supplementary material

11192_2019_3250_MOESM1_ESM.pdf (173 kb)
Supplementary material 1 (PDF 172 kb)


  1. Abura’ed, A., Bravo, A., Chiruzzo, L., & Saggion, H. (2018). LaSTUS/TALN + INCO@ CL-SciSumm 2018-using regression and convolutions for cross-document semantic linking and summarization of scholarly literature. In Proceedings of the 3nd joint workshop on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL2018). Ann Arbor, Michigan (July 2018).Google Scholar
  2. Bourner, T. (1996). The research process: Four steps to success. Research methods: guidance for postgraduates, Arnold, London, pp. 7–11.Google Scholar
  3. Bradshaw, S. (2003). Reference directed indexing: Redeeming relevance for subject search in citation indexes. In International conference on theory and practice of digital libraries (pp. 499–510). Springer, Berlin, Heidelberg.CrossRefGoogle Scholar
  4. Bruce, C. S. (1994). Research students’ early experiences of the dissertation literature review. Studies in Higher Education,19(2), 217–229.CrossRefGoogle Scholar
  5. Buchanan, G., & McKay, D. (2017). The lowest form of flattery: characterising text re-use and plagiarism patterns in a digital library corpus. In Proceedings of the ACM/IEEE joint conference on digital libraries (pp. 1–10). IEEE.Google Scholar
  6. Chubin, D. E., & Moitra, S. D. (1975). Content analysis of references: Adjunct or alternative to citation counting? Social Studies of Science,5(4), 423–441.CrossRefGoogle Scholar
  7. Citron, D. T., & Ginsparg, P. (2015). Patterns of text reuse in a scientific corpus. Proceedings of the National Academy of Sciences,112(1), 25–30.CrossRefGoogle Scholar
  8. Dijk, T. A. (1979). Macrostructures: An interdisciplinary study of global structures in discourse, interaction, and cognition. New York: L. Erlbaum Associates.Google Scholar
  9. Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., & Radev, D. (2008). Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology,59(1), 51–62.CrossRefGoogle Scholar
  10. Guo, Q., & Li, C. (2007). The research on the application of text clustering and natural language understanding in automatic abstracting. In Fourth international conference on fuzzy systems and knowledge discovery, 2007. FSKD 2007. (vol. 4, pp. 92–96). IEEE.Google Scholar
  11. Hart, C. (1998). Doing a literature review. London: Sage.Google Scholar
  12. Jaidka, K., Chandrasekaran, M. K., Rustagi, S., & Kan, M. Y. (2018). Insights from CL-SciSumm 2016: The faceted scientific document summarization shared task. International Journal on Digital Libraries,19(2–3), 163–171.CrossRefGoogle Scholar
  13. Jaidka, K., Khoo, C., & Na, J. C. (2010). Imitating human literature review writing: an approach to multi-document summarization. In Proceedings of the international conference on asian digital libraries (pp. 116–119). Springer, Berlin, Heidelberg.Google Scholar
  14. Jaidka, K., Khoo, C., & Na, J. C. (2013a). Deconstructing human literature reviews–a framework for multi-document summarization. In proceedings of the 14th European workshop on natural language generation (pp. 125–135).Google Scholar
  15. Jaidka, K., Khoo, C. S. G., & Na, J. C. (2013b). Literature review writing: How information is selected and transformed. Aslib Proceedings,65(3), 303–325.CrossRefGoogle Scholar
  16. Jha, R., Jbara, A. A., Qazvinian, V., & Radev, D. R. (2017). NLP-driven citation analysis for scientometrics. Natural Language Engineering,23(1), 93–130.CrossRefGoogle Scholar
  17. Jing, H., & McKeown, K. R. (1999). The decomposition of human-written summary sentences. In Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval (pp. 129–136). ACM.Google Scholar
  18. Jönsson, S. (2006). On academic writing. European Business Review,18(6), 479–490.CrossRefGoogle Scholar
  19. Kan, M. Y., Klavans, J. L., & McKeown, K. R. (2002). Using the annotated bibliography as a resource for indicative summarization. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02).Google Scholar
  20. Khoo, C. S., Na, J. C., & Jaidka, K. (2011). Analysis of the macro-level discourse structure of literature reviews. Online Information Review,35(2), 255–271.CrossRefGoogle Scholar
  21. Knott, D. (1999). Writing an annotated bibliography. Retrieved January 2009.
  22. Liu, Y., Wang, X., Zhang, J., & Xu, H. (2008). Personalized PageRank based multi-document summarization. In IEEE international workshop on semantic computing and systems, 2008. WSCS’08. (pp. 169–173). IEEE.Google Scholar
  23. Massey, A. (1996). Using the literature: 3 × 4 analogies. The Qualitative Report, 2(4). Retrieved from January 2009.
  24. Mei, Q., & Zhai, C. (2008). Generating impact-based summaries for scientific literature. In Proceedings of the ACL conference on human language technologies (pp. 816–824). Association for Computational Linguistics.Google Scholar
  25. Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., & Zajic, D. (2009). Using citations to generate surveys of scientific paradigms. In Proceedings of human language technologies: The 2009 annual conference of the north american chapter of the association for computational linguistics (pp. 584–592). Association for Computational Linguistics.Google Scholar
  26. Nanba, H. (2000). Classification of research papers using citation links and citation types: Towards automatic review article generation. In Proceedings of the American Society for Information Science (ASIS)/the 11th SIG classification research workshop, classification for user support and learning, Chicago, USA, 2000 (pp. 117–134). Morgan Kaufmann Publishers.Google Scholar
  27. Nanba, H., & Okumura, M. (1999). Towards multi-paper summarization reference information. In Proceedings of the 16th international joint conference on Artificial intelligence-Volume 2 (pp. 926–931). Morgan Kaufmann Publishers Inc.Google Scholar
  28. Nanba, H., & Okumura, M. (2005). Automatic detection of survey articles. In International Conference on Theory and Practice of Digital Libraries (pp. 391–401). Springer, Berlin, Heidelberg.Google Scholar
  29. Nomoto, T. (2016). NEAL: A neurally enhanced approach to linking citation and reference. In Proceedings of the joint workshop on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL) (pp. 168–174).Google Scholar
  30. Qazvinian, V., & Radev, D. R. (2010). Identifying non-explicit citing sentences for citation-based summarization. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 555–564). Association for Computational Linguistics.Google Scholar
  31. Qazvinian, V., Radev, D. R., & Özgür, A. (2010). Citation summarization through keyphrase extraction. In Proceedings of the 23rd international conference on computational linguistics (pp. 895–903). Association for Computational Linguistics.Google Scholar
  32. Rowley, J., & Slack, F. (2004). Conducting a literature review. Management research news,27(6), 31–39.CrossRefGoogle Scholar
  33. Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 379–389).Google Scholar
  34. Silva, F. N., Amancio, D. R., Bardosova, M., Costa, L. D. F., & Oliveira, O. N., Jr. (2016). Using network science and text analytics to produce surveys in a scientific topic. Journal of Informetrics,10(2), 487–502.CrossRefGoogle Scholar
  35. Singh, M., Niranjan, A., Gupta, D., Bakshi, N. A., Mukherjee, A., & Goyal, P. (2017). Citation sentence reuse behavior of scientists: A case study on massive bibliographic text dataset of computer science. In Proceedings of the ACM/IEEE joint conference on digital libraries (JCDL) (pp. 1–4). IEEE.Google Scholar
  36. Tandon, N., & Jain, A. (2012). Citation context sentiment analysis for structured summarization of research papers. In 35th German conference on artificial intelligence (p. 98).Google Scholar
  37. Teufel, S. (1999). Argumentative Zoning: Information Extraction from scientific text. Ph.D. Thesis, University of Edinburgh.Google Scholar
  38. Teufel, S., Carletta, J., & Moens, M. (1999). An annotation scheme for discourse-level argumentation in research articles. In Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics (pp. 110–117). Association for Computational Linguistics.Google Scholar
  39. Torraco, R. J. (2005). Writing integrative literature reviews: Guidelines and examples. Human Resource Development Review,4(3), 356–367.CrossRefGoogle Scholar
  40. Toulmin, S. E. (2003). The uses of argument. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  41. Yasunaga, M., Kasai, J., Zhang, R., Dan, A. R. F. I. L., & Radev, F. D. R. (2019). ScisummNet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks. In Proceedings of the AAAI annual meeting.Google Scholar
  42. Zhang, Y., Barzilay, R., & Jaakkola, T. (2017). Aspect-augmented adversarial networks for domain adaptation. arXiv preprint arXiv:1701.00188.
  43. Zhao, J. J., Kim, Y., Zhang, K., Rush, A. M., & LeCun, Y. (2017). Adversarially regularized autoencoders for generating discrete structures. CoRR, abs/1706.04223.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2019

Authors and Affiliations

  1. 1.Wee Kim Wee School of Communication and InformationNanyang Technological UniversitySingaporeSingapore

Personalised recommendations