Characterizing human summarization strategies for text reuse and transformation in literature review writing

Jaidka, Kokil; Khoo, Christopher S. G.; Na, Jin-Cheon

doi:10.1007/s11192-019-03250-5

Characterizing human summarization strategies for text reuse and transformation in literature review writing

Published: 08 October 2019

Volume 121, pages 1563–1582, (2019)
Cite this article

Scientometrics Aims and scope Submit manuscript

694 Accesses
1 Citation
2 Altmetric
Explore all metrics

Abstract

Citations are useful signals of information salience, but little research has identified the patterns of information selection, transformation, and organization that they espouse. This paper investigated the summarization strategies followed in the writing of literature review sections of information science research papers. We found that the summarization strategies followed are different for the two major styles of literature review writing, descriptive versus integrative literature reviews. Descriptive literature reviews, which focus on individual descriptions of research papers, are more likely to reference the Method and the Result sections of the cited paper and copy-paste text the referenced text. In contrast, integrative literature reviews, which synthesize the main ideas for many papers together, have more critiques and focus mainly on the Conclusion sections. These findings, based on a hand-annotated dataset, have the potential to scale up into a transformation-invariant neural architecture for scientific summarization that can generate different summaries of the input text with integrative or descriptive characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Empirical Assessment of Citation Information in Scientific Summarization

Assessing the Effect of Text Type on the Choice of Linguistic Mechanisms in Scientific Publications

QuOTeS: Query-Oriented Technical Summarization

References

Abura’ed, A., Bravo, A., Chiruzzo, L., & Saggion, H. (2018). LaSTUS/TALN + INCO@ CL-SciSumm 2018-using regression and convolutions for cross-document semantic linking and summarization of scholarly literature. In Proceedings of the 3nd joint workshop on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL2018). Ann Arbor, Michigan (July 2018).
Bourner, T. (1996). The research process: Four steps to success. Research methods: guidance for postgraduates, Arnold, London, pp. 7–11.
Bradshaw, S. (2003). Reference directed indexing: Redeeming relevance for subject search in citation indexes. In International conference on theory and practice of digital libraries (pp. 499–510). Springer, Berlin, Heidelberg.
Chapter Google Scholar
Bruce, C. S. (1994). Research students’ early experiences of the dissertation literature review. Studies in Higher Education,19(2), 217–229.
Article Google Scholar
Buchanan, G., & McKay, D. (2017). The lowest form of flattery: characterising text re-use and plagiarism patterns in a digital library corpus. In Proceedings of the ACM/IEEE joint conference on digital libraries (pp. 1–10). IEEE.
Chubin, D. E., & Moitra, S. D. (1975). Content analysis of references: Adjunct or alternative to citation counting? Social Studies of Science,5(4), 423–441.
Article Google Scholar
Citron, D. T., & Ginsparg, P. (2015). Patterns of text reuse in a scientific corpus. Proceedings of the National Academy of Sciences,112(1), 25–30.
Article Google Scholar
Dijk, T. A. (1979). Macrostructures: An interdisciplinary study of global structures in discourse, interaction, and cognition. New York: L. Erlbaum Associates.
Google Scholar
Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., & Radev, D. (2008). Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology,59(1), 51–62.
Article Google Scholar
Guo, Q., & Li, C. (2007). The research on the application of text clustering and natural language understanding in automatic abstracting. In Fourth international conference on fuzzy systems and knowledge discovery, 2007. FSKD 2007. (vol. 4, pp. 92–96). IEEE.
Hart, C. (1998). Doing a literature review. London: Sage.
Google Scholar
Jaidka, K., Chandrasekaran, M. K., Rustagi, S., & Kan, M. Y. (2018). Insights from CL-SciSumm 2016: The faceted scientific document summarization shared task. International Journal on Digital Libraries,19(2–3), 163–171.
Article Google Scholar
Jaidka, K., Khoo, C., & Na, J. C. (2010). Imitating human literature review writing: an approach to multi-document summarization. In Proceedings of the international conference on asian digital libraries (pp. 116–119). Springer, Berlin, Heidelberg.
Jaidka, K., Khoo, C., & Na, J. C. (2013a). Deconstructing human literature reviews–a framework for multi-document summarization. In proceedings of the 14th European workshop on natural language generation (pp. 125–135).
Jaidka, K., Khoo, C. S. G., & Na, J. C. (2013b). Literature review writing: How information is selected and transformed. Aslib Proceedings,65(3), 303–325.
Article Google Scholar
Jha, R., Jbara, A. A., Qazvinian, V., & Radev, D. R. (2017). NLP-driven citation analysis for scientometrics. Natural Language Engineering,23(1), 93–130.
Article Google Scholar
Jing, H., & McKeown, K. R. (1999). The decomposition of human-written summary sentences. In Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval (pp. 129–136). ACM.
Jönsson, S. (2006). On academic writing. European Business Review,18(6), 479–490.
Article Google Scholar
Kan, M. Y., Klavans, J. L., & McKeown, K. R. (2002). Using the annotated bibliography as a resource for indicative summarization. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02).
Khoo, C. S., Na, J. C., & Jaidka, K. (2011). Analysis of the macro-level discourse structure of literature reviews. Online Information Review,35(2), 255–271.
Article Google Scholar
Knott, D. (1999). Writing an annotated bibliography. Retrieved January 2009. http://www.writing.utoronto.ca/advice/specific-types-of-writing/annotated-bibliography.
Liu, Y., Wang, X., Zhang, J., & Xu, H. (2008). Personalized PageRank based multi-document summarization. In IEEE international workshop on semantic computing and systems, 2008. WSCS’08. (pp. 169–173). IEEE.
Massey, A. (1996). Using the literature: 3 × 4 analogies. The Qualitative Report, 2(4). Retrieved from January 2009. http://www.nova.edu/ssss/QR/QR2-4/massey.html.
Mei, Q., & Zhai, C. (2008). Generating impact-based summaries for scientific literature. In Proceedings of the ACL conference on human language technologies (pp. 816–824). Association for Computational Linguistics.
Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., & Zajic, D. (2009). Using citations to generate surveys of scientific paradigms. In Proceedings of human language technologies: The 2009 annual conference of the north american chapter of the association for computational linguistics (pp. 584–592). Association for Computational Linguistics.
Nanba, H. (2000). Classification of research papers using citation links and citation types: Towards automatic review article generation. In Proceedings of the American Society for Information Science (ASIS)/the 11th SIG classification research workshop, classification for user support and learning, Chicago, USA, 2000 (pp. 117–134). Morgan Kaufmann Publishers.
Nanba, H., & Okumura, M. (1999). Towards multi-paper summarization reference information. In Proceedings of the 16th international joint conference on Artificial intelligence-Volume 2 (pp. 926–931). Morgan Kaufmann Publishers Inc.
Nanba, H., & Okumura, M. (2005). Automatic detection of survey articles. In International Conference on Theory and Practice of Digital Libraries (pp. 391–401). Springer, Berlin, Heidelberg.
Nomoto, T. (2016). NEAL: A neurally enhanced approach to linking citation and reference. In Proceedings of the joint workshop on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL) (pp. 168–174).
Qazvinian, V., & Radev, D. R. (2010). Identifying non-explicit citing sentences for citation-based summarization. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 555–564). Association for Computational Linguistics.
Qazvinian, V., Radev, D. R., & Özgür, A. (2010). Citation summarization through keyphrase extraction. In Proceedings of the 23rd international conference on computational linguistics (pp. 895–903). Association for Computational Linguistics.
Rowley, J., & Slack, F. (2004). Conducting a literature review. Management research news,27(6), 31–39.
Article Google Scholar
Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 379–389).
Silva, F. N., Amancio, D. R., Bardosova, M., Costa, L. D. F., & Oliveira, O. N., Jr. (2016). Using network science and text analytics to produce surveys in a scientific topic. Journal of Informetrics,10(2), 487–502.
Article Google Scholar
Singh, M., Niranjan, A., Gupta, D., Bakshi, N. A., Mukherjee, A., & Goyal, P. (2017). Citation sentence reuse behavior of scientists: A case study on massive bibliographic text dataset of computer science. In Proceedings of the ACM/IEEE joint conference on digital libraries (JCDL) (pp. 1–4). IEEE.
Tandon, N., & Jain, A. (2012). Citation context sentiment analysis for structured summarization of research papers. In 35th German conference on artificial intelligence (p. 98).
Teufel, S. (1999). Argumentative Zoning: Information Extraction from scientific text. Ph.D. Thesis, University of Edinburgh.
Teufel, S., Carletta, J., & Moens, M. (1999). An annotation scheme for discourse-level argumentation in research articles. In Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics (pp. 110–117). Association for Computational Linguistics.
Torraco, R. J. (2005). Writing integrative literature reviews: Guidelines and examples. Human Resource Development Review,4(3), 356–367.
Article Google Scholar
Toulmin, S. E. (2003). The uses of argument. Cambridge: Cambridge University Press.
Book Google Scholar
Yasunaga, M., Kasai, J., Zhang, R., Dan, A. R. F. I. L., & Radev, F. D. R. (2019). ScisummNet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks. In Proceedings of the AAAI annual meeting.
Zhang, Y., Barzilay, R., & Jaakkola, T. (2017). Aspect-augmented adversarial networks for domain adaptation. arXiv preprint arXiv:1701.00188.
Zhao, J. J., Kim, Y., Zhang, K., Rush, A. M., & LeCun, Y. (2017). Adversarially regularized autoencoders for generating discrete structures. CoRR, abs/1706.04223.

Download references

Author information

Authors and Affiliations

Wee Kim Wee School of Communication and Information, Nanyang Technological University, 31 Nanyang Link, Singapore, 637718, Singapore
Kokil Jaidka, Christopher S. G. Khoo & Jin-Cheon Na

Authors

Kokil Jaidka
View author publications
You can also search for this author in PubMed Google Scholar
Christopher S. G. Khoo
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Cheon Na
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kokil Jaidka.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 172 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jaidka, K., Khoo, C.S.G. & Na, JC. Characterizing human summarization strategies for text reuse and transformation in literature review writing. Scientometrics 121, 1563–1582 (2019). https://doi.org/10.1007/s11192-019-03250-5

Download citation

Received: 15 April 2019
Published: 08 October 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11192-019-03250-5

Keywords

Mathematics Subject Classification

62H20

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Characterizing human summarization strategies for text reuse and transformation in literature review writing

Abstract

Access this article

Similar content being viewed by others

An Empirical Assessment of Citation Information in Scientific Summarization

Assessing the Effect of Text Type on the Choice of Linguistic Mechanisms in Scientific Publications

QuOTeS: Query-Oriented Technical Summarization

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (PDF 172 kb)

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Characterizing human summarization strategies for text reuse and transformation in literature review writing

Abstract

Access this article

Similar content being viewed by others

An Empirical Assessment of Citation Information in Scientific Summarization

Assessing the Effect of Text Type on the Choice of Linguistic Mechanisms in Scientific Publications

QuOTeS: Query-Oriented Technical Summarization

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (PDF 172 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation