Advertisement

Development and Evaluation of a Multi-document Summarization Method Focusing on Research Concepts and Their Research Relationships

  • Shiyan Ou
  • Christopher S. G. Khoo
  • Dion H. Goh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3815)

Abstract

This paper reports the design and evaluation of a method for summarizing a set of related research abstracts. This summarization method extracts research concepts and their research relationships from different abstracts, integrates the extracted information across abstracts, and presents the integrated information in a Web-based interface to generate a multi-document summary. This study focused on sociology dissertation abstracts, but can be extended to other research abstracts. The summarization method was evaluated in a user study to assess the quality and usefulness of the generated summaries in comparison to a sentence extraction method used in MEAD and a method that extracts only research objective sentences. The evaluation results indicated that the majority of sociology researchers preferred our variable-based summary generated with the use of a taxonomy.

Keywords

Childhood Sexual Abuse Digital Library Noun Phrase Dissertation Abstract Research Concept 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Boros, E., Kanto, P.B., Neu, D.J.: A clustering based approach to creating multi-document summaries. In: Document Understanding Conferences (2002), Available at http://www-nlpir.nist.gov/projects/duc/pubs/2001papers/rutgers_final.pdf
  2. 2.
    Brunn, M., Chali, Y., Dufour, B.: The University of Lethbridge text summarizer at DUC 2002. In: Document Understanding Conferences (2002) from http://www-nlpir.nist.gov/projects/duc/pubs/2002papers/lethbridge_chali.pdf (Retrieved May 19, 2003)
  3. 3.
    Harabagiu, S.M., Lacatusu, F.: Generating single and multi-document summaries with GISTEXTER. In: Document Understanding Conferences (2002), Available at http://www-nlpir.nist.gov/projects/duc/pubs/2002papers/utdallas_sanda.pdf
  4. 4.
    Mani, I., Bloedorn, E.: Summarization similarities and differences among related documents. Information Retrieval 1(1), 1–23 (1999)CrossRefGoogle Scholar
  5. 5.
    Mckeown, K., Radev, D.: Generating summaries of multiple news articles. In: Proceedings of the 18th Annual International ACM Conference on Research and Development in Information Retrieval (ACM SIGIR), Seattle, WA, pp. 74–82 (1995)Google Scholar
  6. 6.
    National Institute of Standards and Technology. In: Document Understanding Conferences (2002), Available at http://www-nlpir.nist.gov/projects/duc/index.html
  7. 7.
    Otterbacher, J.C., Winkel, A.J., Radev, D.R.: The Michigan single and multi-document summarizer for DUC 2002. In: Document Understanding Conferences (2002), Available at http://www-nlpir.nist.gov/projects/duc/pubs/2002papers/umich_otter.pdf
  8. 8.
    Ou, S., Khoo, C., Goh, D.: Multi-document summarization of dissertation abstracts using a variable-based framework. In: Proceedings of the 66th Annual Meeting of the American Society for Information Science and Technology (ASIST), Long Beach, CA, October 19-23, pp. 230–239 (2003)Google Scholar
  9. 9.
    Ou, S., Khoo, C., Goh, D., Heng, H.-H.: Automatic discourse parsing of sociology dissertation abstracts as sentence categorization. In: Proceedings of the 8th International ISKO Conference, London, UK, July 13-16, pp. 345-350 (2004)Google Scholar
  10. 10.
    Ou, S., Khoo, C., Goh, D.: A multi-document summarization system for sociology dissertation abstracts: design, implementation and evaluation. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds.) ECDL 2005. LNCS, vol. 3652, pp. 450–461. Springer, Heidelberg (2005) (in press)CrossRefGoogle Scholar
  11. 11.
    Pasi, J., Timo, J.: A non-projective dependency parser. In: Proceedings of the 5th Conference on Applied Natural Language Processing, pp. 64–71. Association for Computational Linguistics, Washington (1997)Google Scholar
  12. 12.
    Radev, D.: A common theory of information fusion from multiple text sources step one: cross-document structure. In: Proceedings of the 1st SIGdial Workshop on Discourse and Dialogue (2000), Available at http://www.sigdial.org/sigdialworkshop/proceedings/radev.pdf
  13. 13.
    Radev, D., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation and user studies. In: Workshop held with Applied Natural Language Processing Conference / Conference of the North American Chapter of the Association for Computational Linguistics (ANLP/ANNCL), pp. 21–29 (2000)Google Scholar
  14. 14.
    White, M., Korelsky, T., Cardie, C., Ng, V., Pierce, D., Wagstaff, K.: Multi-document summarization via information extraction. In: Proceedings of the 1st International Conference on Human Language Technology Research, HLT 2001 (2001)Google Scholar
  15. 15.
    Zhang, Z., Blair-Goldensohn, S., Radev, D.: Towards CST-enhanced summarization. In: Proceedings of the 18th National Conference on Artificial Intelligence (AAAI 2002), Edmonton, Canada (August 2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Shiyan Ou
    • 1
  • Christopher S. G. Khoo
    • 1
  • Dion H. Goh
    • 1
  1. 1.Division of Information Studies, School of Communication & InformationNanyang Technological UniversitySingapore

Personalised recommendations