Skip to main content

Automatic Text Summarization Using a Machine Learning Approach

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (SBIA 2002)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2507))

Included in the following conference series:

Abstract

In this paper we address the automatic summarization task. Recent research works on extractive-summary generation employ some heuristics, but few works indicate how to select the relevant features. We will present a summarization procedure based on the application of trainable Machine Learning algorithms which employs a set of features extracted directly from the original text. These features are of two kinds: statistical - based on the frequency of some elements in the text; and linguistic - extracted from a simplified argumentative structure of the text. We also present some computational results obtained with the application of our summarizer to some well known text databases, and we compare these results to some baseline summarization procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barzilay, R.; Elhadad, M. Using Lexical Chains for Text Summarization. In Mani, I.; Maybury, M. T. (eds.). In Proceedings of the ACL/EACL-97 Workshop on Intelligent Scalable Text Summarization, Association of Computional Linguistics (1997)

    Google Scholar 

  2. Brandow, R.; Mitze, K., Rau, L. Automatic condensation of electronic publications by sentence selection. Information Processing and Management 31(5) (1994) 675–685

    Article  Google Scholar 

  3. Brill, E. A simple rule-based part-of-speech tagger. In Proceedings of the Third Conference on Applied Comp. Linguistics. Assoc. for Computational Linguistics (1992)

    Google Scholar 

  4. Carbonell, J. G.; Goldstein, J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of SIGIR-98 (1998)

    Google Scholar 

  5. Edmundson, H. P. New methods in automatic extracting. Journal of the Association for Computing Machinery 16(2) (1969) 264–285

    MATH  Google Scholar 

  6. Harman, D. Data Preparation. In Merchant, R. (ed.). The Proceedings of the TIPSTER Text Program Phase I. Morgan Kaufmann Publishing Co. (1994)

    Google Scholar 

  7. Kupiec, J.; Pedersen, J. O.; Chen, F. A trainable document summarizer. In Proceedings of the 18th ACM-SIGIR Conference, Association of Computing Machinery (1995) 68–73

    Google Scholar 

  8. Larocca Neto, J.; Santos, A. D.; Kaestner, CA.; Freitas, A.A.. Document clustering and text summarization. Proc. of 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining (PADD-2000) London: The Practical Application Company (2000) 41–55

    Google Scholar 

  9. Luhn, H. The automatic creation of literature abstracts. IBM Journal of Research and Development 2(92) (1958) 159–165

    Article  MathSciNet  Google Scholar 

  10. Mani, I.; House, D.; Klein, G.; Hirschman, L.; Obrsl, L.; Firmin, T.; Chrzanowski, M.; Sundheim, B. The TIPSTER SUMMAC Text Summarization Evaluation. MITRE Technical Report MTR 98W0000138. The MITRE Corporation (1998)

    Google Scholar 

  11. Mani, I.; Bloedorn, E. Machine Learning of Generic and User-Focused Summarization. In Proceedings of the Fifteenth National Conference on AI (AAAI-98) (1998) 821–826

    Google Scholar 

  12. Mani, I. Automatic Summarization. J. Benjamins Publ. Co. Amsterdam Philadelphia (2001)

    MATH  Google Scholar 

  13. Marcu, D. Discourse trees are good indicators of importance in text. In Mani., I.; Maybury, M. (eds.). Adv. in Automatic Text Summarization. The MIT Press (1999) 123–136

    Google Scholar 

  14. Mitchell, T. Machine Learning. McGraw-Hill (1997)

    Google Scholar 

  15. Mitra, M.; Singhal, A.; Buckley, C. Automatic text summarization by paragraph extraction. In Proceedings of the ACL’97VEACL’97 Workshop on Intelligent Scalable Text Summarization. Madrid (1997)

    Google Scholar 

  16. Nevill-Manning, C. G.; Witten, I. H. Paynter, G. W. et al. KEA: Practical Automatic Keyphrase Extraction. ACMDL 1999 (1999) 254–255

    Google Scholar 

  17. Porter, M.F. An algorithm for suffix stripping. Program 14, 130–137. 1980. Reprinted in: Sparck-Jones, K.; Willet, P. (eds.) Readings in Information Retrieval. Morgan Kaufmann (1997) 313-316

    Google Scholar 

  18. Quinlan, J. C4.5: Programs for Machine Learning. Morgan Kaufmann SaoMateo California (1992)

    Google Scholar 

  19. Rath, G. J.; Resnick A.; Sawage R. The formation of abstracts by the selection of sentences. American Documentation 12(2) (1961) 139–141

    Article  Google Scholar 

  20. Saltón, G.; Buckley, C. Term-weighting approaches in automatic text retrieval. Information Processing and Management 24, 513–523. 1988. Reprinted in: Sparck-Jones, K.; Willet, P. (eds.) Readings in Retrieval. Morgan Kaufmann (1997) 323-328

    Article  Google Scholar 

  21. Sparck-Jones, K. Automatic summarizing: factors and directions. In Mani, I.; Maybury, M. Advances in Automatic Text Summarization. The MIT Press (1999) 1–12

    Google Scholar 

  22. Strzalkowski, T.; Stein, G.; Wang, J.; Wise, B. A Robust Practical Text Summarizer. In Mani, I.; Maybury, M. (eds.), Adv. in Autom. Text Summarization. The MIT Press (1999)

    Google Scholar 

  23. Teufel, S.; Moens, M. Argumentative classification of extracted sentences as a first step towards flexible abstracting. In Mani, I.; Maybury M. (eds.). Advances in automatic text summarization. The MIT Press (1999)

    Google Scholar 

  24. Yaari, Y. Segmentation of Expository Texts by Hierarchical Agglomerative Clustering. Technical Report, Bar-Ilan University Israel (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Neto, J.L., Freitas, A.A., Kaestner, C.A.A. (2002). Automatic Text Summarization Using a Machine Learning Approach. In: Bittencourt, G., Ramalho, G.L. (eds) Advances in Artificial Intelligence. SBIA 2002. Lecture Notes in Computer Science(), vol 2507. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36127-8_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-36127-8_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00124-9

  • Online ISBN: 978-3-540-36127-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics