Advertisement

Experimenting with Automatic Text Summarisation for Arabic

  • Mahmoud El-Haj
  • Udo Kruschwitz
  • Chris Fox
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6562)

Abstract

The volume of information available on the Web is increasing rapidly. The need for systems that can automatically summarise documents is becoming ever more desirable. For this reason, text summarisation has quickly grown into a major research area as illustrated by the DUC and TAC conference series. Summarisation systems for Arabic are however still not as sophisticated and as reliable as those developed for languages like English. In this paper we discuss two summarisation systems for Arabic and report on a large user study performed on these systems. The first system, the Arabic Query-Based Text Summarisation System (AQBTSS), uses standard retrieval methods to map a query against a document collection and to create a summary. The second system, the Arabic Concept-Based Text Summarisation System (ACBTSS), creates a query-independent document summary. Five groups of users from different ages and educational levels participated in evaluating our systems.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baxendale, P.B.: Machine-made index for technical literature: an experiment. IBM J. Res. Dev. 2(4), 354–361 (1958)CrossRefGoogle Scholar
  2. 2.
    Diab, M., Hacioglu, K., Jurafsky, D.: Automatic Processing of Modern Standard Arabic Text. In: Soudi, A., van den Bosch, A., Neumann, G. (eds.) Arabic Computational Morphology: Knowledge-based and Empirical Methods. Text, Speech and Language Technology, pp. 159–179. Springer, Netherlands (2007)CrossRefGoogle Scholar
  3. 3.
    Edmundson, H.P.: New methods in automatic extracting. J. ACM 16(2), 264–285 (1969)CrossRefzbMATHGoogle Scholar
  4. 4.
    El-Haj, M., Hammo, B.: Evaluation of query-based Arabic text summarization system. In: Proceeding of the IEEE International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2008, pp. 1–7. IEEE Computer Society, Beijing (2008)Google Scholar
  5. 5.
    Fiszman, M., Demner-Fushman, D., Kilicoglu, H., Rindflesch, T.C.: Automatic summarization of medline citations for evidence-based medical treatment: A topic-oriented evaluation. Journal of Biomedical Informatics 42(5), 801–813 (2009)CrossRefGoogle Scholar
  6. 6.
    Hoa, T.D.: Overview of duc (2007). In: Proceedings of the Document Understanding Conference (DUC) (2007)Google Scholar
  7. 7.
    Khreisat, L.: Arabic text classification using n-gram frequency statistics: A comparative study. In: Proceedings of the 2006 International Conference on Data Mining, pp. 78–82 (2006)Google Scholar
  8. 8.
    Kupiec, J., Pedersen, J.O., Chen, F.: A trainable document summarizer. In: Fox, E.A., Ingwersen, P., Fidel, R. (eds.) SIGIR 1995, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68–73. ACM Press, Seattle (1995)Google Scholar
  9. 9.
    Leite, D.S., Rino, L.H.M.: Combining multiple features for automatic text summarization through machine learning. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 122–132. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Lin, C.: ROUGE: A package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), pp. 25–26 (2004)Google Scholar
  11. 11.
    Luhn, H.P.: The automatic creation of literature abstracts. IBM Journal of Research Development 2(2), 159–165 (1958)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Maegaard, B., Atiyya, M., Choukri, K., Krauwer, S., Mokbel, C., Yaseen, M.: Medar: Collaboration between European and Mediterranean Arabic partners to support the development of language technology for Arabic. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), LREC 2008, Marrakech, Morocco (2008)Google Scholar
  13. 13.
    Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  14. 14.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: Proceeding of the 40th Annual Meeting on Association for Computational Linguistics (ACL 2002). Association for Computational Linguistics (2002)Google Scholar
  15. 15.
    Salton, G., Wong, A., Yang, S.: A vector space model for automatic indexing. Proceedings of the Communications of the ACM 18(11), 613–620 (1975)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Mahmoud El-Haj
    • 1
  • Udo Kruschwitz
    • 1
  • Chris Fox
    • 1
  1. 1.School of Computer Science and Electronic EngineeringUniversity of EssexUK

Personalised recommendations