Abstract
The volume of information available on the Web is increasing rapidly. The need for systems that can automatically summarise documents is becoming ever more desirable. For this reason, text summarisation has quickly grown into a major research area as illustrated by the DUC and TAC conference series. Summarisation systems for Arabic are however still not as sophisticated and as reliable as those developed for languages like English. In this paper we discuss two summarisation systems for Arabic and report on a large user study performed on these systems. The first system, the Arabic Query-Based Text Summarisation System (AQBTSS), uses standard retrieval methods to map a query against a document collection and to create a summary. The second system, the Arabic Concept-Based Text Summarisation System (ACBTSS), creates a query-independent document summary. Five groups of users from different ages and educational levels participated in evaluating our systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baxendale, P.B.: Machine-made index for technical literature: an experiment. IBM J. Res. Dev. 2(4), 354–361 (1958)
Diab, M., Hacioglu, K., Jurafsky, D.: Automatic Processing of Modern Standard Arabic Text. In: Soudi, A., van den Bosch, A., Neumann, G. (eds.) Arabic Computational Morphology: Knowledge-based and Empirical Methods. Text, Speech and Language Technology, pp. 159–179. Springer, Netherlands (2007)
Edmundson, H.P.: New methods in automatic extracting. J. ACM 16(2), 264–285 (1969)
El-Haj, M., Hammo, B.: Evaluation of query-based Arabic text summarization system. In: Proceeding of the IEEE International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2008, pp. 1–7. IEEE Computer Society, Beijing (2008)
Fiszman, M., Demner-Fushman, D., Kilicoglu, H., Rindflesch, T.C.: Automatic summarization of medline citations for evidence-based medical treatment: A topic-oriented evaluation. Journal of Biomedical Informatics 42(5), 801–813 (2009)
Hoa, T.D.: Overview of duc (2007). In: Proceedings of the Document Understanding Conference (DUC) (2007)
Khreisat, L.: Arabic text classification using n-gram frequency statistics: A comparative study. In: Proceedings of the 2006 International Conference on Data Mining, pp. 78–82 (2006)
Kupiec, J., Pedersen, J.O., Chen, F.: A trainable document summarizer. In: Fox, E.A., Ingwersen, P., Fidel, R. (eds.) SIGIR 1995, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68–73. ACM Press, Seattle (1995)
Leite, D.S., Rino, L.H.M.: Combining multiple features for automatic text summarization through machine learning. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 122–132. Springer, Heidelberg (2008)
Lin, C.: ROUGE: A package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), pp. 25–26 (2004)
Luhn, H.P.: The automatic creation of literature abstracts. IBM Journal of Research Development 2(2), 159–165 (1958)
Maegaard, B., Atiyya, M., Choukri, K., Krauwer, S., Mokbel, C., Yaseen, M.: Medar: Collaboration between European and Mediterranean Arabic partners to support the development of language technology for Arabic. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), LREC 2008, Marrakech, Morocco (2008)
Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: Proceeding of the 40th Annual Meeting on Association for Computational Linguistics (ACL 2002). Association for Computational Linguistics (2002)
Salton, G., Wong, A., Yang, S.: A vector space model for automatic indexing. Proceedings of the Communications of the ACM 18(11), 613–620 (1975)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
El-Haj, M., Kruschwitz, U., Fox, C. (2011). Experimenting with Automatic Text Summarisation for Arabic. In: Vetulani, Z. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2009. Lecture Notes in Computer Science(), vol 6562. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20095-3_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-20095-3_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20094-6
Online ISBN: 978-3-642-20095-3
eBook Packages: Computer ScienceComputer Science (R0)