Towards an automatic extraction of synonyms for Quranic Arabic WordNet

AlMaayah, Manal; Sawalha, Majdi; Abushariah, Mohammad A. M.

doi:10.1007/s10772-015-9301-9

Towards an automatic extraction of synonyms for Quranic Arabic WordNet

Published: 18 September 2015

Volume 19, pages 177–189, (2016)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Manal AlMaayah¹,
Majdi Sawalha¹ &
Mohammad A. M. Abushariah¹

696 Accesses
7 Citations
Explore all metrics

Abstract

In this paper, we developed an automatic extraction model of synonyms, which is used to construct our Quranic Arabic WordNet (QAWN) that depends on traditional Arabic dictionaries. In this work, we rely on three resources. First, the Boundary Annotated Quran Corpus that contains Quran words, Part-of-Speech, root and other related information. Second, the lexicon resources that was used to collect a set of derived words for Quranic words. Third, traditional Arabic dictionaries, which were used to extract the meaning of words with distinction of different senses. The objective of this work is to link the Quranic words of similar meanings in order to generate synonym sets (synsets). To accomplish that, we used term frequency and inverse document frequency in vector space model, and we then computed cosine similarities between Quranic words based on textual definitions that are extracted from traditional Arabic dictionaries. Words of highest similarity were grouped together to form a synset. Our QAWN consists of 6918 synsets that were constructed from about 8400 unique word senses, on average of 5 senses for each word. Based on our experimental evaluation, the average recall of the baseline system was 7.01 %, whereas the average recall of the QAWN was 34.13 % which improved the recall of semantic search for Quran concepts by 27 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Enrichment of Arabic WordNet Antonym Relations

Word sense disambiguation for Arabic text using Wikipedia and Vector Space Model

Article 04 October 2016

Marwah Alian, Arafat Awajan & Akram Al-Kouz

Toward an Arabic Ontology for Arabic Word Sense Disambiguation Based on Normalized Dictionaries

Notes

References

Abouenour, L., Bouzoubaa, K., & Rosso, P. (2013). On the evaluation and improvement of Arabic WordNet coverage and usability. Language Resources and Evaluation, 47(3), 891–917.
Article Google Scholar
Aliwy, A. H. (2013). Arabic morphosyntactic raw text part of speech tagging system. Repozytorium Uniwersytetu Warszawskiego.
Banerjee, S., & Pedersen, T. (2002). An adapted Lesk algorithm for word sense disambiguation using WordNet. In Computational linguistics and intelligent text processing (pp. 136–145). Berlin: Springer.‏
Brierley, C., Sawalha, M., & Atwell, E. (2012). Open-source boundary-annotated corpus for Arabic speech and language processing. In Proceedings of language resources and evaluation conference (LREC) 2012.
Elkateb, S., Black, W., Rodríguez, H., Alkhalifa, M., Vossen, P., Pease, A., & Fellbaum, C. (2006). Building a WordNet for Arabic. In Proceedings of the fifth international conference on language resources and evaluation (LREC 2006).
Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database. Cambridge, MA: MIT Press.
MATH Google Scholar
Fellbaum, C., & Vossen, P. (2007). Connecting the universal to the specific. In T. Ishida, S. R. Fussell & P. T. J. M. Vossen (Eds.), Intercultural collaboration: First international workshop (Vol. 4568, pp. 1–16). Lecture notes in computer science. New York: Springer
Fellbaum, c, & Vossen, P. (2012). Challenges for a multilingual WordNet. Language Resources and Evaluation, 46, 313–326.
Article Google Scholar
Mandala, R., Takenobu, T., & Hozumi, T. (1998). The use of WordNet in information retrieval. In: Paper presented at the use of WordNet in natural language processing systems: Proceedings of the conference.
Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38, 39–41.
Article Google Scholar
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to WordNet: An on-line lexical database*. International Journal of Lexicography, 3(4), 235–244.‏
Miller, G. A., & Fellbaum, C. (2007). WordNet then and now. Language Resources and Evaluation, 41, 209–214.
Article Google Scholar
Poprat, M., Beisswanger, E., & Hahn, U. (2008, June). Building a BioWordNet by using WordNet’s data formats and WordNet’s software infrastructure: A failure story. In Software engineering, testing, and quality assurance for natural language processing (pp. 31–39). Association for Computational Linguistics.‏
Princeton. (2015). Retrived February 3, 2015, from https://wordnet.princeton.edu/.
Qurany. (2015). Retrived February 3, 2015, from http://quranytopics.appspot.com/.
Sawalha, M., & Atwell, E. (2010). Constructing and using broad-coverage lexical resource for enhancing morphological analysis of Arabic. In Proceedings of the seventh conference on international language resources and evaluation (LREC’10).
Sawalha, M. (2011). Open-source resources and standards for Arabic word structure analysis: Fine grained morphological analysis of Arabic text corpora. PhD Thesis. School of Computing. University of Leeds.
Sawalha, M., Brierley, C., & Atwell, E. (2014). Automatically generated, phonemic Arabic-IPA pronunciation tiers for the boundary annotated Qur'an dataset for machine learning (version 2.0). In proceedings of LRE-Rel 2: 2nd workshop on language resources and evaluation for religious texts at LREC 2014. Reykjavik, Iceland.
Sawalha, M. S., Brierley, C., & Atwell, E. (2012). Open-source boundary-annotated Qur’an Corpus for Arabic and phrase breaks prediction in classical and modern standard Arabic text. Journal of Speech Sciences, 2(2), 175–191.
Google Scholar
Shoaib, M., Yasin, M. N., Hikmat, U. K., Saeed, M. I., & Khiyal, M. S. H. (2009, October). Relational WordNet model for semantic search in Holy Quran. In International conference on emerging technologies, 2009. ICET 2009 (pp. 29–34). IEEE.
Siemiński, A. (2011). Wordnet based word sense disambiguation. In Computational collective intelligence. Technologies and applications (pp. 405–414). Berlin:Springer.‏
Varelas, G., Voutsakis, E., Raftopoulou, P., Petrakis, E. G., & Milios, E. E. (2005, November). Semantic similarity methods in wordNet and their application to information retrieval on the web. In Proceedings of the 7th annual ACM international workshop on Web information and data management (pp. 10–16). ACM.‏
Yih, W.-T., & Meek, C. (2007). Improving similarity measures for short segments of text. In Paper presented at the AAAI.

Download references

Author information

Authors and Affiliations

Computer Information Systems Department, King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan
Manal AlMaayah, Majdi Sawalha & Mohammad A. M. Abushariah

Authors

Manal AlMaayah
View author publications
You can also search for this author in PubMed Google Scholar
Majdi Sawalha
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad A. M. Abushariah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Majdi Sawalha.

Appendix

Rights and permissions

Reprints and permissions

About this article

Cite this article

AlMaayah, M., Sawalha, M. & Abushariah, M.A.M. Towards an automatic extraction of synonyms for Quranic Arabic WordNet. Int J Speech Technol 19, 177–189 (2016). https://doi.org/10.1007/s10772-015-9301-9

Download citation

Received: 26 June 2015
Accepted: 12 August 2015
Published: 18 September 2015
Issue Date: June 2016
DOI: https://doi.org/10.1007/s10772-015-9301-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards an automatic extraction of synonyms for Quranic Arabic WordNet

Abstract

Access this article

Similar content being viewed by others

The Enrichment of Arabic WordNet Antonym Relations

Word sense disambiguation for Arabic text using Wikipedia and Vector Space Model

Toward an Arabic Ontology for Arabic Word Sense Disambiguation Based on Normalized Dictionaries

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards an automatic extraction of synonyms for Quranic Arabic WordNet

Abstract

Access this article

Similar content being viewed by others

The Enrichment of Arabic WordNet Antonym Relations

Word sense disambiguation for Arabic text using Wikipedia and Vector Space Model

Toward an Arabic Ontology for Arabic Word Sense Disambiguation Based on Normalized Dictionaries

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation