Skip to main content

Defining Language-Specific Synsets in IndoWordNet: Some Theoretical and Practical Issues

  • Chapter
  • First Online:
The WordNet in Indian Languages

Abstract

A WordNet is a digital network of semantically linked words, which are organized around the notion of synsets of a language. A synset is a set of synonyms with same part-of-speech (mostly), which are potential to be interchanged in usage at certain contexts of expression and information interchange within or across languages. The presence of synsets in a WordNet attests the basic argument that a single word can refer to multiple concepts (i.e., polysemy), and reversely, several words can point to a single concept (i.e., synonymy). Based on the general notion of WordNet, it is possible to assume that synsets can be universal or language specific. In this chapter, I have made an attempt to critically evaluate the concept of ‘synset’ in WordNet as well as the problems of defining language-specific synsets (LSSs) for the Indian languages with special reference to Bangla. Defining LSSs, unlike those of universal synsets, is a real challenge, since the very idea of language specificity is still a fuzzy notion in the domain of lexicology, lexical knowledge representation and language understanding. Therefore, I shall first try to address the question of language specificity; explore the existence and use of LSS in a language; define the methods for LSS selection; and finally refer to the process of LSS generation in a language, with reference to Bangla, within a network of cross-cultural lexical percolation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Apresjan, J. (1973). Regular polysemy. Linguistics, 142, 5–32.

    Google Scholar 

  • Bhatt, B., & Bhattacharyya, P. (2011). IndoWordNet and its linking with ontology. In Proceedings of the 9th International Conference on Natural Language Processing (ICON-2011), Anna University, Chennai, India, 16th–19th Dec 2011.

    Google Scholar 

  • Bhattacharyya, P. (2010). IndoWordNet. In Proceedings of the 7th International Language Resource and Evaluation Conference (LREC-10), Valletta, Malta, May 19–21, 2010.

    Google Scholar 

  • Buitelaar, P., & Sacaleanu, B. (2001). Ranking and selecting synsets by domain relevance. In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources, Pittsburgh, June 2001.

    Google Scholar 

  • Chakrabarti, D., & Bhattacharyya, P. (2004). Creation of English and Hindi verb hierarchies and their application to Hindi WordNet building and English-Hindi MT. In Proceedings of the Second Global WordNet Conference, Bruno, Czech Republic, 20–23 Jan (pp. 83–90).

    Google Scholar 

  • Cruse, D. A. (1986). Lexical semantics. Cambridge: Cambridge University Press.

    Google Scholar 

  • Dash, N. S., Dutta Chowdhury, P., & Sarkar, A. (2009). Naturalization of English words in modern Bangla: A corpus-based empirical study. Language Forum, 35(2), 127–142.

    Google Scholar 

  • Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database. Cambridge, Mass: MIT Press.

    Google Scholar 

  • Gumperz, J. (1962). Types of linguistic communities. Anthropological Linguistics, 4(1), 28–40.

    Google Scholar 

  • Krstev, C., Obradovic, I., & Vitas, D. (2007). An approach to the development of language specific concepts in the WordNets. Southern Journal of Linguistics, 29(1), 77–90.

    Google Scholar 

  • Kulkarni, M., Dangarikar, C., Kulkarni, I., Nanda, A., & Bhattacharyya, P. (2010). Introducing Sanskrit Wordnet. In Global Wordnet Conference (GWC10), Mumbai, India.

    Google Scholar 

  • Magnini, B., & Cavagliá, G. (2000). Integrating subject field codes into WordNet. In Proceedings of the 2nd International Conference on Language Resources & Evaluation LREC’2000, Athens (pp. 1413–1420).

    Google Scholar 

  • Miller, G. A. (1985). WordNet: A dictionary browser. In Proceedings of the 1st International Conference on Information in Data. University of Waterloo, Waterloo, Canada.

    Google Scholar 

  • Narayan, D., Chakrabarty, D., Pande, P., & Bhattacharyya, P. (2002). An experience in building the IndoWordNet: A WordNet for Hindi. In Presented in the 1st International Conference on Global WordNet (GWC 02), Mysore, India.

    Google Scholar 

  • Parkes, G., & Cornell, A. (1992). NTC’s dictionary of German false cognates. National Textbook Company, NTC Publishing Group.

    Google Scholar 

  • Selvaraj, A. (2010). Telugu wordnet. In Proceedings of the Global Wordnet Conference (GWC10), Mumbai, India.

    Google Scholar 

  • Vossen, P. (2004). EuroWordNet: A multilingual database of autonomous and language-specific WordNets connected via an Inter-Lingual-Index. International Journal of Lexicography, 17(2), 161–173.

    Article  Google Scholar 

  • Whitelock, P. (1992). Shake-and-bake translation. In Proceedings of the 14th International Conference on Computational Linguistics (COLING’92) (Vol. 2, pp. 784–790). Nantes, 23–28 August.

    Google Scholar 

  • Whorf, B. L. (1956). Science and linguistics. In J. B. Carroll (Ed.), Language, Thought and Reality: Selected Writings of Benjamin Lee Whorf (pp. 207–219). Cambridge, Mass: MIT Press.

    Google Scholar 

  • Zuckermann, G. (2003). Language contact and lexical enrichment in Israeli hebrew. Houndmills: Palgrave Macmillan.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Niladri Sekhar Dash .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Singapore

About this chapter

Cite this chapter

Dash, N.S. (2017). Defining Language-Specific Synsets in IndoWordNet: Some Theoretical and Practical Issues. In: Dash, N., Bhattacharyya, P., Pawar, J. (eds) The WordNet in Indian Languages. Springer, Singapore. https://doi.org/10.1007/978-981-10-1909-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-1909-8_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-1907-4

  • Online ISBN: 978-981-10-1909-8

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics