Skip to main content

Context Analysis for Computer-Assisted Near-Synonym Learning

  • Chapter
  • First Online:
Computational and Corpus Approaches to Chinese Language Learning

Part of the book series: Chinese Language Learning Sciences ((CLLS))

  • 814 Accesses

Abstract

Despite their similar meanings, near-synonyms may have different usages in different contexts. For second-language learners, such differences are not easily grasped in practical use. This chapter introduces several context analysis techniques such as pointwise mutual information (PMI), n-gram language model, latent semantic analysis (LSA), and independent component analysis (ICA) to verify whether near-synonyms do match the given contexts. Applications can benefit from such techniques to provide useful contextual information for learners, making it easier for them to understand different usages of various near-synonyms. Based on these context analysis techniques, we build a prototype computer-assisted near-synonym learning system. In experiments, we evaluate the context analysis methods on both Chinese and English sentences, and compared its performance to several previously proposed supervised and unsupervised methods. Experimental results show that training on the independent components that contain useful contextual features with minimized term dependence can improve the classifiers’ ability to discriminate among near-synonyms, thus yielding better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Bhogal, J., Macfarlane, A., & Smith, P. (2007). A review of ontology based query expansion. Information Processing and Management,43(4), 866–886.

    Article  Google Scholar 

  • Cheng, C.-C. (2004). Word-focused extensive reading with guidance. In Proceedings of the 13th International Symposium on English Teaching (pp. 24–32).

    Google Scholar 

  • Church, K., & Hanks, P. (1990). Word association norms, mutual information and lexicography. Computational Linguistics,16(1), 22–29.

    Google Scholar 

  • Cribbin, T. (2011). Discovering latent topical structure by second-order similarity analysis. Journal of the American Society for Information Science and Technology,62(6), 1188–1207.

    Article  Google Scholar 

  • Edmonds, P. (1997). Choosing the word most typical in context using a lexical co-occurrence network. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 507–509).

    Google Scholar 

  • Fellbaum, C. (1998). WordNet: An electronic lexical database. Cambridge, MA: MIT Press.

    Book  Google Scholar 

  • Gardiner, M., & Dras, M. (2007). Exploring approaches to discriminating among near-synonyms. In Proceedings of the Australasian Language Technology Workshop (pp. 31–39).

    Google Scholar 

  • Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore, MD: Johns Hopkins University Press.

    Google Scholar 

  • Harris, Z. (1954). Distributional structure. Word,10(2–3), 146–162.

    Article  Google Scholar 

  • Howell, D. C. (2007). Statistical methods for psychology (6th ed.). Belmont, CA: Thomson.

    Google Scholar 

  • Huang, C.-R., Hsieh, S.-K., Hong, J.-F., Chen, Y.-Z., Su, I.-L., Chen, Y.-X., & Huang, S.-W. (2008). Chinese Wordnet: Design, implementation, and application of an infrastructure for cross-lingual knowledge processing. In Proceedings of the 9th Chinese Lexical Semantics Workshop.

    Google Scholar 

  • Hyvärinen, A. (1999). Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks,10(3), 626–634.

    Article  Google Scholar 

  • Hyvärinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis. New York: Wiley.

    Book  Google Scholar 

  • Inkpen, D. (2007). A statistical model of near-synonym choice. ACM Transactions on Speech and Language Processing,4(1), 1–17.

    Article  Google Scholar 

  • Inkpen, D., & Hirst, G. (2006). Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics,32(2), 1–39.

    Article  Google Scholar 

  • Islam, A., & Inkpen, D. (2010). Near-synonym choice using a 5-gram language model. Research in Computing Science,46, 41–52.

    Google Scholar 

  • Kolenda, T., & Hansen, L. K. (2000). Independent components in text. Advances in Neural Information Processing Systems,13, 235–256.

    Google Scholar 

  • Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes,25(2–3), 259–284.

    Article  Google Scholar 

  • Lee, T. W. (1998). Independent component analysis—Theory and applications. Norwell, MA: Kluwer.

    Book  Google Scholar 

  • Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 768–774).

    Google Scholar 

  • Moldovan, D., & Mihalcea, R. (2000). Using Wordnet and lexical operators to improve internet searches. IEEE Internet Computing,4(1), 34–43.

    Article  Google Scholar 

  • Navigli, R., & Velardi, P. (2003). An analysis of ontology-based query expansion strategies. In Proceedings of the Workshop on Adaptive Text Extraction and Mining.

    Google Scholar 

  • Ouyang, S., Gao, H.-H., & Koh, S.-N. (2009). Developing a computer-facilitated tool for acquiring near-synonyms in Chinese and English. In Proceedings of the 8th International Conference on Computational Semantics (pp. 316–319).

    Google Scholar 

  • Pearce, D. (2001). Synonymy in collocation extraction. In Proceedings of the Workshop on WordNet and Other Lexical Resources.

    Google Scholar 

  • Rapp, R. (2004). Mining text for word senses using independent component analysis. In Proceedings of the 4th SIAM International Conference on Data Mining (pp. 422–426).

    Google Scholar 

  • RodrĂ­guez, H., Climent, S., Vossen, P., Bloksma, L., Peters, W., Alonge, A., et al. (1998). The top-down strategy for building EuroWordNet: vocabulary coverage, base concepts and top ontology. Computers and the Humanities,32, 117–159.

    Article  Google Scholar 

  • Roussinov, D., & Zhao, J. L. (2003). Automatic discovery of similarity relationships through Web mining. Decision Support Systems,35(1), 149–166.

    Article  Google Scholar 

  • Sevillano, X., AlĂ­as, F., & SocorĂł, J. C. (2004). Reliability in ICA-based text classification. In Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (pp. 1213–1220).

    Chapter  Google Scholar 

  • Shlrl, A., & Revle, C. (2006). Query expansion behavior within a thesaurus-enhanced search environment: A user-centered evaluation. Journal of the American Society for Information Science and Technology,57(4), 462–478.

    Article  Google Scholar 

  • Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007). Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing and Management,43(6), 1606–1618.

    Article  Google Scholar 

  • Wang, T., & Hirst, G. (2010). Near-synonym lexical choice in latent semantic space. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 1182–1190).

    Google Scholar 

  • Weeds, J., Weir, D., & McCarthy, D. (2004). Characterising measures of lexical distributional similarity. In Proceedings of the 20th International Conference on Computational Linguistics (pp. 1015–1021).

    Google Scholar 

  • Wei, C. P., Yang, C. C., & Lin, C. M. (2008). A Latent semantic indexing-based approach to multilingual document clustering. Decision Support Systems,45(3), 606–620.

    Article  Google Scholar 

  • Wible, D., Kuo, C.-H., Tsao, N.-L., Liu, A., & Lin, H.-L. (2003). Bootstrapping in a language learning environment. Journal of Computer Assisted learning,19(1), 90–102.

    Article  Google Scholar 

  • Wu, C.-H. Liu, C.-H., Matthew, H., & Yu, L.-C. (2010). Sentence correction incorporating relative position and parse template language models. IEEE Transactions on Audio, Speech and Language Processing, 18(6), 1170–1181.

    Google Scholar 

  • Yu, L.-C., & Chien, W.-N. (2013). Independent component analysis for near-synonym choice. Decision Support Systems,55(1), 146–155.

    Article  Google Scholar 

  • Yu, L.-C., Lee, L.-H., Yeh, J.-F., Shih, H.-M., & Lai, Y.-L. (2016). Near-synonym substitution using a discriminative vector space model. Knowledge-Based Systems,106, 74–84.

    Article  Google Scholar 

  • Yu, L.-C., Wu, C.-H., Chang, R.-Y., Liu, C.-H., & Hovy, E. H. (2010). Annotation and verification of sense pools in OntoNotes. Information Processing and Management,46(4), 436–447.

    Article  Google Scholar 

  • Yu, L.-C., Wu, C.-H., & Jang, F.-L. (2009). Psychiatric document retrieval using a discourse-aware model. Artificial Intelligence,173(7–8), 817–829.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang-Chih Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Yu, LC., Chien, WN., Hsu, KH. (2019). Context Analysis for Computer-Assisted Near-Synonym Learning. In: Lu, X., Chen, B. (eds) Computational and Corpus Approaches to Chinese Language Learning. Chinese Language Learning Sciences. Springer, Singapore. https://doi.org/10.1007/978-981-13-3570-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-3570-9_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-3569-3

  • Online ISBN: 978-981-13-3570-9

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics