Abstract
Despite their similar meanings, near-synonyms may have different usages in different contexts. For second-language learners, such differences are not easily grasped in practical use. This chapter introduces several context analysis techniques such as pointwise mutual information (PMI), n-gram language model, latent semantic analysis (LSA), and independent component analysis (ICA) to verify whether near-synonyms do match the given contexts. Applications can benefit from such techniques to provide useful contextual information for learners, making it easier for them to understand different usages of various near-synonyms. Based on these context analysis techniques, we build a prototype computer-assisted near-synonym learning system. In experiments, we evaluate the context analysis methods on both Chinese and English sentences, and compared its performance to several previously proposed supervised and unsupervised methods. Experimental results show that training on the independent components that contain useful contextual features with minimized term dependence can improve the classifiers’ ability to discriminate among near-synonyms, thus yielding better performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bhogal, J., Macfarlane, A., & Smith, P. (2007). A review of ontology based query expansion. Information Processing and Management,43(4), 866–886.
Cheng, C.-C. (2004). Word-focused extensive reading with guidance. In Proceedings of the 13th International Symposium on English Teaching (pp. 24–32).
Church, K., & Hanks, P. (1990). Word association norms, mutual information and lexicography. Computational Linguistics,16(1), 22–29.
Cribbin, T. (2011). Discovering latent topical structure by second-order similarity analysis. Journal of the American Society for Information Science and Technology,62(6), 1188–1207.
Edmonds, P. (1997). Choosing the word most typical in context using a lexical co-occurrence network. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 507–509).
Fellbaum, C. (1998). WordNet: An electronic lexical database. Cambridge, MA: MIT Press.
Gardiner, M., & Dras, M. (2007). Exploring approaches to discriminating among near-synonyms. In Proceedings of the Australasian Language Technology Workshop (pp. 31–39).
Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore, MD: Johns Hopkins University Press.
Harris, Z. (1954). Distributional structure. Word,10(2–3), 146–162.
Howell, D. C. (2007). Statistical methods for psychology (6th ed.). Belmont, CA: Thomson.
Huang, C.-R., Hsieh, S.-K., Hong, J.-F., Chen, Y.-Z., Su, I.-L., Chen, Y.-X., & Huang, S.-W. (2008). Chinese Wordnet: Design, implementation, and application of an infrastructure for cross-lingual knowledge processing. In Proceedings of the 9th Chinese Lexical Semantics Workshop.
Hyvärinen, A. (1999). Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks,10(3), 626–634.
Hyvärinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis. New York: Wiley.
Inkpen, D. (2007). A statistical model of near-synonym choice. ACM Transactions on Speech and Language Processing,4(1), 1–17.
Inkpen, D., & Hirst, G. (2006). Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics,32(2), 1–39.
Islam, A., & Inkpen, D. (2010). Near-synonym choice using a 5-gram language model. Research in Computing Science,46, 41–52.
Kolenda, T., & Hansen, L. K. (2000). Independent components in text. Advances in Neural Information Processing Systems,13, 235–256.
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes,25(2–3), 259–284.
Lee, T. W. (1998). Independent component analysis—Theory and applications. Norwell, MA: Kluwer.
Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 768–774).
Moldovan, D., & Mihalcea, R. (2000). Using Wordnet and lexical operators to improve internet searches. IEEE Internet Computing,4(1), 34–43.
Navigli, R., & Velardi, P. (2003). An analysis of ontology-based query expansion strategies. In Proceedings of the Workshop on Adaptive Text Extraction and Mining.
Ouyang, S., Gao, H.-H., & Koh, S.-N. (2009). Developing a computer-facilitated tool for acquiring near-synonyms in Chinese and English. In Proceedings of the 8th International Conference on Computational Semantics (pp. 316–319).
Pearce, D. (2001). Synonymy in collocation extraction. In Proceedings of the Workshop on WordNet and Other Lexical Resources.
Rapp, R. (2004). Mining text for word senses using independent component analysis. In Proceedings of the 4th SIAM International Conference on Data Mining (pp. 422–426).
RodrĂguez, H., Climent, S., Vossen, P., Bloksma, L., Peters, W., Alonge, A., et al. (1998). The top-down strategy for building EuroWordNet: vocabulary coverage, base concepts and top ontology. Computers and the Humanities,32, 117–159.
Roussinov, D., & Zhao, J. L. (2003). Automatic discovery of similarity relationships through Web mining. Decision Support Systems,35(1), 149–166.
Sevillano, X., AlĂas, F., & SocorĂł, J. C. (2004). Reliability in ICA-based text classification. In Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (pp. 1213–1220).
Shlrl, A., & Revle, C. (2006). Query expansion behavior within a thesaurus-enhanced search environment: A user-centered evaluation. Journal of the American Society for Information Science and Technology,57(4), 462–478.
Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007). Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing and Management,43(6), 1606–1618.
Wang, T., & Hirst, G. (2010). Near-synonym lexical choice in latent semantic space. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 1182–1190).
Weeds, J., Weir, D., & McCarthy, D. (2004). Characterising measures of lexical distributional similarity. In Proceedings of the 20th International Conference on Computational Linguistics (pp. 1015–1021).
Wei, C. P., Yang, C. C., & Lin, C. M. (2008). A Latent semantic indexing-based approach to multilingual document clustering. Decision Support Systems,45(3), 606–620.
Wible, D., Kuo, C.-H., Tsao, N.-L., Liu, A., & Lin, H.-L. (2003). Bootstrapping in a language learning environment. Journal of Computer Assisted learning,19(1), 90–102.
Wu, C.-H. Liu, C.-H., Matthew, H., & Yu, L.-C. (2010). Sentence correction incorporating relative position and parse template language models. IEEE Transactions on Audio, Speech and Language Processing, 18(6), 1170–1181.
Yu, L.-C., & Chien, W.-N. (2013). Independent component analysis for near-synonym choice. Decision Support Systems,55(1), 146–155.
Yu, L.-C., Lee, L.-H., Yeh, J.-F., Shih, H.-M., & Lai, Y.-L. (2016). Near-synonym substitution using a discriminative vector space model. Knowledge-Based Systems,106, 74–84.
Yu, L.-C., Wu, C.-H., Chang, R.-Y., Liu, C.-H., & Hovy, E. H. (2010). Annotation and verification of sense pools in OntoNotes. Information Processing and Management,46(4), 436–447.
Yu, L.-C., Wu, C.-H., & Jang, F.-L. (2009). Psychiatric document retrieval using a discourse-aware model. Artificial Intelligence,173(7–8), 817–829.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Yu, LC., Chien, WN., Hsu, KH. (2019). Context Analysis for Computer-Assisted Near-Synonym Learning. In: Lu, X., Chen, B. (eds) Computational and Corpus Approaches to Chinese Language Learning. Chinese Language Learning Sciences. Springer, Singapore. https://doi.org/10.1007/978-981-13-3570-9_7
Download citation
DOI: https://doi.org/10.1007/978-981-13-3570-9_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3569-3
Online ISBN: 978-981-13-3570-9
eBook Packages: EducationEducation (R0)