Context Analysis for Computer-Assisted Near-Synonym Learning

Yu, Liang-Chih; Chien, Wei-Nan; Hsu, Kai-Hsiang

doi:10.1007/978-981-13-3570-9_7

Liang-Chih Yu⁷,
Wei-Nan Chien⁷ &
Kai-Hsiang Hsu⁸

Part of the book series: Chinese Language Learning Sciences ((CLLS))

814 Accesses

Abstract

Despite their similar meanings, near-synonyms may have different usages in different contexts. For second-language learners, such differences are not easily grasped in practical use. This chapter introduces several context analysis techniques such as pointwise mutual information (PMI), n-gram language model, latent semantic analysis (LSA), and independent component analysis (ICA) to verify whether near-synonyms do match the given contexts. Applications can benefit from such techniques to provide useful contextual information for learners, making it easier for them to understand different usages of various near-synonyms. Based on these context analysis techniques, we build a prototype computer-assisted near-synonym learning system. In experiments, we evaluate the context analysis methods on both Chinese and English sentences, and compared its performance to several previously proposed supervised and unsupervised methods. Experimental results show that training on the independent components that contain useful contextual features with minimized term dependence can improve the classifiers’ ability to discriminate among near-synonyms, thus yielding better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Chinese Near-Synonym Study Based on the Chinese Gigaword Corpus and the Chinese Learner Corpus

A Study on Chinese Synonyms: From the Perspective of Collocations

Word Replaceability Through Word Vectors

Article Open access 25 April 2020

References

Bhogal, J., Macfarlane, A., & Smith, P. (2007). A review of ontology based query expansion. Information Processing and Management,43(4), 866–886.
Article Google Scholar
Cheng, C.-C. (2004). Word-focused extensive reading with guidance. In Proceedings of the 13th International Symposium on English Teaching (pp. 24–32).
Google Scholar
Church, K., & Hanks, P. (1990). Word association norms, mutual information and lexicography. Computational Linguistics,16(1), 22–29.
Google Scholar
Cribbin, T. (2011). Discovering latent topical structure by second-order similarity analysis. Journal of the American Society for Information Science and Technology,62(6), 1188–1207.
Article Google Scholar
Edmonds, P. (1997). Choosing the word most typical in context using a lexical co-occurrence network. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 507–509).
Google Scholar
Fellbaum, C. (1998). WordNet: An electronic lexical database. Cambridge, MA: MIT Press.
Book Google Scholar
Gardiner, M., & Dras, M. (2007). Exploring approaches to discriminating among near-synonyms. In Proceedings of the Australasian Language Technology Workshop (pp. 31–39).
Google Scholar
Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore, MD: Johns Hopkins University Press.
Google Scholar
Harris, Z. (1954). Distributional structure. Word,10(2–3), 146–162.
Article Google Scholar
Howell, D. C. (2007). Statistical methods for psychology (6th ed.). Belmont, CA: Thomson.
Google Scholar
Huang, C.-R., Hsieh, S.-K., Hong, J.-F., Chen, Y.-Z., Su, I.-L., Chen, Y.-X., & Huang, S.-W. (2008). Chinese Wordnet: Design, implementation, and application of an infrastructure for cross-lingual knowledge processing. In Proceedings of the 9th Chinese Lexical Semantics Workshop.
Google Scholar
Hyvärinen, A. (1999). Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks,10(3), 626–634.
Article Google Scholar
Hyvärinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis. New York: Wiley.
Book Google Scholar
Inkpen, D. (2007). A statistical model of near-synonym choice. ACM Transactions on Speech and Language Processing,4(1), 1–17.
Article Google Scholar
Inkpen, D., & Hirst, G. (2006). Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics,32(2), 1–39.
Article Google Scholar
Islam, A., & Inkpen, D. (2010). Near-synonym choice using a 5-gram language model. Research in Computing Science,46, 41–52.
Google Scholar
Kolenda, T., & Hansen, L. K. (2000). Independent components in text. Advances in Neural Information Processing Systems,13, 235–256.
Google Scholar
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes,25(2–3), 259–284.
Article Google Scholar
Lee, T. W. (1998). Independent component analysis—Theory and applications. Norwell, MA: Kluwer.
Book Google Scholar
Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 768–774).
Google Scholar
Moldovan, D., & Mihalcea, R. (2000). Using Wordnet and lexical operators to improve internet searches. IEEE Internet Computing,4(1), 34–43.
Article Google Scholar
Navigli, R., & Velardi, P. (2003). An analysis of ontology-based query expansion strategies. In Proceedings of the Workshop on Adaptive Text Extraction and Mining.
Google Scholar
Ouyang, S., Gao, H.-H., & Koh, S.-N. (2009). Developing a computer-facilitated tool for acquiring near-synonyms in Chinese and English. In Proceedings of the 8th International Conference on Computational Semantics (pp. 316–319).
Google Scholar
Pearce, D. (2001). Synonymy in collocation extraction. In Proceedings of the Workshop on WordNet and Other Lexical Resources.
Google Scholar
Rapp, R. (2004). Mining text for word senses using independent component analysis. In Proceedings of the 4th SIAM International Conference on Data Mining (pp. 422–426).
Google Scholar
Rodríguez, H., Climent, S., Vossen, P., Bloksma, L., Peters, W., Alonge, A., et al. (1998). The top-down strategy for building EuroWordNet: vocabulary coverage, base concepts and top ontology. Computers and the Humanities,32, 117–159.
Article Google Scholar
Roussinov, D., & Zhao, J. L. (2003). Automatic discovery of similarity relationships through Web mining. Decision Support Systems,35(1), 149–166.
Article Google Scholar
Sevillano, X., Alías, F., & Socoró, J. C. (2004). Reliability in ICA-based text classification. In Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (pp. 1213–1220).
Chapter Google Scholar
Shlrl, A., & Revle, C. (2006). Query expansion behavior within a thesaurus-enhanced search environment: A user-centered evaluation. Journal of the American Society for Information Science and Technology,57(4), 462–478.
Article Google Scholar
Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007). Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing and Management,43(6), 1606–1618.
Article Google Scholar
Wang, T., & Hirst, G. (2010). Near-synonym lexical choice in latent semantic space. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 1182–1190).
Google Scholar
Weeds, J., Weir, D., & McCarthy, D. (2004). Characterising measures of lexical distributional similarity. In Proceedings of the 20th International Conference on Computational Linguistics (pp. 1015–1021).
Google Scholar
Wei, C. P., Yang, C. C., & Lin, C. M. (2008). A Latent semantic indexing-based approach to multilingual document clustering. Decision Support Systems,45(3), 606–620.
Article Google Scholar
Wible, D., Kuo, C.-H., Tsao, N.-L., Liu, A., & Lin, H.-L. (2003). Bootstrapping in a language learning environment. Journal of Computer Assisted learning,19(1), 90–102.
Article Google Scholar
Wu, C.-H. Liu, C.-H., Matthew, H., & Yu, L.-C. (2010). Sentence correction incorporating relative position and parse template language models. IEEE Transactions on Audio, Speech and Language Processing, 18(6), 1170–1181.
Google Scholar
Yu, L.-C., & Chien, W.-N. (2013). Independent component analysis for near-synonym choice. Decision Support Systems,55(1), 146–155.
Article Google Scholar
Yu, L.-C., Lee, L.-H., Yeh, J.-F., Shih, H.-M., & Lai, Y.-L. (2016). Near-synonym substitution using a discriminative vector space model. Knowledge-Based Systems,106, 74–84.
Article Google Scholar
Yu, L.-C., Wu, C.-H., Chang, R.-Y., Liu, C.-H., & Hovy, E. H. (2010). Annotation and verification of sense pools in OntoNotes. Information Processing and Management,46(4), 436–447.
Article Google Scholar
Yu, L.-C., Wu, C.-H., & Jang, F.-L. (2009). Psychiatric document retrieval using a discourse-aware model. Artificial Intelligence,173(7–8), 817–829.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Yuan Ze University, Taoyuan, Taiwan
Liang-Chih Yu & Wei-Nan Chien
Yuanze University, Taoyuan, Taiwan
Kai-Hsiang Hsu

Authors

Liang-Chih Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Nan Chien
View author publications
You can also search for this author in PubMed Google Scholar
Kai-Hsiang Hsu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang-Chih Yu .

Editor information

Editors and Affiliations

Department of Applied Linguistics, The Pennsylvania State University, University Park, PA, USA
Xiaofei Lu
Department of Computer Science and Information Engineering, National Taiwan Normal University, Taipei, Taiwan
Berlin Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yu, LC., Chien, WN., Hsu, KH. (2019). Context Analysis for Computer-Assisted Near-Synonym Learning. In: Lu, X., Chen, B. (eds) Computational and Corpus Approaches to Chinese Language Learning. Chinese Language Learning Sciences. Springer, Singapore. https://doi.org/10.1007/978-981-13-3570-9_7

Download citation

DOI: https://doi.org/10.1007/978-981-13-3570-9_7
Published: 07 February 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3569-3
Online ISBN: 978-981-13-3570-9
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics

Context Analysis for Computer-Assisted Near-Synonym Learning

Abstract

Access this chapter

Similar content being viewed by others

Chinese Near-Synonym Study Based on the Chinese Gigaword Corpus and the Chinese Learner Corpus

A Study on Chinese Synonyms: From the Perspective of Collocations

Word Replaceability Through Word Vectors

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Context Analysis for Computer-Assisted Near-Synonym Learning

Abstract

Access this chapter

Similar content being viewed by others

Chinese Near-Synonym Study Based on the Chinese Gigaword Corpus and the Chinese Learner Corpus

A Study on Chinese Synonyms: From the Perspective of Collocations

Word Replaceability Through Word Vectors

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation