Abstract
Generally, lexical simplification replaces complex words in a sentence with simplified and synonymous words. Most current methods improve lexical simplification by optimizing ranking algorithm and their performance are limited. This paper utilizes a hybrid model through merging candidate words generated by a Context2vec neural model and a Context-aware model based on a weighted average method. The model consists of four steps: candidate word generation, candidate word selection, candidate word ranking, and candidate word merging. Through the evaluation on standard datasets, our hybrid model outperforms a list of baseline methods including Context2vec method, Context-aware method, and the state-of-the-art semantic-context ranking method, indicating its effectiveness in community-oriented lexical simplification task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kajiwara, T., Matsumoto, H., Yamamoto, K.: Selecting proper lexical paraphrase for children. In: The 25th Conference on Computational Linguistics and Speech Processing (ROCLING), pp. 59–73 (2013)
Zeng, Q., Kim, E., Crowell, J., Tse, T.: A text corpora-based estimation of the familiarity of health terminology. In: Oliveira, J.L., Maojo, V., MartÃn-Sánchez, F., Pereira, A.S. (eds.) ISBMDA 2005. LNCS, vol. 3745, pp. 184–192. Springer, Heidelberg (2005). https://doi.org/10.1007/11573067_19
Education Bureau: Enhancing English vocabulary learning and teaching at secondary level. http://www.edb.gov.hk/vocab_learning_sec. Accessed: 05 2020
Song, J., Hu, J., Hao, T.: A new context-aware method based on hybrid ranking for community-oriented lexical simplification. In: The 6th International Symposium on Semantic Computing and Personalization (SeCoP). Springer (2020, in press)
Melamud, O., Goldberger, J., Dagan, I.: context2vec: learning generic context embedding with bidirectional LSTM. In: The 20th SIGNLL Conference on Computational Natural Language Learning, pp. 51–61 (2016)
McCarthy, D., Navigli, R.: Semeval-2007 task 10: English lexical substitution task. In: SemEval, pp. 48–53. ACL (2007)
Qiang, J., Li, Y., Zhu, Y., Yuan, Y., Wu, X.: Lexical simplification with pretrained encoders. In: AAAI, pp. 8649–8656 (2020)
Qiang, J., Li, Y., Zhu, Y., Yuan, Y., Wu, X.: A simple BERT-based approach for lexical simplification. arXiv preprint arXiv:1907.06226 (2019)
Paetzold, G., Specia, L.: Semeval 2016 task 11: complex word identification. In: SemEval, pp. 560–569 (2016)
Yimam, S.M., Stajner, S., Riedl, M., Biemann, C.: Multilingual and cross-lingual complex word identification. In: Recent Advances in Natural Language Processing, pp. 813–822 (2017)
Hintz, G., Biemann, C.: Language transfer learning for supervised lexical substitution. In: The 54th Annual Meeting of the Association for Computational Linguistics (ACL), Volume 1: Long Papers, pp. 118–129 (2016)
Paetzold, G., Specia, L.: Lexenstein: a framework for lexical simplification. In: ACL-IJCNLP 2015 System Demonstrations, pp. 85–90 (2015)
Melamud, O., Levy, O., Dagan, I.: A simple word embedding model for lexical substitution. In: The Workshop on Vector Space Modeling for Natural Language Processing, pp. 1–7 (2015)
Kriz, R., Miltsakaki, E., Apidianaki, M., Callison-Burch, C.: Simplification using paraphrases and context-based lexical substitution. In: NAACL, vol. 1, pp. 207–217 (2018)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Peters, M.E., Neumann, M., Zettlemoyer, L., Yih, W.-T.: Dissecting contextual word embeddings: architecture and representation. In: The 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), Brussels, Belgium, pp. 1499–1509 (2018b)
Ehara, Y., Miyao, Y., Oiwa, H., Sato, I., Nakagawa, H.: Formalizing word sampling for vocabulary prediction as graph-based active learning. In: EMNLP, pp. 1374–1384 (2014)
Lee, J., Yeung, C.Y.: Personalizing lexical simplification. In: The 27th International Conference on Computational Linguistics (COLING), pp. 224–232 (2018)
Lee, J., Yeung, C.Y.: Personalized substitution ranking for lexical simplification. In: The 12th International Conference on Natural Language Generation, pp. 258–267 (2019)
Hao, T., Xie, W., Lee, J.: A semantic-context ranking approach for community-oriented english lexical simplification. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 784–796. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_68
Sharoff, S.: Open-source corpora: using the net to fish for linguistic data. Int. J. Corpus Linguist. 11(4), 435–462 (2006)
Acknowledgements
This work was supported by National Natural Science Foundation of China (No. 61772146) and Natural Science Foundation of Guangdong Province (2018A030310051).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Song, J., Shen, Y., Lee, J., Hao, T. (2020). A Hybrid Model for Community-Oriented Lexical Simplification. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12430. Springer, Cham. https://doi.org/10.1007/978-3-030-60450-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-60450-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60449-3
Online ISBN: 978-3-030-60450-9
eBook Packages: Computer ScienceComputer Science (R0)