A Hybrid Model for Community-Oriented Lexical Simplification

Song, Jiayin; Shen, Yingshan; Lee, John; Hao, Tianyong

doi:10.1007/978-3-030-60450-9_11

Jiayin Song¹²,
Yingshan Shen¹²,
John Lee¹³ &
…
Tianyong Hao¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12430))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

3030 Accesses
2 Citations

Abstract

Generally, lexical simplification replaces complex words in a sentence with simplified and synonymous words. Most current methods improve lexical simplification by optimizing ranking algorithm and their performance are limited. This paper utilizes a hybrid model through merging candidate words generated by a Context2vec neural model and a Context-aware model based on a weighted average method. The model consists of four steps: candidate word generation, candidate word selection, candidate word ranking, and candidate word merging. Through the evaluation on standard datasets, our hybrid model outperforms a list of baseline methods including Context2vec method, Context-aware method, and the state-of-the-art semantic-context ranking method, indicating its effectiveness in community-oriented lexical simplification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kajiwara, T., Matsumoto, H., Yamamoto, K.: Selecting proper lexical paraphrase for children. In: The 25th Conference on Computational Linguistics and Speech Processing (ROCLING), pp. 59–73 (2013)
Google Scholar
Zeng, Q., Kim, E., Crowell, J., Tse, T.: A text corpora-based estimation of the familiarity of health terminology. In: Oliveira, J.L., Maojo, V., Martín-Sánchez, F., Pereira, A.S. (eds.) ISBMDA 2005. LNCS, vol. 3745, pp. 184–192. Springer, Heidelberg (2005). https://doi.org/10.1007/11573067_19
Chapter Google Scholar
Education Bureau: Enhancing English vocabulary learning and teaching at secondary level. http://www.edb.gov.hk/vocab_learning_sec. Accessed: 05 2020
Song, J., Hu, J., Hao, T.: A new context-aware method based on hybrid ranking for community-oriented lexical simplification. In: The 6th International Symposium on Semantic Computing and Personalization (SeCoP). Springer (2020, in press)
Google Scholar
Melamud, O., Goldberger, J., Dagan, I.: context2vec: learning generic context embedding with bidirectional LSTM. In: The 20th SIGNLL Conference on Computational Natural Language Learning, pp. 51–61 (2016)
Google Scholar
McCarthy, D., Navigli, R.: Semeval-2007 task 10: English lexical substitution task. In: SemEval, pp. 48–53. ACL (2007)
Google Scholar
Qiang, J., Li, Y., Zhu, Y., Yuan, Y., Wu, X.: Lexical simplification with pretrained encoders. In: AAAI, pp. 8649–8656 (2020)
Google Scholar
Qiang, J., Li, Y., Zhu, Y., Yuan, Y., Wu, X.: A simple BERT-based approach for lexical simplification. arXiv preprint arXiv:1907.06226 (2019)
Paetzold, G., Specia, L.: Semeval 2016 task 11: complex word identification. In: SemEval, pp. 560–569 (2016)
Google Scholar
Yimam, S.M., Stajner, S., Riedl, M., Biemann, C.: Multilingual and cross-lingual complex word identification. In: Recent Advances in Natural Language Processing, pp. 813–822 (2017)
Google Scholar
Hintz, G., Biemann, C.: Language transfer learning for supervised lexical substitution. In: The 54th Annual Meeting of the Association for Computational Linguistics (ACL), Volume 1: Long Papers, pp. 118–129 (2016)
Google Scholar
Paetzold, G., Specia, L.: Lexenstein: a framework for lexical simplification. In: ACL-IJCNLP 2015 System Demonstrations, pp. 85–90 (2015)
Google Scholar
Melamud, O., Levy, O., Dagan, I.: A simple word embedding model for lexical substitution. In: The Workshop on Vector Space Modeling for Natural Language Processing, pp. 1–7 (2015)
Google Scholar
Kriz, R., Miltsakaki, E., Apidianaki, M., Callison-Burch, C.: Simplification using paraphrases and context-based lexical substitution. In: NAACL, vol. 1, pp. 207–217 (2018)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Peters, M.E., Neumann, M., Zettlemoyer, L., Yih, W.-T.: Dissecting contextual word embeddings: architecture and representation. In: The 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), Brussels, Belgium, pp. 1499–1509 (2018b)
Google Scholar
Ehara, Y., Miyao, Y., Oiwa, H., Sato, I., Nakagawa, H.: Formalizing word sampling for vocabulary prediction as graph-based active learning. In: EMNLP, pp. 1374–1384 (2014)
Google Scholar
Lee, J., Yeung, C.Y.: Personalizing lexical simplification. In: The 27th International Conference on Computational Linguistics (COLING), pp. 224–232 (2018)
Google Scholar
Lee, J., Yeung, C.Y.: Personalized substitution ranking for lexical simplification. In: The 12th International Conference on Natural Language Generation, pp. 258–267 (2019)
Google Scholar
Hao, T., Xie, W., Lee, J.: A semantic-context ranking approach for community-oriented english lexical simplification. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 784–796. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_68
Chapter Google Scholar
Sharoff, S.: Open-source corpora: using the net to fish for linguistic data. Int. J. Corpus Linguist. 11(4), 435–462 (2006)
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61772146) and Natural Science Foundation of Guangdong Province (2018A030310051).

Author information

Authors and Affiliations

School of Computer Science, South China Normal University, Guangzhou, China
Jiayin Song, Yingshan Shen & Tianyong Hao
Department of Linguistics and Translation, City University of Hong Kong, Hong Kong, China
John Lee

Authors

Jiayin Song
View author publications
You can also search for this author in PubMed Google Scholar
Yingshan Shen
View author publications
You can also search for this author in PubMed Google Scholar
John Lee
View author publications
You can also search for this author in PubMed Google Scholar
Tianyong Hao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianyong Hao .

Editor information

Editors and Affiliations

ECE & Ingenuity Labs Research Institute, Queen’s University, Kingston, ON, Canada
Xiaodan Zhu
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Min Zhang
School of Computer Science and Technology, Soochow University, Suzhou, China
Yu Hong
College of Intelligence and Computing, Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, J., Shen, Y., Lee, J., Hao, T. (2020). A Hybrid Model for Community-Oriented Lexical Simplification. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12430. Springer, Cham. https://doi.org/10.1007/978-3-030-60450-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-60450-9_11
Published: 02 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60449-3
Online ISBN: 978-3-030-60450-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)