Modelling the Orthographic Neighbourhood for Japanese Kanji
Japanese kanji recognition experiments are typically narrowly focused, and feature only native speakers as participants. It remains unclear how to apply their results to kanji similarity applications, especially when learners are much more likely to make similarity-based confusion errors. We describe an experiment to collect authentic human similarity judgements from participants of all levels of Japanese proficiency, from non-speaker to native. The data was used to construct simple similarity models for kanji based on pixel difference and radical cosine similarity, in order to work towards genuine confusability data. The latter model proved the best predictor of human responses.
KeywordsRadical Model Native Speaker Chinese Character Stroke Group Shared Radical
Unable to display preview. Download preview PDF.
- Apel, U., Quint, J.: Building a graphetic dictionary for Japanese kanji – character look up based on brush strokes or stroke groups, and the display of kanji as path data. In: Proceedings of the 20th International Conference on Computational Linguistics (2004)Google Scholar
- Bilac, S., Baldwin, T., Tanaka, H.: Modeling learners’ cognitive processes for improved dictionary accessibility. In: Proceedings of the 10th International Conference of the European Association for Japanese Studies, Warsaw, Poland (2003)Google Scholar
- Bilac, S., Baldwin, T., Tanaka, H.: Evaluating the FOKS error model. In: Proc. of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal, pp. 2119–2122 (2004)Google Scholar
- Reips, U.-D.: Standards for internet-based experimenting. Experimental Psychology 49(4), 243–256 (2002)Google Scholar
- Saito, H., Inoue, M., Nomura, Y.: Information processing of Kanji (Chinese characters) and Kana (Japanese characters). Psychologia 22, 195–206 (1995)Google Scholar