Abstract
Semantic similarity plays an important role in a number of applications including information extraction, information retrieval, document clustering and ontology learning. Most work has concentrated on English and other European languages. However, for the Thai language, there has been no research about word semantic similarity. This paper presents an experiment and benchmark data sets investigating the application of a WordNet-based machine measure to Thai similarity. Because there is no functioning Thai WordNet we also investigate the use of English WordNet with machine translation of Thai words.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lemon, O., Liu, X.: DUDE: a dialogue and understanding development environment, mapping business process models to information state update dialogue systems. In: Lemon, O., Liu, X. (eds.) Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics, EACL 2006, Stroudsburg, PA, USA (2006)
Kopp, S., Gesellensetter, L., Krämer, N.C., Wachsmuth, I.: A Conversational Agent as Museum Guide – Design and Evaluation of a Real-World Application. In: Panayiotopoulos, T., Gratch, J., Aylett, R.S., Ballin, D., Olivier, P., Rist, T. (eds.) IVA 2005. LNCS (LNAI), vol. 3661, pp. 329–343. Springer, Heidelberg (2005)
Ibarhim, A., Johasson, P.: Multimodal Dialogue Systems for Interactive TVApplications. In: Proceedings of the 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002, Washington DC (2002)
Huang, F., et al.: Language understanding component for Chinese dialogue system. In: International Conference on Spoken Language Processing, Beijing, October 16-20, pp. 1053–1056 (2000)
Ehsani, F., Bernstein, J., Najmi, A.: An interactive dialog system for learning Japanese. Elsevier Science B.V., Amsterdam (2000)
O’Shea, J., Bandar, Z., Crockett, K., Mclean, D.: A Comparative Study of Two Short Text Semantic Similarity Measures. In: Nguyen, N.T., Jo, G.-S., Howlett, R.J., Jain, L.C. (eds.) KES-AMSTA 2008. LNCS (LNAI), vol. 4953, pp. 82–91. Springer, Heidelberg (2008)
Miller, G.A.: WordNet: A Lexical Database for English. Comm. Acm 38(11), 39–41 (1995)
Sornlertlamvanich, V., et al.: Review on Development of Asian WordNet. Japlo 2009 year book, 276–285 (2009)
Lewis, M.P. (ed.): Ethnologue: Languages of the World, 16th edn. SIL International, Dallas (2009)
Rubenstein, H., Goodenough, J.B.: Contextual Correlates of Synonymy. Communication of the ACM 8(10), 627–633 (1965)
Miller, G.A., Charles, W.G.: Contextual Correlates of Semantic Similarity. Language and Cognitive Processes 6(1), 1–28 (1991)
Jarmasz, M., Szpakowicz, S.: Roget’s Thesaurus and semantic similarity. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing, Borovetz, Bulgaria, pp. 212–219 (2003)
Morris, J., Hirst, J.: Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics 17(1), 21–48 (1991)
Kozima, H., Furugori, T.: Similarity between word computed by spreading activation on an English dictionary. In: Proceedings of 6th Conference of the European Chapter of the Association for Computational Linguistics, Utrecht, pp. 232–239 (1993)
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics 19(1), 17–30 (1989)
Wu, Z., Palmer, M.: Verb semantic and lexical selection. In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, pp. 133–138 (1994)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th international joint conference on artificial intelligence, Montreal, Canada, pp. 448–453 (1995)
Lin, D.: An information-theoretic definition of similarity. In: Proceeding of the 15th International Conference on Machine Learning, pp. 296–304 (1998)
Landauer, T.K., Foltz, P.W., Laham, D.: Introduction to Latent Semantic Analysis. Discourse Processes 25, 259–284 (1998)
Tversky, A.: Features of Similarity. Psychological Review 84(4), 327–352 (1977)
Rodriguez, M., Egenhofer, M.: Determining Semantic Similarity Among Entity Classes from Different Ontologies. IEEE Trans. On Knowledge and Data Engineering 15(2), 442–456 (2003)
Li, Y., Bandar, Z., McLean, D.: An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources. IEEE Transactions on Knowledge and Data Engineering 15(4), 871–882 (2003)
Pedersen, T., et al.: Measures of semantic similarity and relatedness in the Biomedical domain. Journal of Biomedical Informatics 40, 288–299 (2007)
Pirro, G.: A semantic similarity metric combining features and intrinsic information content. Data & Knowledge Engineering 68, 1289–1308 (2009)
Hliaoutakis, A., Varelas, G., Voutsakis, E., Petrakis, E.G.M., Milios, E.: Information retrieval by semantic similarity. Int’l Journal on Semantic Web & Information Systems 2(3), 55–73 (2006)
Google translate, http://translate.google.com (cited 08/10/2010)
Och, F.J.: Statistical Machine Translation: Foundations and Recent Advances. Tutorial at MT Summit 2005, Phuket, Thailand (2005)
Trakultaweekoon, K., Porkaew, P., Supnithi, T.: LEXiTRON Vocabulary Suggestion System with Recommendation and Vote Mechanism. In: Proceedings of Conference of SNLP 2007, Thailand (2007)
Longman: Longman Dictionary of Contemporary English, 5 edn. Longman, London (2009)
Li, Y., et al.: Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Transactions on Knowledge and Data Engineering 18(8), 1138–1150 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Osathanunkul, K., O’Shea, J., Bandar, Z., Crockett, K. (2011). Semantic Similarity Measures for the Development of Thai Dialog System. In: O’Shea, J., Nguyen, N.T., Crockett, K., Howlett, R.J., Jain, L.C. (eds) Agent and Multi-Agent Systems: Technologies and Applications. KES-AMSTA 2011. Lecture Notes in Computer Science(), vol 6682. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22000-5_56
Download citation
DOI: https://doi.org/10.1007/978-3-642-22000-5_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21999-3
Online ISBN: 978-3-642-22000-5
eBook Packages: Computer ScienceComputer Science (R0)