Abstract
Derivational relations are an important part of the lexical semantics system in many languages, especially those of rich inflection. They represent wide variety of semantic oppositions. Analysis of morphological word forms in terms of prefixes and suffixes provides limited information about their semantics. We propose a method of semantic classification of the potential derivational pairs. The method is based on supervised learning, but requires only a list of word pairs assigned to the derivational relations. The classification was based on a combination of features describing distribution of a derivative and derivational base in a large corpus together with their morphological and morpho-syntactic properties. The method does not use patterns based on close co-occurrence of a derivative and its base. Two classification schemes were evaluated: a multiclass and a cascade of binary classifiers, both expressed good performance in experiments on the selected nominal derivational relations.
Partially financed by the Polish Ministry of Education and Science, Project N N516 068637.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Broda, B., Piasecki, M.: SuperMatrix: a General Tool for Lexical Semantic Knowledge Acquisition. In: Proc. of IMCSIT — 3rd International Symposium Advances in Artificial Intelligence and Applications (AAIA 2008), pp. 345–352 (2008)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
Fellbaum, C. (ed.): WordNet — An Electronic Lexical Database. The MIT Press (1998)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Hendrickx, I., Kim, S.N., Kozareva, Z., Nakov, P., Ó Séaghdha, D., Padó, S., Pennacchiotti, M., Romano, L., Szpakowicz, S.: SemEval-2010 Task 8: Multi-way classification of semantic relations between pairs of nominals. In: Proc. of the NAACL HLT Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pp. 94–99. ACL (2009)
Pala, K., Hlaváčková, D.: Derivational Relations in Czech WordNet. In: Proc. Workshop on Balto-Slavonic NLP, Prague, pp. 75–81 (2007)
Pantel, P.: Clustering by committee. Ph.D. thesis, Edmonton, Alta., Canada, (2003); adviser-Dekang Lin
Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: ACL 2006 (ed.) Proc. COLING-ACL 2006, Sydney, pp. 113–120. ACL (2006)
Piasecki, M., Ramocki, R., Maziarz, M.: Automated generation of derivative relations in the wordnet expansion perspective. In: Proceedings of 6th International Global Wordnet Conference, pp. 273–280. The Global WordNet Association, Matsue (2012)
Piasecki, M., Szpakowicz, S., Marcińczuk, M., Broda, B.: Classification-Based Filtering of Semantic Relatedness in Hypernymy Extraction. In: Nordström, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 393–404. Springer, Heidelberg (2008)
Piasecki, M., Szpakowicz, S., Broda, B.: A Wordnet from the Ground Up. Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław (2009)
Przepiórkowski, A.: The IPI PAN Corpus: Preliminary version. Institute of Computer Science PAS (2004)
Radziszewski, A., Wardyński, A., Śniatowski, T.: WCCL: A Morpho-syntactic Feature Toolkit. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 434–441. Springer, Heidelberg (2011)
Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 1297–1304. MIT Press, Cambridge (2005)
Turney, P.D., Littman, M.L.: Corpus-based learning of analogies and semantic relations. Machine Learning 60(1-3), 251–278 (2005)
Turney, P.D., Pantel, P.: From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research 37, 141 (2010)
Vossen, P.: EuroWordNet General Document Version 3. Tech. rep., Univ. of Amsterdam (2002)
Weiss, D.: Korpus Rzeczpospolitej, corpus from the online edtion of Rzeczypospolita (2008), http://www.cs.put.poznan.pl/dweiss/rzeczpospolita
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Piasecki, M., Ramocki, R., Minda, P. (2012). Corpus-Based Semantic Filtering in Discovering Derivational Relations. In: Ramsay, A., Agre, G. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2012. Lecture Notes in Computer Science(), vol 7557. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33185-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-33185-5_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33184-8
Online ISBN: 978-3-642-33185-5
eBook Packages: Computer ScienceComputer Science (R0)