Advertisement

Polysemous Verb Classification Using Subcategorization Acquisition and Graph-Based Clustering

  • Fumiyo Fukumoto
  • Yoshimi Suzuki
  • Kazuyuki Yamashita
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6562)

Abstract

This paper presents a method for classifying Japanese polysemous verbs. We used a graph-based unsupervised clustering algorithm, which detects the spin configuration that minimizes the energy of the material. Comparing global and local minima of an energy function allows for the detection of spins (nodes) with more than one cluster. We applied the algorithm to cluster polysemies. Moreover, we used link analysis to detect subcategorization frames, which are used to calculate distributional similarity between verbs. Evaluation are made on a set collected from Japanese dictionary, and the results suggest that polysemy, rather than being an obstacle to word sense discovery and identification, may actually be of benefit.

Keywords

Polysemies Verb Classification Soft Clustering Algorithm Markov Random Walk Model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Brew, C., Walde, S.S.: Spectral Clustering for German Verbs. In: Proc. of 2002 Conference on Empirical Methods in Natural Language Processing, pp. 117–123 (2002)Google Scholar
  2. 2.
    Dagan, I., Lee, L., Pereira, F.C.N.: Similarity-based Models of Word Cooccurrence Probabilities. Machine Learning 34(1-3), 43–69 (1999)CrossRefzbMATHGoogle Scholar
  3. 3.
    Bremaud, P.: Markov Chains: Gibbs Fields, Monte Carlo Simulation. and Queues. Springer, Heidelberg (1999)CrossRefzbMATHGoogle Scholar
  4. 4.
    Hindle, D.: Noun Classification from Predicate-Argument Structures. In: Proc. of 28th Annual Meeting of the Association for Computational Linguistics, pp. 268–275 (1990)Google Scholar
  5. 5.
    Hughes, J.: Automatically Acquiring Classification of Words. Ph.D. theses University of Leeds (1994)Google Scholar
  6. 6.
    Kirkpatrick, S., Gelatt Jr., C., Vecchi, M.: Optimization by Simulated Annealing. Science 220(4598), 671–680 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Korhonen, A., Krymolowski, Y., Marx, Z.: Clustering Polysemic Subcategorization Frame Distributions Semantically. In: Proc. of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 64–71 (2003)Google Scholar
  8. 8.
    Kudo, T., Matsumoto, Y.: Fast Methods for Kernel-based Text Analysis. In: Proc. of 41st Annual Meeting of the Association for Computational Linguistics, pp. 24–31 (2003)Google Scholar
  9. 9.
    Lee, L.: Measures of Distributional Similarity. In: Proc. of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 25–32 (1999)Google Scholar
  10. 10.
    Levin, B.: English Verb Classes and Alternations. Chicago University Press, Chicago (1993)Google Scholar
  11. 11.
    Lin, D.: Automatic Retrieval and Clustering of Similar Words. In: Proc. of 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, pp. 768–773 (1998)Google Scholar
  12. 12.
    Matsuo, Y., Sakaki, T., Uchiyama, K., Ishizuka, M.: Graph-based Word Clustering using a Web Search Engine. In: Proc. of 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pp. 542–550 (2006)Google Scholar
  13. 13.
    Mihalcea, R.: Unsupervised Large Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling. In: Proc. of the Human Language Technology / Empirical Methods in Natural Language Processing Conference, pp. 411–418 (2005)Google Scholar
  14. 14.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On Spectral Clustering: Analysis and an Algorithm. In: Advances in Neural Information Processing Systems, vol. 14. MIT Press, Cambridge (2002)Google Scholar
  15. 15.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Citation Ranking: Bringing Order to the Web. Technical report, Stanford Digital Libraries (1998)Google Scholar
  16. 16.
    Pereira, F., Tishby, N., Lee, L.: Distributional Clustering of English Words. In: Proc. of the 31st Annual Meeting of the Association for Computational Linguistics, pp. 183–190 (1993)Google Scholar
  17. 17.
    Reichardt, J., Bornholdt, S.: Statistical Mechanics of Community Detection. Physical Review E 74 (2006)Google Scholar
  18. 18.
    Reichardt, J., Bornholdt, S.: Detecting Fuzzy Community Structures in Complex Networks with a Potts Model. Physical Review Letters 93(21) (2004)Google Scholar
  19. 19.
    Rooth, M.: Two-Dimensional Clusters in Grammatical Relations. In: Inducing Lexicons with the EM Algorithm, AIMS Report 4(3) (1998)Google Scholar
  20. 20.
    Schulte im Walde, S.: Clustering Verbs Semantically according to their Alternation Behaviour. In: Proc. of the 18th International Conference on Computational Linguistics, pp. 747–753 (2000)Google Scholar
  21. 21.
    Schulte im Walde, S., Hying, C., Scheible, C., Schmid, H.: Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences. In: Proc. of the 46th Annual Meeting of the Association for Computational Linguistics (2008)Google Scholar
  22. 22.
    Stevenson, S., Joanis, E.: Semi-Supervised Verb-Class Discovery using Noisy Features. In: Proc. of the 7th Conference on Natural Language Learning at HLT-NAACL 2003, pp. 71–78 (2003)Google Scholar
  23. 23.
    Widdows, D., Dorow, B.: A Graph Model for Unsupervised Lexical Acquisition. In: Proc. of 19th International Conference on Computational Linguistics (COLING 2002), pp. 1093–1099 (2002)Google Scholar
  24. 24.
    Witten, I.H., Bell, T.C.: The Zero-Frequency Problem: Estimating the Probabilities of Novel Events in Adaptive Text Compression. IEEE Transactions on Information Theory 37(4), 1085–1094 (1991)CrossRefGoogle Scholar
  25. 25.
    Xue, G.R., Yang, Q., Zeng, H.J., Yu, Y., Chen, Z.: Exploiting the Hierarchical Structure for Link Analysis. In: Proc. of the SIGIR 2005, pp. 186–193 (2005)Google Scholar
  26. 26.
    Zhang, S., Wang, R., Zhang, Z.: Identification of Overlapping Community Structure in Complex Networks using Fuzzy C-means Clustering. Physica A 374, 483–490 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Fumiyo Fukumoto
    • 1
  • Yoshimi Suzuki
    • 1
  • Kazuyuki Yamashita
    • 2
  1. 1.Interdisciplinary Graduate School of Medicine and EngineeringUniv. of YamanashiKofuJapan
  2. 2.Faculty of Education Human SciencesUniv. of YamanashiKofuJapan

Personalised recommendations