Research on Language and Computation

, Volume 8, Issue 1, pp 1–22

Exploiting Semantic Information for HPSG Parse Selection

  • Sanae Fujita
  • Francis Bond
  • Stephan Oepen
  • Takaaki Tanaka
Article

Abstract

In this article, we investigate the use of semantic information in parse selection. We show that fully disambiguated sense-based semantic features smoothed using ontological information are effective for parse selection. Training and testing was undertaken using definition and example sentences taken from a Japanese dictionary corpus (Hinoki), which is manually annotated with senses. A model employing both syntactic and semantic information provides better parse selection accuracy than a model using only syntactic features.

Keywords

HPSG Parse selection Semantic information 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abney S. P. (1997) Stochastic attribute-value grammars. Computational Linguistics 23: 597–618Google Scholar
  2. Agirre, E., Baldwin, T., & Martinez, D. (2008). Improving parsing and PP attachment performance with sense information. In Proceedings of the 46th annual meeting of the association for computational linguistics: ACL/HLT-2008 (pp. 317–325).Google Scholar
  3. Baldridge J., Osborne M. (2007) Active learning and logarithmic opinion pools for HPSG parse selection. Natural Language Engineering 13(1): 1–32CrossRefGoogle Scholar
  4. Bikel, D. M. (2000). A statistical model for parsing and word-sense disambiguation. In ACL-2000 student research workshop (pp. 1–7). Hong Kong.Google Scholar
  5. Bond F., Fujita S., Tanaka T. (2006) The Hinoki syntactic and semantic treebank of Japanese. Language Resources and Evaluation 40(3–4): 253–261 (Special issue on Asian language technology)Google Scholar
  6. Bond, F., & Shirai, S. (1997). Practical and efficient organization of a large valency dictionary. In NLPRS-97 Workshop on Multilingual Information Processing. Phuket.Google Scholar
  7. Ciaramita, M., & Altun, Y. (2006). Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In Proceedings of the 2006 conference on empirical methods in natural language processing: EMNLP-2006 (pp. 594–60)2. Sydney, Australia.Google Scholar
  8. Copestake A., Flickinger D., Pollard C., Sag I. A. (2005) Minimal recursion semantics. An Introduction. Research on Language and Computation 3(4): 281–332CrossRefGoogle Scholar
  9. Dong, Z., & Dong, Q. (2000). http://www.keenage.com.
  10. Fellbaum, C. (eds) (1998) WordNet: An electronic lexical database. MIT Press, Cambridge, MAGoogle Scholar
  11. Fujita S., Bond F. (2007) A method of creating new valency entries. Machine Translation Journal 21(1): 1–28CrossRefGoogle Scholar
  12. Fujita, S., Bond, F., Oepen, S. & Tanaka, T. (2007). Exploiting semantic information for HPSG parse selection. In Proceedings of ACL-2007 workshop on deep linguistic processing (pp. 25–32). Prague, Czech Republic.Google Scholar
  13. Ginzburg J., Sag I. A. (2000) Interrogative investigations. The form, meaning, and use of English interrogatives. CSLI Publications, Stanford, CAGoogle Scholar
  14. Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2006). OntoNotes: The 90% Solution. In Proceedings of the human language technology conference of the NAACL, companion volume: short papers (pp. 57–60). New York City, USA: Association for Computational LinguisticsGoogle Scholar
  15. Ikehara, S., Miyazaki, M., Shirai, S., Yokoo, A., Nakaiwa, H., Ogura, K., Ooyama, Y., & Hayashi, Y. (1997). Goi-Taikei—A Japanese Lexicon. Iwanami Shoten, Tokyo. 5 volumes/CD-ROM.Google Scholar
  16. Johnson, M., Geman, S., Canon, S., Chi, Z. & Riezler, S. (1999). Estimators for Stochastic ‘Unification-based’ Grammars. In Proceedings of the 37th annual meeting of the association for computational linguistics: ACL-99 (pp. 535–541). College Park, MD.Google Scholar
  17. Kasahara, K., Sato, H., Bond, F., Tanaka, T., Fujita, S., Kanasugi, T., & Amano, S. (2004). Construction of a Japanese semantic Lexicon: Lexeed. In IEICE technical report: 2004-NLC-159, pp. 75–82. (in Japanese).Google Scholar
  18. Kindaichi H., Ikeda Y. (1988) Gakken Japanese Dictionary. Gakken Co Ltd, Tokyo, JapanGoogle Scholar
  19. Klein, D. & Manning, C. D. (2003). Accurate unlexicalized parsing. In Erhard, H. & Dan R. (Eds.), Proceedings of the 41st annual meeting of the association for computational linguistics, (pp. 423–430).Google Scholar
  20. Malouf, R. (2002). A comparison of algorithms for maximum entropy parameter estimation. In Proceedings of the 6th conference on computational natural language learning: CoNLL-2002. Taipei, Taiwan.Google Scholar
  21. Malouf, R., & van Noord, G. (2004). Wide coverage parsing with stochastic attribute value grammars. In IJCNLP-04 Workshop: Beyond shallow analyses—Formalisms and statistical modeling for deep analyses. JST CREST.Google Scholar
  22. Miyao Y., Tsujii J. (2008) Feature forest models for probabilistic HPSG parsing. Computational Linguistics 34(1): 35–80CrossRefGoogle Scholar
  23. Oepen S., Flickinger D., Toutanova K., Manning C. D. (2004) LinGO Redwoods: A rich and dynamic treebank for HPSG. Research on Language and Computation 2(4): 575–596CrossRefGoogle Scholar
  24. Oepen, S. & Lønning, J. T. (2006). Discriminant-based MRS banking. In Proceedings of the 5th international conference on language resources and evaluation: LREC-2006. Genoa, Italy.Google Scholar
  25. Pollard C., Sag I. A. (1994) Head driven phrase structure grammar. University of Chicago Press, ChicagoGoogle Scholar
  26. Riezler, S., King, T. H., Kaplan, R. M., Crouch, R., Maxwell III, J. T., Alto, P. & Johnson, M. (2002). Parsing the Wall Street Journal using a lexical-functional grammar and discriminative estimation techniques. In Proceedings of the 40th annual meeting of the association for computational linguistics: ACL-2002. Philadelphia, PA.Google Scholar
  27. Siegel, M., & Bender, E. M. (2002). Efficient deep processing of Japanese. In Proceedings of the 3rd workshop on Asian language resources and international standardization at the 19th international conference on computational linguistics. Taipei.Google Scholar
  28. Toutanova K., Manning C. D., Flickinger D., Oepen S. (2005) Stochastic HPSG parse disambiguation using the redwoods corpus. Research on Language and Computation 3(1): 83–105CrossRefGoogle Scholar
  29. Velldal, E. (2008). Empirical realization ranking. Doctoral dissertation, University of Oslo.Google Scholar
  30. Xiong, D., Li, S., Liu, Q., Lin, S., & Qian, Y. (2005). Parsing the Penn Chinese treebank with semantic knowledge. In Dale, R., Su, J., Wong, K.-F., & Kwong, O. Y. (Eds.), Natural language processing—IJCNLP 005: Second international joint conference proceedings, (pp. 70–81). Springer.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  • Sanae Fujita
    • 1
  • Francis Bond
    • 2
  • Stephan Oepen
    • 3
  • Takaaki Tanaka
    • 4
  1. 1.NTT Communication Science Laboratories, Nippon Telegraph and Telephone CorporationKyotoJapan
  2. 2.Division of Linguistics and Multilingual StudiesNanyang Technological UniversitySingaporeSingapore
  3. 3.Department of InformaticsUniversity of OsloOsloNorway
  4. 4.NTT WestOsakaJapan

Personalised recommendations