Skip to main content
Log in

Exploiting Semantic Information for HPSG Parse Selection

  • Published:
Research on Language and Computation

Abstract

In this article, we investigate the use of semantic information in parse selection. We show that fully disambiguated sense-based semantic features smoothed using ontological information are effective for parse selection. Training and testing was undertaken using definition and example sentences taken from a Japanese dictionary corpus (Hinoki), which is manually annotated with senses. A model employing both syntactic and semantic information provides better parse selection accuracy than a model using only syntactic features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abney S. P. (1997) Stochastic attribute-value grammars. Computational Linguistics 23: 597–618

    Google Scholar 

  • Agirre, E., Baldwin, T., & Martinez, D. (2008). Improving parsing and PP attachment performance with sense information. In Proceedings of the 46th annual meeting of the association for computational linguistics: ACL/HLT-2008 (pp. 317–325).

  • Baldridge J., Osborne M. (2007) Active learning and logarithmic opinion pools for HPSG parse selection. Natural Language Engineering 13(1): 1–32

    Article  Google Scholar 

  • Bikel, D. M. (2000). A statistical model for parsing and word-sense disambiguation. In ACL-2000 student research workshop (pp. 1–7). Hong Kong.

  • Bond F., Fujita S., Tanaka T. (2006) The Hinoki syntactic and semantic treebank of Japanese. Language Resources and Evaluation 40(3–4): 253–261 (Special issue on Asian language technology)

    Google Scholar 

  • Bond, F., & Shirai, S. (1997). Practical and efficient organization of a large valency dictionary. In NLPRS-97 Workshop on Multilingual Information Processing. Phuket.

  • Ciaramita, M., & Altun, Y. (2006). Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In Proceedings of the 2006 conference on empirical methods in natural language processing: EMNLP-2006 (pp. 594–60)2. Sydney, Australia.

  • Copestake A., Flickinger D., Pollard C., Sag I. A. (2005) Minimal recursion semantics. An Introduction. Research on Language and Computation 3(4): 281–332

    Article  Google Scholar 

  • Dong, Z., & Dong, Q. (2000). http://www.keenage.com.

  • Fellbaum, C. (eds) (1998) WordNet: An electronic lexical database. MIT Press, Cambridge, MA

    Google Scholar 

  • Fujita S., Bond F. (2007) A method of creating new valency entries. Machine Translation Journal 21(1): 1–28

    Article  Google Scholar 

  • Fujita, S., Bond, F., Oepen, S. & Tanaka, T. (2007). Exploiting semantic information for HPSG parse selection. In Proceedings of ACL-2007 workshop on deep linguistic processing (pp. 25–32). Prague, Czech Republic.

  • Ginzburg J., Sag I. A. (2000) Interrogative investigations. The form, meaning, and use of English interrogatives. CSLI Publications, Stanford, CA

    Google Scholar 

  • Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2006). OntoNotes: The 90% Solution. In Proceedings of the human language technology conference of the NAACL, companion volume: short papers (pp. 57–60). New York City, USA: Association for Computational Linguistics

  • Ikehara, S., Miyazaki, M., Shirai, S., Yokoo, A., Nakaiwa, H., Ogura, K., Ooyama, Y., & Hayashi, Y. (1997). Goi-Taikei—A Japanese Lexicon. Iwanami Shoten, Tokyo. 5 volumes/CD-ROM.

  • Johnson, M., Geman, S., Canon, S., Chi, Z. & Riezler, S. (1999). Estimators for Stochastic ‘Unification-based’ Grammars. In Proceedings of the 37th annual meeting of the association for computational linguistics: ACL-99 (pp. 535–541). College Park, MD.

  • Kasahara, K., Sato, H., Bond, F., Tanaka, T., Fujita, S., Kanasugi, T., & Amano, S. (2004). Construction of a Japanese semantic Lexicon: Lexeed. In IEICE technical report: 2004-NLC-159, pp. 75–82. (in Japanese).

  • Kindaichi H., Ikeda Y. (1988) Gakken Japanese Dictionary. Gakken Co Ltd, Tokyo, Japan

    Google Scholar 

  • Klein, D. & Manning, C. D. (2003). Accurate unlexicalized parsing. In Erhard, H. & Dan R. (Eds.), Proceedings of the 41st annual meeting of the association for computational linguistics, (pp. 423–430).

  • Malouf, R. (2002). A comparison of algorithms for maximum entropy parameter estimation. In Proceedings of the 6th conference on computational natural language learning: CoNLL-2002. Taipei, Taiwan.

  • Malouf, R., & van Noord, G. (2004). Wide coverage parsing with stochastic attribute value grammars. In IJCNLP-04 Workshop: Beyond shallow analyses—Formalisms and statistical modeling for deep analyses. JST CREST.

  • Miyao Y., Tsujii J. (2008) Feature forest models for probabilistic HPSG parsing. Computational Linguistics 34(1): 35–80

    Article  Google Scholar 

  • Oepen S., Flickinger D., Toutanova K., Manning C. D. (2004) LinGO Redwoods: A rich and dynamic treebank for HPSG. Research on Language and Computation 2(4): 575–596

    Article  Google Scholar 

  • Oepen, S. & Lønning, J. T. (2006). Discriminant-based MRS banking. In Proceedings of the 5th international conference on language resources and evaluation: LREC-2006. Genoa, Italy.

  • Pollard C., Sag I. A. (1994) Head driven phrase structure grammar. University of Chicago Press, Chicago

    Google Scholar 

  • Riezler, S., King, T. H., Kaplan, R. M., Crouch, R., Maxwell III, J. T., Alto, P. & Johnson, M. (2002). Parsing the Wall Street Journal using a lexical-functional grammar and discriminative estimation techniques. In Proceedings of the 40th annual meeting of the association for computational linguistics: ACL-2002. Philadelphia, PA.

  • Siegel, M., & Bender, E. M. (2002). Efficient deep processing of Japanese. In Proceedings of the 3rd workshop on Asian language resources and international standardization at the 19th international conference on computational linguistics. Taipei.

  • Toutanova K., Manning C. D., Flickinger D., Oepen S. (2005) Stochastic HPSG parse disambiguation using the redwoods corpus. Research on Language and Computation 3(1): 83–105

    Article  Google Scholar 

  • Velldal, E. (2008). Empirical realization ranking. Doctoral dissertation, University of Oslo.

  • Xiong, D., Li, S., Liu, Q., Lin, S., & Qian, Y. (2005). Parsing the Penn Chinese treebank with semantic knowledge. In Dale, R., Su, J., Wong, K.-F., & Kwong, O. Y. (Eds.), Natural language processing—IJCNLP 005: Second international joint conference proceedings, (pp. 70–81). Springer.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sanae Fujita.

Additional information

This article presents an updated and extended version of results first published by Fujita et al. (2007).

About this article

Cite this article

Fujita, S., Bond, F., Oepen, S. et al. Exploiting Semantic Information for HPSG Parse Selection. Res on Lang and Comput 8, 1–22 (2010). https://doi.org/10.1007/s11168-010-9069-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11168-010-9069-7

Keywords

Navigation