Skip to main content
Log in

A method of creating new valency entries

  • Published:
Machine Translation

Abstract

Information on subcategorization and selectional restrictions in a valency dictionary is important for natural language processing tasks such as monolingual parsing, accurate rule-based machine translation and automatic summarization. In this paper we present an efficient method of assigning valency information and selectional restrictions to entries in a bilingual dictionary, based on information in an existing valency dictionary. The method is based on two assumptions: words with similar meaning have similar subcategorization frames and selectional restrictions; and words with the same translations have similar meanings. Based on these assumptions, new valency entries are constructed for words in a plain bilingual dictionary, using entries with similar source-language meaning and the same target-language translations. We evaluate the effects of various measures of semantic similarity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akiba Y, Ishii M, Almuallim H, Kaneda S (1995) Learning English verb selection rules from hand-made rules and translation examples. In: Sixth international conference on theoretical and methodological issues in machine translation: TMI-95, Leuven, Belgium, pp 206–220

  • Akiba Y, Nakaiwa H, Shirai S, Ooyama Y (2000) Interactive generalization of a translation example using queries based on a semantic hierarchy. In: 12th IEEE international conference on tools with artificial intelligence (ICTAI 2000), Vancouver, BC, Canada, pp 326–332

  • Amano S, Kondo T (1998) Estimation of mental lexicon size with word familiarity database. In: The 5th international conference on spoken language processing, Sydney, Australia, pp 2119–2122

  • Amano S, Kondō T [天野,近藤] (1999) 日本語の語彙特性 [Lexical properties of Japanese]. Sanseidō, Tōkyō, Japan

  • Apel U (2002) WaDokuJT—a Japanese–German dictionary database. In: Proceedings of Papillon 2002 seminar, Tokyo, Japan

  • Baldwin T, Bond F, Hutchinson B (1999) A valency dictionary architecture for machine translation. In: Proceedings of the 8th international conference on theoretical and methodological issues in machine translation (TMI-99), Chester, England, pp 207–217

  • Bond F, Sulong RB, Yamazaki T, Ogura K (2001) Design and construction of a machine-tractable Japanese–Malay dictionary. In: MT summit VIII: Machine translation in the information age, Santiago de Compostela, Spain, pp 53–58

  • Bond F, Baldwin T, Fujita S (2002) Detecting alternation instances in a valency dictionary. In: 言語処理学会第8回年次大会 [8th Annual meeting of the Association for Natural Language Processing], Keihanna, Japan, pp 519–522

  • Breen JW (2004) JMDict: a Japanese-multilingual dictionary. In: Coling 2004 workshop on multilingual linguistic resources, Geneva, Switzerland, pp 71–78

  • Desperrier J-M (2002) Analyze [sic] of the results of a collaborative project for the creation of a Japanese–French dictionary. In: Proceedings of Papillon 2002 Seminar, Tokyo, Japan

  • Dillinger M (2001) Dictionary development workflow for MT: design and management. In: MT summit VIII: Machine translation in the information age, Santiago de Compostela, Spain, pp 83–88

  • Dorr BJ (1997) Large-scale dictionary construction for foreign language tutoring and interlingual machine translation. Mach Translat 12(4): 271–322

    Article  Google Scholar 

  • Dorr BJ, Levow G-A, Lin D (2002) Construction of a Chinese–English verb lexicon for machine translation. Mach Translat 17: 99–137

    Article  Google Scholar 

  • Erk K, Kowalski A, Padó S, Pinkal M (2003) Towards a resource for lexical semantics: a large German corpus with extensive semantic annotation. In: ACL-03: 41st annual meeting of the Association for Computational Linguistics, Sapporo, Japan, pp 537–544

  • Fujita S, Bond F (2002) A method of adding new entries to a valency dictionary by exploiting existing lexical resources. In: Proceedings of the 9th international conference on theoretical and methodological issues in machine translation (TMI-2002), Keihanna, Japan, pp 42–52

  • Fujita S, Bond F (2004) An automatic method of creating new valency entries using plain bilingual dictionaries. In: The tenth conference on theoretical and methodological issues in machine translation, Baltimore, Maryland, pp 55–64

  • Furumaki H, Tanaka H [古牧,田中] (2003) 構築を目指した < Nスル > の考察-言語処理と認知言語学の接点 [The consideration of  < N-suru >  for construction of the dynamic lexicon]. In: 言語処理学会第9回年次大会 [9th annual meeting of the Association for Natural Language Processing], Yokohama, Japan, pp 298–301

  • Haruno M, Yamazaki T (1996) High-performance bilingual text alignment using statistical and dictionary information. In: 34th annual meeting of the Association for Computational Linguistics, Santa Cruz, CA, pp 131–138

  • Hong M, Kim Y-K, Park S-K, Lee Y-J (2004) Semi-automatic construction of Korean–Chinese verb patterns based on translation equivalency [sic]. In: Coling 2004 workshop on multilingual linguistic resources, Geneva, Switzerland, pp 87–92

  • Ikehara S, Shirai S, Yokoo A, Nakaiwa H (1991) Toward an MT system without pre-editing—effects of new methods in ALT-J/E. In: Third machine translation summit: MT summit III, Washington, DC, pp 101–106

  • Ikehara S, Shirai S, Yokoo A [池原,白井,横尾,], Bond F, Omi Y [小見] (1995) 日英機械翻訳における利用者登録語の意味属性の自動推定 [Automatic determination of semantic attributes for user-defined words in Japanese–English MT]. 自然言語処理 [J Nat Lang Proc] 2(1):3–17

  • Ikehara S, Miyazaki M, Shirai S, Yokō A, Nakaiwa H, Ogura K, Ōyama Y, Hayashi Y [池原,宮崎,白井,横尾,中岩,小倉,大山,林] (1997) 日本語語彙大系 [Goi-Taikei: a Japanese lexicon], Iwanami Shoten, Tōkyō, Japan

  • Kanamaru T, Murata M, Kuroda K, Isahara H (2005) Obtaining Japanese lexical units for semantic frames from Berkeley FrameNet using a bilingual corpus. In: Proceedings of the 6th international workshop on linguistically interpreted corpora (LINC-2005), Jeju Island, Korea, pp 11–20

  • Kasahara K, Matsuzawa K, Ishikawa T [笠原,松沢,石川] (1997) 国語辞書を利用した日常語の類似性判別 [A method for judgment of semantic similarity between daily-used words by using machine readable dictionaries]. 論文誌論文誌 [Trans Info Proc Soc Jpn] 38:1272–1283

  • Kawahara D, Kurohashi S (2001) Japanese case frame construction by coupling the verb and its closest case component. In: Proceedings of first international conference on human language technology research (HLT 2001), San Diego, CA, pp 204–210

  • Kawahara D, Kurohashi S [河原,黒橋] (2005) 格フレーム辞書の漸次的自動構築 [Gradual fertilization of case frames]. 自然言語処理 [J Nat Lang Process] 12(2):109–132

  • Kindaichi H, Ikeda Y [金田–,池田] (1988) 学研国語大辞典 [Gakken Japanese dictionary], 2nd edn. Gakken, Tōkyō, Japan

  • Korhonen A (2002) Semantically motivated subcategorization acquisition. In: Proceedings of the ACL workshop on unsupervised lexical acquisition, Philadelphia, PA, pp 51–58

  • Levin B (1993) English verb classes and alternations. University of Chicago Press, Chicago

    Google Scholar 

  • Li H, Abe N (1998) Generalizing case frames using a thesaurus and the MDL principle. Comput Linguist 24(2): 217–244

    Google Scholar 

  • Manning CD (1993) Automatic acquisition of a large subcategorization dictionary from corpora. In: 31st annual meeting of the Association for Computational Linguistics, Columbus, OH, pp 235–242

  • McCarthy D (2000) Using semantic preferences to identify verbal participation in role switching alternations. In: 1st meeting of the North American chapter of the Association for Computational Linguistics, Seattle, Washington, pp 256–263

  • Nakaiwa H, Ikehara S (1995) Intrasentential resolution of Japanese zero pronouns in a machine translation system using semantic and pragmatic constraints. In: Sixth international conference on theoretical and methodological issues in machine translation: TMI-95, Leuven, Belgium, pp 96–105

  • Nomura N, Muraki K (1996) An empirical architecture for verb subcategorization frame. In: COLING-96: 16th international conference on computational linguistics, Copenhagen, Denmark, pp 640–645

  • Paik K, Bond F, Shirai S (2001) Using multiple pivots to align Korean and Japanese lexical resources. In: 6th natural language processing Pacific Rim symposium post-conference workshop, language resources in Asia, Tokyo, Japan, pp 63–70

  • Ri-Zhong Cidian (1987) 日中辞典 [Japanese–Chinese Dictionary]. Shogakkan, Tōkyō, Japan

  • Ruppenhofer J, Ellsworth M, Petruck MRL, Johnson CR (2005) FrameNet: theory and practice. http://framenet.icsi.berkeley.edu/book/book.html [Last accessed September 15, 2006]

  • Shirai S [白井] (1999) 単文の結合価パターンの網羅的収集に向けて-日英機械翻訳の観点から [Toward collecting all valency patterns—from the viewpoint of Japanese-to-English machine translation]. In: 言語資源の共有と再利用」シンポジウム [Symposium “Reusing linguistic resources”], Kyoto, Japan, pp 59–66

  • Tanaka K, Umemura K (1994) Construction of a bilingual dictionary intermediated by a third language. In: COLING 94: the 15th international conference on computational linguistics, Kyoto, Japan, pp 297–303

  • Utsuro T, Miyata T, Matsumoto Y (1997) Maximum entropy model learning of subcategorization preference. In: Proceedings of the fifth workshop on very large corpora, Beijing, China and Hong Kong, pp 246–260

  • Yamura-Takei M, Fujiwara M, Yoshie M, Aizawa T (2002) Automatic linguistic analysis for language teachers: the case of zeros. In: 19th international conference on computational linguistics: COLING-2002, Taipei, Taiwan, pp 1114–1120

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sanae Fujita.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fujita, S., Bond, F. A method of creating new valency entries. Machine Translation 21, 1–28 (2007). https://doi.org/10.1007/s10590-008-9032-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-008-9032-7

Keywords

Navigation