Abstract
For a researcher in a field with high industrial relevance, retrieving research papers and patents has become an important aspect of assessing the scope of the field. However, retrieving patents using keywords is a laborious task for researchers, because the terms used in patents (patent terms) are often more abstract than those used in research papers (scholarly terms) or in ordinary language, to try to widen the scope of the claims. We propose a method for translating scholarly terms into patent terms (e.g. translating “word processor” into “document editing device” or “document writing support system”). To translate scholarly terms into patent terms, we propose two methods: the “citation-based method” and the “thesaurus-based method”. We also propose a method combining these two with the existing “Mase’s method”. To confirm the effectiveness of our methods, we conducted some examinations, and found that the combined method performed the best in terms of recall, precision, and ε, which is an extensional measure of Mean Reciprocal Rank (MRR) widely used for the evaluation of question-answering systems.
Notes
- 1.
Generally, technical terms are defined as terms used in a particular research field. Based on this definition, “floppy disc” or “word processor” are not technical terms, because they are commonly used. In this paper, we define “scholarly terms” as terms used in research papers, even though they may also be used more generally, such as “floppy disc” or “word processor”.
- 2.
We define the task of “translation of scholarly terms into patent terms” as “to output all useful patent terms for patent retrieval”. In many cases, patent terms are hypernyms or synonyms of a given scholarly term, and include a part of scholarly terms.
- 3.
http://www.kantei.go.jp/jp/singi/titeki2/keikaku2009_e.pdf. Cited 30 June 2010.
- 4.
In this case, the scholarly term “floppy disc” was already inserted in the “title” field.
- 5.
For example, when a scholarly term “floppy disc” is given, the thesaurus-based method output its hypernyms, such as “removable recording medium”, as patent terms.
- 6.
We will report this experimental result later.
- 7.
When Mase’s method outputs three candidate terms “magnetic recording device” (freq. 10), “removable storage device” (freq. 5), and “information recording medium” (freq. 3), the three words “device” (freq. 10), “device” (freq. 5), and “medium” (freq. 3) are extracted from the terms.
- 8.
For the example in Step 2, “device” (score 15) and “medium” (score 3) are obtained. Then, the scores of the words are normalised by dividing by 15, which is the score for “device”, resulting in “device” (score 1) and “medium” (score 0.2).
- 9.
For example, if the citation-based method obtained a term “recording medium” (score 0.5), a score 0.2×m for “medium” is added to 0.5. Here, m is a parameter that indicates the influence of Mase’s method on the citation-based method. We will describe how to determine m in Sect. 19.5.1.
- 10.
- 11.
- 12.
References
Atkinson KH (2008) Toward a more rational patent search paradigm. In: Proceedings of the 1st international CIKM workshop on patent information retrieval (PaIR’08), pp 37–40
Chen H, Ng TD, Martinez J, Schats BR (1997) A concept space approach to addressing the vocabulary problem in scientific information retrieval: An experiment on the worm community system. J Am Soc Inf Sci 48(1):17–31
Fujii A, Iwayama M, Kando N (2004) Overview of patent retrieval task at NTCIR-4. Working notes of the 4th NTCIR workshop, pp 225–232
Fujii A, Iwayama M, Kando N (2005) Overview of patent retrieval task at NTCIR-5. In: Proceedings of the 5th NTCIR workshop meeting on evaluation of information access technologies: Information retrieval, question answering and cross-lingual information access, pp 269–277
Fujii A, Iwayama M, Kando N (2007) Overview of the patent retrieval task at NTCIR-6 workshop. In: Proceedings of the 6th NTCIR workshop meeting, pp 359–365
Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th international conference on computational linguistics, pp 539–545
Ikeda D, Fujiki T, Okumura M (2006) Automatically linking news articles to blog entries. In: Proceedings of AAAI spring symposium series computational approaches to analyzing weblogs, pp 78–82
Itoh H, Mano H, Ogawa Y (2002) Term distillation for cross-db retrieval. Working notes of the 3rd NTCIR workshop meeting, Part III: Patent retrieval task, pp 11–14
Iwayama M, Fujii A, Kando N, Takano A (2002) Overview of patent retrieval task at NTCIR-3. Working notes of the 3rd NTCIR workshop meeting, Part III: patent retrieval task, pp 1–10
Lupu M, Piroi F, Huang J, Zhu J, Tait J (2009) Overview of the TREC chemical IR track. In: Proceedings of the 18th text retrieval conference
Mase H, Matsubayashi T, Ogawa Y, Yayoi Y, Sato Y, Iwayama M (2005) NTCIR-5 patent retrieval experiments at Hitachi. In: Proceedings of NTCIR-5 workshop meeting, pp 318–323
Mase H, Iwayama M (2007) NTCIR-6 patent retrieval experiments at Hitachi. In: Proceedings of the 6th NTCIR workshop meeting, pp 403–406
Nanba H (2007) Query expansion using an automatically constructed thesaurus. In: Proceedings of the 6th NTCIR workshop meeting, pp 414–419
Nanba H, Anzen N, Okumura M (2008) Automatic extraction of citation information in Japanese patent applications. Int J Digit Libr 9(2):151–161
Nanba H, Fujii A, Iwayama M, Hashimoto Y (2008) The patent mining task in the seventh NTCIR workshop. In: Proceedings of the 1st international CIKM workshop on patent information retrieval (PaIR’08), pp 25–31
Nanba H, Fujii A, Iwayama M, Hashimoto T (2010) Overview of the patent mining task at the NTCIR-8 workshop. In: Proceedings of the 8th NTCIR workshop meeting on evaluation of information access technologies: Information retrieval, question answering and cross-lingual information access, pp 293–302
Shinmori A, Okumura M, Marukawa Y, Iwayama M (2002) Rhetorical structure analysis of Japanese patent claims using cue phrases. In: Proceedings of the 3rd NTCIR workshop meeting, Part III: patent retrieval task, pp 69–77
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Nanba, H., Kamaya, H., Takezawa, T., Okumura, M., Shinmori, A., Tanigawa, H. (2011). Automatic Translation of Scholarly Terms into Patent Terms. In: Lupu, M., Mayer, K., Tait, J., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 29. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19231-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-19231-9_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19230-2
Online ISBN: 978-3-642-19231-9
eBook Packages: Computer ScienceComputer Science (R0)