Skip to main content

Automatic Translation of Scholarly Terms into Patent Terms

  • Chapter
Current Challenges in Patent Information Retrieval

Part of the book series: The Information Retrieval Series ((INRE,volume 29))

  • 1580 Accesses

Abstract

For a researcher in a field with high industrial relevance, retrieving research papers and patents has become an important aspect of assessing the scope of the field. However, retrieving patents using keywords is a laborious task for researchers, because the terms used in patents (patent terms) are often more abstract than those used in research papers (scholarly terms) or in ordinary language, to try to widen the scope of the claims. We propose a method for translating scholarly terms into patent terms (e.g. translating “word processor” into “document editing device” or “document writing support system”). To translate scholarly terms into patent terms, we propose two methods: the “citation-based method” and the “thesaurus-based method”. We also propose a method combining these two with the existing “Mase’s method”. To confirm the effectiveness of our methods, we conducted some examinations, and found that the combined method performed the best in terms of recall, precision, and ε, which is an extensional measure of Mean Reciprocal Rank (MRR) widely used for the evaluation of question-answering systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    Generally, technical terms are defined as terms used in a particular research field. Based on this definition, “floppy disc” or “word processor” are not technical terms, because they are commonly used. In this paper, we define “scholarly terms” as terms used in research papers, even though they may also be used more generally, such as “floppy disc” or “word processor”.

  2. 2.

    We define the task of “translation of scholarly terms into patent terms” as “to output all useful patent terms for patent retrieval”. In many cases, patent terms are hypernyms or synonyms of a given scholarly term, and include a part of scholarly terms.

  3. 3.

    http://www.kantei.go.jp/jp/singi/titeki2/keikaku2009_e.pdf. Cited 30 June 2010.

  4. 4.

    In this case, the scholarly term “floppy disc” was already inserted in the “title” field.

  5. 5.

    For example, when a scholarly term “floppy disc” is given, the thesaurus-based method output its hypernyms, such as “removable recording medium”, as patent terms.

  6. 6.

    We will report this experimental result later.

  7. 7.

    When Mase’s method outputs three candidate terms “magnetic recording device” (freq. 10), “removable storage device” (freq. 5), and “information recording medium” (freq. 3), the three words “device” (freq. 10), “device” (freq. 5), and “medium” (freq. 3) are extracted from the terms.

  8. 8.

    For the example in Step 2, “device” (score 15) and “medium” (score 3) are obtained. Then, the scores of the words are normalised by dividing by 15, which is the score for “device”, resulting in “device” (score 1) and “medium” (score 0.2).

  9. 9.

    For example, if the citation-based method obtained a term “recording medium” (score 0.5), a score 0.2×m for “medium” is added to 0.5. Here, m is a parameter that indicates the influence of Mase’s method on the citation-based method. We will describe how to determine m in Sect. 19.5.1.

  10. 10.

    http://kantan.nexp.jp/.

  11. 11.

    http://geta.ex.nii.ac.jp/.

  12. 12.

    http://jdream2.jst.go.jp/html/thesaurus99/thesaurus_index99.htm.

References

  1. Atkinson KH (2008) Toward a more rational patent search paradigm. In: Proceedings of the 1st international CIKM workshop on patent information retrieval (PaIR’08), pp 37–40

    Chapter  Google Scholar 

  2. Chen H, Ng TD, Martinez J, Schats BR (1997) A concept space approach to addressing the vocabulary problem in scientific information retrieval: An experiment on the worm community system. J Am Soc Inf Sci 48(1):17–31

    Article  Google Scholar 

  3. Fujii A, Iwayama M, Kando N (2004) Overview of patent retrieval task at NTCIR-4. Working notes of the 4th NTCIR workshop, pp 225–232

    Google Scholar 

  4. Fujii A, Iwayama M, Kando N (2005) Overview of patent retrieval task at NTCIR-5. In: Proceedings of the 5th NTCIR workshop meeting on evaluation of information access technologies: Information retrieval, question answering and cross-lingual information access, pp 269–277

    Google Scholar 

  5. Fujii A, Iwayama M, Kando N (2007) Overview of the patent retrieval task at NTCIR-6 workshop. In: Proceedings of the 6th NTCIR workshop meeting, pp 359–365

    Google Scholar 

  6. Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th international conference on computational linguistics, pp 539–545

    Chapter  Google Scholar 

  7. Ikeda D, Fujiki T, Okumura M (2006) Automatically linking news articles to blog entries. In: Proceedings of AAAI spring symposium series computational approaches to analyzing weblogs, pp 78–82

    Google Scholar 

  8. Itoh H, Mano H, Ogawa Y (2002) Term distillation for cross-db retrieval. Working notes of the 3rd NTCIR workshop meeting, Part III: Patent retrieval task, pp 11–14

    Google Scholar 

  9. Iwayama M, Fujii A, Kando N, Takano A (2002) Overview of patent retrieval task at NTCIR-3. Working notes of the 3rd NTCIR workshop meeting, Part III: patent retrieval task, pp 1–10

    Google Scholar 

  10. Lupu M, Piroi F, Huang J, Zhu J, Tait J (2009) Overview of the TREC chemical IR track. In: Proceedings of the 18th text retrieval conference

    Google Scholar 

  11. Mase H, Matsubayashi T, Ogawa Y, Yayoi Y, Sato Y, Iwayama M (2005) NTCIR-5 patent retrieval experiments at Hitachi. In: Proceedings of NTCIR-5 workshop meeting, pp 318–323

    Google Scholar 

  12. Mase H, Iwayama M (2007) NTCIR-6 patent retrieval experiments at Hitachi. In: Proceedings of the 6th NTCIR workshop meeting, pp 403–406

    Google Scholar 

  13. Nanba H (2007) Query expansion using an automatically constructed thesaurus. In: Proceedings of the 6th NTCIR workshop meeting, pp 414–419

    Google Scholar 

  14. Nanba H, Anzen N, Okumura M (2008) Automatic extraction of citation information in Japanese patent applications. Int J Digit Libr 9(2):151–161

    Article  Google Scholar 

  15. Nanba H, Fujii A, Iwayama M, Hashimoto Y (2008) The patent mining task in the seventh NTCIR workshop. In: Proceedings of the 1st international CIKM workshop on patent information retrieval (PaIR’08), pp 25–31

    Chapter  Google Scholar 

  16. Nanba H, Fujii A, Iwayama M, Hashimoto T (2010) Overview of the patent mining task at the NTCIR-8 workshop. In: Proceedings of the 8th NTCIR workshop meeting on evaluation of information access technologies: Information retrieval, question answering and cross-lingual information access, pp 293–302

    Google Scholar 

  17. Shinmori A, Okumura M, Marukawa Y, Iwayama M (2002) Rhetorical structure analysis of Japanese patent claims using cue phrases. In: Proceedings of the 3rd NTCIR workshop meeting, Part III: patent retrieval task, pp 69–77

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hidetsugu Nanba .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Nanba, H., Kamaya, H., Takezawa, T., Okumura, M., Shinmori, A., Tanigawa, H. (2011). Automatic Translation of Scholarly Terms into Patent Terms. In: Lupu, M., Mayer, K., Tait, J., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 29. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19231-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19231-9_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19230-2

  • Online ISBN: 978-3-642-19231-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics