Skip to main content

Expanding Queries with Term and Phrase Translations in Patent Retrieval

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6653))

Abstract

Patent retrieval is a branch of Information Retrieval (IR) that aims to enable the challenging task of retrieving highly technical and often complicated patents. Typically, patent granting bodies translate patents into several major foreign languages, so that language boundaries do not hinder their accessibility. Given such multilingual patent collections, we posit that the patent translations can be exploited for facilitating patent retrieval.

Specifically, we focus on the translation of patent queries from German and French, the morphology of which poses an extra challenge to retrieval. We compare two translation approaches that expand the query with (i) translated terms and (ii) translated phrases. Experimental evaluation on a standard CLEF-IP European Patent Office dataset reveals a novel finding: phrase translation may be more suited to French, and term translation may be more suited to German. We trace this finding to language morphology, and we conclude that tailoring the query translation per language can lead to improved results in patent retrieval.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Atkinson, K.H.: Toward a more rational patent search paradigm. In: 1st ACM Workshop on Patent IR, pp. 37–40 (2008)

    Google Scholar 

  2. Azzopardi, L., Vanderbauwhede, W., Joho, H.: Search system requirements of patent analysts. In: SIGIR, pp. 775–776 (2010)

    Google Scholar 

  3. Ballesteros, L., Croft, W.B.: Phrasal translation and query expansion techniques for cross-language information retrieval. In: SIGIR, pp. 84–91 (1997)

    Google Scholar 

  4. Bashir, S., Rauber, A.: Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection. In: CIKM, pp. 1863–1866 (2009)

    Google Scholar 

  5. Bashir, S., Rauber, A.: Improving retrievability of patents in prior-art search. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 457–470. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  6. Braune, F., Fraser, A.: Improved unsupervised sentence alignment for symmetrical and asymmetrical parallel corpora. In: COLING (2010)

    Google Scholar 

  7. Chinnakotla, M.K., Raman, K., Bhattacharyya, P.: Multilingual prf: english lends a helping hand. In: SIGIR, pp. 659–666 (2010)

    Google Scholar 

  8. Croft, W.B., Lafferty, J.: Language Modeling for Information Retrieval. Kluwer Academic Publishers, Dordrecht (2003)

    Book  MATH  Google Scholar 

  9. Fujii, A., Utiyama, M., Yamamoto, M., Utsuro, T.: Overview of the patent translation task at the NTCIR-7 workshop. In: NTCIR (2008)

    Google Scholar 

  10. Gao, W., Niu, C., Nie, J.-Y., Zhou, M., Wong, K.-F., Hon, H.-W.: Exploiting query logs for cross-lingual query suggestions. TOIS 28(2) (2010)

    Google Scholar 

  11. Jochim, C., Lioma, C., Schütze, H., Koch, S., Ertl, T.: Preliminary study into query translation for patent retrieval. In: PaIR, Toronto, Canada. ACM, New York (2010)

    Google Scholar 

  12. Kettunen, K.: Choosing the best MT programs for CLIR purposes – can MT metrics be helpful? In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 706–712. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: open source toolkit for statistical machine translation. In: ACL, pp. 177–180 (2007)

    Google Scholar 

  14. Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: NAACL, pp. 48–54 (2003)

    Google Scholar 

  15. Larkey, L.S., Connell, M.E.: Structured queries, language modeling, and relevance modeling in cross-language information retrieval. Inf. Process. Manage. 41(3), 457–473 (2005), doi:10.1016/j.ipm.2004.06.008

    Article  MATH  Google Scholar 

  16. Lavrenko, V., Croft, W.B.: Relevance-based language models. In: SIGIR, pp. 120–127 (2001)

    Google Scholar 

  17. Oard, D.W., Diekema, A.R.: Cross-language information retrieval. Annual Review of Information Science and Technology 33, 223–256 (1998)

    Google Scholar 

  18. Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)

    Article  MATH  Google Scholar 

  19. Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval experiments in the intellectual property domain. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mostefa, D., Penas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 385–409. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Tait, J. (ed.): 1st ACM Workshop on Patent IR (2008)

    Google Scholar 

  21. Tait, J. (ed.): 2nd ACM Workshop on Patent IR (2009)

    Google Scholar 

  22. Wang, J., Oard, D.W.: Combining bidirectional translation and synonymy for cross-language information retrieval. In: SIGIR, pp. 202–209 (2006)

    Google Scholar 

  23. Xue, X., Croft, W.B.: Automatic query generation for patent search. In: CIKM, pp. 2037–2040 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jochim, C., Lioma, C., Schütze, H. (2011). Expanding Queries with Term and Phrase Translations in Patent Retrieval. In: Hanbury, A., Rauber, A., de Vries, A.P. (eds) Multidisciplinary Information Retrieval. IRFC 2011. Lecture Notes in Computer Science, vol 6653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21353-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21353-3_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21352-6

  • Online ISBN: 978-3-642-21353-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics