Effects of Language and Topic Size in Patent IR: An Empirical Study

  • Florina Piroi
  • Mihai Lupu
  • Allan Hanbury
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7488)


We revisit the effects that various characteristics of the topic documents have on the effectiveness of the systems for the task of finding prior art in the patent domain. In doing so, we provide the reader interested in approaching the domain a guide of the issues that need to be addressed in this context.

For the current study, we select two patent based test collections with a common document representation schema and look at topic characteristics specific to the objectives of the collections. We look at the effect of languages on retrieval and at the length of the topic documents. We present the correlations between these topic facets and their retrieval results, as well as their relevant documents.


Patent Application Patent Citation Test Collection Patent Document Relevance Judgement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alonso, O., Mizzaro, S.: Can we get rid of TREC assessors? Using Mechanical Turk for relevance assessment. In: Proc. of SIGIR IR Evaluation Workshop (2009)Google Scholar
  2. 2.
    Ferro, N., Peters, C.: CLEF 2009 Ad Hoc Track Overview: TEL and Persian Tasks. In: Peters et al. [14]Google Scholar
  3. 3.
    Fujii, A.: Enhancing patent retrieval by citation analysis. In: Proc. of SIGIR (2007)Google Scholar
  4. 4.
    Fujii, A., Iwayama, M., Kando, N.: Overview of the Patent Retrieval Task at the NTCIR-6 Workshop. In: Proc. of EVIA (2007)Google Scholar
  5. 5.
    Iwayama, M., Fujii, A., Kando, N., Takano, A.: Report on the patent retrieval task at NTCIR workshop 3. SIGIR Forum 38(1), 22–24 (2004)CrossRefGoogle Scholar
  6. 6.
    Kishida, K., Chen, K.-H., Lee, S., Kuriyama, K., Kando, N., Chen, H.-H., Myaeng, S.H., Eguchi, K.: Overview of CLIR Task at the Fourth NTCIR Workshop. In: Proc. of the NTCIR Workshop (2004)Google Scholar
  7. 7.
    Kando, N., Leong, M.-K.: Workshop on Patent Retrieval (Workshop Report). SIGIR Forum 34(1) (2000)Google Scholar
  8. 8.
    Kürsten, J., Wilhelm, T., Eibl, M.: The Xtrieval Framework at CLEF 2008: Domain-Specific Track. In: Peters, et al. [13]Google Scholar
  9. 9.
    Larson, R.: Back to Basics - Again - for Domain-Specific Retrieval. In: Peters et al [13]Google Scholar
  10. 10.
    Lupu, M., Piroi, F., Hanbury, A.: Aspects and analysis of patent test collections. In: Proc. of PaIR (2010)Google Scholar
  11. 11.
    Lupu, M., Piroi, F., Huang, J., Zhu, J., Tait, J.: Overview of the TREC Chemical IR Track. In: Proc. of the 18th Text Retrieval Conference (2010)Google Scholar
  12. 12.
    Lv, Y., Zhai, C.: When documents are very long, BM25 fails! In: Proc. of SIGIR (2011)Google Scholar
  13. 13.
    Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.): CLEF 2008. LNCS, vol. 5706. Springer, Heidelberg (2009)Google Scholar
  14. 14.
    Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.): CLEF 2009. LNCS, vol. 6241. Springer, Heidelberg (2010)Google Scholar
  15. 15.
    Petras, V., Baerisch, S.: The Domain-Specific Track at CLEF 2008. In: Peters et al. [13]Google Scholar
  16. 16.
    Piroi, F.: CLEF-IP 2010: Retrieval Experiments in the Intellectual Property Domain. In: CLEF 2010 LABs and Workshops, Notebook Papers (2010)Google Scholar
  17. 17.
    Piroi, F., Zenz, V.: Evaluating Information Retrieval in the Intellectual Property Domain: The CLEF-IP Campaign. In: Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol. 29 (2011)Google Scholar
  18. 18.
    Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain. In: Peters et al. [14]Google Scholar
  19. 19.
    Voorhees, E., Buckland, L. (eds.): Proc. of TREC, volume Special Publication 500–278. NIST (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Florina Piroi
    • 1
  • Mihai Lupu
    • 1
  • Allan Hanbury
    • 1
  1. 1.Vienna University of TechnologyViennaAustria

Personalised recommendations