Improving Retrievability of Patents in Prior-Art Search

Bashir, Shariq; Rauber, Andreas

doi:10.1007/978-3-642-12275-0_40

Shariq Bashir²⁴ &
Andreas Rauber²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5993))

Included in the following conference series:

European Conference on Information Retrieval

2334 Accesses
47 Citations

Abstract

Prior-art search is an important task in patent retrieval. The success of this task relies upon the selection of relevant search queries. Typically terms for prior-art queries are extracted from the claim fields of query patents. However, due to the complex technical structure of patents, and presence of terms mismatch and vague terms, selecting relevant terms for queries is a difficult task. During evaluating the patents retrievability coverage of prior-art queries generated from query patents, a large bias toward a subset of the collection is experienced. A large number of patents either have a very low retrievability score or can not be discovered via any query. To increase the retrievability of patents, in this paper we expand prior-art queries generated from query patents using query expansion with pseudo relevance feedback. Missing terms from query patents are discovered from feedback patents, and better patents for relevance feedback are identified using a novel approach for checking their similarity with query patents. We specifically focus on how to automatically select better terms from query patents based on their proximity distribution with prior-art queries that are used as features for computing similarity. Our results show, that the coverage of prior-art queries can be increased significantly by incorporating relevant queries terms using query expansion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Azzopardi, L., Vinay, V.: Retrievability: an evaluation measure for higher order information access tasks. In: Proc. of CIKM 2008, Napa Valley, California, USA, October 26-30, pp. 561–570 (2008)
Google Scholar
Bashir, S., Rauber, A.: Analyzing Document Retrievability in Patent Retrieval Settings. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 753–760. Springer, Heidelberg (2009)
Chapter Google Scholar
Bashir, S., Rauber, A.: Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection. In: Proc. of CIKM 2009, Hong Kong, China, November 2-6, pp. 1863–1866 (2009)
Google Scholar
Cao, G., Nie, J.-Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proc. of SIGIR 2008, Singapore, pp. 243–250 (2008)
Google Scholar
Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval. In: Proc. of SIGIR 2009, Boston, MA, USA, pp. 251–258 (2009)
Google Scholar
Custis, T., Al-Kofahi, K.: A new approach for evaluating query expansion: query-document term mismatch. In: Proc. of SIGIR 2007, Amsterdam, The Netherlands, July 23-27, pp. 575–582 (2007)
Google Scholar
Fall, C.J., Torcsvari, A., Benzineb, K., Karetka, G.: Automated categorization in the international patent classification. ACM SIGIR Forum 37(1), 10–25 (Spring 2003)
Article Google Scholar
Fujii, A.: Enhancing patent retrieval by citation analysis. In: Proc. of SIGIR 2007, Amsterdam, The Netherlands, pp. 793–794 (2007)
Google Scholar
Itoh, H., Mano, H., Ogawa, Y.: Term distillation in patent retrieval. In: ACL 2003: Proceedings of the ACL-2003 workshop on Patent corpus processing, Sapporo, Japan, pp. 41–45 (2003)
Google Scholar
Konishi, K.: Query terms extraction from patent document for invalidity search. In: Proc. of NTCIR 2005: NTCIR-5 Workshop Meeting, Tokyo, Japan (2005)
Google Scholar
Konishi, K., Kitauchi, A., Takaki, T.: Invalidity patent search system at NTT data. In: Proc. of NTCIR-4 Workshop Meeting, Tokyo, Japan (2004)
Google Scholar
Larkey, L.S.: A Patent Search and Classification System. In: Proc. of 4th ACM Conference on Digital Libraries, Berkeley, CA, USA, pp. 179–187 (1999)
Google Scholar
Lavrenko, V., Croft, W.B.: Relevance based language models. In: Proc. of SIGIR 2001, New Orleans, Louisiana, USA, pp. 120–127 (2001)
Google Scholar
Lee, K.S., Croft, W.B., Allan, J.: A cluster-based resampling method for pseudo-relevance feedback. In: Proc. of SIGIR 2008, Singapore, pp. 235–242 (2008)
Google Scholar
Mase, H., Matsubayashi, T., Ogawa, Y., Iwayama, M., Oshio, T.: Proposal of two-stage patent retrieval method considering the claim structure. ACM Transactions on Asian Language Information Processing 4(2), 190–206 (2005)
Google Scholar
Murata, M., Kanamaru, T., Shirado, T., Isahara, H.: Using the k-nearest neighbor method and SMART weighting in the patent document categorization subtask at NTCIR-6. In: Proc. NTCIR-6 Workshop Meeting, Tokyo, Japan (2007)
Google Scholar
Osborn, M., Strzalkowski, T., Marinescu, M.: Evaluating Document Retrieval in Patent Database: A Preliminary Report. In: Proc. of CIKM 1997, Las Vegas, Nevada, USA, pp. 216–221 (1997)
Google Scholar
Tao, T., Zhai, C.: An exploration of proximity measures in information retrieval. In: Proc. of SIGIR 2007, Amsterdam, The Netherlands, pp. 295–302 (2007)
Google Scholar
Xue, X., Croft, W.B.: Transforming patents into prior-art queries. In: Proc. of SIGIR 2009, Boston, MA, USA, pp. 808–809 (2009)
Google Scholar
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)
Article Google Scholar
Zhao, J., Yun, Y.: A proximity language model for information retrieval. In: Proc. of SIGIR 2009, Boston, MA, USA, pp. 291–298 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Software Technology and Interactive Systems, Vienna University of Technology, Austria
Shariq Bashir & Andreas Rauber

Authors

Shariq Bashir
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Rauber
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Adaptive Information Cluster, Dublin City University, Dublin, 9, Ireland
Cathal Gurrin
The Open University, Walton Hall, MK7 6HF, Milton Keynes, UK
Yulan He
Microsoft Research Ltd, 7 JJ Thomson Avenue, CB3 0FB, Cambridge, UK
Gabriella Kazai
Department of Computer Science, University of Essex, Wivenhoe Park, CO4 3SQ, Colchester, UK
Udo Kruschwitz
The Open University, Walton Hall, Milton Keynes, UK
Suzanne Little
University of London, London, UK
Thomas Roelleke
Knowledge Media Institute, The Open University, MK7 6AA, Milton Keynes, UK
Stefan Rüger
Department of Computing Science, University of Glasgow, 17 Lilybank Gardens, G12 8QQ, Glasgow, UK
Keith van Rijsbergen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bashir, S., Rauber, A. (2010). Improving Retrievability of Patents in Prior-Art Search. In: Gurrin, C., et al. Advances in Information Retrieval. ECIR 2010. Lecture Notes in Computer Science, vol 5993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12275-0_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-12275-0_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12274-3
Online ISBN: 978-3-642-12275-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics