Skip to main content

Efficient Prior-Art Retrieval of Patent Documents Using MapReduce Paradigm

  • Conference paper
  • First Online:
Proceedings of the International Conference on Computing and Communication Systems

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 24))

Abstract

A patent is a legal right given to novel, non-obvious and useful inventions. The prior-art search involves retrieving prior works related to it to avoid duplication of the invention and granting of the patent. Moreover, it analyzes a variety of documents like newspaper articles, proceedings, and journals. The amount of patent document and the volume of filings keep on increasing at an unprecedented rate every year. Processing on this enormous volume of data sequentially is time-consuming. Hence, the proposed Prior-Art Retrieval System (PARS) retrieves only the patent documents through Google patent API, and K-Means clustering was employed in a parallel mode to cluster the documents. Through Relevance Mapping prominent document clusters were identified. The documents within the relevant clusters are ranked based on the citations. The top ranked documents were displayed to the patent analyst.The results show that the processing time with map reduce has reduced significantly and accuracy of clusters was around 50%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Gaff, Brian M., and Bruce Rubinger.: The significance of prior art. Computer. 8, pp. 9–11 (2014)

    Google Scholar 

  2. Wanagiri, M. Z., Adriani, M.: Prior Art Retrieval Using Various Patent Document Fields Contents. CLEF (Notebook Papers/LABs/Workshops), pp. 1–6, UK (2010)

    Google Scholar 

  3. Xue, X., Croft, W. B.: Automatic query generation for patent search. In: 18th ACM conference on Information and knowledge management, pp. 2037–2040, Germany (2009)

    Google Scholar 

  4. Jun, S., Park, S. S., Jang, D. S. : Document clustering method using dimension reduction and support vector clustering to overcome sparseness. Expert Systems with Applications. 41, 7, 3204–3212 (2014)

    Article  Google Scholar 

  5. Andrews, N. O., and Fox, E. A.: Recent developments in document clustering. Technical Report TR-07-35 (2007)

    Google Scholar 

  6. Huang, S. H., Ke, H. R., Yang, W. P.: Structure clustering for Chinese patent documents. Expert Systems with Applications. 34, 4, 2290–2297 (2008)

    Google Scholar 

  7. Balabantaray, R. C., Sarma, C., Jha, M.: Document Clustering using K-Means and K-Medoids. International Journal of Knowledge Based Computer Systems. 1, 1 (2015)

    Google Scholar 

  8. Bradley, P. S., Fayyad, U. M., Reina, C.: Scaling Clustering algorithms to large databases. In: 4th International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 9–15. (1998)

    Google Scholar 

  9. Kriegel, H. P., Kroger, P., Renz, M., Wurst, S.: A generic framework for efficient subspace clustering of high-dimensional data. In: Proceedings of the 5th IEEE International conference on data mining (ICDM), pp 250–257 (2005)

    Google Scholar 

  10. Han, J. and Kamber M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Elsevier (2011)

    MATH  Google Scholar 

  11. Ngazimbi, M.: Data Clustering Using MapReduce. In Masters Thesis, Boise State University (2009)

    Google Scholar 

  12. Sun, T., Shu, C., Li, F., Yu, H., Ma, L., Fang, Y.: An Efficient Hierarchical Clustering Method for Large Datasets with Map-Reduce. In Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies, 12, 2, pp. 494–499 (2009)

    Google Scholar 

  13. Wang, S., Dutta, H.: PARABLE: A PArallel RAndom-partition Based HierarchicaL ClustEring Algorithm for the MapReduce Framework. In: 6th Annual Machine Learning Symposium at the New York Academy of Science. (2011)

    Google Scholar 

  14. Zhao, W., Ma, H.,He, Q.: Parallel K-means Clustering Based on MapReduce. In: IEEE International Conference on Cloud Computing, pp. 674–679 (2009)

    Google Scholar 

  15. Kang, I. S., Na, S. H., Kim, J., Lee, J. H.: Cluster-based patent retrieval. Information processing & management, 43, 5, 1173–1182 (2007)

    Google Scholar 

  16. Aleman-Meza, B., Arpinar, I. B., Nural, M. V., Sheth, A. P.: Ranking documents semantically using ontological relationships. In: 4th International Conference on semantic computing, pp. 299–304, US (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Girthana .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Girthana, K., Swamynathan, S. (2018). Efficient Prior-Art Retrieval of Patent Documents Using MapReduce Paradigm. In: Mandal, J., Saha, G., Kandar, D., Maji, A. (eds) Proceedings of the International Conference on Computing and Communication Systems. Lecture Notes in Networks and Systems, vol 24. Springer, Singapore. https://doi.org/10.1007/978-981-10-6890-4_70

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6890-4_70

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6889-8

  • Online ISBN: 978-981-10-6890-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics