Efficient Prior-Art Retrieval of Patent Documents Using MapReduce Paradigm

Girthana, K.; Swamynathan, S.

doi:10.1007/978-981-10-6890-4_70

K. Girthana⁶ &
S. Swamynathan⁶

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 24))

942 Accesses
1 Citations

Abstract

A patent is a legal right given to novel, non-obvious and useful inventions. The prior-art search involves retrieving prior works related to it to avoid duplication of the invention and granting of the patent. Moreover, it analyzes a variety of documents like newspaper articles, proceedings, and journals. The amount of patent document and the volume of filings keep on increasing at an unprecedented rate every year. Processing on this enormous volume of data sequentially is time-consuming. Hence, the proposed Prior-Art Retrieval System (PARS) retrieves only the patent documents through Google patent API, and K-Means clustering was employed in a parallel mode to cluster the documents. Through Relevance Mapping prominent document clusters were identified. The documents within the relevant clusters are ranked based on the citations. The top ranked documents were displayed to the patent analyst.The results show that the processing time with map reduce has reduced significantly and accuracy of clusters was around 50%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PatSearch: an integrated framework for patentability retrieval

Article 06 November 2017

Automatics Tools and Methods for Patents Analysis: Efficient Methodology for Patent Document Clustering

HIM-PRS: A Patent Recommendation System Based on Hierarchical Index-Based MapReduce Framework

References

Gaff, Brian M., and Bruce Rubinger.: The significance of prior art. Computer. 8, pp. 9–11 (2014)
Google Scholar
Wanagiri, M. Z., Adriani, M.: Prior Art Retrieval Using Various Patent Document Fields Contents. CLEF (Notebook Papers/LABs/Workshops), pp. 1–6, UK (2010)
Google Scholar
Xue, X., Croft, W. B.: Automatic query generation for patent search. In: 18th ACM conference on Information and knowledge management, pp. 2037–2040, Germany (2009)
Google Scholar
Jun, S., Park, S. S., Jang, D. S. : Document clustering method using dimension reduction and support vector clustering to overcome sparseness. Expert Systems with Applications. 41, 7, 3204–3212 (2014)
Article Google Scholar
Andrews, N. O., and Fox, E. A.: Recent developments in document clustering. Technical Report TR-07-35 (2007)
Google Scholar
Huang, S. H., Ke, H. R., Yang, W. P.: Structure clustering for Chinese patent documents. Expert Systems with Applications. 34, 4, 2290–2297 (2008)
Google Scholar
Balabantaray, R. C., Sarma, C., Jha, M.: Document Clustering using K-Means and K-Medoids. International Journal of Knowledge Based Computer Systems. 1, 1 (2015)
Google Scholar
Bradley, P. S., Fayyad, U. M., Reina, C.: Scaling Clustering algorithms to large databases. In: 4th International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 9–15. (1998)
Google Scholar
Kriegel, H. P., Kroger, P., Renz, M., Wurst, S.: A generic framework for efficient subspace clustering of high-dimensional data. In: Proceedings of the 5th IEEE International conference on data mining (ICDM), pp 250–257 (2005)
Google Scholar
Han, J. and Kamber M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Elsevier (2011)
MATH Google Scholar
Ngazimbi, M.: Data Clustering Using MapReduce. In Masters Thesis, Boise State University (2009)
Google Scholar
Sun, T., Shu, C., Li, F., Yu, H., Ma, L., Fang, Y.: An Efficient Hierarchical Clustering Method for Large Datasets with Map-Reduce. In Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies, 12, 2, pp. 494–499 (2009)
Google Scholar
Wang, S., Dutta, H.: PARABLE: A PArallel RAndom-partition Based HierarchicaL ClustEring Algorithm for the MapReduce Framework. In: 6th Annual Machine Learning Symposium at the New York Academy of Science. (2011)
Google Scholar
Zhao, W., Ma, H.,He, Q.: Parallel K-means Clustering Based on MapReduce. In: IEEE International Conference on Cloud Computing, pp. 674–679 (2009)
Google Scholar
Kang, I. S., Na, S. H., Kim, J., Lee, J. H.: Cluster-based patent retrieval. Information processing & management, 43, 5, 1173–1182 (2007)
Google Scholar
Aleman-Meza, B., Arpinar, I. B., Nural, M. V., Sheth, A. P.: Ranking documents semantically using ontological relationships. In: 4th International Conference on semantic computing, pp. 299–304, US (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Science and Technology, Anna University, Chennai, 600025, India
K. Girthana & S. Swamynathan

Authors

K. Girthana
View author publications
You can also search for this author in PubMed Google Scholar
S. Swamynathan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. Girthana .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India
J. K. Mandal
Department of Information Technology, North-Eastern Hill University, Shillong, Meghalaya, India
Goutam Saha
Department of Information Technology, North-Eastern Hill University, Shillong, Meghalaya, India
Debdatta Kandar
Department of Information Technology, North-Eastern Hill University, Shillong, Meghalaya, India
Arnab Kumar Maji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Girthana, K., Swamynathan, S. (2018). Efficient Prior-Art Retrieval of Patent Documents Using MapReduce Paradigm. In: Mandal, J., Saha, G., Kandar, D., Maji, A. (eds) Proceedings of the International Conference on Computing and Communication Systems. Lecture Notes in Networks and Systems, vol 24. Springer, Singapore. https://doi.org/10.1007/978-981-10-6890-4_70

Download citation

DOI: https://doi.org/10.1007/978-981-10-6890-4_70
Published: 30 March 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6889-8
Online ISBN: 978-981-10-6890-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Efficient Prior-Art Retrieval of Patent Documents Using MapReduce Paradigm

Abstract

Access this chapter

Similar content being viewed by others

PatSearch: an integrated framework for patentability retrieval

Automatics Tools and Methods for Patents Analysis: Efficient Methodology for Patent Document Clustering

HIM-PRS: A Patent Recommendation System Based on Hierarchical Index-Based MapReduce Framework

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Efficient Prior-Art Retrieval of Patent Documents Using MapReduce Paradigm

Abstract

Access this chapter

Similar content being viewed by others

PatSearch: an integrated framework for patentability retrieval

Automatics Tools and Methods for Patents Analysis: Efficient Methodology for Patent Document Clustering

HIM-PRS: A Patent Recommendation System Based on Hierarchical Index-Based MapReduce Framework

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation