Abstract
In the paper, authors proposed a methodology to solve the problem of prior art patent search, consists of a statistical and semantic analysis of patent documents, machine translation of patent application and calculation of semantic similarity between application and patents. The paper considers different variants of statistical analysis based on LDA method. On the step of the semantic analysis, authors applied a new method for building a semantic network on the base of Meaning-Text Theory. Prior art search also needs pre-translation of the patent application using machine translation tools. On the step of semantic similarity calculation, we compare the semantic trees for application and patent claims. We developed an automated system for the patent examination task, which is designed to reduce the time that an expert spends for the prior-art search and is adopted to deal with a large amount of patent information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Magdy, W., Jones, G.J.F.: Applying the KISS principle for the CLEF-IP 2010 prior art candidate patent search task. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)
Verma, M., Varma, V.: Exploring keyphrase extraction and IPC classification vectors for prior art search. In: CLEF Notebook Papers/Labs/Workshop (2011)
Mahdabi, P., Crestani, F.: Query-driven mining of citation networks for patent citation retrieval and recommendation. In: ACM International Conference on Information and Knowledge Management (CIKM) (2014)
Xue, X., Croft, W.B.: Modeling reformulation using query distributions. J. ACM Trans. Inf. Syst. 31(2) (2013). ACM, New York
D’hondt, E., Verberne, S., Oostdijk, N., Boves, L.: Patent classification on subgroup level using balanced winnow. In: Lupu, M., Mayer, K., Kando, N., Trippe, A. (eds.) Current Challenges in Patent Information Retrieval. TIRS, vol. 37, pp. 299–324. Springer, Heidelberg (2017). doi:10.1007/978-3-662-53817-3_11
Bouadjenek, M., Sanner, S., Ferraro, G.: A study of query reformulation of patent prior art search with partial patent applications. In: 15th International Conference on Artificial Intelligence and Law (ICAIL 2015), pp. 1–11. Association for Computing Machinery (ACM), USA (2015)
Kim, Y., Croft, W.B.: Diversifying query suggestions based on query documents. In: Proceedings of the SIGIR 2014 (2014)
Ferraro, G., Suominen, H., Nualart, J.: Segmentation of patent claims for improving their readability. In: 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR). Stroudsburg, PA 18360, USA, pp. 66–73 (2014)
Andersson, L., Hanbury, A., Rauber, A.: The portability of three types of text mining techniques into the patent text genre. In: Lupu, M., Mayer, K., Kando, N., Trippe, A. (eds.) Current Challenges in Patent Information Retrieval. TIRS, vol. 37, pp. 241–280. Springer, Heidelberg (2017). doi:10.1007/978-3-662-53817-3_9
Blei, D.M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Durrani, N., Sajjad, H., Hoang, H., Koehn, P.: Integrating an unsupervised transliteration model into statistical machine translation. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Gothenburg, Sweden (2014)
Korobkin, D., Fomenkov, S., Kravets, A., Kolesnikov, S., Dykov, M.: Three-steps methodology for patents prior-art retrieval and structured physical knowledge extracting. In: Kravets, A., Shcherbakov, M., Kultsova, M., Shabalina, O. (eds.) Creativity in Intelligent Technologies and Data Science. CCIS, vol. 535, pp. 124–136. Springer, Cham (2015). doi:10.1007/978-3-319-23766-4_10
Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proceeding EMNLP 2000, vol. 13, Hong Kong, pp. 63–70 (2000)
Hall, J.: MaltParser – An Architecture for Inductive Labeled Dependency Parsing, p. 92. University of Colorado, Boulder (2006)
Haverinen, K., Viljanen, T., Laippala, V., Kohonen, S., Ginter, F., Salakoski, T.: Treebanking finnish. In: Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories (TLT) (2010)
de Marneffe, M.-C., Manning, C.D.: Stanford typed dependencies manual (2016)
Mel’čuk, I.A.: Dependency Syntax Theory and Practice. SUNY Publ, Albany (1988)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 10(8), 707–710 (1966)
Korobkin, D.M., Fomenkov, S.A., Kravets, A.G., Golovanchikov, A.B.: Patent data analysis system for information extraction tasks. In: 13th International Conference on Applied Computing (AC) 2016, pp. 215–219 (2016)
Acknowledgement
This research was partially supported by the Russian Foundation of Basic Research (grants No. 15-07-09142 A, No. 15-07-06254 A, No. 16-07-00534 A).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Korobkin, D., Fomenkov, S., Kravets, A., Kolesnikov, S. (2017). Methods of Statistical and Semantic Patent Analysis. In: Kravets, A., Shcherbakov, M., Kultsova, M., Groumpos, P. (eds) Creativity in Intelligent Technologies and Data Science. CIT&DS 2017. Communications in Computer and Information Science, vol 754. Springer, Cham. https://doi.org/10.1007/978-3-319-65551-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-65551-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65550-5
Online ISBN: 978-3-319-65551-2
eBook Packages: Computer ScienceComputer Science (R0)