Abstract
Patent search is a substantial basis for many operational questions and scientometric evaluations. We consider it as a sequence of distinct stages. The “patent wide search” involves a definition of system boundaries by means of classifications and a keyword search producing a patent set with a high recall level (see Schmitz in Patentinformetrie: Analyse und Verdichtung von technischen Schutzrechtsinformationen, DGI, Frankfurt (Main), 2010 with an overview of searchable patent meta data). In this set of patents a “patent near search” takes place, producing a patent set with high(er) precision. Hence, the question arises how the researcher has to operate within this patent set to efficiently identify patents that contain paraphrased descriptions of the sought inventive elements in contextual information and whether this produces different results compared to a conventional search. We present a semiautomatic iterative method for the identification of such patents, based on semantic similarity. In order to test our method we generate an initial dataset in the course of a patent wide search. This dataset is then analyzed by means of the semiautomatic iterative method as well as by an alternative method emulating the conventional process of keyword refinement. It thus becomes obvious that both methods have their particular “raison d’être”, and that the semiautomatic iterative method seems to be able to support a conventional patent search very effectively.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
See Appendix “Identified reference documents” with a list of the ten reference documents.
To prove the robustness of our method, we compared its results with a relaxed approach (see Appendix “Robustness check”).
References
Aamodt, A., & Plaza, E. (1994). Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Communications, 7(1), 39–59.
Abercrombie, R. K., Udoeyop, A. W., & Schlicher, B. G. (2012). A study of scientometric methods to identify emerging technologies via modeling of milestones. Scientometrics, 91(2), 327–342.
Alberts, D., Yang, C. B., Fobare-DePonio, D., Koubek, K., Robins, S., Rodgers, M., et al. (2011). Introduction to patent searching. In M. Lupu, K. Mayer, J. Tait, & A. J. Trippe (Eds.), Current challenges in patent information retrieval (pp. 3–44). Heidelberg: Springer-Verlag.
Benson, C. L., & Magee, C. L. (2013). A hybrid keyword and patent class methodology for selecting relevant sets of patents for a technological field. Scientometrics, 96(1), 69–82.
Breitzmann, A., & Thomas, P. (2002). Using patent citation analysis to target/value M&A candidates. Research Technology Management, 45(5), 28–36.
Carpineto, C., & Romano, G. (2012). A survey of automatic query expansion in information retrieval. ACM Computing Surveys, 44(1), 1–50.
Carterette, B., & Voorhees, E. M. (2013). Overview of information retrieval evaluation. In M. Lupu, K. Mayer, J. Tait, & A. J. Trippe (Eds.), Current challenges in patent information retrieval (pp. 69–86). Springer: Berlin.
Cascini, G., Fantechi, A., & Spinicci, E. (2004). Natural language processing of patents and technical documentation. Lecture Notes in Computer Science, 3163, 508–520.
Cascini, G., & Zini, M. (2008). Measuring patent similarity by comparing inventions functional trees. Computer-Aided Innovation (CAI), 277, 31–42.
Choi, S., Yoon, J., Kim, K., Lee, J. Y., & Kim, C. (2011). SAO network analysis of patents for technology trends identification: A case study of polymer electrolyte membrane technology in proton exchange membrane fuel cells. Scientometrics, 88(3), 863–883.
DIN Deutsches Institut für Normung e.V. (2010). DIN SPEC 1060: Dienstleistungsqualität im intellectual property management. Berlin: Beuth Verlag.
Dirnberger, D. (2011). A guide to efficient keyword, sequence and classification search strategies for biopharmaceutical drug-centric patent landscape searches—A human recombinant insulin patent landscape case study. World Patent Information, 33, 128–143.
Dixon, R. M. W. (1992). A new approach to English grammar, on semantiv principles. Oxford: Oxford University Press.
Dong, H., Hussain, F. K., & Chang, H. (2011). A context-aware semantic similarity model for ontology environments. Concurrency and Computation: Practice and Experience, 23(5), 505–524.
Ernst, H. (2001). Patent applications and subsequent changes of performance: Evidence from time-series cross-section analysis on the firm level. Research Policy, 30, 143–157.
Ervilia, F. T., & Herstatt, C. (2009). Exploring the relation of patent ownership and market success—Cases from the LCD flat panel display industry. International Journal of Technology Intelligence and Planning, 5(1), 90–109.
Field, A. (2009). Discovering statistics using SPSS. London et al.: Sage Publications.
Gambardella, A., & McGahan, A. M. (2010). Business-model innovation: General purpose technologies and their implications for industry structure. Long Range Planning, 43, 262–271.
Große, D., Fey, G., & Drechsler, R. (2007). SATRIX: Algorithmen für Boolsche Erfüllbarkeit. Herzogenrath: Shaker Verlag.
Harhoff, D., et al. (2003). Citations, family size, opposition and the value of patent rights. Research Policy, 32, 1343–1363.
Jang, S.-L., Yu, Y.-C., & Wang, T.-Y. (2011). Emerging firms in an emerging field: An analysis of patent citations in electronic-paper display technology. Scientometrics, 89(1), 259–272.
Kim, Y., Suh, J., & Park, S. (2008). Visualization of patent analysis for emerging technology. Expert Systems with Applications, 34(3), 1804–1812.
Krause, J. (Ed.). (1987). Inhaltserschließung von Massendaten: Zur Wirksamkeit informationslinguistischer Verfahren am Beispiel des Deutschen Patentinformationssystems. Hildesheim: Georg Olms.
Lee, S. (2013). Linking technology roadmapping to patent analysis. In M. G. Moehrle, R. Isenmann, & R. Phaal (Eds.), Technology roadmapping for strategy and innovation (pp. 267–284). Berlin: Springer.
Mayring, P. (2003). Qualitative inhaltsanalyse. Grundlagen und Techniken. Weinheim: Beltz.
Mitchell, M., & Jolley, J. M. (2012). Research design explained (8th ed.). Wadsworth: Cengage Learning Emea.
Moehrle, M. G. (2010). Measures for textual patent similarities: A guided way to select appropriate approaches. Scientometrics, 85(1), 95–109.
Moehrle, M. G., & Gerken, J. (2012). Measuring textual patent similarity on basis of combined concepts: Design decisions and their consequences. Scientometrics, 91, 805–826.
Moehrle, M. G., & Walter, L. (2009). Patentierung von Geschäftsprozessen. Monitoring—strategien—schutz. Berlin: Springer.
Moehrle, M. G., et al. (2010). Patinformatics as a business process: A guideline through patent research task and tools. World Patent Information, 32, 291–299.
Mogee, M., & Breitzmann, A. (2002). The many applications of patent analysis. Journal of Information Science, 28, 187–205.
Moskovkin, V. M., Shigorina, N. A., & Popov, D. (2012). The possibility of using google patents search tool in patentometric analysis. Scientific and Technical Information Processing, 39, 107–112.
Niemann, H., Moehrle, M. G., & Walter, L. (2013). The development of business method patenting in the logistics industry—Insights from the case of intelligent sensor networks. International Journal of Technology Management, 61(2), 177–197.
Nijhof, E. (2007). Subject analysis and search strategies—Has the searcher become the bottleneck in the search process? World Patent Information, 29(1), 20–25.
Salton, G. (1988). A simple blueprint for automatic Boolean query processing. Information Processing and Management, 24(3), 269–280.
Salton, G., & McGill, M. J. (1988). Information retrieval. Grundlegendes für Informationswissenschaftler. Hamburg: McGraw-Hill Book Company GmbH.
Sánchez, D., Batet, M., Isern, D., & Valls, A. (2012). Ontology-based semantic similarity: A new feature-based approach. Expert Systems with Applications, 39(9), 7718–7728.
Schmitz, J. (2010). Patentinformetrie: Analyse und Verdichtung von technischen Schutzrechtsinformationen. Frankfurt (Main): DGI.
Stefanov, V., & Tait, J. I. (2011). An introduction to contemporary search technology. In M. Lupu, K. Mayer, J. Tait, & A. J. Trippe (Eds.), Current challenges in patent information retrieval (pp. 45–68). Berlin: Springer-Verlag.
Stock, W. G. (2007). Information retrieval: Informationen suchen und finden. München: Oldenbourg.
Teece, D. J. (2010). Business models, business strategy and innovation. Long Range Planning, 43, 172–194.
Tinsley, H. E. A., & Weiss, D. J. (2000). Interrater reliability and agreement. In H. E. A. Tinsley & S. D. Brown (Eds.), Handbook of applied multivariate statistics and mathematical modeling (pp. 95–124). San Diego: Academic Press.
Trajtenberg, M., et al. (1997). University versus corporate patents: A window on the basicness of invention. Economics of Innovation and New Technology, 5, 19–50.
Trippe, A. J. (2003). Patinformatics: Tasks and tools. World Patent Information, 25(3), 211–221.
Van der Drift, J. (1991). Effective strategies for searching existing patent rights. World Patent Information, 13, 67–71.
Verhaegen, P. A., et al. (2011). Searching for similar products through patent analysis. Procedia Engineering, 9, 431–441.
von Proff, S., & Dettmann, A. (2012). Inventor collaboration over distance: A comparison of academic and corporate patents. Scientometrics,. doi:10.1007/s11192-012-0812-8:1-22.
von Wartburg, I., et al. (2005). Inventive progress measured by multi-stage patent citation analysis. Research Policy, 34, 1591–1607.
Yin, R. K. (2009). Case study research (4th ed.). Thousand Oaks et al.: Sage.
Yoon, J., & Kim, K. (2011). Identifying rapidly evolving technological trends for R&D planning using SAO-based semantic patent networks. Scientometrics, 88(1), 213–228.
Yoon, B., & Park, Y. (2004). A text-mining-based patent network: Analytical tool for high-technology trend. The Journal of High Technology Management Research, 15(1), 37–50.
Acknowledgments
The authors wish to thank Dr. Lothar Walter and Dr. Jan M. Gerken for their critical comments and constructive ideas, particularly regarding the keyword search emulating method. We also would like to acknowledge the contributions of two anonymous reviewers, whose suggestions helped enhancing this paper’s readability and general quality.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Identified reference documents
Table 9.
Robustness check
Analogous to experimental design we create an additional procedure. We interpret the ranking of patents in our semi-automatic iterative method as treatment for an “experimental group” and define a second procedure without treatment for a “control group” (see Mitchell and Jolley 2012 for the design of experiments). In other words, the second procedure represents a relaxation of our method, neglecting the ranking of the dataset created by similarity measurement. For this purpose pure coincidence was chosen as a comparative value for the qualitative evaluation of the presented method. This is based on the assumption that as many patents are randomly reviewed as were marked during the examination of the dataset by means of the presented method. In the course of the semiautomatic iterative method, the review of the semantically most similar documents produced an average quantity of 63 marked documents. If the same quantity were reviewed randomly, the resulting average recall would amount to 63 % (see Table 10). The precision of this purely coincidental approach can be calculated by means of the given formula but would merely represent the share of relevant documents in proportion to the total number of documents (in this case 36 %). Thus, the semiautomatic iterative method presented here proves to be robust, as it is unrestrictedly superior to the random selection of patents in terms of recall and precision.
Rights and permissions
About this article
Cite this article
Moeller, A., Moehrle, M.G. Completing keyword patent search with semantic patent search: introducing a semiautomatic iterative method for patent near search based on semantic similarities. Scientometrics 102, 77–96 (2015). https://doi.org/10.1007/s11192-014-1446-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-014-1446-9