Abstract
The paper presents a text mining approach to identifying technological trajectories. The main problem addressed is the selection of documents related to a particular technology. These documents are needed to identify a trajectory of the technology. Two different methods were compared (based on word2vec and lexical-morphological and syntactic search). The aim of developed approach is to retrieve more information about a given technology and about technologies that could affect its development. We present the results of experiments on a dataset containing over 4.4 million of documents as a part of USPTO patent database. Self-driving car technology was chosen as an example. The result of the research shows that the developed methods are useful for automated information retrieval as the first stage of the analysis and identification of technological trajectories.
Keywords
- Text mining
- Technological trajectories
- Similar document retrieval
This work was supported by the RFBR grant № 17-29-07016 ofi_m.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Dosi, G.: Technological paradigms and technological trajectories: a suggested interpretation of the determinants and directions of technical change. Res. Policy 11(3), 147–162 (1982)
Liu, X., et al.: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database. J. Am. Soc. Inf. Sci. Technol. 61(6), 1105–1119 (2010)
Niemann, H., Moehrle, M.G., Frischkorn, J.: The use of a new patent text-mining and visualization method for identifying patenting patterns over time: concept, method and test application. Technol. Forecast. Soc. Change 115, 210–220 (2017)
Ozcan, S., Islam, N.: Patent information retrieval: approaching a method and analysing nanotechnology patent collaborations. Scientometrics 111(2), 941–970 (2017)
Sochenkov, I.V.: Metod sravneniya textov dlya resheniya poiskovo-analiticheskikh zadatch (Text comparison method for solving search and analytical tasks). Intellectualniy poisk informacii (Intelligent information retrieval), vol. 2, pp. 32–43 (2013)
Möller, A., Moehrle, M.G.: Complementing keyword search with semantic search—introducing an iterative semiautomatic method for near patent search based on semantic similarities. Scientometrics 102(1), 77–96 (2015)
Korobkin, D.M., et al.: Prior art candidate search on base of statistical and semantic patent analysis. In: Multi Conference on Computer Science and Information Systems 2017, pp. 231–238 (2017)
Alves, T., Rodrigues, R., Costa, H., Rocha, M.: Development of text mining tools for information retrieval from patents. In: Fdez-Riverola, F., Mohamad, M., Rocha, M., De Paz, J., Pinto, T. (eds.) PACBB 2017. AISC, vol. 616, pp. 66–73. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60816-7_9
Osipov, G., Smirnov, I., Tikhomirov, I., Sochenkov, I., Shelmanov, A.: Exactus expert—search and analytical engine for research and development support. In: Hadjiski, M., Kasabov, N., Filev, D., Jotsov, V. (eds.) Novel Applications of Intelligent Systems. SCI, vol. 586, pp. 269–285. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-14194-7_14
Osipov, G.S., et al.: Exactus patent–sistema patentnogo poiska i analiza (Exactus Patent–patent search and analysis system)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Okamoto, M., Shan, Z., Orihara, R.: Applying information extraction for patent structure analysis. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 989–992. ACM (2017)
Mikolov, T., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Smirnov, I.V., et al.: Semantic-syntactic analysis of natural languages. Part II. Method for semantic-syntactic analysis of texts. In: Iskusstvenny intellekt i prinyatie resheniy–Artificial Intelligence and Decision Making, vol. 1, pp. 11–24 (2014)
Search for patents–USPTO. https://www.uspto.gov/patents-application-process/search-patents
Suvorov, R.E., Sochenkov, I.V.: Opredelenie svyazannosti nauchno-technicheskikh dokumentov na osnove kharakteristiki tematicheskoy znachimosti (Determination of the connectedness of scientific and technical documents based on the characteristics of thematic significance). Iskusstvenniy intellect I prinyatie resheniy (Artificial intelligence and making decisions)
Dataset trajectories-uspto. http://nlp.isa.ru/trajectories-uspto. Accessed 04 July 2019
Sochenkov, I.V., Suvorov, R.E.: Servisy polnotekstovogo poiska v informacionno-analiticheskoy sisteme (chast 1) (Full-text search services in the information and analytical system). In: Informatsionnie tekhnologii i vichislitelnie sistemy (information technologies and computing systems), no. 2, p. 69 (2013)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Volkov, S.S., Devyatkin, D.A., Sochenkov, I.V., Tikhomirov, I.A., Toganova, N.V. (2019). Towards Automated Identification of Technological Trajectories. In: Kuznetsov, S., Panov, A. (eds) Artificial Intelligence. RCAI 2019. Communications in Computer and Information Science, vol 1093. Springer, Cham. https://doi.org/10.1007/978-3-030-30763-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-30763-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30762-2
Online ISBN: 978-3-030-30763-9
eBook Packages: Computer ScienceComputer Science (R0)