, Volume 105, Issue 3, pp 2005–2022

A systematic method to create search strategies for emerging technologies based on the Web of Science: illustrated for ‘Big Data’

  • Ying Huang
  • Jannik Schuehle
  • Alan L. Porter
  • Jan Youtie

DOI: 10.1007/s11192-015-1638-y

Cite this article as:
Huang, Y., Schuehle, J., Porter, A.L. et al. Scientometrics (2015) 105: 2005. doi:10.1007/s11192-015-1638-y


Bibliometric and “tech mining” studies depend on a crucial foundation—the search strategy used to retrieve relevant research publication records. Database searches for emerging technologies can be problematic in many respects, for example the rapid evolution of terminology, the use of common phraseology, or the extent of “legacy technology” terminology. Searching on such legacy terms may or may not pick up R&D pertaining to the emerging technology of interest. A challenge is to assess the relevance of legacy terminology in building an effective search model. Common-usage phraseology additionally confounds certain domains in which broader managerial, public interest, or other considerations are prominent. In contrast, searching for highly technical topics is relatively straightforward. In setting forth to analyze “Big Data,” we confront all three challenges—emerging terminology, common usage phrasing, and intersecting legacy technologies. In response, we have devised a systematic methodology to help identify research relating to Big Data. This methodology uses complementary search approaches, starting with a Boolean search model and subsequently employs contingency term sets to further refine the selection. The four search approaches considered are: (1) core lexical query, (2) expanded lexical query, (3) specialized journal search, and (4) cited reference analysis. Of special note here is the use of a “Hit-Ratio” that helps distinguish Big Data elements from less relevant legacy technology terms. We believe that such a systematic search development positions us to do meaningful analyses of Big Data research patterns, connections, and trajectories. Moreover, we suggest that such a systematic search approach can help formulate more replicable searches with high recall and satisfactory precision for other emerging technology studies.


Search strategy Lexical query Citation analysis Big Data 

Funding information

Funder NameGrant NumberFunding Note
Forecasting Innovation Pathways of Big Data & Analytics
  • 1527370
China Scholarship Council
  • 201406030005

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2015

Authors and Affiliations

  • Ying Huang
    • 1
    • 2
    • 3
  • Jannik Schuehle
    • 5
  • Alan L. Porter
    • 3
    • 4
  • Jan Youtie
    • 6
  1. 1.School of Management and EconomicsBeijing Institute of TechnologyBeijingChina
  2. 2.Lab of Knowledge Management and Data Analysis (KMDA)Beijing Institute of TechnologyBeijingChina
  3. 3.School of Public PolicyGeorgia Institute of TechnologyAtlantaUSA
  4. 4.Search Technology, Inc.AtlantaUSA
  5. 5.Department of Economics and ManagementKarlsruhe Institute of TechnologyKarlsruheGermany
  6. 6.Enterprise Innovation InstituteGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations