INTRODUCTION

Nielsen and colleagues (2020) have produced very valuable insights into the methodological choices of articles published in the Journal of International Business Studies (JIBS) over the last 50 years. Among others, their findings lend support to the idea that the increased use of secondary data has established itself as a methodological convention in the journal. However, while Nielsen et al. (2020) call for more triangulation to counterbalance the decline in methodological diversity, their interpretation of the JIBS data does not adequately emphasize important additional implications that are shaping the international business (IB) domain. Specifically, the surge in secondary data usage implies a declining share of primary data studies among all published empirical papers. This is important because, everything else equal, a systematic preference for one type of data over another can have serious implications for theory generation and testing, the field’s exposure to specific methodological risks, and ultimately the field’s advancement. Furthermore, in their analysis, Nielsen et al. (2020) do not address earlier speculations that the IB field may have neglected the role of individuals, their motivations, abilities, and actions, or interactions between them, which means that many IB phenomena are incompletely captured by our field. While Nielsen et al. (2020) remain silent on this issue, previous reviews have only selectively focused on specific topics within the IB domain, such as knowledge sharing (Foss & Pedersen, 2019), subsidiary management (Meyer, Li, & Schotter, 2020), or global strategy (Contractor, Foss, Kundu, & Lahiri, 2019), therefore calling for a more comprehensive analysis. Finally, while Nielsen and colleagues’ insights are very valuable, they do not take into account other relevant IB journals.

In this commentary, which we see as an extension of Nielsen et al.’s (2020) key tenets, we focus on the six leading IB journals and the empirical publications therein over the last 20 years. Specifically, we analyze the share of studies that use secondary vs. primary data over time. Our results clearly show a declining share of primary research in IB. This pattern is robust across all IB journals. Furthermore, we investigate the extent to which the secondary data in IB covers individual-level constructs and issues. Our data show that a sharp increase in the share of secondary data in IB is accompanied by a persistently low share of secondary data research focusing on individual-level constructs and issues. We discuss the theoretical mechanisms and implications of the observed trends. We conclude by calling for more healthy skepticism towards secondary data constructs and their origins in IB, and for further leveraging the untapped potential of individual-level secondary data to advance the IB domain theoretically and empirically.

APPROACH AND RESULTS

Approach

We applied a systematic review approach (Tranfield, Denyer, & Smart, 2003) to examine the type of data collected in both quantitative and qualitative empirical papers published in the most prominent IB journals – Global Strategy Journal (GSJ), International Business Review (IBR), Journal of International Business Studies (JIBS), Journal of International Management (JIM), Journal of World Business (JWB), and Management International Review (MIR).1 We followed all necessary steps and transparency requirements as suggested by Aguinis, Ramani and Alabduljader (2018). In line with their recommendations, we only included research articles (“Original Papers” or “Articles” in JIBS). We excluded articles published in the “From the Editor” and “Research Notes and Commentaries” sections.2 In total, this amounted to 4202 articles published between 2000 and 2019.

Extant literature rarely provides a precise definition for what distinguishes primary data from secondary data. In fact, recent articles on the topic do not explicitly define the two types of data (e.g., Aguinis, Cascio, & Ramani, 2017; Bosco, Aguinis, Field, Pierce, & Dalton, 2016; Miller, Davis-Sramek, Fugate, Pagell, & Flynn, 2020). Furthermore, previous researchers, including Nielsen et al. (2020), have coded data according to the type of data collection (e.g., surveys or interviews) rather than the type of data. In an attempt to bring more clarity to this issue, and after fruitful discussions with the editorial team,3 we define primary data as raw data, i.e., data that provide raw information and evidence about a study object. Secondary data, in turn, is defined as data which are not in raw format anymore, but instead obtained from sources that provide descriptions, interpretations, syntheses, or aggregations of primary data. Because we acknowledge that the definition may still leave some ambiguities, for example as to the precise meaning of raw data, we detail our approach to coding below.

Specifically, we coded data as primary when it was collected first-hand through structured surveys4 (including structured questionnaires and structured interviews), unstructured/semi-structured interviews, observations, and experiments5 (including simulations). Furthermore, we coded meta-analyses, studies based on data directly provided by companies that was not publicly available (e.g., data from HR archives), and data directly obtained from companies that were publicly available, such as annual reports or company websites, as primary data. Similarly, company presentations and other data collected as sources of triangulation in qualitative research (e.g., case study research) were equally considered raw data. As such, our coding of primary data may slightly overstate the prevalence of primary data. All other sources were coded as secondary data. For example, this included all work which used Hofstede’s country scores, as the scores represent aggregated information from the original raw survey data. Similarly, work using indices that aggregate raw data, such as the World Bank’s Ease of Doing Business index or Policymaking Uncertainty of the POLCON data (Henisz, 2000), as well as financial performance figures and ratios such as those produced by Compustat (see Calantone & Vickery, 2010) were coded as secondary data.

When coding all articles, we specifically focused on the substantive variables. Substantive variables comprise all independent, moderator, mediator, and dependent variables that were part of the main model, process, or typology examined in that particular study. Thus, if a paper used survey items for both independent and dependent variables and a number of control variables from secondary data sources, then this study was coded as “exclusively primary data”. When the independent variable was based on secondary data while the dependent variable was based on primary data, we coded the paper as using both “secondary data” and “primary data”. Interviews used merely to determine the appropriate sample, develop questionnaires, or make sense of secondary data findings were not considered.

In a second step, we coded each study using secondary data in terms of whether the substantive variables in its core model were measured at the individual level or at higher levels of analysis (i.e., team, firm, industry, clusters, province/region, and country level). Specifically, if data were initially collected at the individual level but was later aggregated to a higher level of analysis, and if the research question and main statistical analyses focused at the higher level of analysis, we coded our data at that level. For example, there is a long tradition in IB research to draw on individual patent applications (e.g., Phene & Almeida, 2008). We thus coded studies drawing on individual patent applications at a higher level of analysis if the data were aggregated to a higher level, but accounted for them as individual-level studies if the data were used both at the individual level and higher levels of analysis.

We took great care to validate our coding procedure. Four coders independently performed coding work and 60.7% of all analyzed articles (2550 articles) were coded by two or more coders. Coders 1, 2, and 4 were graduate students who all possessed research assistant experience and who had gone through rigorous training with Coder 3, one of the co-authors. A senior scholar was consulted in case of ambiguities. In the first round, Coder 1 examined 86.4% (3632) and Coder 2 17.2% (721) of all articles. Both coders wrote detailed memos explaining their coding, and Coder 3 subsequently checked all codes. Whenever ambiguities occurred (which applied to 150 articles), Coder 3 marked these papers and handed them for additional checks to the respective other coder of this first round. For all cases that did not reach inter-coder agreement between Coders 1 and 2, Coder 3 reread the full methods section of the papers and made a final decision on the coding.

In the second round, when we examined the share of studies based on secondary data that used individual-level variables, an additional fourth coder reanalyzed all papers except literature analyses (107 articles), for a total of 1666 articles (45.8% of all empirical articles). Coder 3 rechecked all articles that were either unclear for Coder 4 or that were used during the training of Coder 4. In total, Coder 3 coded 720 papers (19.8% of all empirical articles) in the first and the second round.

We acknowledge that our approach of coding secondary data studies according to the level at which the raw data have been aggregated and analyzed according to the main research question may understate the prevalence of individual-level data in IB. However, we would argue that statistical aggregation not only omits individual variance from all further analyses but, importantly, the conceptualizations and theoretical conclusions drawn from such studies are necessarily limited to that aggregate level of analysis and do not permit inferences about individual heterogeneity (Foss & Pedersen, 2019). Our detailed methodological approach is available upon request.

Results

Table 1 presents the number and share of articles per data type across journals, while Table 2 aggregates this information for all journals over the entire time frame.

Table 1 Number and share of papers according to data type across journals between 2000 and 2019
Table 2 Number and share of papers according to data type over specific time periods

Our results show that secondary data have been widely used in IB and accounts for 48.7% of all published empirical papers. More importantly, the share of empirical papers using secondary data has been growing over the years, as shown by an increase of 52.6% from the period of 2000–2004 to 2015–2019 (see Figure 1). This trend is very similar for the share of papers using secondary data either exclusively or in combination with primary data, in all journals, and cumulatively for all IB journals together. At the same time, the share of papers using primary data has shrunk by 21.7% over the same period (see Figure 2). Again, this trend is similar for the share of empirical papers using primary data either exclusively or in combination with secondary data, in all journals, and cumulatively for all IB journals together.

Figure 1
figure 1

Share of empirical papers using secondary data. *Since 2011 for GSJ, as this was the year of its first publication. Note: Percentages refer to all empirical papers using secondary data, both exclusively and in combination with primary data.

Figure 2
figure 2

Share of empirical papers using primary data. *Since 2011 for GSJ, as this was the year of its first publication. Note: Percentages refer to all empirical papers using primary data, both exclusively and in combination with secondary data.

Furthermore, of all articles exclusively drawing on secondary data, only about 4.1% (62 articles, out of which 16 are single-level studies and 46 are multi-level studies) use individual-level variables. Adding studies that combine secondary and primary data, this percentage rises to 5.5% (97 articles, out of which 20 are single-level studies and 77 are multi-level studies). These small shares are relatively persistent over time, even though there is a slight increase specifically in the last period (with a share of 5.3%) compared to earlier periods (2.2–4.0%, see Table 2). IBR, JIBS, and JWB (with 4.7, 4.3, and 6.2%, respectively) were slightly above average (4.1%, see Table 1).

DISCUSSION

Two mechanisms may explain the observed shifts: (1) increases in the quality and versatility of secondary data relative to primary data and (2) perceived increases in the quality and versatility of secondary data that have led to institutional reinforcement over time. While our data do not allow us to substantiate the relative salience of each mechanism, we believe that both mechanisms are at play and we illustrate each using example constructs or fields of study within IB.

First, the sharp relative increase of secondary data might be due to its quality and versatility having improved over time while persistent issues inherent in primary data have become more transparent. For example, scholars have argued that the declining survey and interview response rates indicate that scholars fail to impress executives with the relevance of their work (Chidlow, Ghauri, Yeniyurt, & Cavusgil, 2015) while also biasing samples and estimates (Aguinis et al., 2018). Furthermore, researchers have voiced methodological challenges of primary research in IB, including common method variance (Chang, Van Witteloostuijn, & Eden, 2010), limited reproducibility and replicability (Aguinis & Solarino, 2019; Aguinis et al., 2017), and language effects (Harzing, Reiche, & Pudelko, 2013). Secondary data address some of these concerns. It facilitates replicability and largely avoids data privacy concerns (Salkind, 2010). Secondary data – especially in the case of financial figures or technical ratios – may also be less biased by respondent perceptions (Calantone & Vickery, 2010). It further enables the researcher to perform longitudinal analyses over longer time frames, wider geographical areas, and larger samples than would often be possible through the collection of primary data, thereby allowing for more sophisticated statistical techniques.

Second, Nielsen et al. (2020) invoked Kuhn’s (2012) framework of normal science to explain how methodological choices become entrenched as a dominant paradigm and lead to a decline in methodological pluralism in a research domain like IB. From that perspective, secondary data use may not (only) have increased due to its improved quality and versatility per se, but also as a result of perceived increases in quality and versatility that have established and institutionally reinforced secondary data as the “normal” or typical source of IB research. There are indeed good reasons for why secondary data use may have become a methodological convention in IB. Specifically, secondary data have become more available and accessible, thus often saving substantial time and money during data collection (Ketchen, Ireland, & Baker, 2013; Nielsen et al., 2020). Measures of country risk, political risk, corruption, as well as variables capturing institutional and cultural context or distance are readily available, can be processed fast, and have been used in many studies (e.g., Bekaert, Harvey, Lundblad, & Siegel, 2014; Beugelsdijk, Ambos, & Nell, 2018). While increased availability and accessibility are not an issue per se, they may become problematic when researchers respond by increasingly investigating topics that fit the data as opposed to focusing on relevant research questions and collecting data to address them. To this end, the increased competition for journal space and growing publication pressures (Aguinis, Cummings, Ramani & Cummings, 2020) may have reinforced the trend towards secondary data.

Importantly, the institutionalization of secondary data use in IB may have also come at the expense of recognizing salient drawbacks inherent in secondary data. These involve risks surrounding secondary data in general such as construct validity problems that arise from using archival proxies that may not properly reflect the underlying construct or are used to measure a wide range of different constructs (Ketchen et al., 2013), and concerns over the selective choice of substantive variables, which may be more prevalent in large secondary data sets (Aguinis et al., 2017). However, some problems are highly specific to IB. For example, consider conceptual equivalence issues that stem from countries reporting different values for a given statistic due to differences in how the units and thresholds are defined (Doole & Lowe, 2008; Sorrentino, 2000). While conceptual equivalence issues are of course relevant for primary data as well (Hult et al., 2008), secondary data usually do not allow the researcher to control or mitigate these concerns in the research design.

Furthermore, governments, national statistical offices, international organizations, or multinational corporations may have reasons to manipulate the data they gather – data, which find their way into widely used secondary databases. For example, politicians in many countries might aim to overstate factors to attract more FDI while understating other factors, for example to receive more foreign aid (Doole & Lowe, 2008). A case in point is Greece, which falsified data about its public finances (Barber, 2010). Similarly, cultural measures (e.g., Hofstede, Schwartz and Globe) as well as the Eurobarometer, Transparency International Corruption Index, World Bank Governance Indicators, World Values Survey, or the Global Competitiveness Report have been heavily criticized for their methodological problems (e.g., Andersson & Heywood, 2009; Hofstede, 2006; Kirkman, Lowe & Gibson, 2017; Maseland, Dow & Steel, 2018; Shenkar, 2012). In fact, all of these measures are originally based on survey data, which means that they frequently suffer from survey-related problems and that secondary data researchers have difficulties controlling for or mitigating these problems in their research. At the same time, references to the biases and weaknesses of such data are conspicuously absent in much of the literature. In other words, researchers seem to assume that just because some credible source such as the World Bank made the data available, it is therefore unproblematic and can be taken at face value. Given the sharp increase of secondary data, it thus appears that editors and reviewers do not consider the weaknesses and lack of mitigation possibilities inherent in secondary data as critical.

The idea that relevant concerns of secondary data may have been neglected is supported by empirical evidence regarding particular constructs. For example, in the case of cultural distance, Beugelsdijk et al. (2018) refer to Hofstede-based Kogut and Singh’s cultural distance as having achieved the status of a “quasi-objectified” measure that is insufficiently questioned and discussed. Similarly, Ambos and Håkanson (2014: 2) suggest that the distance construct has been subject to a process of reification “whereby we come to take constructs for granted (…) largely forgetting the assumptions and rationale that underpinned them originally.” In short, prior work has criticized that researchers using these indices and datasets simply reference other papers discussing respective problems without any further explanation of how these problems may affect the results of the current study (Beugelsdijk et al., 2018; Shenkar, 2012). Beyond distance, a number of other constructs measured through secondary data such as innovation performance and international knowledge transfer measured via patents (e.g., Berry, 2020) or absorptive capacity measured via R&D spending (Lane, Koka, & Pathak, 2006) may have experienced reification and become quasi-objectified, thereby unduly increasing the perceived validity and reliability of secondary data.

Our finding that secondary data-based research at the individual level remains scarce might similarly be the result of researchers’ focus shifting from the research question to the data available. This can be highly problematic to the extent that it is a systematic shift. However, there is a range of novel sources of secondary data at the individual level that IB researchers may simply not be sufficiently aware of or equipped for using. For example, virtual communication habits such as language use, turntaking in virtual meetings, or frequency of use, which may be aggregated per individual over time or across meeting type by platforms such as Zoom or Microsoft Teams have received hardly any application in IB. Additionally, mobility tracking during international business travel, and individual differences in web browsing habits or social media usage that can be compiled by analytics providers or technology firms such as Google or Facebook would be fruitful for individual-level research. Such data may advance our understanding of how individuals experience global work, how global leaders communicate and motivate in a virtual space, and how decision-making heuristics emerge and impact key IB outcomes.

The persistently low share of individual-level secondary data in IB also risks reducing theoretical pluralism. The type of data used is closely connected to the type of research questions that can be answered with them given that measurement should occur at the level of the hypothesized mechanism. The neglect of individual-level data thus likely leads to an underrepresentation of theories that contain individual-level concepts and by extension our ability to further develop salient micro-level mechanisms in IB theories. Neglecting the individual level also effectively reduces possibilities for multi-level theorizing. As Buckley and Lessard (2005, p. 595) point out, “the key to international business is that it approaches empirical phenomena at a variety of levels of analysis.” Yet, multi-level models in IB have unrealized potential (Peterson, Arregle, & Martin, 2012). For example, Maitland and Sammartino (2015) show that individuals’ assessments and judgments of relevant information are important microfoundations for internationalization theory, yet rarely find their way into IB theory generation and testing.

The concept of psychic distance is another case in point. Originally introduced in the literature by Beckerman (1956) as “the subjectively perceived distance to a given foreign country” (Håkanson & Ambos, 2010, p. 196), it subsequently evolved in its original meaning. It is now frequently used and interpreted at the country level (see Håkanson & Ambos, 2010 for a more detailed overview), often framed as “psychic distance stimuli” (Dow & Karunaratna, 2006), and measured via secondary data. In fact, research capturing the original concept of psychic distance at the individual level through primary research is much less prevalent than research on country-level psychic distance stimuli (Baak, Dow, Parente, & Bacon, 2015; Håkanson & Ambos, 2010) and yet, jointly considering individual-level and country-level assessments of psychic distance would have important implications for a range of IB phenomena such as internationalization, locational choice, headquarters–subsidiary relationships, or global talent management. Taken together, our results extend recent claims (e.g., Contractor et al., 2019; Foss & Pedersen, 2019; Meyer et al., 2020) that the individual-level mechanisms underlying many phenomena remain underdeveloped and suggest that this may be true in IB more broadly and not only in specific domains such as knowledge sharing, subsidiary management, or global strategy.

CONCLUSION

Following Nielsen et al.’s (2020) results, we report a sharp increase in the share of secondary data to the detriment of primary data in the six leading IB journals over the last 20 years. While secondary data are often highly valid, reliable, and often also less biased, we would caution that the documented trend increasingly exposes IB to the specific risks and problems of secondary data. Specifically, we contend that the advent of more high-quality and easily accessible secondary data has influenced the generally accepted paradigm governing normal science in ways that may be problematic, increasing – perhaps unduly – the perceived as opposed to the factual quality and versatility of secondary data. Secondary data should be treated with the same healthy skepticism and quality checks as primary data, and researchers should report on the specific limitations of the original data as well as their potential effects on the focal study. Yet, we believe that more could be done in this regard within the IB domain as we strive for increasing standards of rigor (Chang et al., 2010). Additionally, we demonstrated a persistently low level of individual-level studies in research based on secondary data, and suggest that this untapped potential provides many valuable research avenues for advancing IB.

Notes

  1. 1

    All these journals have ratings of 4 or 3 in the Academic Journal Guide 2018 in the field of International Business or Strategy. We excluded African Affairs, Asia Pacific Journal of Management, Journal of Common Market Studies and Management and Organizational Review (all with ranks 3 on this list), since they are rather focused on area studies.

  2. 2

    We also excluded non-original articles, such as JIBS Decade Award Articles, and editorials, perspectives, keynotes, book reviews, points, counterpoints, credits, glossaries, and letters from the editors.

  3. 3

    We particularly thank the consulting editor for crucial input on this matter.

  4. 4

    Surveys use a response format that is fully determined prior to administration and use numerical responses (or those that will be coded into numerical values in the analysis) in nominal, ordinal, or metric form.

  5. 5

    Studies where at least two randomly assigned groups participate either in a lab or their natural “field” conditions and where only one group receives a treatment, while both groups receive post-tests in order to check the effect of the treatment; we also subsume the quasi-experimental method within this group, since this method has all attributes of an experiment, except for the random assignment to a treatment group.