Introduction

International business (IB) phenomena provide new opportunities for identifying interesting and important relationships that are often overlooked in other studies, adding many dimensions of complexity to the research we conduct. This additional complexity emerges from several sources: from the cross-border relationships that organizations engage in as they have to deal with differences in economic, political, social, and geographic conditions; from the cross-country comparison of relationships that take into account additional variation in how the environment shapes relationships; and from the inclusion of the country level of analysis that alters relationships at lower levels of analysis. Previous editorials have explained how to deal with some of these issues by, for example, providing suggestions on how to: explain interaction effects within and across levels of analysis (Andersson, Cuervo-Cazurra, & Nielsen, 2014; Cortina, Köhler, & Nielsen, 2015), address multilevel challenges (Peterson, Arregle, & Martin, 2012), solve endogeneity problems (Reeb, Sakakibara, & Mahmood, 2012), improve qualitative research (Birkinshaw, Brannen, & Tung, 2011), address common method challenges (Chang, Van Witteloostuijn, & Eden, 2010), improve the theoretical identification of relationships (Bello & Kostova, 2012; Thomas, Cuervo-Cazurra, & Brannen, 2011), and more generally how to benefit from, and deal with, the inherent interdisciplinary nature of IB research (Cantwell & Brannen, 2011; Cantwell, Piepenbrink, & Shukla, 2014; Cheng, Birkinshaw, Lessard, & Thomas, 2014; Cheng, Henisz, Roth, & Swaminathan, 2009).

We build on these ideas and focus on providing a better understanding of how to ensure that the findings coming out of empirical studies are trustworthy, i.e., “worthy of confidence” (Merriam-Webster Dictionary, 2016; see also Lapan & deMarrais, 2003). Ensuring that the relationships identified in an empirical study are trustworthy is important in IB studies in particular and in management studies in general, because there is a very limited tradition of replicability that can help uncover researchers’ biases and differences in empirical techniques (Bettis, Ethiraj, Gambardella, Helfat, & Mitchell, 2016; Silberzahn & Uhlmann, 2015). This limited replicability is due to several reasons. Many of the samples are proprietary and closely guarded by researchers who want to extract the maximum number of publications out of their data collection effort. Even if the datasets are not proprietary, the specific samples may be difficult to replicate because researchers do not make their samples available to others, in contrast to studies in economics. Further, replication tends to be discouraged from publication in leading journals, including Journal of International Business Studies (JIBS), which prioritize novelty in ideas and analyses. In the absence of replication, each paper in and of itself has to demonstrate that it is worthy of confidence.

In doing so, a key task in empirical papers is ruling out alternate explanations for the phenomena under investigation. It is easy for this step to be neglected. Authors are encouraged to increase the storytelling nature of their articles (e.g., Haley & Boje, 2014; Pollock & Bono, 2013) and this includes developing a straightforward, accessible story line (Ragins, 2012). It can be tricky to introduce the possibility of alternative explanations without deviating from the plot of the narrative. However, it is necessary to do so for a paper to be considered trustworthy.

The objective of this editorial is to provide guidance to help IB scholars address alternate explanations in their empirical manuscripts, and ensure that they have identified the correct relationships and mechanisms so that readers can place higher trust in their findings. The editorial is organized in two parts that address the particular challenges of two distinct empirical traditions: qualitative and quantitative. Part A deals with qualitative research methods. It discusses multiple and integrated techniques to strengthen readers’ belief that the explanations arising from the analysis of one or few cases are the correct ones and not subject to alternative influences that emerge from data limitations or the inherent biases in the minds of the researchers. Part B deals with quantitative studies. It discusses large sample studies that test whether theoretically-derived relationships hold on a large number of individuals, teams, organizations, or countries. It provides suggestions on how to control for alternative explanations not only in the analyses of data, which has been the usual focus of the discussions of controls in quantitative studies, but also in the theoretical explanation of the hypothesized relationships as well as in the research design.

In discussing qualitative and quantitative research separately in two distinct sections, we recognize that we are inviting at least two types of criticism. The first is that there is no objective, clear-cut delineation between qualitative and quantitative research. As Small points out “the quantitative versus qualitative opposition has been used to contrast many kinds of alternative studies: large-n versus small-n, nomothetic versus idiographic, causal versus interpretive, variable-based versus case-based, explanatory versus descriptive, probabilistic versus deterministic, and numerous others” (Small, 2011: 59). We agree that the two categories are not mutually exclusive. Further, we agree with Small that “qualitative” and “quantitative” can refer independently to data, to data collection, and to data analysis, which renders the binary classification of many studies difficult. For example, with increased accessibility of electronic text and software tools such as sentiment analysis, there is greater quantification of qualitative data (Kaplan, 2015). However, we retain the distinction here for clarity of exposition, and for each section of the editorial we draw on scholarly authorities that are unambiguously about qualitative or quantitative research.

The second criticism we invite is that the relevant issues associated with the two research traditions are very different; in other words, we are discussing apples and oranges with only a loose connection under the umbrella of trustworthiness. However, we have two reasons for combining both research traditions in a single editorial. The first reason is to highlight the importance of being open minded. Too often scholars dismiss research that does not conform to their expected standards of analysis, and this is in part because of a natural tendency of paying attention to what one is familiar with. The current JIBS editorial team values theoretical and methodological pluralism to promote complementary ways to address new and difficult research questions and enhance the overall development of the field. The second reason is the value that one can gain from better understanding an alternative research tradition. We want to emphasize the value of comparing and contrasting research traditions next to one another. Although the specifics differ, both traditions face challenges in ensuring the identification of findings that other researchers can trust. Authors should recognize that reviewers will include experts in the methods they use, who will prioritize method-specific standards, but they are also likely to get reviewers who may be more familiar with other types of empirical methods. It is imperative that authors can explain to this second group of reviewers how they are establishing trustworthiness through the methods-related choices they make. An IB scholar keeping up with the literature needs to understand how trustworthiness is established for empirical approaches they may have little experience with. Thus both parts of this editorial are relevant to IB scholars even if they self-identify with only one of the research traditions covered.

Part A: Producing Trustworthy Qualitative International Business Research

Research based on qualitative data has played a long and illustrious role in IB (Birkinshaw et al., 2011); yet, the proportion of qualitative research appearing in JIBS is lower than this track record might warrant. There are probably several interrelated reasons for this. There are few submissions of qualitative papers to JIBS. There is limited training in many PhD programs in qualitative methods, and so researchers may lack familiarity with them. There is a lack of established standards for analyzing and presenting data (e.g., Bansal & Corley, 2012; Pratt, 2008), which makes the research process seem uncertain. It is time-consuming to embark on the long journey involved in collecting and analyzing qualitative research, such as gaining access to research sites, conducting interviews, and analyzing interview transcripts and documents. On top of all of this, there is the language challenge: primary data from interviews and participant observation often need to be conducted in more than one language, transcriptions must be done by a native speaker and at some point translated into English for publication in JIBS, and assuring meaning congruence and functional equivalence of terms is challenging.

In addition to these supply-based reasons, we believe that a factor constraining the publication of qualitative research papers is that they are having difficulty getting through the review process successfully. While the nature of the difficulties vary, we have noticed that a weakness common to many qualitative research submissions is that the authors have not paid sufficient attention to demonstrating the trustworthiness of their research. To address this, we provide guidelines as to how qualitative researchers in IB can establish this trustworthiness in their manuscripts.

At the outset we note that researchers wishing to use qualitative methods have many resources from which to draw inspiration. There was a JIBS Special Issue on Qualitative Methods (Birkinshaw et al., 2011), and there have been recent JIBS articles on qualitative methods in general (e.g., Doz, 2011) and on specific topics related to qualitative methods such as longitudinal historical research (Burgelman, 2011), grounded theory (Gligor, Esmark, & Gölgeci, 2016), case-based research (Welch, Piekkari, Plakoyiannaki, & Paavilainen-Mäntymäki, 2011), and ethnography (Westney & Van Maanen, 2011). There are articles in other journals on topics particularly relevant to IB, such as process-based research (e.g., Langley, 1999; Welch & Paavilainen-Mäntymäki, 2014) and there are classic texts such as Corbin and Strauss (2008), Glaser and Strauss (2011), Miles and Huberman (1994), Marschan-Piekkari and Welch (2011), Piekkari, Welch and Paavilainen (2009), Van Maanen (1998) and Yin (2009). We encourage authors to consult these and other resources when they are making research design and analysis decisions, and to use them to justify these decisions when reporting research results in their manuscripts.

Our intention in this editorial is to highlight the importance of making explicit and consistent choices in order to establish trustworthiness in a qualitative manuscript submitted for publication. This requires rigor from the start of a research project, because the conceptualization and design of a project influences the nature of the analysis that can be undertaken, and therefore the findings that constitute a scholarly contribution to the field. There are three well-known paths that are unlikely to lead to successful outcomes. One such path is converting a teaching case into a research case, which is problematic because a teaching case will rarely have the theoretical relevance and the rich data required of a research case. A second questionable path can occur in situations where it is difficult to collect data from a sample large enough to establish statistical significance, and so a researcher collects data from several companies and attempts to establish generalization by showing that multiple companies are engaged in the same strategies. This use of case studies is an example of a theoretical contribution that small n studies cannot make. Small n studies cannot make frequency-based insights, such as the propensity to engage in a particular firm behavior, because the frequency observed is highly dependent on the particular cases selected for examination. Moreover, small n studies can rarely explain outcomes such as performance, which are affected by many factors, because they cannot control for these factors as can large-scale quantitative studies. Finally, a third questionable path is “convenient sample driven” research, or “squat ethnography” (Van Maanen, 1998), where a researcher has access to a subject (individual, team, company, country) and starts collecting data. Once collected, the researcher starts analyzing the data and thinking about what to do with it, hoping to have a eureka moment in which something that seems to be different emerges from the data. This approach tends to be justified with an argument along the lines of “with an open mind and with no prior biases I studied company x to be able to identify new patterns.” However, such an approach mistakes having an open mind with having no clue about what to do!

None of these paths are likely to result in a trustworthy manuscript. Instead, trustworthiness needs to be built into the start of a manuscript and maintained consistently throughout it. We next provide some guidance as to how this can be done in the research context, research design and empirical analysis. Table 1 summarizes the ideas presented in these discussions.

Table 1 Recommendations for establishing trustworthiness in qualitative research

Trustworthiness in Research Context

Qualitative methods are inherently embedded in context and so it is critical that the context of studies based on qualitative methods be explicitly defined. The type of context that is relevant to one study may be different from the type of context relevant to another study – for example, it could be an event, a type of environment, or a particular situational strength (Johns, 2006) – but it is important that the contextual nature of the research be consistent across all aspects of the manuscript – the research question, the literature review, methodological choices and the theoretical interpretation of the findings. If this is done effectively, then the contextual delineation of the study bounds the theoretical claims that can be made, thereby providing clarity around what is and what is not explained.

Because context is so central to the theoretical and empirical aspects of qualitative research, it is incumbent on authors to justify the particular context they are studying. At a basic level, authors should consult the JIBS Statement of Editorial Policy, which describes the meaning of IB with respect to submissions to the journal. Beyond this, it is advantageous for authors to show that the specific context they are studying is theoretically interesting and relevant to current scholarly IB conversations.

In many qualitative studies, the motivation to study a particular context is based on observations of real world phenomena. For example, Brannen and Peterson (2009) justify their study of a Japanese acquisition in the US by highlighting the high failure rate of cross-border mergers and acquisitions and the lack of theory to explain them. In the absence of prior theory, such as this, it is difficult to develop hypotheses to be tested in a large scale study, and so inductive, qualitative methods are used to generate or create theory (Edmondson & McManus, 2007). Sometimes, however, the motivation to select a particular context is based on prior research and the questions it leaves unaddressed. For example, Jonsson and Foss (2011) justify their study of the Swedish furniture retailer IKEA by noting that although scholars understand the trade-offs between replication (scale) and local adaption, little is known about the processes through which both can be accomplished. It is interesting to note that in both of these papers, the context is just one organization. That is not always the case in qualitative studies, of course. For example, Caprar’s (2011) study of the culture of local employees of MNEs is based on focus groups of employees of American MNEs in Romania. He frames this choice of context as relevant to culture – the key theoretical construct – since Romanians are both welcoming of foreign investment and sufficiently culturally distant from Americans to be theoretically interesting.

These examples illustrate that in justifying a research context, it is important to clarify what is and what is not known about the phenomena under investigation, and to be explicit about why a qualitative research approach is used. The first task, positioning a scholarly paper in prior literature, is beneficial regardless of the empirical method. However, Pratt (2008) points out that a particular challenge for qualitative researchers is to manage the tension between recognizing and drawing on existing theory, while also distancing from it to show that new theory has been generated. He suggests developing open theoretical frameworks that describe prior research while highlighting where prior research has been largely silent, in order to create a new space for an author’s contribution. In creating these boundaries between what is known and what is not yet known, an author can credibly signal that alternative explanations for the paper’s findings are unlikely.

The second task, justifying the use of qualitative methods, is important in conveying the overall theoretical objectives of the research. While articulating an explicit research question is beneficial in conveying the specific focus of the research, communicating the nature of the findings in theoretical terms helps readers to follow the thread of the storyline. Are you using qualitative methods to extend theory in a particular direction or are you building new theory? Are you generating variance theory or process theory (Langley, 1999)? Are you intending to develop testable propositions or reveal new interpretations of theoretical constructs or relationships? An important dimension of communicating the nature of your findings is being precise with respect to the outcome you are explaining; for example, learning processes within MNEs (e.g., Jonsson & Foss, 2011), variation in SME internationalization practices (e.g., Lamb, Sandberg, & Liesch, 2011) or variation in managerial narratives (e.g., Haley & Boje, 2014). Since choices among these theoretical objectives are connected with choices related to research design, empirical analysis and reporting of findings, expressing them clearly and early in the paper helps the reader understand the subsequent choices you make. This consistency therefore enhances the trustworthiness of the explanations offered as theoretical contributions.

Trustworthiness in Research Design

We have come across misperceptions that research based on quantitative data and deductive reasoning is empirical research, while research based on qualitative data and inductive reasoning is conceptual research. These perceptions are wrong. Both are empirical studies and in both the quality of the research design is crucial for establishing trustworthiness. Moreover, it is crucial to check for data quality in qualitative research, because there are no statistical tests to provide assurances about the operationalization of theoretical constructs and the strength of the relationships among them.

Three aspects of the design of qualitative research can substantially influence perceptions of its trustworthiness: site selection, data replication, and data triangulation. First, with respect to site or sample selection, the researcher needs to justify how and why they chose a single site (one case), or how and why they constructed a sample of multiple cases, such as individuals, teams, organizations, events, regions or countries. Whether one case or a sample of cases is selected, the basis of selection needs to be tightly coupled with the theoretical context of the study and the interpretation of its findings in order for the choice to be seen as trustworthy. Single cases can be justified because they are extreme, unique, representative, revelatory or longitudinal (Yin, 2009: 47–49) and it is important to embed the justification in the theoretical contribution of the paper. As Siggelkow (2007) points out, it is easier to justify a special case than a representative case because you can show that it was selected to allow you to gain insights that other cases would not provide. For example, in order to reveal insights about the liability of foreignness, Brannen (2004) chose the US entertainment firm Walt Disney Company as a research site because it was an extreme case of paradoxes regarding foreignness. When the objective is to investigate variance, it is important to justify the selection of several cases on the basis of theoretical diversity, so individual cases can serve as replications, contrasts and extensions to the emerging theory (Eisenhardt & Graebner, 2007). For example, Lamb et al. (2011) wanted to capture the greatest possible variation in small firm internationalization and so they justified their cases by emphasizing that they reflected a variety of international experiences and histories within and across different wine export networks that helped better understand internationalization. In the field of IB it is not unusual to combine a single site with a theoretical diverse sample within that site. For example, Jonsson and Foss (2011) chose IKEA as a site because it exhibits a unique combination of format standardization and local adaptation, but to investigate variance in learning within IKEA, they interviewed employees in three markets (China, Japan and Russia) whose differing degrees of development were likely to be associated with variance in learning.

A second aspect of research design that influences the trustworthiness of a manuscript is data replication. Replication adds credibility to findings because it provides support that they are deeply grounded in diverse empirical evidence and not idiosyncratic to one particular case (Eisenhardt & Graebner, 2007). As we have already pointed out, including multiple cases (interviewees, firms) in a sample provides replication. Researchers can also provide replication by collecting data more than once. For example, in Caprar’s (2011) study of the culture of local employees, he conducted three focus groups, varying their composition and timing in order to be able to assess whether these factors impacted the findings. In process studies, replication can be provided through data collected on multiple observations longitudinally (Langley, Smallman, Tsoukas, & Van de Ven, 2013). For example, Bingham (2009) captured data on processes associated with multiple foreign entries over time. In this case, the study was designed with replication across organizations (cases) and within organizations (entries), but longitudinal data collection can also provide within-case replication when the study is based on a single organization. In ethnographies, which are designed specifically to describe and understand how groups of individuals (cultures) function; their norms and patterns of behavior, values and basic assumptions, replication is characterized by its continuous nature. The research outcomes of ethnography are detailed narrative accounts of cultural phenomena told as much as possible from the native’s point of view, and so participant observation is a key aspect of the methodology. The ethnographer needs to find a role within the group under observation from which to participate in some manner, even if only as “outside observer.” Participant observation, therefore, is limited to contexts where the community under study understands and permits it. Further, since the ethnographer’s aim is to understand predominantly tacit, complex, contextually embedded, existential phenomena, the amount of time spent in the field must be substantial – to an anthropologist this means at least 1 year, though a year may be too brief if the research involves learning or perfecting a new language on the part of the researcher. Thus, rather than being characterized by discrete replications, ethnographic research is characterized by diverse and continuous data collection and it is important for the ethnographer to describe in detail both the research data and how data collection took place. For example, in studying a Japanese acquisition of an American manufacturing plant, Brannen and Peterson (2009) provide a rich description of the plant before and after the acquisition, as well as the nature of their participant observation activities and other data collection techniques that were used.

A third element of research design that enhances the trustworthiness of a manuscript is data triangulation. It is common for authors to state that they have supplemented interviews with archival data about the entities they study, but positioning such data as supplemental detracts from their credibility. If the data are not relevant to the analysis and the findings, it is preferable to leave them out of the discussion. It is rare when authors show how they incorporated diverse types of data in their analysis. If the data are relevant, it is important to justify both how they were collected and how they were used. For example, in their study of MNE’s storytelling, Haley and Boje (2014) describe their diverse data sources – including onsite observation, interviews, videos, TV commercials, and transcripts of legal disputes – and weave all of these into their discussion of the study’s findings. Likewise, in their study of Englishization in the provision of cross-border services, Boussebaa, Sinha and Gabriel (2014) carefully detail and justify collecting interview data from different types of employees, as well as data from internal documents, company intranet pages and onsite observation. In discussing their findings, they are able to deepen their interpretation of interview data by portraying it in conjunction with the company`s human resources policies and with the physical work set-up that they observed.

Trustworthiness in Empirical Analysis

The empirical analysis of qualitative data can be enhanced, and thus the confidence of the scholarly IB community in the interpretation of the data presented, in three ways: navigating multilingual and multicultural boundaries, establishing clarity in the analysis, and reporting both evidence and theory and the links between the two.

First, multilingual and multicultural boundaries are particularly prevalent in the field of IB because much of the scholarly inquiry crosses national, cultural or linguistic lines. It is important for researchers to show how they navigate such boundaries effectively, because accurate data interpretation is so important in establishing the credibility of qualitative research findings. This navigation involves accurate translation of documents and interview transcripts. However, most qualitative IB researchers do not discuss their translation decisions in their manuscripts, even though there are substantial theoretical differences among approaches to translation (Chidlow, Plakoyiannaki, & Welch, 2014). It also involves an intimate knowledge of the cultural milieus being examined, both to be sufficiently accepted to be able to collect meaningful data and to be sufficiently acclimatized to be able to interpret that data. This is often achieved by ensuring that someone on the research team has the required language skills and cultural familiarity.

Second, with respect to providing a clear analysis, authors can be overwhelmed by the quantity of data to be analyzed and by the lack of prescriptions for how the analysis should be conducted, and for this reason they need to pay particular attention as to how to analyze data in the most effective way. In contrast to quantitative studies, in qualitative studies there are no standard formats for discussing the methods and findings sections (e.g., Bansal & Corley, 2012; Pratt, 2008). However, this does not mean that any approach for analyzing data is valid. Indeed, qualitative researchers are recognizing that there are templates for distinct styles of qualitative research (e.g., Gioia, Corley, & Hamilton, 2012; Langley & Abdallah, 2011). Regardless of the type of analysis used, it is important that the reader understand in detail what was done and why. Too often, manuscripts go from a description of the sample to a description of the findings and provide little detail on how data were analyzed. One way to show how data analysis was conducted is to show examples of work products, such as the coding schemes developed. This not only helps increase confidence in the analysis, but can also help other researchers improve their own research designs.

While data analysis in qualitative studies tends to be focused on identifying dominant patterns in the data, it is also important to recognize that there may be “negative cases” (Corbin & Strauss, 2008: 84); i.e., cases that do not fit the dominant pattern. These are important to acknowledge and explain. Rather than detracting from a study’s credibility, they can signal analytic rigor because rarely are dominant patterns universal. Moreover, negative cases can provide an opportunity to deepen the theoretical claims that are being made by taking exceptions into account.

Third, deciding how to report the findings of a qualitative study can be challenging, because there are no standardized tables that are expected, and because qualitative data do not always lend themselves to being summarized. One of the key issues that an author faces is deciding what to show and what to tell (Pratt, 2009). Focusing on showing the data (the evidence for theoretical claims) can make the paper seem overly descriptive, while focusing on telling about the data (the theoretical interpretations) can make the theory seem unsubstantiated. Successful qualitative researchers address this difficulty by coming up with creative ways to display their data (Bansal & Corley, 2012). It is important for the reports of the findings to transcend description and indicate clearly the new theory that was generated from the investigation.

Towards More Trustworthy Qualitative Manuscripts

In Part A of this editorial we have provided suggestions for how IB scholars can enhance readers’ confidence in research findings that are based on qualitative data. Scholarly insights are more trustworthy when they take into account extraneous factors that may have affected research results. As is discussed in Part B, on controls in large sample quantitative studies, the ruling out of alternative explanations is handled by controlling for them. In qualitative research, however, the likelihood and magnitude of alternative explanations cannot be measured. Instead, as we have explained, there are multiple and integrated mechanisms to strengthen a reader’s belief that the explanations presented in a qualitative research study are accurate and valid. These mechanisms include ensuring that the boundaries of the theoretical claims are delineated, the research site is appropriate, the data are rich and robust and there is transparency in data analysis and the interpretation of the findings. Moreover, it is important that there be coherence and consistency across these mechanisms so that the thread from theoretical purpose to method to findings to theoretical contribution is clearly visible and easy to follow. We hope that these suggestions are useful for producing more sophisticated and trustworthy qualitative studies.

Part B: Using Controls in International Business Research

Trusting the findings from empirical analyses has a longer tradition and there are already several JIBS editorials that have analyzed ways to handle the analysis of large samples (e.g., Andersson et al., 2014; Cortina et al., 2015; Peterson et al., 2012; Reeb et al., 2012). To complement and extend these ideas, in this editorial we analyze how to use controls in IB. Controls are particularly important in quantitative IB research, which is characterized by analyzing complex phenomena, often spanning multiple disciplines, theories and levels of analysis. The study of cross-border phenomena not only adds an additional layer of country-level influences to the relationships, but can also modify how such relationships operate as new mechanisms emerge that alter existing arguments (Andersson et al., 2014; Cortina et al., 2015). This complexity is the source of new insights on the behavior of economic actors that extend not only IB theory but also theories developed with a single country in mind. However, despite its importance, this complexity needs to be controlled for to avoid confusion and ambiguity.

Controls are commonly used in large sample empirical studies to address spuriousness and hence enhance confidence in results. In these studies, the standard solution is to focus on a few focal influences and include controls for other characteristics that may have an additional impact on the dependent variable, but that are not the focus of interest of the particular study. However, in some cases these controls are included without due justification; often seemingly as a mechanical way of addressing potential reviewers’ concerns rather than as a concerted effort to account for alternative influences that may pollute the proposed relationships. Yet the inclusion of controls does not by itself address the inherent complexity in IB research. In fact, the inclusion of the wrong controls, or exclusion of relevant controls, may seriously affect empirical results and cast in doubt the validity of a study.

In this editorial, we argue that including the appropriate controls is essential for the validity of a study and that researchers in general, and IB researchers in particular, need to pay more attention to the nature and role of controls when conducting their studies to increase the trustworthiness of the ideas and findings presented. Here we go beyond previous discussions of controls that have focused on their use in large sample studies (e.g., Becker, 2005; Breaugh, 2008; Spector & Brannick, 2011; Moody & Marvell, 2010) and propose that future research can improve by taking into account controls in three areas: theory, research design and empirical analysis. Table 2 summarizes the recommendations we discuss in this editorial. First, we explain how to use controls to theoretically establish the boundaries of arguments and dismiss alternative and competing explanations of the proposed relationships. Second, we explain how to design studies to include a control group in the sample to facilitate the comparison to the group of interest in order to identify whether the arguments are general or apply only to certain groups. Third, we explain how to use appropriate statistical techniques which account for alternative influences on the dependent variable by including relevant control variables.

Table 2 Recommendations for using controls in quantitative research

Trustworthiness Through Controls in Theoretical Development

Despite our quest for generalization, theoretical arguments rarely have universal applicability. Typically, a theory is developed with a particular, often rather narrow, set of assumptions regarding its boundaries and potential applicability. Though often not stated explicitly, such boundary conditions regarding the use of a theory may result in its’ applicability being limited to particular contexts, for instance countries with democratic political systems and efficient market mechanisms, or individuals with a minimum level of education or income. When the context changes, as is often the case in IB research, the underlying theory or some of its arguments may need modification. Indeed, such modifications may constitute the very essence of the contribution that an IB study provides to the literature. Even in cases in which the contribution is the modification of assumptions, the theoretical development may need two sets of controls: (1) theoretical boundaries that establish the limits of the applicability of the arguments and (2) clarifications that to account for the existence of alternative explanations of the arguments.

Establishing Theoretical Boundaries

Articles need a clear statement of the theoretical boundaries. Although the search for a generalizable argument is the objective that researchers aim to achieve, in reality most research has limited applicability, either because the researchers have not explained assumptions (Bello & Kostova, 2012; Thomas et al., 2011), or because relationships depend on particular environmental conditions (Cuervo-Cazurra, 2012). Thus a clear and explicit statement of the conditions under which the proposed relationships hold is needed as a first theoretical control.

A statement of theoretical boundaries is not the same as saying that the specified relationships only hold in the context in which they are later tested, but rather that the proposed arguments assume the existence of particular conditions. There is nothing wrong with having a study or arguments that assume certain conditions or specific contexts (Barkema, Chen, George, Luo, & Tsui, 2015). Such studies may provide important steps to our understanding of how a theory can be extended to explain situations that have not been considered in the initial development of the theory, but the conditions need to be made explicit.

To specify such boundary conditions we recommend the following. First, think about your unstated assumptions and the complementary (or substituting) factors or characteristics at various levels (e.g., individual, team, firm, country). Second, once you have identified these characteristics, discuss how the arguments proposed apply to certain types of individuals, companies or countries. You can do this with an initial paragraph before the theoretical arguments in which you acknowledge such boundary conditions with statements such as ‘the theoretical boundaries of the arguments are the following. First, in the current paper we assume individuals or companies of [insert particular type, characteristics, etc.] and thus the following arguments may need modification when analyzing individuals or companies of a different type or characteristics.’

In conceptual work you may include boundary conditions in the development of specific propositions, much in the same way as one would do in empirical work. For instance, one may specify that ‘we expect Y to be positively influenced by X in emerging economies, whereas the relationship is reversed in advanced economies.’ Note how explaining the boundary conditions of the theoretical arguments rather than simply stating ‘other things being equal’ provides a more precise application of the theory. At the same time, it provides future researchers with useful guidance for how to design an empirical study to test the relationship, and it even specifies variables that should be controlled for. Finally, in both empirical and conceptual work, you must return to the issue of the boundary conditions of theory when discussing results and implications as this provides the basis upon which the contributions should be judged.

Theoretically Controlling for Alternative Explanations

Once you have established the theoretical boundaries, a second level of theoretical controls involves theoretically accounting for alternative explanations of the proposed phenomenon. In many cases the propositions or hypotheses establish a relationship between independent variables and the dependent variable. However, these relationships can be explained with many alternative theories and theoretical arguments. Thus the burden falls on the researcher not only to explain the proposed relationship(s), but also to rule out alternative accounts for such relationship(s). To do this, you need first to identify alternative theories that may explain the proposed relationships, and then discuss how the mechanisms proposed by such alternative theories differ from the ones proposed by your preferred theory. After this, the next step is to argue and explain how the predictions driven by the theory proposed by you are better than the predictions driven by the alternative theory; especially if this provides a simpler explanation with fewer assumptions (i.e., Occam’s razor, Duignan, 2015) and one that can be falsified with data (Popper, 2002).

IB research may require alternative explanations because of differences in context or relationships across time and space (Dunning, 1998). First, under the conditions established in the theoretical boundaries, the initial explanation may no longer hold and thus we need a more sophisticated explanation. It is your responsibility to provide proof that the new mechanisms are better than the old ones. You may explain how the previous arguments are theoretically constrained to particular situations, and a new explanation is needed for the new situation. Second, new influences and relationships may emerge, which previous theoretical explanations had not taken into account. In this case, you can explain how the previous mechanisms are too simplistic and extend theory to account for new conditions and assumptions.

Trustworthiness Through Controls in Research Design

Unlike the natural sciences, in management studies, with the exception of some psychology-based analyses, there are rarely random samples and limited opportunities for conducting experiments in which some firms are assigned to receive a treatment and others to be a control group (Banerjee & Duflo, 2009; Cook & Campbell, 1979). In order to encourage more experimental design in IB research, Zellmer-Bruhn, Caligiuri and Thomas (2016) outline these opportunities and explain the value and limitations of experiments in the IB context.

Despite these possibilities, however, most management studies use convenient samples that have data or surveys on companies or individuals that are easily accessible or effortlessly identified. If, in addition to not having a random sample, the researcher restricts the sample to firms or individuals that have a characteristic of interest, the researcher is in many cases bound to find the expected relationships. Without a control group the author cannot know whether this behavior is exclusive to the group under analysis, or whether it is generalizable to other firms or individuals that were not included in the sample. For example, authors may argue that emerging-market multinationals (EMNCs) are internationalizing quickly nowadays. If such arguments are tested on a sample that only includes EMNCs, researchers may find that this is indeed the case. However, if advanced economy multinationals are included as well, researchers may find that these firms are also internationalizing quickly thanks to, for example, advances in information and transportation technologies and the reduction of constraints on trade and investment. Hence the argument applies to all multinationals and not just EMNCs.

Including a Control Group

We recommend including a control group in the research design against which the relation of interest can be contrasted and compared. This helps understand whether: (1) the arguments presented apply to all individuals or companies in general or only to individuals or companies of a particular nature, and (2) the arguments presented apply to all individuals or companies in general but individuals or companies of a particular nature exhibits some additional different behaviors.

Including a control group also requires the modification of the arguments and hypotheses in the theoretical development, so that such arguments and hypotheses are presented in comparison to the control group and not just as general arguments. One interesting way of doing such comparison can be not only to include individuals or firms that do not have the required novel characteristic, but also, if data are available, to do a matched sample in order to identify how the characteristics of interest indeed drive the proposed relationships (Estrin, Meyer, Nielsen, & Nielsen, 2016; Reeb et al., 2012). Naturally, including a control group requires more work collecting data. This does not, however, excuse researchers from doing so. If the research question warrants the introduction of a control group in order to analyze a particular phenomenon, reviewers and editors must insist on such steps being taken. You will do well to consider this issue early on when designing your research in order to avoid rejection due to design issues. Inadequate attention to theoretical and empirical boundary conditions is often grounds for rejection in JIBS and excuses due to data limitations are not valid if data can be obtained.

Using Natural Experiments

In some instances researchers can take advantage of natural experiments to identify a control group. Natural experiments are examples of designs that are able to isolate (control) the effects of the focal (treatment) variable by eliminating the effects of extraneous factors. For example, Kogut and Zander (2000) analyzed the ability to innovate of the German optics firm Carl Zeiss in different political environments in East and West Germany as a result of the division into two firms post-World War II. The rather dramatic division of Germany after the War provided a fertile ground for a natural experiment which utilized a matched-pair design of two entities that had hitherto been part of the same organization, thus avoiding some of the problems of conjectural causality (or multiple causes) inherent in comparative work (Ragin, 2014).

Trustworthiness Through Controls in Empirical Analyses

A typical way of controlling for alternative explanations in large sample analyses is to include in the empirical model other variables that may influence the dependent variable but that are not the focus of discussion in the theoretical development. Unfortunately, some studies do not even include controls and merely use an analysis of differences in means between groups to test hypotheses; such analysis cannot be used to test theoretical arguments, because there may be many other alternative factors that influence behavior beyond belonging to one group or another.

Even in cases when researchers include controls in empirical analyses, their inclusion often does not seem to be adequately justified or guided by theory. First, it appears that specific controls are sometimes included merely because previous papers have used them. In such cases one usually finds citations to previous work without an explanation of the reasons why such controls need to be included. On other occasions controls are included because they exert influence on some of the independent variables of interest. The inclusion of such controls raises two issues. One is the creation of multicollinearity that results in the independent variables of interest becoming statistically significant merely because some of the controls are included in the analysis. Another is a misunderstanding of the need to include controls; while controls need to be included as alternative explanations of the dependent variable, they should not serve as competitive explanations of other independent variable.

Third, there is a difference between theoretically irrelevant and not statistically significant controls. In the former case, if theory does not call for the inclusion of a variable in order to control for alternative influences on the dependent variable, it should not be part of the statistical model. If the variable happens to be statistically significant, it presents itself as an opportunity to develop theory (best case) or it represents a collinear relation with another independent variable (worst case). In the latter case, if theory calls for the inclusion of a particular control variable, it should be part of the statistical model irrespective of its statistical significance. In practice, selection of appropriate control variables may be difficult but should be guided by whether they satisfy the criteria for spuriousness based on theory, prior empirical studies, and common knowledge about the phenomenon under investigation. It is better to err on the side of caution by including all the theoretically relevant controls, even if many of them are not statistically significant, though such practice may result in unstable results due to overfitting of the model.

Based on our editorial experiences and to facilitate a better use of controls in large empirical analyses, we summarize the following observations on the common mistakes made in the use of controls in large sample empirical studies in IB, and provide some suggestions for solving them. We group them in four themes: Inclusion, exclusion, measurement and reporting.

Inclusion of Controls (1): Justified Controls

A common mistake is that there is often little or no theoretical justification for inclusion of specific controls, apart from inserting references to previous studies that have used the control. However, in many cases these references have little to do with the current dependent variable and may be contextually irrelevant.

Our recommendation it to include a theoretical justification. Avoid mimicry of other studies and instead provide sound theoretical reasoning for each and every control included. This should include a brief discussion of why a particular variable is a biasing (control) rather than a substantive (independent) variable in a particular model.

Inclusion of Controls (2): Relevant Controls

Impotent control variables are often included – for example ones that are uncorrelated with the dependent variable – without justification for inclusion. Unless a control can be legitimately justified as suppressor, it should be excluded as it will reduce power in the analysis. Alternatively, controls are sometimes included to improve the statistical significance of key relationships or to increase the model fit by reducing error terms. This includes instances where certain controls are included in some analyses but not in others, or the nature or even measurement of controls vary within the same study.

We recommend that you make sure to include the ‘correct’ controls. This should be driven by theory and not by previous research (which may be flawed or contextually different) or what works statistically. Also, avoid including too many controls in the pursuit of ‘methodological trickery’ – more is not necessarily better and each and every control must be theoretically and logically justified. Finally, select controls that explain the dependent variable, not those associated with independent variables.

Exclusion of Controls: Excluding Dimensions of Controls

Some studies conveniently exclude related dimensions of the independent construct, which may artificially inflate the significance of the dimension of the construct included. For example, when analyzing the impact of culture on finance, many studies select individualism/collectivism as the key cultural variable and exclude the other dimensions of culture in the controls; if the other dimensions are included, the significance of individualism/collectivism may likely be affected, maybe to the point in which it loses its statistical significance.

Our recommendation is that instead of excluding certain dimensions of a construct, either include all dimensions or explain the logic for excluding the dimensions and potential biases. Also, if controls, their measurement or their treatment vary within the same study, this needs to be clearly explained and justified.

Measurement of Controls (1): Specify Controls

In some cases there is little information on the specific measurement of controls, using some vague indications rather than providing precise explanation of how the measure was created (e.g., discussing GDP per capita without specifying where data came from and how it was measured: using GDP in current dollar terms, in international dollar terms, in PPP terms, dividing GDP by the estimated or census population, etc.).

We suggest that you provide information and clearly discuss in the method section how controls were measured and why a particular measure is adequate for the context of the study. This may include a discussion of the validity and reliability of controls, as well as an explanation of choices of controls. Without specific knowledge about which controls were included, how they were measured and where they come from, replication is impossible.

Measurement of Controls (2): Describe Controls

Often some controls are not included in the correlation matrix. Their exclusion may be a sign of sloppy work or a sign of trying to conceal potential multicollinearity problems between controls and other independent variables. Also, in some cases the effect size is not provided for all controls in tables, which results in missing information.

We would recommend you report descriptive statistics for all controls including means, standard deviations, range, and so on, and provide evidence of reliability and validity where appropriate.

Reporting of Controls (1): Impact of Controls

There is often no discussion of the impact of controls in the results or discussion sections. Far too often we are left to speculate what significant controls may mean and almost never is the relationship between controls and the dependent variable explicitly discussed.

Our recommendation is to discuss in the results section how controls influence dependent variable(s) and key relationships in your model and offer insights for future researchers on what to control for in studies of a particular phenomenon.

Reporting of Controls (1): Importance of Controls

In many cases, controls account for more explanatory power than the main effects, but this is almost never discussed. This begs the question of whether the statistical significance of the variables of interest has any economic significance, which again is rarely computed and discussed.

We recommend that you explain the impact of the controls in comparison to the impact of the independent variables, and compute and discuss the economic significance of the variables on the dependent variable.

Reporting of Controls (1): Comparisons

Another common mistake is that the baseline model with only the impact of the controls on the dependent variable is excluded from the table of results, or the full model is not run both with and without controls. Thus we cannot fully assess how much explanatory power the inclusion of all the relevant independent variables provides beyond the controls.

Our recommendation is that you run a model that only includes controls before adding explanatory variables to models and report significance levels and betas. Also run full models with and without controls to rule out controls as potential explanation for results. Explain what it may mean if results differ markedly when controls are included and when they are not; this may help future researchers rule out potential biasing effects. Discuss the results in relation to the specific controls included using language like ‘controlling for A, B and C, the relationship between X and Y was…’ and make sure to relate this to prior studies in the literature – this may include references to other studies of the same phenomenon in which certain controls were found to have similar, opposite, or no effects.

Towards More Trustworthy Quantitative Manuscripts

In Part B of this editorial we have provided suggestions on how to control for alternative influences in the complex phenomena analyzed in IB research to increase the trustworthiness of the ideas and findings presented in the research. We argued that studies need to include controls at the level of theory, research design and analysis to account for alternative explanations and influences to understand this complexity, in addition to providing more sophisticated theoretical development and the explanation of the mechanisms (e.g., Bello & Kostova, 2012; Thomas et al., 2011). The need to control for alternative explanations applies to (1) theoretical development, by explaining the boundaries of the analyses on the applicability of the theory; (2) research design, by including a control group against which to compare the characteristic of interest, and (3) the empirical analysis.

The overall intention of Part B is to make researchers aware of what actually controlling for alternative explanations entails, which goes beyond what has in many cases become an automatic or mechanistic process of adding a few variables to the statistical analysis. We do so by providing specific recommendations for selection and treatment of control variables in IB research. Hopefully these recommendations will result in better and more trustworthy quantitative studies in the future.

Looking Ahead: Mixed Methods and Ambidextrous IB Scholars

Both qualitative and quantitative research can improve their trustworthiness by paying attention to the theoretical development, research design and data analysis, to ensure that the insights gained from the analyses are not subject to alternative, unaccounted influences. Although we have divided the discussion in this editorial into two parts to provide depth to the suggestions, both qualitative and quantitative data are complementary in developing IB as a field of scholarly inquiry. Mixed method approaches, in which researchers undertake both qualitative and quantitative studies to answer their research question, are worthy of greater scholarly attention than they have hitherto attracted. Using qualitative and quantitative methods in tandem can increase the trustworthiness of a study by compensating for the weaknesses inherent in any one method alone, and can yield a richer answer to a research question (Brannen & Peterson, 2009; Kaplan, 2015; Small, 2011). This points to the benefit of scholars investing in becoming more ambidextrous with respect to their methods-related skills and/or in establishing ambidextrous research teams; and we hope that there are increasing numbers of these. However we need to heed the caution that a mixed methods scholar “risks being jack-of-all-trades and master of none” (Kaplan, 2015: 431). Trustworthiness through cohesiveness, depth and rigor still needs to be incorporated into the design and analysis of all datasets in order for a mixed method study to be substantive and persuasive. We hope that this editorial provides a useful framework for sparking the interest in gaining expertise in a different tradition and creating more trustworthy studies that provide deeper insights.