INTRODUCTION

Developed over half a century ago, the event study is a research method that continues to gain popularity and acceptance as an important tool in the field of finance. In recent years, event studies have evaluated the impact of corporate initiatives (e.g., mergers and acquisitions, equity and debt issuance, dividends and repurchases, corporate restructuring), regulatory changes (e.g., board reform, compensation, changes in taxation, workplace safety), and macroeconomic shocks (e.g., the COVID-19 pandemic, Brexit, the Paris Agreement) on stock prices for a single country (usually the U.S.). However, only limited attention has been paid to cross-country event studies, i.e., those simultaneously analyzing and comparing price reactions in multiple countries. For example, of the 699 event studies published in premier international and finance journals, including the Journal of International Business Studies (JIBS), the Journal of Finance (JF), the Journal of Financial Economics (JFE), and the Review of Financial Studies (RFS), only 12% (or 81) were identified as cross-country event studies. Apart from the difficulties in obtaining adequate cross-country data, the main obstacles to this line of research include identifying an appropriate international benchmark for expected returns, and a lack of clear guidance on how best to leverage cross-country results to assess the role of various national institutions (e.g., cultural, economic, legal, political) in affecting event outcomes. Nevertheless, there is much to be learned from cross-country event studies, particularly about how institutions and environments shape events deemed important.

Event studies measure abnormal changes in stock prices (or returns) that occur in conjunction with an event, such as the announcement of a corporate decision (e.g., an increase in dividends). They are based on two main principles: (1) the semi-strong version of the efficient market hypothesis, the idea that prices fully reflect all publicly available information (Fama, 1970), and (2) the notion that the price of an asset is equal to the present value of all of its future free cash flows. Thus, the price impacts of an event can reveal the effects of that event on future cash flows. To test market efficiency, researchers examine how asset prices react to an event relevant to particular firms, industries, or markets (see, e.g., Campbell, Lo, & MacKinlay, 1997). The choice of event study topics is vast, given the breadth and depth of the fields in which they are prevalent, but the key requirement is that the outcome further deepen our understanding of how firms are valued.

In this paper, we survey prior event studies in international finance, review the approaches taken, highlight the many challenges facing researchers that conduct cross-country comparative event studies, and offer guidance on how to address them in practice. We also extend event studies to an important but less studied asset class in an international setting – the fixed-income market. We emphasize particularly how researchers define the event under investigation, the study period (i.e., whether short or long term), how to compute abnormal returns, and how to statistically infer from the measured abnormal returns whether the event under investigation produced a reliable price reaction.

Our survey of prior event studies published in JIBS and the top three finance journals – JF, JFE, and RFS – reveals that single-country studies dominate cross-country studies, with mergers and acquisitions (M&A) being the most studied event. Against this backdrop, our comprehensive analysis of event-study methodologies and ensuing recommendations provide a number of avenues for future research in international finance. First, scholars should examine understudied corporate events and the effects of global events in an international setting. Second, we urge researchers to conduct long-horizon event studies. Some events may span a relatively long period (e.g., the COVID-19 pandemic), or may not be fully or correctly evaluated by the stock market over a short-term window due to cultural or market microstructure considerations1. Third, we recommend examining how differences in formal institutions (e.g., legal rules and their enforcement) and informal institutions (e.g., culture, social norms) affect cross-country differences in stock market reactions to corporate and global events. Obviously, the choice of country-level conditioning factors will depend on the nature of the event (e.g., legal factors are more relevant in governance-related events).

Our paper is related to, but distinct from, the recent work of Eden, Miller, Khan, Weiner, and Li (2022) in two important ways. First, while Eden et al. (2022) examine the event-study methodology in international business, we focus on international finance, where event studies are lacking, perhaps due to the complexities inherent in cross-country research. Second, we extend our implementation to the international bond market, which is larger in size than the equity market. In addition, we highlight the more recently available and higher-quality databases that can be used for future research in international finance as well as topics worthy of investigation.

Our paper contributes to the growing literature on identification and estimation techniques in international business research (Eden & Nielsen, 2020; Nielsen, Eden, & Verbeke, 2020; Nielsen et al., 2020; Reeb, Sakakibara, & Mahmood, 2012). Although researchers often use event studies to examine a market reaction to an event in a single country, we identify the current state of knowledge in the finance literature and areas for future research. We emphasize key methodological concerns and offer thorough guidance on the best practices applicable in cross-country and cross-asset class settings.

Because the event-study method departs from a simple association driven by underlying firm characteristics to a direct reaction of unanticipated events, it is a powerful tool for determining causality in cross-country studies. Our comprehensive approach to conducting event studies and addressing the various methodological issues in an international setting improves internal and external validity, thus reducing the need for further investigations. Our analysis and recommendations can also be extended to other areas of international business, such as accounting, marketing, management, and law. Furthermore, by extending the methodology to international bond markets, we contribute to the literature on the efficiency of information flow between financial markets (Kwan, 1996).

The remainder of the paper proceeds as follows. The next section surveys event studies published in top-tier finance and international business (IB) journals. The subsequent section outlines the steps involved in an event study, discusses the specific challenges associated with using the method in international finance, and advises how to overcome them. We then discuss the event-study method in the fixed-income market. We conclude with a summary and further recommendations.

WHICH EVENTS?

The starting point of any event study in international finance is the choice of which phenomenon to study, whether intrinsic to a particular company, to the industry, or market as a whole. The choice of topics is vast, but the key is to deepen understanding of how firms are valued.

We first explore which finance areas have already been studied, and which topics need more attention in cross-country analysis. Table 1 lists the number of studies conducted by topic in the four major IB and finance journals – JIBS, JF, JFE, and RFS.2 In total, we find 699 event-study papers, of which 545 investigated firm-level events (78%), 38 dealt with peer-level events (5.4%), and the remaining 116 focused on country-level events (16.6%). Of the four journals, JFE emerges as the front runner of event studies, with 319 publications (45.6% of the total), followed by JF (31.6%), RFS (17.9%), and JIBS (4.9%).

Table 1 Number of single- and cross-county event studies published in the four major finance and IB journals, by topic

Surprisingly, we find a wide breadth of topics covered by event studies in finance. Indeed, for firm-level event studies alone, we identify 47 distinct topics with at least two articles and 17 more single-article topics aggregated in the “others” category. For peer- and country-level studies, we identify 8 and 12 topics, respectively. In addition, the topics covered differ in their level of prevalence. For firm-level event studies, M&As is the most studied topic, with as many as 117 articles. The other important firm-level topics are restructuring (39), equity issuance (37), dividends (23), analyst forecasts and recommendations (20), and earnings (20). For industry-level event studies, the most common topic is distress in the bank–borrower relationship (four studies), followed by bankruptcy, M&As, and security issuance (three each). For country-level studies, the most researched topics are governance reform/legislative change (32 studies), elections/political risk events (17), monetary policy (12), market trading mechanism changes (11), and government intervention (7).

However, what is striking about the figures summarized in Table 1 is that relatively few (81 of 699, or about 11.6%) are cross-country event studies (see Figures 1 and 2). Of note, we find that JIBS accounts for the lion’s share, with 33 of 81 studies. Few of the event studies published by JF (6.3%), JFE (4.7%), or RFS (15.2%) cover more than one country, while the large majority of event studies published in JIBS (97.1%) are cross-country studies. Also, we identify 117 M&A studies, but just 19 (16.2%) examine this segment in a cross-country context. We find none covering dividends, board structure changes, or investor activism and voting. In fact, there is no single cross-country analysis in 35 of the 47 firm-level topics with at least two publications in the major journals considered. We therefore call for future cross-country event studies on these understudied topics.

Figure 1
figure 1

Distribution of single- and cross-country studies, by event type.

Figure 2
figure 2

Distribution of single- and cross-country event studies for the top topics.

CONDUCTING AN INTERNATIONAL EVENT STUDY

This section describes the steps necessary to conduct a cross-country event study. We begin with data selection, followed by a discussion of event and estimation windows. We then discuss the modeling of normal returns, the calculation of abnormal returns, inferences about the nature of an event in the short and long run, and assess the role country-level institutions play in determining an event’s outcome.

Data Gathering

Once the event to study is chosen, the next challenge is obtaining the appropriate data. Two types of data are generally required. The first pertains to the announcement of the event itself. Data gathering is much easier for events on a national or industrial scale, or for a global event. This is because many companies share these events, and announcement details are widely available. For firm-specific events, researchers can consider local news sources, or established providers such as LexisNexis’s NexiUni, Dow Jones’s Factiva, Bloomberg News, Capital IQ, FactSet, Refinitiv SDC Platinum, Thomson Reuters, NYTimes, or the Wall Street Journal, etc. Tapping multiple announcement sources is important for obtaining the largest possible sample and the most precise information about dates. This precision significantly affects the quality of measured abnormal returns.3

The second type of data consists of returns, prices, and other performance metrics that researchers need in order to assess whether they are affected by the event. For an international event study, up-to-date, comprehensive, and accurate data are critical. There are several sources of cross-country data typically used in the literature. Bloomberg is the most prominent finance data provider, and offers the most complete data on fixed-income securities and derivatives. Bureau van Dijk’s Osiris offers a range of accounting and market data for a large cross-section of countries, but Compustat Global (Global Vantage) and Refinitiv (previously Thomson) Datastream tend to be the most widely used in finance and accounting research.

For large-firm returns and financial statements in developed markets, the best choice is generally Compustat Global. It provides excellent coverage and standardizes data across various accounting standards and practices to deliver high international comparability. Compustat Global is also considered the most accurate. However, it features only limited coverage for small firms and emerging markets, and little coverage outside equities. Datastream is the most widely used international finance database because it provides broad coverage of large and small stocks in developed and emerging markets. It also contains comprehensive coverage of non-stock instruments, such as bonds, currencies, options, and CDSs. However, Ince and Porter (2006) find numerous material errors in terms of coverage, classification, and data integrity that can be difficult to mitigate without an alternative data source or screening procedure. Therefore, when using Datastream, we recommend including a thorough screening tool as well (e.g., Griffin, Kelly, & Nardari, 2010; Ince & Porter, 2006; Lee, 2011; Tobek & Hronec, 2018).

Regardless of which database is used, certain steps are necessary to avoid mistakes. First, when aggregating data from different countries, it is important to convert the variables into a common currency, preferably the U.S. dollar. Second, researchers should examine the robustness of their findings to outliers, especially if using a relatively small cross-section of data. Third, to mitigate potential market microstructure issues, such as price rounding, assets with low prices (say, less than $5) should be excluded from the main analysis.

Once the data for the event firms are obtained, it is important to provide descriptive statistics on the sample characteristics, such as size, value, and industrial make-up. These statistics can be contrasted with those obtained for the population to determine whether additional measures are needed to control for sample selection bias.

Delineating the Event and Estimation Periods

MacKinlay (1997: 13) defines an event study as the measure of “the impact of a specific event on the value of a firm.” To measure this impact, the starting point is to determine the window over which the stock price response is calculated, often referred to as the “event period.” The efficient-market hypothesis predicts that the price impact will be immediate and correct, making the event study’s announcement day the most important date. The event period typically goes beyond the announcement day to include the surrounding period.

Once the event window is known, the next step is to determine the estimation period – a different time window used to measure all the parameters needed to estimate normal returns (the returns one would expect to have had if the event had not occurred). Differentiation between the estimation and event windows is critical for measuring the normal returns correctly.

The estimation period is sometimes referred to as the “pre-event period,” because researchers often use a window before the event to estimate the parameters. However, there are not always sufficient data before the event window (Mikkelson & Partch, 1986), or there may be reasons to believe the parameters have changed between the pre-event and event windows (Masulis, 1980). In these cases, a researcher can use data from the post-event period, provided it is far enough from the announcement to avoid potential contamination. Moreover, researchers must ensure the estimation window is free from extraordinary events such as crashes that could distort the measured normal returns.

For event studies based on daily returns, using 100 days or more as the estimation period is reasonable (Armitage, 1995). However, Corrado and Zivney (1992) find that their simulation results are largely unaffected by the number of pre-event days considered. For monthly returns, 60-month estimation periods are typical. As Shanken (1992: 2) notes: “Measurement error in beta declines as the time-series sample size, \(T\), increases.” Note that extending the estimation window is likely to induce inaccuracies when risk varies over time. Therefore, there must be a trade-off between the need to precisely measure the parameters and any inaccuracies triggered by lengthening the estimation window.

According to Fama (1991), an efficient market is one in which “prices fully reflect all available information” (p. 1575). This definition implies that the price adjustment to a crucial event must be immediate, and explains why most event studies cover the day of the announcement and surrounding days. If significant abnormal returns are found in the run-up period, this could indicate insider trading. Merger announcements, for example, are often poorly concealed. This is consistent with Keown and Pinkerton’s (1981) finding with U.S. data that “leakage of inside information is a pervasive problem occurring at a significant level up to 12 trading days prior to the first public announcement of a proposed merger” (p. 855). A non-zero abnormal return in the run-up period could also indicate that market players anticipated the event (Jarrell & Poulsen, 1989), which does not necessarily indicate market inefficiency.4 A significant abnormal return on the announcement day is not incompatible with market efficiency, and can be interpreted as the value created or destroyed by the event. Additionally, if significant abnormal returns are observed after the announcement day, it could be evidence against market efficiency because the price impact of the event should not persist for days.

Measuring Normal Returns

To measure abnormal returns, we must first determine normal (expected) returns, the return in the absence of an event. Caution should be exercised due to the numerous potential choices. If the event period considered is short, measured abnormal returns will not be sensitive to the model used (Brown & Warner, 1980) because risk is unlikely to change dramatically over a short period. On the other hand, if the event period considered is long, estimating normal returns properly becomes more important.

Typically, an event study involves determining whether an event has created or destroyed value around the announcement date. To this end, we need a benchmark for the returns realized during the event. This benchmark is referred to as “normal returns,” those expected if the event had not occurred. The following equations denote some of the models used to determine normal returns (\({\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right]\)) on asset \(i\) (\(i = 1, \ldots ,n\)):5

$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \mathop \sum \limits_{\tau = t - \delta }^{t - T} \frac{{r_{i\tau } }}{T - \delta + 1}, $$
(1)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = r_{Mt} , $$
(2)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \alpha_{i} + \beta_{iM} r_{Mt} , $$
(3)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \alpha_{i} + \mathop \sum \limits_{\tau = - \ell }^{\ell } \beta_{i\tau } r_{Mt + \tau } , $$
(4)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \alpha_{i} + \beta_{iM} r_{Mt} + \beta_{iS} {\text{SMB}}_{t} + \beta_{iH} {\text{HML}}_{t} , $$
(5)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \alpha_{i} + \beta_{iM} r_{Mt} + \beta_{iS} {\text{SMB}}_{t} + \beta_{iH} {\text{HML}}_{t} + \beta_{iU} {\text{UMD}}_{t} , $$
(6)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \alpha_{i} + \beta_{iM} r_{Mt} + \beta_{iS} {\text{SMB}}_{t} + \beta_{iH} {\text{HML}}_{t} + \beta_{iR} {\text{RMW}}_{t} + \beta_{iC} {\text{CMA}}_{t} , $$
(7)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \alpha_{i} + \beta_{iMKT} {\text{MKT}}_{t} + \beta_{{i{\text{ME}}}} r_{{{\text{ME}}t}} + \beta_{iI/A} r_{I/At} + \beta_{{i{\text{ROE}}}} r_{{{\text{ROE}}t}} + \varepsilon_{it} , $$
(8)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \alpha_{i} + \beta_{iW} r_{Wt} , $$
(9)
$$ {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right] = \alpha_{i} + \theta_{i} \beta_{iM} r_{Mt} + \left( {1 - \theta_{i} } \right)\beta_{iW} r_{Wt} , $$
(10)

where \({\Omega }_{t}\) denotes the conditioning information for a normal performance model; and \(r_{it}\) , \(r_{Mt}\), and \(r_{Wt}\) are the returns on asset \(i\), the local market, and the global market, respectively. Note that, at the outset, all parameters (the alphas, betas, and thetas) in the models above are measured using data in the estimation window.

Equation (1) describes the mean-adjusted model conventionally used in the literature (Masulis, 1980). This model takes the average return obtained during the estimation period as the normal return. Besides simplicity, its main advantage is that it is constant in the event window while providing a different benchmark for each asset. One major drawback, however, is that it produces biased abnormal returns if the event in question coincides with large market movements. This is because the method ignores market risk. Using Monte Carlo simulations, Klein and Rosenfeld (1987) find that the mean-adjusted method yields upwardly (downwardly) biased abnormal returns during a bull (bear) market.

Equation (2) – known as the market-adjusted model (see, e.g., Lakonishok & Vermaelen, 1990) – adjusts for market sensitivity, and is thus likely to work best for events that coincide with significant market movements. However, one drawback is that it assumes all assets have similar market sensitivities (betas). Brown and Warner (1980) find in simulations that, in short-term event studies, the mean-adjusted model can detect the presence of abnormal returns. The performance of the mean-adjusted model deteriorates when the announcement date is not known with precision, or when the value-weighted index is used, or when there is clustering in event dates.

Equation (3) represents the market model, classically used in Fama, Fisher, Jensen, and Roll (1969). This model continues to be the most widely used in the field, including in international finance. It takes market risk into account, and allows different assets to exhibit different sensitivities to market movements.6 Brown and Warner (1980: 205) find that it “performs well under a wide variety of conditions”. In international finance, the local market factor can be replaced by the global factor, or by a combination of local and global factors depending on the characteristics and the origin of the event firms.

Equation (4) is Dimson’s (1979) variant of the market model. It considers up to \(\ell\) lags and leads of the market factor. Many studies, including Fama and French (1992) and Hou, Xue, and Zhang (2020), use special cases of (4) to adjust for non-synchronous trading. This problem is less acute in long-run event studies focusing on large, mature markets (Jain, 1986). However, it can be problematic in international finance because the various developed and emerging markets have different trading hours, market participants, volumes, and liquidity. For example, the Australian Securities Exchange never opens simultaneously with the New York Stock Exchange, thanks to the 14-h time difference between Sydney and New York. Therefore, if Australian and U.S. stocks are considered together, the lack of synchrony must be addressed. Dimson’s (1979) adjustment is more important for short-term event studies that focus on a few days around announcements than for long-horizon studies.

Note that Eqs. (1) to (4) are generally correct in short-horizon-event studies, not only because risk is unlikely to change significantly over a few days but also because short-horizon data “can attenuate or eliminate the joint-hypothesis problem, that market efficiency must be tested jointly with an asset-pricing model” (Fama, 1991: 1601). However, over longer horizons, systematic risk does tend to change. Hence, how the various risks are measured and captured can strongly impact the inferences. Equations (5) to (8) show several multifactor models that have been developed in the literature, primarily due to the market’s inability to explain anomalies. For example, Eq. (5) depicts Fama and French’s (1993) three-factor model, which adds additional size (SMB) and value (HML) factors to the market factor. The logic behind measuring abnormal returns with such a model is simple. Although the sample firms differ in size and value, the associated size and value premiums required by them are likely to be constant when the event window is short. In this case, if the market model is used, the alpha parameter should capture the omitted size and value premiums. Thus, this method should work well at measuring the event’s impact. In contrast, if the event period is relatively long, a constant alpha will not be able to capture the time variation in the size and value premiums. These time-varying premiums can significantly bias the abnormal returns measured and exacerbate the joint hypothesis problem (Fama, 1970). Therefore, despite the apparent scarcity of empirical work in the international finance literature, we recommend exercising caution when modeling expected returns in long-term event studies.

Following Carhart (1997), Eq. (6) adds the momentum factor (UMD) to the three-factor model. Note that the primary justification for considering UMD is Fama and French’s (1996) conclusion that their three-factor model cannot account for the momentum effect. However, Fama and French (2016) find that the UMD factor is crucial when the left-hand test assets are formed on the basis of past returns. The implication for event studies is that momentum should be controlled for when the events in question deal with past returns.

Equations (7) and (8) describe the five-factor model of Fama and French (2015) and the q-factor model of Hou, Xue, and Zhang (2015), respectively. In addition to the factors discussed above, these models adjust for profitability (RMW and \(r_{{{\text{ROE}}}}\)) and investment (CMA and \(r_{I/A}\)) factors. The adjustments are necessitated by the inability of the three- and four-factor models to explain return patterns related to abnormal capital investment (Titman, Wei, & Xie, 2004), asset growth (Cooper, Gulen, & Schill, 2008), investment growth (Xing, 2008), investment-to-assets (Lyandres, Sun, & Zhang, 2008), net stock issues (Loughran & Ritter, 1995), gross profits-to-assets (Novy-Marx, 2013), return on equity (Haugen & Baker, 1996), and profit margins (Soliman, 2008). Given these empirical results, if the event firms have varying levels of investment and profitability (often the case with international data), it will be essential to control for these factors.

When conducting an event study in international finance, a key question is whether the factors should be local or global, or a combination thereof. Of course, the answer ultimately depends on the origin of the selected firms, and the level of financial and economic integration of the countries (Akbari, Ng, & Solnik, 2020; Pukthuanthong & Roll, 2009). Under the assumption of perfect integration, all assets will be priced similarly regardless of location. The primary factor then becomes the global market factor (Grauer, Litzenberger, & Stehle, 71976; Harvey, 1991). If only firms in large developed markets are considered, then Eq. (9) – which can be viewed as the global version of the market model – is more appropriate for short-term studies.

However, even if the global market factor is considered instead of the local one, we find it more relevant to use local non-market factors instead of their global counterparts, for two reasons. First, a purely local asset pricing model with \(N\) non-market factors can never be integrated into an international model with \(N\) purely global factors. At best, it can only integrate into an international model with one global and many local factors. Second, the existing empirical evidence suggests that local versions of non-market factors perform better than their global counterparts. For example, Griffin (2002) examines the local and global versions of Fama and French’s (1992) three-factor model in explaining time series variations in international stock returns. He finds that local factors outperform global factors.

If the selected firms are from developed and emerging markets, adjusting for local and global market factor sensitivities will be necessary. In this case, Eq. (10), which is motivated by the mild segmentation model of Errunza and Losq (1985), is likely more appropriate for conducting an event study (see, e.g., Bekaert & Harvey, 1995; Karolyi & Stulz, 2003). Ultimately, we recommend using the model that gives the best average R-squared when firm returns are regressed on the factors in the pre-event window.8

In a case where the selected event firms come from both (more integrated) developed markets and (more segmented) emerging markets, it is reasonable to assume that Purchasing Power Parity will be violated. Thus, expected returns would include additional currency risk premiums (Adler & Dumas, 1983; Solnik, 1974; Stulz, 1981). Note that studies such as Dumas and Solnik (1995) and De Santis and Gerard (1998) find that exchange rate risk is priced into conditional international asset pricing tests, but we do not recommend controlling for foreign exchange risk premia in cross-country event studies that do not involve currencies. This is because event study methods typically deal with unconditional tests, and exchange rate risk is not priced into unconditional international asset pricing tests (e.g., Korajczyk & Viallet, 1989; Stehle, 1977). In addition, Heston and Rouwenhorst (1994) find that currency movements do not explain much of the variation in international stock returns.

Because of the growing literature on market anomalies, researchers are now much more aware of the importance of firm characteristics related to size, past performance, valuation, financing, investing, etc. The higher relevance of these characteristics begs an important question: How are event studies affected by event firm characteristics? Firms experiencing an event often share a particular characteristic; for example, they may all come from the same industry. Therefore, we must account for the impact of any differences in characteristics between the selected firms and those in the general population of firms (e.g., Dimson & Marsh, 1986).

Ahern (2009) runs simulations where the event firms are not randomly selected, but are chosen to be tilted toward specific characteristics based on size, value, and momentum. He finds that standard event study methods are subject to significant statistical errors (including those that rely on Fama and French’s (1996) multifactor models). The bias induced by the characteristics of the event firms is likely to be greater in international finance because firms coming from the same countries tend to share certain characteristics (for example, Chinese firms tend to be larger than the median international firm). Using a cross-sectional model to compute normal returns (Lewellen, 2015) is an excellent way to mitigate bias, but we recommend using a characteristic-based benchmark to adjust the returns of event stocks by the returns of a portfolio of control stocks that are in the same characteristic quintiles or deciles (see, e.g., Ahern, 2009; Daniel, Grinblatt, Titman, & Wermers, 1997).

Finally, we do not recommend using the measures of expected return listed above when the event involves a major international crisis, such as the 2007–2009 global financial crisis or the COVID-19 pandemic. Emerging evidence shows that trends toward greater globalization and integration of international markets tend to retrench during major crises (Doidge, Karolyi, & Stulz, 2020; Heikki, 2015; Milesi-Ferretti & Tille, 2011), and this phenomenon is often amplified by higher financial protectionism (Rose & Wieladek, 2014) and increased cross-border funding frictions (Akbari, Carrieri, & Malkhozov, 2022; Lane & Milesi-Ferretti, 2017). With such retrenchments, the loadings (betas) on local and global factors are likely to change significantly between the estimation and event windows, implying biases in the measured abnormal returns. One way to rectify this problem is to modify the estimating equations to allow the alphas and betas to vary conditionally with state variables that capture changes in the levels of market integration (e.g., Shanken, 1990).

Computing Abnormal Returns and Making Inferences from the Event

To infer from the selected sample of firms whether the event type considered creates or destroys value or violates market efficiency, we must first measure the abnormal returns generated in the event window. An abnormal return is the portion of the realized return left unexplained by the normal return. In other words, the abnormal return is the difference between the realized and normal returns:

$$ {\text{AR}}_{it} = r_{it} - {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right]. $$
(11)

If the normal returns are well specified, the abnormal returns must average to zero in the estimation window. Hence, if the event under investigation is not impactful, the abnormal return will be indistinguishable from zero in the event window. In contrast, if the event is consequential, the abnormal return will differ from zero as soon as announced. It can be helpful to observe the abnormal returns generated in the run-up and post-event periods to test for the possible presence of market inefficiencies.

When the event period is short (say, a few days), the statistical inference from the measured abnormal returns is relatively easy to make, and the tests are usually very powerful. However, if the event period is long (say, a few years), many complications can ensue, and the inherent tests are known to lack statistical power. Next, we discuss in more detail the inferences in short-term event studies, as well as the difficulties with long-horizon event studies.

Short-term event studies

Most event studies focus on a short window surrounding the event announcement, for two reasons. First, the efficient-market hypothesis posits that prices should quickly and correctly reflect new information. Under this view, as Fama (1998: 284) argues: “Any lag in the response of prices to an event is short-lived.” Second, because both the normal returns and any change in risk over a short time interval are small, the measured abnormal returns over such a short period are generally robust to modeling choices, and less afflicted by the joint hypothesis problem.

One firm, one day

For testing purposes, the case of a single firm for one day is easier to handle. Under the usual assumption that firm returns are independent and identically distributed, we can make this inference through a t-statistic:

$$ t\left( {{\text{AR}}_{it} } \right) = \frac{{{\text{AR}}_{it} }}{{s\left( {AR_{it} } \right)}}, $$
(12)

where \(s\left( {{\text{AR}}_{it} } \right)\) is the standard deviation of the abnormal return measured in the estimation window. Under the null hypothesis that the event has no valuation effects, \(t\left( {{\text{AR}}_{it} } \right)\) has a t distribution. If the estimation window is long (say, 120 days), \(t\left( {{\text{AR}}_{it} } \right)\) will have an approximatively standard normal distribution. The distribution is not exact because the standard deviation \(s\left( {{\text{AR}}_{it} } \right)\) in the denominator of (12) must be estimated with errors. If the statistic is higher (in absolute terms) than the stated critical value, we can reject the null hypothesis of zero abnormal performance with the desired confidence level.

However, in practice, researchers are not interested solely in measuring the abnormal performance of a single firm for a single day. They need to measure (1) abnormal performance across several firms (which requires cross-sectional aggregation of the abnormal returns of individual firms), and (2) abnormal performance over several days around the announcement (which requires temporal aggregation).

Several firms, one day

When research is measuring the event impact for a sample of firms, a simple method is to compute the abnormal returns of the average firm (in the estimation and event window), and make inferences based on the t-statistic obtained for this average firm. We can compute abnormal returns of the average firm in two ways:

$$ {\text{AR}}_{t}^{ew} = \mathop \sum \limits_{i = 1}^{n} \frac{{{\text{AR}}_{it} }}{n}, $$
(13)
$$ {\text{AR}}_{t}^{vw} = \mathop \sum \limits_{i = 1}^{n} w_{i} {\text{AR}}_{it} . $$
(14)

Equations (13) and (14) aggregate abnormal returns across assets, but the former equally weights all abnormal returns, while the latter weights each abnormal return by \(w_{i}\). This can be seen as the capitalization weight of firm \(i\) in the estimation window. In international markets, the value-weighted average abnormal return obtained in (14) makes more sense because international stocks tend to be smaller and more influential when event firms’ returns are weighted equally.

Hou et al. (2020) argue that equal-weighted portfolio results often lean heavily on microcaps. As such, they may provide deceptive results because the investment capacity of microcaps is very limited. Fama (1998: 296) points out that “apparent anomalies in long-term post-event returns typically shrink a lot and often disappear when event firms are value-weighted rather than equal-weighted” and that “bad-model problems are more severe in inferences from equal-weight returns.” Finally, Asparouhova, Bessembinder, and Kalcheva (2013) note that value-weighted portfolios are better than their equal-weighted counterparts at alleviating the biases induced by pricing noise.

One advantage of using an average firm is that it naturally captures the cross-sectional dependence of the event firms. Inferences can also be made, as in the single firm case discussed above, via a t-statistic:

$$ t\left( {{\text{AR}}_{t}^{ew} } \right) = \frac{{{\text{AR}}_{t}^{ew} }}{{s\left( {{\text{AR}}_{t}^{ew} } \right)}}, $$
(15)
$$ t\left( {{\text{AR}}_{t}^{vw} } \right) = \frac{{{\text{AR}}_{t}^{vw} }}{{s\left( {{\text{AR}}_{t}^{vw} } \right)}}, $$
(16)

where the standard deviations (\(s\left( {{\text{AR}}_{t}^{ew} } \right)\) and \(s\left( {{\text{AR}}_{t}^{vw} } \right)\)) are estimated using the average abnormal returns obtained in the estimation window.

Several firms, several days

If investors have anticipated the event, the run-up abnormal returns are likely to be significant, but this may also be the case due to insider trading before an announcement. Post-announcement, market efficiency dictates that no abnormal returns should exist. Therefore, researchers are interested in making statistical inferences from the abnormal returns of event firms over time intervals around the event dates. This is often done by combining abnormal returns over these intervals.

Let CAR be the cumulative abnormal return of the average firm for the length of time \(T\) between two dates, \(\tau_{1}\) and \(\tau_{2}\) (\(T = \tau_{2} - \tau_{1} + 1\)). For the average firms defined above, the CARs are given by:

$$ {\text{CAR}}_{T}^{ew} = \mathop \sum \limits_{{t = \tau_{1} }}^{{\tau_{2} }} {\text{AR}}_{t}^{ew} , $$
(17)
$$ {\text{CAR}}_{T}^{vw} = \mathop \sum \limits_{{t = \tau_{1} }}^{{\tau_{2} }} {\text{AR}}_{t}^{vw} . $$
(18)

The statistical significance of these CARs is usually tested while assuming that abnormal returns are independent and identically distributed under the null, thus yielding the following t-statistics:

$$ t\left( {{\text{CAR}}_{T}^{ew} } \right) = \frac{{{\text{CAR}}_{T}^{ew} }}{{\sqrt T s\left( {{\text{AR}}_{t}^{ew} } \right)}}, $$
(19)
$$ t\left( {{\text{CAR}}_{T}^{vw} } \right) = \frac{{{\text{CAR}}_{T}^{vw} }}{{\sqrt T s\left( {{\text{AR}}_{t}^{vw} } \right)}}. $$
(20)

The assumption of independence is generally correct when there is no clustering of event windows across event firms. However, event windows may overlap in calendar time in some circumstances. In this case, the standard deviation of the cumulative abnormal returns, \(s\left( {{\text{CAR}}_{T} } \right) = \sqrt T s\left( {{\text{AR}}_{t} } \right)\), will be biased downward because of missing positive covariance terms (see Table 2 in Collins & Dent, 1984). This would lead to upwardly biased t-statistics, as in (19) and (20). Kothari and Warner (2007) find that we can address this bias by using a portfolio of event firms (i.e., the average firm) and evaluating the variance of the abnormal returns in the period preceding or after the event date. If a researcher has a legitimate reason to believe that the event under investigation is associated with higher uncertainty, the standard deviation of the abnormal returns measured in the estimation window will be biased downward. In this case, this standard deviation must be adjusted by multiplying it with the ratio of the cross-sectional standard deviation of the abnormal returns in the estimation window to that in the event window.

Note that there may be outliers in the abnormal returns of the event firms, or their distributions may depart from normality. Non-parametric tests can be used to compare the distributions of the realized and normal returns (e.g., Wilcoxon’s signed-rank test).

Long-term event studies

Under the efficient-market hypothesis, the price impact of an event should be swift and correct. After the adjustment, the new price will reflect the true intrinsic value. No wealth transfer should exist between investors that decide to sell shares and those that continue to remain shareholders. In contrast, direct evidence contradicting market efficiency would be found when the price reaction to an event lasts over a long mark-up period. This could lead to wealth transfers between existing shareholders and those trading based on readily available information.

While modeling choices about risk adjustment are unimportant for short-horizon event studies, the opposite is true for long-horizon studies. Kothari and Warner (2007: 22) provide a cogent description of this problem: “In multi-year long-horizon tests, risk-adjusted return measurement is the Achilles' heel for at least two reasons. First, even a small error in risk adjustment can make an economically large difference when calculating abnormal returns over horizons of 1 year or longer. […] Second, it is unclear which expected return model is correct, and therefore estimates of abnormal returns over long horizons are highly sensitive to model choice.”

The problem arises because market efficiency can only be tested jointly with a specific model for normal returns. However, all models are somewhat false descriptions of reality – and the modeling errors compound (faster than standard errors) over the long run. This is the “bad-model problem” described by Fama (1998). The bad-model problem is likely to be more acute in international finance, given higher uncertainty about which model – e.g., Sharpe-Linter’s CAPM, the world CAPM (Grauer et al., 1976), the international CAPM with exchange rate risk (Solnik, 1974), the local q-factor model (Hou et al., 2015), the global version of the five-factor model (Fama & French, 2017), or the global characteristics model (Hou, Karolyi, & Kho, 2011; Nazaire, Pacurar, & Sy, 2020) – best describes the behavior of expected international stock returns. Even if we choose one model, its performance will vary over time with the state of the global economy because the level of market integration and cross-border activity is affected strongly by the global outlook (e.g., Doidge et al., 2020; Milesi-Ferretti & Tille, 2011). Hence, the long-term event study outcome is more sensitive to modeling choices, especially in international finance. Given this reality, we recommend using a variety of specifications when conducting a long-run international event study.

Next, we briefly describe three of the most widely used methods. The first two are based on event time; the last one uses calendar time.

Cumulative abnormal return (CAR) method

The CAR method is frequently used in long-term event studies. For example, Rau and Vermaelen (1998) use several variants and CARs to examine the 3-year post-announcement performance of acquirers in mergers and tender offers conditional on their levels of book-to-market. Perhaps the main reason why CAR has been frequently used is that it can be easily performed: A researcher only needs to increase the length of time \(T\) between the announcement date and the end of the study. In long-term event studies, \(T\) is typically between 1 and 5 years, and the test statistics are computed similarly as in the short-term study (see (19) and (20)). To make inferences, we recommend using a bootstrapping approach instead of one based on asymptotic assumptions (e.g., Brock, Lakonishok, & LeBaron, 1992; Ikenberry, Lakonishok, & Vermaelen, 1995).

While easy to implement and interpret, the CAR method is prone to one serious problem: It assumes that the parameters in the estimation window (including the measured risk) are not only correct but stable enough to be used in the event window to measure expected returns. Fama (1998: 292) criticizes this hypothesis in the following terms: “For many events, long periods of unusual pre-event returns are common. Thus, the choice of a normal period to estimate a stock’s expected return or its market model parameters is problematic.”

Furthermore, Barber and Lyon (1997) recommend using buy-and-hold abnormal returns instead of cumulative abnormal returns because CARs ignore compounding.

Buy-and-hold abnormal return (BAHR) method

Barber and Lyon (1997) conduct a study of the statistical power of long-term event studies. They argue that tests based on CARs, which typically contrast realized with expected returns calculated using a reference portfolio, are inappropriate for long-run event studies. Part of their reasoning is that one must aggregate returns through time by using an approach that accounts for compounding. However, because the CAR method does not account for compounding, it provides a biased measure of true performance to long-term investors. The BHAR does account for compounding, but it is subject to the new listing bias (Ritter, 1991), and it increasingly departs from normality (positively skewed) as \(T\) increases (Fama, 1976).9

To control for these biases, Barber and Lyon (1997) propose measuring long-horizon abnormal performance using a buy-and-hold approach that matches the event firms to control firms of similar characteristics (see also Ikenberry et al., 1995; Lyon, Barber, & Tsai, 1999):

$$ {\text{BHAR}}_{iT} = \mathop \prod \limits_{t = 1}^{T} \left( {1 + r_{it} } \right) - \mathop \prod \limits_{t = 1}^{T} \left( {1 + {\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right]} \right), $$
(21)

where \({\text{E}}\left[ {r_{it} \left| {{\Omega }_{t} } \right.} \right]\) is the expected return on firm \(i\) in the event period, regardless of whether the reference return is determined using a characteristics-matched non-event firm, a reference portfolio obtained by sorting stocks, or a benchmark given by a particular asset pricing model. For the equal- and value-weighted average firms, the BHARs are given by:

$$ {\text{BHAR}}_{T}^{ew} = \mathop \sum \limits_{i = 1}^{n} \frac{{{\text{BHAR}}_{iT} }}{n}, $$
(22)
$$ {\text{BHAR}}_{T}^{vw} = \mathop \sum \limits_{i = 1}^{n} w_{i} {\text{BHAR}}_{iT} . $$
(23)

Barber and Lyon (1997) argue that long-term event studies conducted with this BHAR approach offer better tests in most situations. Moreover, by relying on the average event firm instead of individual firms, researchers can avoid the complications created by the correlation of returns across event stocks, often observed when there is clustering in event dates.

The BHAR approach is widely used in the literature, and provides some improvements over the CAR approach in long-term studies. However, both methods exacerbate the bad-model problem discussed above, especially the BHAR because it compounds instead of accumulating errors. As Fama (1998: 291) notes: “Bad-model problems are most acute with long-term buy-and-hold abnormal returns (BHARs), which compound (multiply) an expected-return model’s problems in explaining short-term returns.”

Jensen’s alpha method

Given the difficulties with event-time-based methods (CAR and BHAR), researchers such as Jaffe (1974) and Mandelker (1974) have suggested an alternative approach based on calendar time (see also Brav & Gompers, 1997; Mitchell & Stafford, 2000). It consists of three steps. First, define the study horizon (i.e., the number \(T\) of post-announcement months covered). Second, each month, form a portfolio of all firms that have gone through the event in the previous months. The firms can be equal- or value-weighted in order to obtain the monthly portfolio return (). Third, run the following time series regression to obtain the portfolio alpha () and the associated t-statistics:

$$ r_{pt} - r_{ft} = \alpha_{p} + \mathop \sum \limits_{k = 1}^{K} \beta_{pk} f_{kt} + \varepsilon_{pt} , $$
(24)

where fkt is the return on factor k (k=1, 2, .., K) at time t, rft is the risk-free return, βpk is the loading on factor k, and εpt is the zero-mean residual term. The statistical inference from (24), using the t-statistic on the alpha generated by the statistical package, is generally straightforward. However, we recommend that researchers present bootstrapped skewness-adjusted t-statistics that jointly adjust for correlation and skewness biases (Lyon et al., 1999, Eq. 6).

Even if the calendar time approach based on Jensen’s alpha does not compound the bad-model problems, it does give equal weight to each calendar time. Loughr and Ritter (2000) argue that this may reduce the power to reject the efficient-market hypothesis if managers time events to opportunistically exploit pricing errors, which tend to cluster in time within some industries. However, as Fama (1998) notes, this problem can be solved by weighting the abnormal monthly returns by their standard deviations (see also Jaffe, 1974; Mandelker, 1974).

Another important question in international finance is which model should be used (i.e., the identity of the factors). There is no consensus among researchers about the true underlying model, so tests based on (24) may be subject to the bad-model problem. This implies that researchers must explore the robustness of their results to various models: the purely local CAPM, the world CAPM, the international CAPM with exchange rate risk (Solnik, 1974), the global five-factor model (Fama & French, 2017), a local characteristics-based model (Lewellen, 2015), a global characteristics-based model (e.g., Hou et al., 2011; Nazaire et al., 2020). Given that measured abnormal returns (alphas) usually depend on the pricing model used, we recommend that researchers exercise caution when deriving strong conclusions from a long-term event study.

Ultimately, each method of studying the long-term effects of events has advantages and disadvantages. Therefore, test results are likely to be sensitive to the method used. Fama (1998: 304) uses this sensitivity to argue against rejecting the efficient market hypothesis, concluding: “The recent finance literature seems to produce many long-term return anomalies. Subjected to scrutiny, however, the evidence does not suggest that market efficiency should be abandoned. Consistent with the market efficiency hypothesis that the anomalies are chance results, apparent overreaction of stock prices to information is about as common as underreaction.” Loughran and Ritter (2000: 362) caution against dismissing market efficiency based on robustness to alternative tests. They state: “We argue that if there are significant misvaluations in the stock market, abnormal returns should not be robust to alternative methodologies. In particular, some methods have little power to pick up material misvaluations that display predictable patterns”.

Finally, researchers should recognize the method sensitivity of long-horizon event studies by using at least one event time and one calendar time method. If both approaches yield abnormal performance, this would provide more solid evidence against market efficiency. If results are sensitive to the method used, caution should be exercised in drawing conclusions. In any case, researchers should dig deeper into the cross-country differences in outcomes.

Exploring the Determinants of Abnormal Returns

Cross-country event studies are interesting because they provide a natural way to examine country differences in short-term price reactions or long-term price deviations caused by an event. Once we establish such differences, the crucial question is what drives them? To address this, scholars can examine the role of country-level institutions in a multivariate regression setting. Prior IB research (e.g., Boubakri, Guedhami, Kwok, & Saffar, 2016; Chen, El Ghoul, Guedhami, Kwok, and & Nash, 2021) has typically built on the analytical framework developed by institutional economists such as North (1990) and Williamson (2000). It also distinguishes between formal (e.g., legal, financial, political) and informal (e.g., national culture and social) institutions.

To conduct such an analysis, the usual method is to run the following regression model (see, e.g., Brockman, Rui, & Zou, 2013):

$$ {\text{Performance}}_{ickt} = \theta_{0} + \theta_{1} {\text{Institutions}}_{ct} + \theta_{2} X_{ickt} + \mu_{c} + \mu_{k} + \mu_{t} + \varepsilon_{ickt} , $$
(25)

where the dependent variable \({\text{Performance}}_{ickt}\) is the measure of abnormal performance (AR, CAR, BHAR, or alpha depending on the situation) for firm \(i\) in country \(c\), industry \(k\), and year \(t\). Depending on the data used for the event study, \({\text{Performance}}\) can take several forms. For example, if the event study is conducted over a short period, \({\text{Performance}}\) can measure various calendar times and be matched with the corresponding dependent variables in a panel data framework. In contrast, with a 5-year event study, Eq. (25) is likely to be seen as a purely cross-sectional regression.

In each situation, it is important to include the appropriate fixed effects (\(\mu_{c}\), \(\mu_{k}\), and/or \(\mu_{t}\)) to control for unobserved heterogeneities shared by particular groups of observations, as well as the set of control variables (\(X_{ickt}\)) deemed important in the particular context of the study. The choice of institutional variables (\({\text{Institutions}}\)) depends on what the researcher believes determines the cross-country differences in performance following the event.

INTERNATIONAL BOND EVENT STUDIES: AN EXTENSION

In addition to equities, multinational corporations and governments regularly issue bonds denominated in various currencies with different maturities, coupons, and credit ratings to benefit from lower borrowing costs. They also serve to hedge profits by matching currency inflows and outflows. In this section, we extend the cross-country event study method in equity markets to another important but less studied asset class: international bonds (e.g., Eurobonds, foreign bonds, Yankee bonds). Historically, there have been fewer international bond event studies because of the lack of data. In fact, a review of all articles published in the JIBS shows only one recent study related to cross-border M&A activity (Renneboog, Szilagyi, & Vansteenkiste, 2017). Other bond event studies exist in finance, but only applied to U.S. data (see, e.g., Bessembinder, Kahle, Maxwell, & Xu, 2008).

Thus, although largely unexplored, bond market event studies play an important role in international finance in several settings: when the interests of stockholders and bondholders diverge (e.g., payout policy, capital structure changes, seasoned equity offering), when the interests of managers and all stakeholders – bondholders and stockholders – are not aligned (e.g., value-destroying acquisitions, spinoffs, CEO turnover), and in situations where there are several theories for why the stock market moves in a certain direction but with different implications for the bond market (Ederington, Guan, & Yang, 2015).10

Several issues arise for researchers conducting bond event studies in an international context. First, there are serious data limitations. Bonds are thinly traded. Several U.S. studies report that the majority of industrial bonds do not trade at all (e.g., Ederington et al., 2015). This problem is likely to be amplified in international markets, especially in emerging economies. Infrequent trading complicates bond return calculations because data for consecutive days are not available. To obtain a workable sample, researchers may require that bonds trade a certain number of times (for example, 50–100) over the period under study, and that returns be calculable for a minimum number of days (5–10 days). In addition, currently available bond datasets report end-of-day prices that are not always based on actual trades. These so-called “matrix priced” bonds should be excluded from the analysis because their values can deviate from actual prices. Bonds with short-term maturities (less than 1 year), those that have defaulted, been called, or retired, trades with settlement dates more than a week in the future, and “when issued” and “special price” trades should also be considered for exclusion.

Second, multinational corporations often issue several bonds with varying characteristics, such as maturities, investment grades, and ratings, in order to finance their needs and match the duration of their assets with their liabilities. Thus, researchers must determine how to calculate bond returns at a firm level when more than one bond is outstanding. Early work in the bond market (see, e.g., Cook & Easterwood, 1994; Hand, Holthausen, & Leftwitch, 1992; Warga & Welch, 1993) considers each bond as a separate observation. However, this approach has the potential to bias the sample toward larger firms. Bond returns of the same firm are also correlated, leading to biased test statistics (Ederington et al., 2015). Later research has computed bond returns at a firm level using the weighted average of each bond in the sample. The weight is the amount outstanding for each bond, scaled by the total amount outstanding for all bonds (see, e.g., Maxwell & Stephens, 2003).

Third, prior research argues that bond returns exhibit considerable cross-sectional heteroscedasticity (Ederington et al., 2015). Because bond price variability depends on issue characteristics – maturity, coupon, ratings, and investment status – volatility is not constant over time. One way to control for this variability is to run separate tests for investment-grade and non-investment-grade bonds. Alternatively, researchers may reduce the heteroscedasticity by standardizing each bond’s event window return by its return volatility. This method can substantially increase the power of the signed-rank test.

In order to measure abnormal bond returns, scholars must obtain bond prices. Currently, the main datasets for international bond prices are Bloomberg and Refinitiv Eikon Datastream, which provide actual dealer quotes data for Eurobonds.

Bond return (\(R\left( { - k, + n} \right)\)) during the announcement period from business day \(- k\) to \(+ n\) (i.e., \(k + n + 1\) days return) should be computed as:

$$ R\left( { - k, + n} \right) = \frac{{P_{n} + {\text{AI}}_{n} - \left( {P_{ - k - 1} + {\text{AI}}_{ - k - 1} } \right) + C}}{{P_{ - k - 1} + {\text{AI}}_{ - k - 1} }}, $$
(26)

where \(P_{t}\) is the price at time \(t\) with day 0 the disclosure date, \({\text{AI}}_{t}\) is accrued coupon interest as of trading day \(t\), and \(C\) is the coupon payment received during the event period. Prices should be obtained using the quote of the last trade that day.

Because they are traded infrequently, not all bonds issued by a firm at the time of the event are traded during the event window. Thus, abnormal returns may not be cumulative for all firms. For example, the abnormal return from day 0 to day 5 is not the sum of abnormal returns from day 0 to day 1 and from day 1 to day 5 because the sample bonds with valid bond trading information in those 3 days are not the same. To obtain robust results, scholars should consider \(k\) to be 1 and 0 and \(n\) to be any number from 0 to 5. Thus, when \(k = 1\) and \(n = 5\), we can obtain an event study window of 7 days from - 1 to +5.

After computing the bond return during the announcement period, researchers can construct bond indices from their own dataset in the announcement window by matching according to the various categories of bond characteristics. These include currency, rating, and maturity. They should then adjust the raw bond return by the bond index return to obtain the abnormal bond return. In this way, the computed abnormal bond return (ABR) has removed market, rating, and maturity effects, so the statistical inferences about the bond market reaction to the disclosure of the event are not subject to these compounding effects. That is:

$$ {\text{ABR}}\left( { - k, + n} \right)_{i} = R\left( { - k, + n} \right)_{i} - {\text{BM}}\left( { - k, + n} \right)_{i} , $$
(27)

where \({\text{BM}}\left( { - k, + n} \right)_{i}\) is the mean return on a benchmark rating–maturity matched portfolio corresponding to bond \(i\). We recommend using benchmark portfolios based on six rating classes (Aaa and Aa, A, Baa, Ba, B, and below B) and four maturity groupings (1 to 3 years, 3+ to 5 years, 5+ to 10 years, and over 10 years). Moody’s and/or the S&P rating should also be used to assign bonds to portfolios, if available. To calculate the benchmark return for each rating-maturity group, at least five bonds in that group are required to trade on days \(k - 1\) and \(k + 1\).

CONCLUSION

This study explores the challenges of conducting event studies in international finance research. It has always been somewhat problematic to find an appropriate sample of cross-country event firms and data for these firms and their potential counterparts (control firms). However, significant progress has been made in this area. Many local and global news platforms are now available online, which makes it easier to obtain more precise announcement details and gather more comprehensive samples of event firms across international markets. Our work shows that, for research dealing with the returns of relatively large firms in major markets, Compustat Global is the database of choice. If small firms, especially those in emerging markets, are an essential part of the study, Datastream would be the best choice. In this case, the data should be cleaned for errors using existing screening procedures (e.g., Griffin et al., 2010; Ince & Porter, 2006; Lee, 2011; Tobek & Hronec, 2018). Once the event and the sample of firms has been chosen, it is important to control for the presence of potential sample selection biases. To this end, researchers should compare the distribution of the sample of event firms to that of the population with regard to characteristics such as size, book-to-market, momentum, industrial makeup, etc. If the two samples differ significantly, we recommend using techniques that match event and control firms with similar characteristics.

For short-horizon event studies involving returns, various models can be used. These include: the usual market model, a purely global version, or a two-factor mild-segmentation model that accommodates the local and global market factors depending on whether the assets are primarily sampled from emerging markets, developed markets, or both. It is always important to explore the robustness of the results to non-synchronous trading by incorporating the leads and lags of the factors considered. When considering alternative performance metrics instead of returns, such as operating performance or Tobin’s Q, it is vital to control for the effects of outliers and to use non-parametric test statistics to improve the robustness of the conclusions. In this case, it will also help match each event firm with a carefully chosen counterpart with the same pre-event return and characteristics.

For long-horizon event studies, the choice of the model for expected returns becomes more important. We recommend using at least one measure of abnormal performance based on event time (such as CAR or BHAR) and one measure based on calendar time (such as alpha). Researchers should also consider the robustness to various models for expected returns (i.e., pricing models). These models capture a fully segmented market (i.e., all factors are purely local), a fully integrated market (i.e., all factors are global), and a mixed situation, where returns react to both local and global factors. When results based on both equally and value-weighted firms are not presented, the value-weighted results should be considered. This is because they are (1) more conservative and less subject to bad-model problems (Fama, 1998), (2) more realistic in terms of investment capacity (Hou et al., 2020), and (3) less subject to biases induced by price noise (Asparouhova et al., 2013). Fama (1998) explains that strong evidence against market efficiency requires that the results be robust across various approaches.

The test statistics obtained for short-horizon event studies are generally correct, but we suggest (in addition to asymptotic tests) always using the bootstrap method (e.g., Brock et al., 1992; Ikenberry et al., 1995) to assess statistical significance for two reasons.11 First, it is easy to implement, reasonably precise, and reliable in smaller samples. Second, measures of abnormal performance in cross-country studies are often, if not always, subject to non-trivial time series and cross-sectional correlations, which makes asymptotic inferences unreliable.

Over the past half-century, a large number of event studies have been conducted, and contributed to a unique body of knowledge in various areas of social sciences. Although much has been discovered about how prices react to various events, little is yet known about how international environments and institutions shape event outcomes. Indeed, the top four IB and finance journals we reviewed contain 699 event studies, but less than 12% of these studies compare events across countries, and fewer have examined the determinants of cross-country differences. We highlight many areas where such cross-country analyses are lacking, and indicate how future studies can be conducted.

Notes

  1. 1

    For example, a dividend increase may not be fully captured by market participants in high-conformity cultures. Thus, to better assess the stock market reaction to this event in such cultures, a long-term approach is needed.

  2. 2

    We search the full text of articles in the JIBS (journal website), JF (JSTOR), JFE (journal website), and RFS (journal website) using the keyword “event study”. Then, we manually verify that these papers actually use an event-study methodology. The detailed list of the event studies is available upon request.

  3. 3

    Brown and Warner (1980) report little difference in the statistical power of various event study approaches if the moment at which the event occurs is known with precision.

  4. 4

    If significant abnormal returns are found in the run-up period, it would be interesting to measure how they correlate with those obtained in the post-announcement (mark-up) period to gauge who bears the cost of this insider trading/information leakage.

  5. 5

    The notation used here is similar to that in Campbell et al. (1997). Parametric tests rely on distributional assumptions such as the normality of returns. These can be violated, especially when dealing with daily returns (Fama, 1976). Brown and Warner (1980) find that non-parametric statistics (which do not make any distributional assumptions) are misspecified because they tend not to reject the null of zero abnormal returns often enough. However, when measures of operating performance are used instead of market returns, Barber and Lyon (1996) find that non-parametric tests such as the Wilcoxon statistic are more powerful than parametric statistics. They attribute this result to the presence of outliers in the measures of operating performance. Although our survey deals with standard event-study methodology, it should be noted that some events are not random, but result from the deliberate choices of firms or other entities. Failure to recognize this self-selection via a conditional event study can lead to bias (e.g., Acharya, 1988). Prabhala (1997) argues, however, that the traditional approach usually leads to valid inferences under certain conditions.

  6. 6

    The mean adjusted model is the particular case of the market model where the beta is 0; the market-adjusted model is the particular case where the beta is 1 and the alpha is 0.

  7. 7

    This global adjustment is important because some studies, such as De Santis and Gerard (1997) find in asset pricing tests that local market risk is not priced into developed markets.

  8. 8

    Campbell et al. (1997) show how the precision of the inference is related to the R-squared.

  9. 9

    Eckbo, Masulis, and Norli (2000) criticize the BHAR approach on the grounds that the assets of the buy and sell portfolios are not known in advance.

  10. 10

    See for example, Hand, Holthausen, and Leftwitch (1992), Maxwell and Stephens (2003), Maxwell and Rao (2003), Adams and Mansi (2009), Kesckés, Mansi, and Zhang (2013), Gao, Liao, and Wang (2011), Klein and Zur (2011), and DeFond and Zhang (2014).

  11. 11

    See Mitchell and Stafford (2000) for a relevant criticism of the use of bootstrapped test statistics.