Organizational resilience remains an important topic in both the research and practice of organization design, and it is studied through a diverse set of perspectives and methods (see Table 1). In this note, we propose an approach to measuring organizational resilience that is scalable and generalizable across contexts. Such an approach may prove useful to test the qualitative conclusions drawn from a large variety of inductive studies about resilient organizations that have accumulated in the literature (Van der Vegt et al. 2015).

Table 1 Empirical works that were published since 2000 in 11 top management journals and operationalized organizational resilience (see Appendix for more information)

The scope of our analysis is limited by three choices we make. First, our focus is on organizational (rather than individual) resilience, though we take a broad perspective on what an organization is (Puranam 2018). The measure of organizational resilience we propose here can be applied to organizations that are smaller (e.g., divisions, departments and teams) or larger (e.g., alliances, eco-systems, meta-organizations) than a single firm.

Second, our focus is on measuring organizational resilience as an observable outcome, and we are agnostic to the antecedents that produce it. The literature on the mechanisms that underlie organizational resilience is vast, beginning at least with Thompson’s (1967) distinction between buffering vs. adaptation, and recent reviews provide a detailed account of how those seminal ideas have been developed (e.g., Williams et al., 2017; Mithani, 2020). Accordingly, we do not offer any deep elaboration on the antecedents of these outcomes and encourage further research to do so.

Third, we propose to measure resilience in terms of changes to organizational performance after unexpected adversity (Weick and Sutcliffe 2001; Lengnick-Hall and Beck 2009). This means that our approach is ideally suited to measure organizational resilience when time series data on organizational performance are available.

Organizational resilience as a performance outcome

Typically, an organization’s performance after an unexpected adversity (such as the entry of a competitor, Argyres et al. 2015; an unexpected terror attack, Kendra and Wachtendorf 2003; an extreme weather event, Dutta 2017; an epidemic, Rao and Greve 2018; to consider a few) will exhibit a drop in performance at the onset of adversity, as well as a possible recovery. Obviously, the incidence and the magnitude of the adversity experienced by an organization should not be measured by the same performance metric being used to assess its resilience, to avoid circularity.

Intuitively, there are at least four separately measurable components of resilience as an outcome: (a) The magnitude and (b) the rate of the drop in performance, and (c) the magnitude and (d) the rate of recovery in performance after the adverse event. Several labels have been used to describe these measurable components derived from performance trajectories which have been discussed by Ayyub (2014) in detail (see also Tang 2019). For instance, the rate of drop in performance ranges from graceful to brittle as it increases. Robustness is described as the residual performance, associated with the magnitude of the drop of performance. Lack of full recovery in magnitude indicates scarring. In certain cases, the magnitude of performance recovery in the post-shock period may lead to higher performance than in the pre-shock period and is labelled antifragility (e.g., Taleb 2012; Kupers and Mullie 2014; Martin 2020).

While these notions are intuitive, when used in isolation, they make implicit assumptions about the window over which observations take place, and the counterfactual being used to evaluate performance. To make such assumptions explicit, we build on previous work (such as Tierney and Bruneau 2007; Cimellaro et al. 2016; Zhang et al. 2019; Tang 2019) and define a measure of organizational resilience as the negatively signed cumulative performance difference between a hypothetical performance trajectory that would have been realized in a world without the shock \({\pi }_{0}\) and the realized performance trajectory \(\pi\) (see the area between dashed and solid curves in Fig. 1). In formal terms, this corresponds to the integral:

Fig. 1
figure 1

Measurement of absolute revenue resilience of Martin Mariette Materials Inc. for illustration purposes. Pre-shock period is used for future forecasting to construct counterfactual. The 2008 Financial Crisis impacts the company in Q2 2008. After a 50% drop in its revenues, the company returns to forecasted revenues in Q2 2014, whereas its revenue recover to pre-shock levels in Q3 2013. For high seasonality and growth companies, forecasting will offer more accurate measures. Accordingly, in the post-shock period, the company improves on its pre-shock trend and experiences a larger growth (anti-fragile outcome)

$$R\left(\alpha \right)= -{\int }_{{t}_{\mathrm{pre}}}^{{t}_{\mathrm{post}}}{\pi }_{0}\left(t\right)-\pi \left(t;\alpha \right)\mathrm{d}t,$$
(1)

where tpre and tpost correspond to moments in time before the shock happens and after the shock happens, π0 corresponds to the counterfactual performance, π to the actual performance, and \(\alpha\) denotes a set of parameters associated with organizational mechanisms determining resilience. The negative sign ensures that the more the realized performance matches or exceeds the counterfactual, the greater the organizational resilience and vice versa.

The four intuitive measures (drop, rate of drop, recovery and rate of recovery) can be shown to be approximations of R and positively correlated with it (please see Appendix). However, there may be negative correlations among them—implying that researchers who use different subsets of these measures may reach opposite conclusions about the same firms. As an instance, drop of performance and time to recovery are two variables measured in several works (Table 1). However, time to recovery is the product of a subset of variables we have derived, given that Time to Rec. = Drop \(\times\) (1/Rate of Drop + 1/Rate of Rec.). There are two important consequences of using these two variables: First, it omits the possibility that organizations may recover beyond the previous performance, leading to an incomplete description of resilience by omitting the fourth component of the R measure (the level of recovery). Second, when not adjusted for the drop, it becomes a measure of the sum of correlations between the drop of performance and the inverse rate of drop and the inverse rate of recovery; this runs into the risk of drawing inconsistent conclusions from measurements. We argue that, if all four variables are used together, they constitute a minimal set of variables that can capture an unbiased (even if approximate) measure of the resilience of an organization. Therefore, a key injunction from our analysis is that researchers should ideally aim to use all four measures, or at least make explicit their assumptions about the measures they are not capturing, in order make the resilience measurement more transparent.

In its general form, this measure directs our attention to two explicit choices, namely, the time interval (tpre, tpost), and the counterfactual trajectory, π0(t). Below we discuss the implications of these choices for observed organizational resilience. We summarize the following discussion in Table 2, highlighting the assumptions, advantages, limitations, and conditions according to which one assumption may be more suitable than the other.

Table 2 Summary table for our discussion of the underlying assumptions in measuring organizational resilience

Counterfactuals and time windows in measuring organizational resilience

Choice of counterfactual π 0(t)

Measuring the absolute resilience of an organization requires determining a counterfactual which describes the performance of an organization as if the shock did not take place. This approach is for example visualized by Ayyub (2014); the labels arise from the comparison of a realized performance curve with an established counterfactual one. In the organizational context, when the exposure to the adverse event cannot be randomized, the determination of absolute resilience relies on forecasting methods. For example, in the context of regional economic organization, Sensier et al. (2016) use macro business cycles to forecast expected performances and determine the drop of performance at each cycle to measure resilience. In Fig. 1, we see an illustration of this: Martin Marietta Materials Inc. (MMM) is an S&P500 company in the construction materials industry per GICS categorization, coded as 151020. As the quarterly revenue curve (solid line) shows, during the 2008 Financial Crisis, the company suffered from a 50% drop in revenues in the course of almost 2 years. To assess the absolute resilience, we used Holt-Winters exponential smoothing method to forecast what would have happened if the shock had not occurred (dashed line beginning in 2008-Q1). In this case, we observe that the actual curve surpasses the forecasted one in 2014-Q1, indicating that MMM experienced an anti-fragile outcome after this shock. However, such forecasting methods rely on time series data up to the moment of the shock and strictly assume that a model derived from past data predicts future outcomes well enough; forecasts for prolonged times may not be able to fulfill this.

An alternative approach measures the relative resilience of an organization, where the counterfactual is the observed performance of a group of organizations that have faced the same adversity. For instance, one can compare the performance of a firm to the industry average, in an industry that has been affected by an adversity befalling all its firms at the same point in time. The relative resilience of firms to the COVID19 pandemic or the government regulatory restrictions that arose to cope with it can easily be measured in this way. Measurement of relative resilience is considerably easier as the counterfactual curve is derived from observable performance. The counterfactual group can be also constructed using synthetic controls (Abadie et al. 2010; Tirunillai and Tellis 2017; Conti and Valentini 2018) ahead of a shock to establish more accurate counterfactuals after the shock period. In this measurement, it is assumed that the only heterogeneity in shock response arises from the underlying resilience related capabilities (the α parameters in Eq. 1), not due to the differing levels of shocks across organizations. Otherwise, organizations that are hit by smaller shocks may appear to be more resilient, although they just faced less adversity.

In Fig. 2a (top), we illustrate the measurement of relative resilience for MMM and its peers, Vulcan Materials Co. (VM) and James Hardie Industries PLC (JHI). A closer look at the industry of MMM shows that it is rather concentrated; the top 3 companies with more than 10% market share cover almost 80% of the market in the time period of interest. Since the impact of the shock will differ across different market shares, we focus on these top three companies to establish the counterfactual through averaging. Adjusting the performance for pre-shock performance (2007-Q4),Footnote 1 we see that during the shock period the market leader VM has taken advantage of its market position and continued to perform better compared to its peers. However, its relative resilience declines drastically over the longer term. Meanwhile, JHI, MMM’s next closest rival falls below the population response and only to recover strongly in the post-shock period. Overall, MMM’s relative resilience remains unperturbed. In Fig. 2b, we further illustrate how the choice of peers matters. The resilience of all three companies is considerably underestimated when an industry-wide average is used instead of the peer based one. To minimize such errors in choosing the appropriate peer group, their bottom-up determination through synthetic control approaches may prove to be much more appropriate for these purposes. For comparison purposes with the absolute resilience measure, we also illustrate the results following from Eq. (1) for the three companies in Fig. 2a (bottom). We observe that the choice of counterfactual can impact the analysis, MMM turns out to be the most resilient almost throughout the whole observation period. Meanwhile, VM’s early resilience remains both for relative and absolute measures.

Fig. 2
figure 2

a (Top) Measurement of relative resilience using Eq. (1). All companies with more than 5% market share (top three companies registered under GICS Industry code 151020 cover more than 75% of the mining and quarrying industry. 3rd one is dropped as its downturn (and eventual bankruptcy) during the financial crisis is because of idiosyncratic reasons. Martin Marietta Materials Inc. is resilient during the shock, overall maintaining its performance during and beyond the shock. The market leader Vulcan Materials Co. shows a resilient response in the short term, only to fall behind peer response in the long run. Finally, James Hardie Industries PLC shows a highly resilience response in the long run, although in the immediate short term it performs worst in the peer group. (Bottom) As comparison, we show the absolute resilience measure of each company. MMM appears to be the most resilient compared with its own forecasted counterfactual, whereas rest of its peers continue to underperform. This illustrates how the counterfactual choice can impact analysis. b Measurement of relative resilience using Eq. (1) using all companies categorized under the same 6-digit GICS code. Resilience is drastically lower for the top companies in the long run, whereas increased in the short-run. More specifically, we see that MMM and James Hardie Industries PLC show almost non-negative resilience measure. However, MMM remains scarred in the long run which is not the case in a (top)

Choice of time interval (t pre, t post)

We see two potential ways of determining the interval of measurement. Many researchers apply a rule-based choice of time window, such as the time to full recovery to determine the time window. However, its implementation poses several problems that are typical of time series. For example, fast-growing organizations tend to grow fast in the post-shock period as well, which shortens the time to full recovery. Here fast-growth is a firm-specific parameter that is not part of α. Another issue arises from seasonality effects as the seasonal fluctuations make it difficult to pinpoint the start and the end of the shock period. When we revisit Fig. 1, we see that MMM revenue stream demonstrates a positive trend and strong seasonality, maxima frequently coinciding with summers and minima with winters. First, the onset and the end of the shock will depend on which quarter is being considered, Q2s showing quicker recovery than Q4s. Meanwhile, the pre-shock trend and the post-shock trend do not seem to differ significantly; determining the shock duration through the moment of full recovery (without considering the trend) may lead to an overestimation of resilience by 1 year. This has been also noted by Tang (2019). Finally, rules that assume the shock period being over at the full recovery directly limit the assessment of the post-shock magnitude of performance recovery as they may be realized long after the recovery period. Using the time to full recovery by definition misses out on the anti-fragile outcome of MMM which is observed past 2014. As such, the ease of the implementation of “time to full recovery” windows comes with crude and potentially misleading approximations to an organization’s resilience.

An alternative to this rule-based choice is the consideration of time windows that stretch both into the past or the future of the shock period. This ensures a better estimation of the magnitude of recovery in the post-shock period. Moreover, one could consider short-term, mid-term, and long-term resilience outcomes that could offer a more nuanced understanding of an organization’s resilience. In Fig. 2a (top), using Eq. (1), we see that the market leader’s resilient response in the short- to mid-term did not last forever. JHI’s post-shock performance, meanwhile, indicates anti-fragility, with strong post-shock revenue growth. On the other hand, the choice of these time windows would have to pay attention to avoiding the occurrence of other shocks in the past and the future.

The resilience outcome should be assessed on a single shock basis to avoid construct validity problems. The time window should also be sufficiently localized to capture the resilience response. Nokia’s decline in performance due to competition from Blackberry, Microsoft, and Apple in 2004 lasted a long time (Doz and Wilson 2017). After almost a decade, we observe that Nokia is once again a rising player in the semiconductor and infrastructure markets. Nevertheless, these two performance outcomes are not associated with a response to the same unexpected shock. This tension between the length of the observation window and the need to exclude other adverse events is very similar to that arising in using cumulative abnormal returns on share prices when conducting event studies (e.g., Christie 1983). As with event studies, it may be useful to report resilience measures for different time windows to assess the robustness of conclusions.

While the time window to study organizational resilience to any particular shock should not accidentally include other shocks, somewhat paradoxically it is useful to study multiple shocks to estimate the organizational parameters captured by \(\alpha\) in (1) in an unbiased manner—to say something confidently about what the mechanisms and antecedents to an organization’s resilience are. This is because an empirical study that considers a single shock will ultimately be equivalent to conducting a cross-sectional study—we cannot be sure if the observed resilience is due to observed organizational features or unobserved heterogeneity. In contrast, observing resilience to multiple shocks is equivalent to estimating a fixed effect in panel data, which allows for control of all stable unobserved heterogeneity at the panel level. Table 1 shows that Dutta (2017) and Rao and Greve (2018) are rare instances of works comparing the resilience of organizations across multiple shocks. It is important to note that in such a case the interpretation of an organization’s resilience will differ depending on the nature of the multiple shocks: The literature differentiates between general and specific resilience (Nykvist and Von Heland 2014). When recurring shocks are of the same nature, we measure specific resilience. On the other hand, currently, many scholars question whether resilience to the 2008 Financial Crisis is predictive of resilience to the 2020 Coronavirus healthcare crisis. Such studies may inform us regarding the general resilience of an organization. Finally, we also highlight the fact that studies identifying the antecedents of organizational resilience (\(\alpha\)) through such a longitudinal method have to also assume stability in the antecedents over the time frame considered to make consistent inferences.

Measurement and research design

So far we have set forth a rather technical discussion of how to measure organizational resilience, which we summarized in Table 2. In addition to this, we would like to raise several caveats regarding its application in the context of larger research design and address fundamental blocks of organizational resilience research, namely, the organization, the shock, and the performance. The organization-shock pair constitutes the main unit of analysis as the performance outcome can only be generated by such a pair. Accordingly, their properties require close scrutiny in the larger research design context.

  1. (1)

    Consistent unit of analysis The initial response to the 2020 Coronavirus pandemic included some companies—such as General Motors, L’Oreal, and Dyson—taking on crisis (pandemic) specific business activities thanks to their economies of scope. More drastically, some companies engaged in divestitures (e.g., GE divesting GECAS and Dell divesting VMWare) and some others filed for bankruptcy (e.g., Hertz). Beyond these observations, Lin et al. (2006) find that 25 out of 80 organizations observed in crisis have changed their organizational design. These responses to crises direct our attention to the consistency of the unit of analysis. Researchers will need to specify the identity of the organizational entity whose resilience they’re interested in; the identity of this entity has to be stable and its performance should remain measurable and consistent before and after the shock.

  2. (2)

    Appropriate performance metric Markman and Venzin (2014) document that six major banks performed the best along seven different performance metrics during the 2008 Financial Crisis. This observation demonstrates that an organization can be assessed along many performance metrics and the choice will determine the interpretation accordingly. Once again, the research context will be the ultimate guide in choosing the performance metric of interest (e.g., see Modica and Reggiani’s (2015) review documenting various metrics used in regional economics). In the organizational context, we would like to point out that organizations may differ in their purposes and their performance metric of concern may vary accordingly. Performance feedback and aspiration level theory (Greve 1998) indicates that organizations will respond to downturn in some performance metrics more than to some others. Considering such context specific central performance metrics may be useful leads to follow in choosing the appropriate one.

  3. (3)

    Shock duration can range from short-term perturbations to long-lasting environmental shifts. We argue that for short-term perturbations, where the organizational environment mostly returns to its original state, forecasting methods facilitating the measurement of absolute resilience will be suitable. On the other hand, for long-lasting systemic shifts measurement of relative resilience may be more preferable as other methods of building counterfactuals may become questionable in their long-term accuracy. The nature of the organization-shock pair will determine the applicability of these methods. For instance, during the 2008 Financial Crisis, many companies were hit by the lack of financial resources and lower demand due to the recession. On the other hand, banks faced a series of regulation changes corresponding to long term environmental shifts. Such shock related factors play an important role in choosing the appropriate method as illustrated in the last column of Table 2.

We encourage future empirical research to elaborate carefully on these research design elements as well as discuss other properties of these elements that may be impactful in measuring organizational resilience, eventually contributing to and extending the earlier discussions by Carpenter et al. (2001) and Powley et al. (2020).

Conclusion

Several conceptualizations of resilience in terms of performance outcomes following an unexpected adversity are available in the literature (e.g., robustness, anti-fragility, brittleness) but as we have shown, they involve implicit assumptions about time windows and counterfactuals; counterfactuals can be built through forecasting leading to absolute measures or peer-based estimations leading to relative measures. Meanwhile, time windows can be determined in rule-based methods or can be varied for richer interpretations. These choices have a significant effect on what can be measured and how they should be interpreted. These will, however, depend on their suitability for the research question. A field such as strategic management may value competitive dynamics more and accordingly emphasize relative resilience, whereas organization design may rather be interested in the relationship between certain organizational design choices and absolute resilience. Furthermore, we pointed out three important caveats regarding the match between the measurement and the research design: The consistency of the unit of analysis, the relevance of the performance metric, and consideration of the shock duration (among other properties) are important research design related factors to consider in measurement and have to be discussed at least transparently. The minimal set of variables derived from Eq. (1) may open up a fruitful research direction aimed at capturing the configurational nature of organizational resilience, more specifically the correlation structure among all four variables of the minimal set. We are optimistic about future theoretical developments in this direction, which may complement research on the mechanisms that produce resilience as an outcome.