Test power properties of within-firm estimators of ownership and board-related explanatory variables with low time variation

Corporate governance research is often limited in its ability to employ within-firm estimators, which address time-invariant endogeneity, when the variables of interest exhibit low time variation (for example, ownership and board independence). The problem is further exacerbated if data for multiple points in time needs to be hand-collected. We offer simulation-based methodological guidance to improve the statistical power of within-firm estimators in the presence of low time variation. We illustrate the usefulness of our simulation results by replicating two influential studies on ownership and board independence and extending them with a within-firm estimator. Based on widely used databases as well as a novel granular database, we document the different degrees and nature of time variation of ownership and board independence across jurisdictions and subgroups by listed status, family control and complexity of ownership structure. Researchers can use our findings to optimize the hand-collection and pre-processing of governance data and thereby increase statistical power and/or to distinguish whether lack of significance is due to low time variation as opposed to absence of a true relationship between their governance variable of interest and the respective outcome.


Introduction
In studies explaining firm outcomes with governance variables, researchers commonly assume that they exhibit very little time variation. However, unless there is time variation in the data, there are limited ways to strengthen a claim of a causal effect, for example by using a within-firm estimator. We report power properties of the within-firm estimator for different degrees of time variation, length of time series and frequency of sampling of ownership and board independence. Our results help researchers determine whether they have sufficient power in a given empirical setting and how to increase it.
Typical research questions in finance and accounting consider how the controlling ownership stakes of different types of investors affect firm decisions (Banerjee and Homroy 2018;Larrain and Francisco 2013), how outcomes like cost of capital or performance depend on different degrees of conflict of interest among controlling stakeholders (Bertrand et al. 2002;Lin et al. 2011), or how board independence relates to performance or CEO turnover (Coles et al. 2008;Graham et al. 2020). Establishing a causal relationship between governance and these outcomes is very difficult because, among many other reasons, accounting for unquantifiable confounding factors is rarely possible. For example, a significant relationship between a governance variable and performance may not truly exist, but may appear as a result of unobservable firm characteristics (Lins 2003;Bennedsen and Nielsen 2010). In a survey of the board literature, Hermalin and Weisbach (2003) argue that most board research as of that time had failed to address the endogeneity in board composition. More recent survey and editorial articles continue to highlight the problem of reliably addressing endogeneity in governance research (Adams 2017;Edmans and Holderness 2017). The state-of-the-art best practice employs empirical designs based on natural experiments. However, they are rare, or, if present, the setting is generally imperfect and requires additional strengthening tests that are often based on time variation Black 2016, 2020).
From an empirical design perspective, the problem of insufficient time variation in ownership is exacerbated by the limitations of vendor-constructed databases that do not provide all ownership links and require extensive pre-processing and verification (Holderness 2009;Dlugosz et al. 2006), and/or the necessity to hand collect data. Board related variables bring fewer data processing challenges, but they are only easily available for listed firms and similarly may change slowly over time (Black et al. 2017). To facilitate research centered on governance variables with low time variation and/or limited availability, our solution consists of methodological guidance based on simulations and verified using real data from a variety of sources (including a rare granular ownership data source). Our simulations generate artificial data series that satisfy a number of constraints to imitate the underlying economic processes behind governance variables. The simulation results provide guidance on the amount of time variation required for sufficient statistical power of within-firm estimators, depending on a variety of institutional features. Our findings are useful to researchers in at least three ways. First, when a project requires the hand collecting of governance data with low variation, by following our simulation results a researcher can efficiently collect non-consecutive data points to ensure sufficient time variation. For example, if a researcher has hand collected two time observations of ownership (board independence) data four years apart for a sample of firms of interest and less than 60% (55%) of them change between the two periods, she could decide whether it is worthwhile collecting one additional time observation to increase the power properties of her tests. This additional time period would reduce the required proportion of changing firms to around 42% (45%) to detect statistical significance if it was present. Second, suppose a researcher uses one of the widely available databases, for example board data from Boar-dEx or Execucomp, or any of the corporate ownership products of Bureau van Dijk (now part of Moody's), Thomson Reuters (now Refinitiv) or Standard & Poors. She can use our results to check whether the amount of time variation in her sample of interest is sufficient. This is important because, if she finds no significant relationship, knowing whether this may be due to the lack of time variation helps her determine the choice of additional tests to employ in the project. Or alternatively, if time variation is sufficient, helps strengthen her conclusions that a relationship is unlikely to exist. Third, if a researcher decides not to use within-firm estimators and substantiates this decision based on the lack of time variation in the data, our benchmark results help add a degree of formality to this reasoning.
To show the usefulness of our simulation findings and as a proof of concept, we replicate regressions from two existing studies- Lin et al. (2013) and Coles et al. (2008). Lin et al. (2013) examine how the divergence between cash flow and control rights of the controlling owner affects the proportion of public debt a firm holds. Coles et al. (2008) analyze the relationship between board independence, complexity of expertise required by a firm's board and Tobin's Q. Both studies originally do not use a within-firm estimator for their baseline results, motivated by a lack of time variation. We show modifications to their design, guided by our simulation results, where time variation is sufficient, and the more reliable within-firm estimator is able to detect statistically significant relationships as a result of the improved power. For example, in the case of Lin et al. (2013) using firms from the same jurisdiction as in the original paper with ten consecutive time observations (whereas they have on average four per firm) allows the within-firm estimator to detect statistical significance. Importantly, if having ten consecutive time observations is prohibitive, we show that sampling five years apart and having four time observations would be sufficient to find statistically significant results. The conclusions change when time variation is especially low (for example, for US firms). In this case neither more time observations per firm, nor sampling with gaps allows the detection of significance even though it exists.
Lastly, we provide evidence of the different patterns of time variation in governance variables. We employ commonly used data sources (BoardEx, Capital IQ, Thomson Reuters, CSMAR) plus a unique database (ICO by Statistics Canada) that provides a long time series, complete corporate structures and exact direct ownership stakes for all global multinationals that have any subsidiary operating in Canada. We document that the degree of time variation in ownership and board independence differs substantially depending on jurisdiction of control, listed status, family ownership and ownership structure complexity. Therefore, the stability of governance variables cannot always be taken for granted and should be tested for the particular dataset in question, as we do here. For example, the time variation in board independence is greater for non-listed than listed firms and for Anglo-Saxon 1 than US firms. Time variation in ownership is indeed modest for multinational firms controlled from Anglo-Saxon jurisdictions, and it is likely the case for Continental Europe; however, we find a considerable time variation in the population of Canadian firms. We further find that there is greater time variation in the ownership of family versus non-family-controlled firms belonging to pyramid structures, regardless of their control jurisdiction. In addition, the prevalence of complex pyramidal corporate structures is greater among family firms. Although the distinct features of family owners have been studied extensively (e.g., Almeida and Wolfenzon 2006;Anderson and Reeb 2003;Belenzon et al. 2019), the fact that they tend to adjust the ownership structures of their firms more frequently is a new finding. Across size groupings, unsurprisingly, firms belonging to larger structures change more frequently than in smaller structures. This greater variability for family firms is preserved when grouping by (i.e., controlling for) size and jurisdiction.
We further document the following patterns of time variation: (1) the averages of ownership and board independence change little over time; (2) the average proportion of firms that change over two consecutive years is relatively small for ownership and larger for board independence of listed firms; and (3) extended periods of stability are common for ownership and less so for board independence of listed firms, and it takes several years for a considerable proportion of firms to leave a period of stability 2 ; (4) the standard deviation of ownership over time is greater for family firms and pyramids in univariate and multivariate analyses; (5) ANOVA and persistence regressions confirm the differing patterns of stability by family control, jurisdiction and ownership structure complexity.
The importance of considering corporate ownership and board independence in empirical research cannot be overstated. Corporate ownership is central to our understanding of firm decisions like growth (Belenzon et al. 2019), innovation (Aghion et al. 2013), mergers and acquisitions (Basu et al. 2009;Boateng et al. 2017), internationalization (Singla et al. 2017) and financing (Lin et al. 2011(Lin et al. , 2013. Ownership and board independence shape incentives, preferences, and short-term versus long-term orientation of managers and boards in listed firms (Siegel and Choudhury 2012;Banerjee and Homroy 2018;Ellis et al. 2020) and of owner-managers in family firms (Deephouse and Jaskiewicz 2013); tip the balance of power among managers, and majority and minority owners (Aguilera and Crespi-Cladera 2016;Guo and Masulis 2015); allow for unique capabilities that different owners and directors bring to their firms (Rabbiosi et al. 2019;Edmans and Holderness 2017;Kim and Starks 2015), etc. The importance of quantifying governance variables more precisely (including ownership and board independence) is underscored by the everrising use of Environmental, Governance and Social (ESG) considerations in investment decision-making. They are already formally regulated in the UK and across Europe and gradually gaining importance in the US (Eccles and Klimenko 2019;Mooney 2018). 3 In turn, the presence of sizable controlling stakes versus more diversified holdings of different types of institutional investors has implications for monitoring incentives, shareholder activism and adjustments in executive pay structures, which ultimately affect firm decision making (Azar et al. 2018;Flammer and Bansal 2017;Lardon et al. 2019).
The finance and accounting literature abounds with studies that exploit key strategic changes to answer questions about firm outcomes. These settings often involve ownership structure adjustments (M&As, reorganizations, vertical and horizontal integration, new product and market entry) and board composition changes (independence, minority representation, etc.). Our results open the way for time-variation-based methods of analysis where ownership or board independence is a key variable of interest. We help researchers understand the reason behind a finding of no significance of a within-firm estimator, highlight settings where time variation is likely higher and show researchers how to increase the power of within-firm estimators without necessarily having to collect a large number of time observations. This way we answer the call in several recent editorials and review articles highlighting the importance of addressing endogeneity if the quality of business research is to be improved (Semadeni et al. 2014;Reeb et al. 2012;Abdallah et al. 2015;Holmes et al. 2018).
The rest of the study is organized as follows: Sect. 2 outlines the background and justifies our simulation design; Sect. 3 describes our data sources and replicating regressions; Sect. 4 consists of the detailed analysis of time variation in ownership and board independence variables; and Sect. 5 concludes.

Background and simulation design
Corporate ownership and board composition governance variables exhibit little time variation, which limits their potential for clean causal inference. For example, the earliest ownership studies constructed hand-collected samples which were typically static because data was not available in electronic form even for listed firms (La Porta et al. 1999;Claessens et al. 2000;Faccio and Lang 2002, among many). 4 More recently, electronic sources improved coverage and technological advancements have allowed construction of time series datasets, but they have rarely been exploited to address endogeneity. For example, Villalonga and Amit (2009, p. 3083) explain: "Because these variables exhibit very little time-series variation, we abstain from using firm fixed effects," while Banalieva andEddleston (2011, p. 1065) acknowledge that "the standard fixed effects estimation is infeasible". More recently, Pronobis and Schaeuble (2020, footnote 21) write "A closer look to the dataset reveals that our foreign ownership variables are relatively sticky. That makes it difficult to get a sufficient number of observations for which we can identify changes in foreign ownership." On this basis, they acknowledge the subsequent lack of power: "Therefore, the results provided by estimating our change model have only limited explanatory power". Similar examples abound in the literature on board independence. Coles et al. (2008) argue that including firm fixed effects is not appropriate in their setting because most of the variation in board size arises in the cross-section instead of in the time series. Choi et al. (2007) report that their results on the value of outside directors disappear once they introduce firm fixed effects. Black and Kim (2012) highlight the advantages of their Korean data set, as this provides enough time variation in the variable "outside directors" to make within-estimation feasible. Wintoki et al. (2012, p. 591) acknowledge that "[b]oard structure is highly persistent [, which] can reduce the power of any panel data estimator". More recently, Frye et al. (2021) refer to the stickiness of board structure and in all their specifications they only use industry fixed effects, rather than a within-firm estimator.
Faced with this lack of time variation, a few of the papers employ a random effects estimation which does not rely exclusively on the time dimension of the data. However, it is not appropriate to consider this estimator as an alternative to fixed effects because the underlying assumption of the random effects estimator is exogeneity. 5 Empirically, Black et al. (2014) perform extensive testing and reject the equivalence of the fixed effects and random effects estimators in the context of multi-country governance studies. The fixed effects and first difference estimators are the ones alleviating time invariant endogeneity. 6 As Reeb et al. (2012, p. 214) highlight: "Evidence of causal relation with unit level fixed effects can be quite compelling […]." Bliese et al. (2019, p. 9) go as far as to refer to the within-firm estimator as the "gold standard to which results from other analytic options are compared".
There is growing scientific consensus that the most reliable causal inference methodologies are shock-based designs often referred to as quasi-natural experiments Black 2016, 2020). These approaches often require at least two time points-before and after the shock-for a difference-in-difference (DiD) design (for example, Aguilera et al. 2017;Liu et al. 2015). Importantly, even when an instrument and/or a shock is available, the design may be imperfect and require additional strengthening measures to rule out alternative explanations (Atanasov and Black 2020). For example, one could include additional covariates and/or firm fixed effects in a DiD design to get closer to satisfying the parallel trends assumption. Furthermore, routinely required tests to support causal inference findings are placebo tests to show that a significant finding around the shock disappears around a different date. However, the absence of significance of the placebo test could be a mechanical consequence of the lack of time variation in the key explanatory variable and cannot be used as supporting evidence for the DiD findings.
Given the small amount of time variation in ownership and board independence and/or limited data availability, some researchers have opted to use data with gaps. For example, Franks et al. (2012) focus on family control at two points in time ten years apart, Basu et al. (2017) study blocks of any kind of owners five years apart, while Lin et al. (2011Lin et al. ( , 2012 construct ultimate ownership for four points in time two and three years apart. Wintoki et al. (2012) sample board independence every two years, while Boone et al. (2007) and Linck et al. (2008) sample every three years. There is no formal guidance in the literature on the gap length that ensures sufficient time variation in the variable of interest. Even in cases where electronic databases make uninterrupted time series readily available, extensive pre-processing and cleaning may preclude the researcher from examining every single consecutive observation. In addition, in the case of ownership, these series may not 5 The key assumption for the validity of the random effects estimator is cov x it , a i = 0 , i.e., x should not be correlated with the unobservable firm-specific heterogeneity a i (Wooldridge 2010). We thank an anonymous reviewer for suggesting the correlated random effects (CRE) estimator to us. If we were to treat ownership or board independence as one of the time-varying explanatory variables in the CRE setting, Wooldridge (2019) shows that the coefficient vector ̂ is identical to the within-firm estimator in the case of an unbalanced panel, while Mundlak (1978) has originally shown this result for the case of a balanced panel. Therefore, the simulation results would be mathematically equivalent if we were to use CRE instead of the within-firm estimator. 6 There is one setting, where within-firm estimators are inappropriate due to its institutional specificity: CEO ownership. As Adams (2017) explains, using a fixed effect estimator to test the effect of CEO ownership on firm value will result in estimating the effect of CEO turnover instead of managerial incentives, because big changes in CEO ownership within a firm arise primarily from changes in the CEO. reflect the complete set of ownership links, and additional data-collection is likely to be necessary. 7 The first article to employ the fixed effects estimator in the study of managerial ownership and firm performance, using consecutive observations from a widely available database, found no significance (Himmelberg et al. 1999). However, one more recent extension (Kim and Lu 2011) and a comprehensive in-depth analysis (Fabisik et al. 2021), using a longer time period with greater time variation as a result of stock-based executive compensation, identified a significant relationship. Similarly, Graham et al. (2020) document that although there is large persistence in board independence, over longer horizons there are significant within-firm changes in board structure. This invites the question of the amount of time variation sufficient to identify a relationship, if it exists, as well as the number of observations necessary to detect it when data is not readily available, both of which we address in this work.
The time variation in most governance variables is shaped by the regulatory and institutional environment in which firms operate. In the case of ownership, the values naturally cluster at important threshold points: simple majority, supermajority and 100%. For board independence, different jurisdictions have gradually introduced legal minimums for listed firms. In addition, firm charters determine the frequency of replacing board members. There are also differences in terms of the range of values governance variables can take. For example, cash flow rights can be very close to 0, for firms held by multiple layers of subsidiaries, all the way to 1, for wholly owned firms. Board independence of listed firms is most likely to vary between 0.5 and 0.95. All these differences are paramount in both parts of our empirical design. In this section, we simulate separately the time variation in ownership and board independence as close as possible to their real-life characteristics, whereas in section four, we capture these differences by performing statistical analysis of real data from multiple angles (averages over time, cross-sectional and time-series standard deviation, year-to-year variation, persistence, etc.).
We begin by modelling the time series evolution of ownership that reflects its specific nature: being a particularly stable ("sticky") variable, changing in a step-wise fashion and clustering in ranges with economic importance (for example, slightly above 50%). We employ transition probability matrices for this purpose. 8 A transition matrix T contains the probabilities p rc of moving from state r (along the rows) to state c (along the columns). The states are ownership ranges that reflect important cutoffs used for significant decisions in corporate charters, for example, absolute majority (50% + 1 vote) or supermajority at two thirds and three quarters. We pick the minimum number of states that capture the important cutoffs and the typical patterns of ownership stakes as summarized in Faccio and Lang (2002), Claessens et al. (2000) and Holderness (2009):own < 0.05;0.05 ≤ own < 0.5 ;0.5 ≤ own < 0.51;0.51 ≤ own < 0.6;0.6 ≤ own < 0.75;0.75 ≤ own < 0.9;0.9 ≤ own < 1 ;own = 1 . For board independence, there are less likely to be common institutional cut-offs, because mandated minimums are jurisdiction-specific, not always codified, and often under a comply-or-explain regime. Thus, our simulations use data-driven quartiles (as in Graham et al. 2020), where the transition probabilities and bin cutoffs are determined by real-life global data in the BoardEx and CSMAR databases. 9 We impose a structure on the ownership transition matrix generated in each iteration of the simulations satisfying a minimum number of rules consistent with stylized facts we already know from existing work and the nature of ownership data: (1) rows sum up to 1 (general property of transition probabilities); (2) the diagonal elements are very high (reflecting stickiness); (3) probabilities along the two off-diagonals (superdiagonal and subdiagonal) are relatively higher (reflecting the greater chance of moving to an adjacent state; for example, the probability of moving from bin 3 ( 0.5 ≤ own < 0.51 ) to bins 2 or 4 is higher than to other bins); and (4) the probabilities along the last column are also relatively higher (reflecting a tendency towards wholly owned subsidiaries (Nicodano and Regis 2019)). 10 This structure would hold for any one of the three controlling ownership measures used in the literature: ultimate cash flow rights, control rights or the ratio between the two (often referred to as wedge; for example in Faccio et al. 2011and Lin et al. 2011. It is likely that the ultimate cash flow rights variable exhibits greater time variation than control rights, because any rearrangement of the corporate structure will affect it, while control rights are based on a threshold of control and will only change if that threshold is reached. In this case, the wedge variable will co-vary with ultimate cash flow rights. Of course, there may be cases where both ultimate cash flow rights and control rights change to the same degree, whereby the wedge may appear more stable. To accommodate research applications with any of these ownership measures and a variety of settings from different jurisdictions where ownership may vary to a different degree over time, we define two degrees of stringency of the regularity constraints (2)-(4) above: one for a case of relatively higher time variation and another for lower time variation.
For board independence we define the data-driven quartile bins as follows: board indep < 0.5 ; 0.5 ≤ board indep < 0.65 ; 0.65 ≤ board indep < 0.8 ; and 0.8 ≤ board indep < 0.95 . These cutoffs are based on the actual distribution of board independence in the BoardEx database for global listed and non-listed firms and in the CSMAR database for Chinese listed firms. Transition probabilities and bin widths by jurisdiction, listed status and state ownership are given in the online appendix. In the context of board independence, time variation will depend on the regulatory and institutional setting in different countries with respect to listed, private and state-owned firms. For example, the Chinese Securities Regulatory Commission (CSRC) mandates that as of June 30, 2003, a minimum of one third of all board directors of a listed firm should be independent. While, among Anglo-Saxon countries and most of Europe, listing rules require that more than half of directors be independent (Papadopoulos 2019). To reflect this variety in the legal range of levels board independence can assume and thereby its scope for change, we again adopt a low and high time variation regime for the simulations of board independence.
The simulation analysis follows the standard power calculation framework outlined in Murphy et al. (2014). We generate artificial data to assess the power of hypothesis tests in a firm-fixed effects specification when using different time-series lengths and frequency of sampling, under the absence or presence of time-variant endogeneity. Our strategy is 9 Refer to the online appendix for transition probabilities and bin widths of board independence for different jurisdictions and by listed and state ownership status. Ownership simulation results based on datadriven quartile bins are also available in the online appendix. 10 Full details and the algorithm used in the execution of the simulations are available in the accompanying online appendix.
to generate pseudo-random samples by a process following a theoretical relationship of interest. Our focus is determining whether tests of statistical significance of the coefficient of interest are able to detect this relationship (by rejecting the null hypothesis). Intuitively, since our artificial data was generated under the alternative hypothesis, we should reject the null hypothesis most of the time. Conventionally, a test with a power of 80% or above is considered adequate (see Murphy et al. 2014).
More specifically, the true model underlying our generated data takes the form: where is the primary coefficient to be estimated, X it is artificially generated ownership/board independence data based on a random transition probability matrix T following the minimal structure described above, W it is artificially generated data for a generic control variable (like size, leverage, tangibility, etc.) that is correlated directly with X and with Y via its error term , where W it = X it + it + it . We choose relatively high levels for and of 0.9 to generate a conservative case of high time-varying endogeneity. 11 In the set of results under time-varying exogeneity = 0 and = 0.
Under time-varying endogeneity we also allow the explanatory variable to be related to it through the transition probabilities p(X rcit | , f t ) . We follow Bazzi et al. (2017). The parameter vector contains all static parameters that govern the transition probabilities, while f t captures the dynamic elements that depend on it . Making X and W dependent on the disturbance term of the outcome and correlated with each other is realistic and creates correlation with the error, which violates the zero conditional mean assumption and leads to biased estimates. 12 (1) The parameters , and the implicit coefficient of 1 in front of in the W equation all contribute to endogeneity. It propagates via several channels: through the key variable of interest X , ownership or board independence, as well as the control variable W and the correlation between them. We have selected moderately high levels of endogeneity (close or equal to 1-around the upper range of values we explore for , namely 0.4, 0.6 and 0.8) to show a situation where statistical power is substantially affected. The time-variant exogeneity case serves as a lower bound where power is not affected. We experimented with levels of these parameters ∈ (0, .9) , and the results, as expected, fall between the two cases we show here (exogeneity and moderately high endogeneity). For even more extreme levels of the endogeneity parameters that result in highly positive or highly negative bias, again as expected, power increases because we always reject the null as the estimated is very far away from 0. In these cases different degrees of time-variation do not play a role in not finding statistical significance if it is present. 12 We thank an anonymous reviewer for suggesting that we add an additional control variable that is also endogenous. This setup is realistic and introduces more channels for endogeneity to propagate. When endogeneity is introduced only through the transition probabilities of X, which is stable by construction, endogeneity has little scope to pass on. Another suggestion by the anonymous reviewer was to model the endogeneity in X through measurement error instead of the transition probabilities. The results are consistent with the ones presented here. The online appendix provides details on these alternative approaches.

3
We simulate data under (1) using three different values for beta, = 0.4, 0.6, 0.8, for two transition matrices reflecting high and low time variation in X . We then estimate regression model (1) on a subsample of the generated data using different gaps and lengths of time. The null and alternative hypothesis are: Table 1 reports the proportion of times out of 1000 iterations that the null hypothesis is (correctly) rejected. If we find a rejection rate for hypothesis H0 of at least 80%, the amount of time variation is sufficient for statistical power purposes (shown in bold). In Panel A of Table 1 we simulate data that mimics a situation in which the researcher has available consecutive data points for 10, 15 and 20 years shown along the x-axis. The different values of the theoretical beta coefficient in (1), i.e., = 0.4, 0.6 and 0.8, are given along the y-axis. The reported results are based on a conservatively small sample of 500 firms, given the limitations of hand-collecting data. The first and third sections summarize the results where strict exogeneity holds; i.e., cov X it , it = 0 in (1), and the second and fourth sections summarize the results where ownership is endogenous; i.e., cov X it , it ≠ 0 in (1) and cov W it , it ≠ 0 . Results for the low (high) time variation case are at the left (right) of each table.
We find that for the simulations exploiting consecutive time observations (10, 15 and 20), whether based on relatively high or low time variation, if there was a true relationship between X and y , it should be uncovered by hypothesis tests under time-varying exogeneity of X (first and third sections of Panel A Table 1). However, under time-varying endogeneity (second and fourth sections of Panel A Table 1) a weak theoretical relationship ( = 0.4) cannot be detected for any time series length. In the case of = 0.6 and low time variation, power is insufficient for ownership, but for board independence, 20 years of data overcomes this. 13 Since in many cases collecting ten consecutive years of data may not be practical, we repeat our simulations using different time-series lengths and frequency of sampling. In Panels B and C we consider data with gaps-every 2-5 years-(along the y-axis) for two, three or four time observations (along the x-axis). For example, the coordinate (two time obs, every three years) is consistent with collecting data in 2010 and 2013. The different values for the beta coefficient in Eq. (1) are represented vertically. The first section of Panels B and C shows results where strict exogeneity holds, whereas the second section shows results for the presence of time-varying endogeneity.
In Panel B (ownership), for the high time variation case (right-hand side), we find that the 80% power threshold is reached for a theoretical = 0.8 and any number of time points and gaps in sampling, while for = 0.4, two or three time observations less than four years apart are no longer sufficient. The results are somewhat stronger in the right-hand side of Panel C (board independence), with additional cases of sufficient power for two time observations every three years under = 0.6, as well as three time observations every three years under = 0.4. For low time variation (left-hand side), in Panel B (ownership), Ha ∶ ≠ 0 13 We stress that our simulation results have no bearing on addressing endogeneity. Our decision to run the simulations under exogeneity and in the presence of endogeneity is only intended to diagnose the ability of regression analysis to detect a relationship, if it exists, under these two sets of assumptions. power is only sufficient for = 0.8, with four time observations sampled three years apart or more. In the left-hand side of Panel C (board independence), power is a lot higherbeing sufficient in all cases under = 0.8 and for three and four time observations under = 0.6 and even under = 0.4 for four time observations every three years or more. The results in Panel B show that collecting ownership data with gaps does not generally lead to good power properties when time variation is low, except when the theoretical relationship is strong and there are at least four time observations. By contrast, in the case of board independence (Panel C), even for weaker theoretical strength under relatively low time variation, sampling every three years or more and having at least four time observations provides sufficient power.
In the right-hand sections of Panels B and C we present results under time-varying endogeneity. We find that endogeneity leads to worse power properties than under exogeneity for both the low and high time variation cases. For ownership (Panel B), power is only sufficient under = 0.8, high time variation and three or more time observations, while for board independence (Panel C), we see cases of sufficient power for low time variation and = 0.6. We caution that the power measure is not as informative in the presence of endogeneity since the t-statistic at its basis has a biased numerator.
To sum up, if a relationship between X and y exists, consecutive data of ownership or board independence covering at least ten years should detect it when using a within-firm estimator under exogeneity. Under time-varying endogeneity, however, a weaker theoretical relationship cannot be detected even with 20 years of data and large time variation. Under the practical approach to collecting data with gaps in the presence of time-varying endogeneity, hypothesis tests are more likely to detect a relationship with ownership (board independence) if there are at least three time observations for a relatively strong (even for less strong) theoretical relationship and time variation is high (or even when time variation is low).
In light of our simulation results, we can revisit some existing work where time variation can be deduced. Donelli et al. (2013) report explicitly an average proportion of 6-7% of firms changing each year over a 20-year period for their Chilean data. This degree of time variation clearly corresponds to our low variation case (lower left corner of section two of Panel A Table 1). In their Table 8 they report various outcome regressions, whereby ownership changes are significant only when firm fixed effects are not included. There could be two possible explanations: (1) either time-invariant endogeneity leads to bias in the OLS estimator and erroneously shows significance, or (2) the relationship indeed exists but the low time variation in their data prevents them from detecting it with a within-firm estimator. Our simulation results show lack of power for the same 20-year period, with theoretical relationship strength below = 0.8 and low time variation, as is the case in their data, and therefore provide support for explanation 2. Thus, the conclusion of Donelli et al. (2013) regarding a weak theoretical relationship between ownership and real outcomes is supported by our work.
In Panels A (ownership) and C (board independence) of Fig. 1 we summarize the simulation results under time-varying endogeneity and sampling data with gaps to produce a relationship between a measure of time variation (the proportion of firms changing between one time point and the next) and statistical power. These graphs show the predicted marginal probabilities of rejecting H0 based on a logit specification using the data from all simulations. 14 The logit model regresses power on frequency of sampling, the proportion of firms that change between the first and next period, true beta, number of time observations, and starting year of sampling. 15 We show statistical power (along the y axis) as a function of the proportion of firms that change between two time periods (along the x axis) for three different strengths of the theoretical relationship being tested, sampling gaps and number of time observations. Suppose a researcher has collected two time observations five years apart for a sample of firms, as in Basu et al. (2017). She can compute the proportion of firms that change between the two points in time; then she can compare it to the required 63% (for = 0.8) or 86% (for = 0.6) for sufficient statistical power of at least 80% (bottom row of Panel A of Fig. 1). If she observes lower proportions, she could decide to collect an extra time observation, which reduces the required range to 40% and 60% respectively.
Two of the studies we surveyed allow us to deduce the proportion of observations that change in their sample but only over the whole period of analysis: in Lin et al. (2011), 21% of firms exhibit changes in ownership over a 13-year period, while in Lin et al. (2013) the proportion is 39% over a 10-year period. In the first paper, the authors collect ownership data four years apart. To see how these degrees of time variation map with our results, in Panel B of Fig. 1 we present a subset for sampling every four years as a function of the proportion of firms that change between the first and last year of data (instead of the following year, as in Panels A and C). We note that Lin et al. (2011) opt for difference regressions not on their full data but only on the observations that do change. The 13 years of data sampled every four years corresponds to the right-most graph in Panel B. For 17% of firms changing (close to the 21% in Lin et al. (2011)) between the first and last period, sufficient power is only present for a theoretical beta coefficient of 0.8. Therefore, a presumed lack of power on a full first difference estimation in their case is consistent with a relatively lower theoretical beta.
For board independence under endogeneity (Panel C of Fig. 1) we show sampling every two years, consistent with the design in Wintoki et al. (2012), and every four years for contrast. For low theoretical strength and two time periods power is very low, but for four time observations we find that power is enough once 45% of the firms change between the first and next sampling period. For higher theoretical strength this proportion drops to 10%. We examine these findings further in the next section, where we replicate two influential studies where ownership and board independence are the key variables of interest. Figure 1 may help researchers in several ways. Even when consecutive years of data are easily available in electronic form from data vendors, careful empirical design will benefit from comparing the degree of time variation in the data (relevant to a research question of interest) to our benchmark results. For example, if the researcher finds no significant relationship and the amount of variation in their data is low relative to our benchmark results, the decision not to use within-firm estimators can be substantiated with more formality. When data collection is a hurdle, for example when board data for private firms is not readily available, starting with two non-consecutive time observations of data and measuring the proportion of firms that change allows the researcher to decide whether collecting an additional time observation of data is worthwhile. Furthermore, a researcher starting a new project could look up an existing study that uses data with similar characteristics (jurisdiction, family control, listed status, etc.), check the proportion of firms that change in that study, and, coupled with our power graphs, choose a sampling time gap that provides sufficient statistical power.

Data and replicating regressions
Next, we check whether our simulation findings play out in real-life data by replicating two highly cited studies focusing on ownership (Lin et al. 2013) and board independence (Coles et al. 2008). We use several data sources for ownership and board independence data. Our most comprehensive source (ICO) provides a long time series coverage of detailed controlling ownership for a large number of firms. It is distinctive from most of the popular ownership databases in the following ways. First, it provides the complete chain of ownership links from any firm operating in Canada (above a certain size threshold 16 ) to its ultimate for full coverage of the data and further splits by family control and ownership structure complexity. We perform an extensive hand-collection of family control status, apply a verification algorithm with multiple alternative sources and exploit the long time series to detect inconsistencies and outliers. Last, we compute three ownership variables-cash flow rights of the ultimate owner, control rights of the ultimate owner and the ratio between the two-using computational tools from graph theory and formalized in Almeida et al. (2011). To ensure greater representativeness for the rest of the world, we augment all analyses with the CSMAR database covering all listed firms in China and offer further splits by state ownership. The raw direct ownership stakes we start with come from the Intercorporate Ownership Database (ICO) compiled by Statistics Canada for the period 1995-2019. The CRA specifies a penalty of fines and/or prison for failure to disclose ownership information. In ICO, all firms are attributed to belong to structures referred to as "enterprises" based on common control. 17 The two groups of firms (domestic Canadian and multinationals) differ in that the domestic dataset includes a large majority of private and small firms, while the multinational (MNC) dataset represents firms that are larger and more likely to be listed. For the replicating regressions we are limited only to listed firms, for which there is financial data (in either Capital IQ or Thomson One) and for which we can construct an outcome variable (public-to-private debt ratio and firm performance as proxied by Tobin's Q) plus the control variables in each of the two studies we replicate. We compute three ownership variables: the ultimate cash flow rights of the controlling owner (ucfr), the control rights of the controlling owner (cr) and the ratio between the two (wedge). ucfr i is the proportion of one unit of disbursement from firm i that is received by the controlling owner, while cr i is the critical control threshold, which is shown to be equivalent to the concept of the weakest link (as used in La Porta et al. 1999, Claessens et al. 2000, and Faccio and Lang 2002 when cross-shareholdings and multiple links are absent but can also be computed for more complex structures. We follow Almeida et al. (2011) in the construction of ucfr and cr. In the remainder of the text, for brevity, we use the term "ownership" synonymously with any one of the three variables. Full definitions of ownership and board independence variables are given in Appendix 1, while summary statistics of all variables used in the two replications are in Appendix 2.
We collect board independence data from BoardEx, which is the most comprehensive board composition database, covering more than 20,000 companies globally. We separate firms into four jurisdiction groups in descending order of coverage: US (12,559, of which 7342 are listed), Anglo-Saxon (7228, of which 5258 are listed), Continental Europe (3532, of which 2865 are listed) and Rest of the World (5716, of which 4993 are listed). Again, we augment the East-Asian coverage by analyzing board independence of all Chinese listed companies in the CSMAR database (2613, of which 1000 are state-owned). For the illustrative regression on board independence, we use the same data as in the original study-US listed firms-and retrieve the required financial variables from Compustat.
We begin with a quasi-replication of Lin et al. (2013). They study the choice between public and private debt among Western European and East-Asian companies. Lin et al.  Table 4, which contains the specification we use). Our ownership data cover 25 years and allow us to select firms that have sufficient number of consecutive time observations so that we are able to examine different time series properties.

3
We adopt their specification 18 : Our extensions employ the within-firm estimator: where we suppress firm-year subscripts for brevity; debtchoice is the ratio of public to private debt; i are firm fixed effects, which we use in our extensions but which are not present in Lin et al. (2013); t are year effects, j t are country × year effects; X k is a vector of control variables: cash-flow rights, leverage, tangibility, size, profitability and Tobin's Q, all defined exactly as in the original paper ( ind q are industry fixed effects, which are absorbed by the firm fixed effects in our extensions).
We use a sample of similar but not identical firms because the data source in Lin et al. (2013) is a proprietary database (ORBIS by BvD-refer to the online appendix for the differences in precision between the BvD family of databases and ICO). However, we apply the exact same regression specification as in their Table 4, given in Eq. (2) above, plus within-firm estimators with different frequencies of sampling, given in Eq. (3). We show results for three jurisdictional samples in Table 2: All firms (Panel A), Continental Europe and East Asian firms-CEEA (Panel B) and US & UK firms (Panel C). We examine how the results may differ for firms not analyzed in Lin et al. (2013)-US firms, which exhibit lower time variation in ownership than those in Western Europe and East-Asia.
Column (1) of Panels A, B and C in Table 2 reports the same OLS estimator used by Lin et al. (2013), while in column (2) we show the within-firm estimator using the full data. In columns (3) and (4) of Panels B and C we limit the sample to contain four time observations with sampling frequency every two and every five years. To maintain comparability between the two sampling frequencies in Panels B and C, we keep the number of time observations the same, which means that in the five-years-apart case we need at least 16 years of data (including the first and last year).
When using the full data with a OLS specification with industry dummies (column (1) of Panels A, B and C), we confirm Lin et al. (2013) findings that the use of public debt is lower when firms have a higher wedge and z-score, but the interaction of the two counteracts this effect. However, in column (2) of Panel A, using the within-firm estimator results in lack of significance. This could be attributed to the low time variation in ownership of US firms, which represent around half of the sample.
The within-firm estimator in column (2) of Panels A, B and C only detects significance for the sub-sample of CEEA firms. At the bottom of Panels B and C in Table 2 we report the proportion of firms that change between the first and second sampling period. In column (3) Panel B Table 2 only 14.6% of CEEA firms change and the wedge effect is not detected, but five years apart (column (4)), 35.7% of the firms change and the effect is revealed.
For US & UK firms (Panel C of Table 2), however, the potential effect of the wedge cannot be detected by the within-firm estimator (columns (2), (3) and (4)). The low degree of time variation for the US & UK sample is evident in that only 10.8% of firms change in two consecutive years, while the proportion increases slightly to 13.4% and 17.1% respectively for two and five years apart. Both sets of results in Panels B (CEEA firms) and C (US & UK firms) are consistent with our simulation findings. In particular, having ten years of data is sufficient to detect an effect if it exists and is relatively strong (second section of Panel B Table 1); however, sampling five years apart with four time observations requires at least 24% of firms to change (left-most graph on the bottom row of Panel A Fig. 1).
Next, we replicate the regressions in Table 5 of Coles et al. (2008). They study the effect of firm complexity on firm value as measured by Tobin's Q in requiring a variety of rich expertise from the board. Their specification is: Our extensions employ the within-firm estimator: where we suppress firm-year subscripts for brevity; Q is Tobin's Q; i are firm fixed effects, which we use in our extensions but are not present in Coles et al. (2008); t are year effects; insiderfrac = 1-board independence; outsiders is the log of the number of independent directors; advice is a dummy variable equal to 1 if the firm-year observation ranks above the median by the first principal component of firm complexity based on the number of segments, size and leverage; X k is a vector of control variables: R&D dummy, standard deviation of returns, profitability plus its lag, intangible assets and CEO ownership, all defined as in the original paper; and ind q are industry fixed effects, which are absorbed by the firm fixed effects in our extensions. Here we require three time observations four years apart, which means a minimum of nine years of data plus one for lagged profitability.
We begin by mimicking the number of observation in Coles et al. (2008) in Panel D of Table 2. We are unable to replicate their number of firms, since they are not reported in the original study. We show that the within-firm estimator only detects the significance on insider fraction for the subset of firms with high time variation in column (4). This implies that the sample Coles et al. (2008) were working with likely had low time variation. Next, we exploit our full sample, which is larger than Coles et al. (2008), despite the fact that we limit the time period to ten years as is the case in their paper. This is due to the better coverage in BoardEx, which begins in 1999, while the Execucomp data in Coles et al. (2008) covers the period 1992-2001, but is sparser. The OLS and within-firm estimators in columns (1) and (2) of Panel E Table 2 detect the effect of the proportion of insiders as found by Coles et al. (2008) in their Table 5 column (3). When we sample with gaps using three time observations, power decreases but is still enough to detect the effect in column (3) given that 34.8% of firms change between the first and next sampling period (consistent with the dashed line in the top right graph of Panel C in Fig. 1, which requires more than 30% of firms changing). In column (4), however, 41.2% of firms changing four years apart (4) no longer provides sufficient power, as the bottom right graph of Panel C in Fig. 1 requires more than 45% of firms to change.
Our replicating regressions verify that the within-firm estimator can detect a relationship between a sticky variable of interest even in a short panel as long as the sampling is performed as far apart as necessary to capture sufficient time variation. This is important, because within-firm estimators (even if imperfect) are more reliable in establishing causality than alternative approaches in the absence of shocks or valid instruments. 19

Empirical analysis
Last, we perform six types of time variation analysis to investigate how the four variables we consider map along our simulation findings (Table 1 and Fig. 1) depending on jurisdiction, listed status, family or state control and ownership structure complexity. Three of the analyses are of descriptive nature (averages over time, year-to-year variation and stable regimes), while the other three analyses are regression-based (analysis of variance, regressions showing the determinants of the time-series standard deviation of ownership and board independence and persistence regressions). We show additional sub-sample results along all six types of analyses in the online appendix.
The granular ownership data in ICO with a long time dimension allows for the most detailed sample splits, the dominance of state-owned firms in China provides an additional interesting dimension, while BoardEx shows the time variation patterns in board independence by jurisdiction and listed status. The composition of the ICO database is presented in Panel A of Fig. 2. The group of MNCs controlled from Anglo-Saxon countries is large and provides relative jurisdictional homogeneity. Therefore, our baseline analysis focuses on the two largest groups: Canadian firms and Anglo-Saxon MNCs. We present results for the two remaining groups-Continental Europe and Rest of the World-in the accompanying online appendix (the right-most two rectangles of each graph in Panel A of Fig. 2). 20 Anglo-Saxon countries are united by a type of capitalism characterized by a lower degree of government intervention and greater reliance on free market mechanisms (Esping-Andersen 1990). We focus on productive 21 firms that are part of either a pyramid or a group corporate structure. 22,23 This approach may appear too restrictive at first glance, but we emphasize that the granularity of the data means that firms which usually would be classified as stand-alone, here fall in the group or pyramid type. In particular, listed or large private firms, even if non-family-controlled, almost always have subsidiaries (often wholly owned and organized in flat structures in 19 In the ANOVA analysis in the following section we show that year, industry, size deciles and the interactions thereof capture a tiny proportion of the variation in ownership and board independence. 20 The results for the subsamples of Continental Europe and Rest of the World are more consistent with the Anglo-Saxon MNC than the Canadian subsample, but any conclusions based on them would be based on a much smaller sample size. 21 We refer to non-financial firms as 'productive' throughout the text, tables and graphs. Financial firms include all types of financial intermediaries, investment funds, trusts and holding companies. 22 Results for financials are available in the online appendix. Similarly, some of or baseline analyses focus on firms with at least 12 years of data, in these case we show results for firms with less than 12 years of data in the online appendix as well. 23 We use an established definition of a pyramid without the restriction that it must contain a listed firm (consistent with Almeida et al. 2011 andFaccio et al. 2010).

Table 2 Illustrative regression results
Panel A: All firms-specification as in Lin et al. (2013)    Panel D: Board independence specification on subsamples with low and high time variation-specification as in Coles et al. (2008)  the case of US or UK control) and therefore will be classified as either group or pyramid (the bottom two groups in Panel B of Fig. 2). Our definition of a stand-alone firm applies to small firms without any links to other legal entities. Stand-alone firms exhibit almost no time variation in ownership (results available in the online appendix). While all member firms of a corporate structure are included in the calculation of ultimate cash flow rights, in the subsequent analyses we are only interested in productive firms, because the non-productive ones are most often shell companies or individual trusts without any business activity. This still leaves in the analysis the productive firms that may have an ultimate owner that is a financial institution. These filtering steps have a balancing role in that the differences in size, industry and listed status between Canadian firms and Anglo-Saxon MNCs become less pronounced. In particular, the variety of sizes and industry composition represented in the structures are not significantly different between the Canadian and Anglo-Saxon MNC samples (industry and size splits are presented in the online appendix). The proportions of listed firms remain Panel E: Board independence specification on full data and sampling with gaps-specification as in Coles et al. (2008) Table 5 with robust standard errors as in their specification. The dependent variable is Tobin's Q. We use data on US listed firms as in the original paper. Panel C shows subsample splits with the same number of observations as in the original paper by low and high time variation in board independence. Panel E columns (1) and (2) show OLS and within-firm estimators on the full data, while columns (3) and (4) show sampling every two and four years respectively. Summary statistics of the variables used in replications are given in Appendix 2. All variable definitions are as in the original papers. Significance symbols and thresholds used: * for p < 0.1; ** for p < 0.05 and *** for p < 0.01 different, whereby 21% of the Canadian enterprises contain a listed firm, while 55% of the Anglo-Saxon MNCs do. 24 Starting with the total counts of productive firms in the database shown at the bottom of each graph in Panel A of Fig. 2 (78,128 non-listed and 2472 listed), we focus on the pyramids and groups with Canadian (28,116) and Anglo-Saxon (17,620) control. After dropping the firms that disappear from the database in 2006 due to a higher size threshold for mandatory disclosure, we have 6033 Canadian and 4848 Anglo-Saxon firms split by family control and listed status in Panel B of Fig. 2.
In Fig. 3  These examples illustrate typical economic processes behind the time variation in ownership. We now turn to a detailed analysis of the patterns of these changes for different groups of firms over time.
In our analysis a firm is considered to be family-controlled if its ultimate owner is reported as a single entity describing an individual, a family, a group of related individuals or a group of related families, as in Claessens et al. (2002). 25 We take great care to assign family status with as much precision as we can by verifying the data in at least five other sources (as described in the accompanying online appendix). 26 24 The descriptive statistics and analysis in this section is done at the firm level, where multiple firms belong to the same enterprise (corporate structure). Here we give an idea of the proportion of enterprises having a listed firm because the disclosure requirements pertaining to listed firms likely have relevance to the entire structure under common control. The balancing achieved by focusing on the complex structures is a side product of eliminating stand-alone firms, whose ownership largely does not change over time. 25 In defining family ownership, we include spouses and children, but not cousins. For example, the Bronfman family shows up in the data with two separate and distinct groups: the Charles and Edgar Bronfman group (the sons of the patriarch Samuel), which controlled Seagram until 2001; and the Peter and Edward Bronfman group (the sons of Samuel's brother Allen), which went through several metamorphoses: Edper investment company, Brookfield Asset Management, Brascan Ltd. and presently Partners Ltd. 26 Firms are considered non-family-controlled if we cannot find evidence otherwise. This approach means that any measurement error in the coding of the variable family control will be against finding a difference between family and non-family firms, because presumably a number of cases that are family-controlled may have been left in the non-family category. This, however, is not a problem for us, since any differences we document will in fact be even higher, were the measurement error eliminated. State control is another important type of ownership, but because we observe very few enterprises under state control in the ownership data from ICO, we exclude them from the baseline analysis.
We study the patterns of time variation in ownership across jurisdictions, family and non-family status, and ownership structure complexity. These groupings are guided by the extensive literature on family firms, ownership structures and institutions. Existing literature reveals a number of reasons why family owners make ownership change decisions differently than non-family ones. For example, inheritance planning is only relevant for family owners and therefore some changes in the ownership structure of family firms are motivated by the distribution of wealth among descendants (Villalonga and Amit 2009;Tsoutsoura 2015). Family owners have a longer-term investment horizon (Bertrand and Schoar 2006) and are more risk-averse (Faccio et al. 2011), which affects the timing and  Fig. 3 Examples of ownership changes over time size of their ownership stake decisions. Importantly, family business groups are likely to be organized as pyramids (Almeida and Wolfenzon 2006) or employ other control enhancing mechanisms (Masulis et al. 2011). Board independence may vary more in mature markets with stricter regulatory or institutional shareholder scrutiny. Among 26 countries which undertook corporate governance reforms in the beginning of the 2000s, only three mandated a minimum board independence threshold, while most employed a comply-or-explain model (Kim and Lu 2013). At the same time the range of values board independence can take depends directly on the current legal minimum in the respective jurisdiction-one third being most predominant in emerging markets versus one half in developed economies (Papadopoulos 2019). In addition, non-listed firms are likely to not be subject to any minimum board independence requirements, while state-owned firms may have even stricter limits (for example 90% in Sweden and 80% in Vietman (OECD 2018)).

Summary statistics and standard deviation regressions
We begin the analysis of time variation of ownership and board independence with a summary graph of yearly averages across all firms, which we show in Fig. 4. In Panel A, the cash flow rights of the ultimate owner exhibit great stability. Starting from the bottom, the group of Canadian family firms are held with the lowest stake on average, which changes the most year to year. The average ownership stake of the non-family Canadian and family Anglo-Saxon MNC groups is always close to 0.9 and fluctuates somewhat. The group with the least time variation in its average is that of Anglo-Saxon MNC non-family firms. In addition, the average of this group is always above 0.95, which suggests that the vast majority of these firms are wholly owned. The top line in Panel B shows very low variation in the average board independence of US listed firms, with no discernible adjustment following the Sarbanes-Oxley Act. On the other hand, the rest of the Anglo-Saxon countries display the gradual increase in board independence consistent with the implementation of legal minimums or comply-or-explain codes. The majority of non-listed firms start being covered by BoardEx in 2007 and show much lower average levels of independence, around 0.5, with slightly more adjustments over time for the US subsample. Overall, Fig. 4 points to little time variation in time averages within groupings but different degrees of time variation across type of firm and jurisdiction. Corresponding graphs for control rights and wedge are shown in the online appendix.
This broad view of time variation does not tell us what proportion of the observations are responsible for the movements in the average, nor whether the movements have a common source for all firms or are firm-specific; we analyze these possibilities in the following sections.
In Table 3 we present three types of summary statistics: overall means, cross-sectional standard deviations (equal to the standard deviation of the time averages for all firms) and average time standard deviations (computed over time and then averaged across firms) of ownership and board independence by subsamples. Not surprisingly, the average ownership (standard deviation) in a pyramid is lower (higher) than in a group. However, even for pyramids the means are relatively high, which means that the typical firm is wholly owned (Panel A of Table 3). For Chinese listed firms in Panel B the means are much lower, reflecting the absence of wholly owned firms. The time standard deviations are always lower than the cross-sectional ones, reflecting the sticky nature of the variables. In Panel C of Table 3 mean board independence is higher for listed than non-listed firms for all jurisdictions, while the cross-sectional standard deviation is much higher for non-listed firms. Time standard deviations are always low as expected for a stable variable. In China, both private and state-owned firms have mean board independence very close to the legal minimum, with both cross-sectional and time standard deviations also very low, suggesting that within-firm estimators are unlikely to detect effects in Chinese data. Tests for differences in variances by subgroups are almost always statistically significant (shown in the online appendix). Corresponding tables of summary statistics and tests for differences in variances for control rights and wedge are available in the online appendix.
To test whether time variation is statistically significantly different in a multivariate setting, in Table 4 we present regressions of the time-series standard deviation of ownership and board independence on indicators for the different groupings in Table 3. The dependent variable is constructed over 4-year rolling windows for each firm. We confirm that familycontrolled firms have significantly higher ownership time-series standard deviation. Across jurisdictions, Canadian firms exhibit statistically significantly higher standard deviation of ownership than Anglo-Saxon MNCs. The greater time variability of family firms is maintained through different size groupings (Panel B of Table 4). Board independence of listed firms has statistically significantly higher time-series standard deviation across jurisdictions (Panel C). The size decile analysis in Panel D reveals that the lower time variability in board independence for US listed firms is driven by the two smallest deciles.
Overall, the summary statistics and test results in Tables 3 and 4 suggest that the amount of time variation in ownership and board independence differs across jurisdictions, family control and listed status.

Analysis of year-to-year variation
In the next two sections we would like to see how our granular data maps onto the generalized simulation results in Fig. 1, where we show the relationship between power and proportion of firms that change between the first and next (last) period. We start with yearto-year changes and the firms responsible for these changes. In Table 5 Panel A (Panel B) we report the proportion of Canadian (Anglo-Saxon MNC) firms that change in each year. We use two definitions of change: any change > 0 and changes > 0.05 (as in Donelli et al. 2013), and we report non-family and family firms separately. 27 We also analyze the yearly proportions of changes by industry and size, but for brevity they are only reported in the online appendix. Corresponding tables with very similar results for control rights and wedge are also shown in the online appendix.
In both Panels A and B of Table 5 we note the lack of systematic spikes or drops in the proportion of firms that change in particular years or periods. The only exception is a jump in the period 2005-2006, which coincides with the changes in disclosure rules and size thresholds in CRA. Table 5 Panel A (B) tells us that at a minimum of two years of data (that allows the computation of a change) on average we have only 13.8% (7.1%) of Canadian (Anglo-Saxon) family firms, and only 11.1% (4.3%) of Canadian (Anglo-Saxon) non-family firms, exhibiting a change in ownership. The numbers for changes greater than 5% are comparable to the results in Donelli et al. (2013) for Chile.
In both Panels C and D of Table 5 we see a very different picture for board independence. The minimum proportion of firms that change their board independence in any year is 45%, with the overall proportion of firms that change at least once becoming 100% over the entire period. Therefore, we uncover a large difference in the time variation of ownership and board independence when it comes to the proportion of firms changing each year.
Consider the findings in Table 5 relative to the generalized simulation results in Fig. 1. The minimum average proportion of firms that change ownership between two consecutive calendar years among all subsamples is 4.3%, which accumulates to 22.8% for the full period. Distributing this accumulated change equally over the period implies a rough estimate at year 13 of 21%, which would provide sufficient statistical power for four time observations and = 0.8, corresponding to the right-most graph of Panel B of Fig. 1. For the other subsamples time variation is higher, and they would provide sufficient power even for a shorter time series: the 13.8% average year-to-year proportion of changing firms for the Canadian family subsample would translate to 42.1% in year 9 and have sufficient power for close to 0.6.
Realistically, in empirical research involving sticky data it is much more likely to have a few time observations and be in the situation where the proportion of firms that account for the firm-specific time variation is relatively low. This highlights the local nature of an estimator based on time variation for a persistent variable and necessitates a closer look at the changing firms. We examine the kind of firms that change by industry and size (results in the online appendix). The industry and size splits reveal that the most changeable firms are larger and that in all categories family firms change more than non-family firms.

Analysis of stable regimes
Finally, we note that the year-to-year changes in the previous section are calendar year specific. Therefore, in Table 6 we introduce the concept of a stable regime that is calendar year independent. We follow DeAngelo and Roll (2015), who examine the stability of capital structures in US listed firms over long horizons. A stable regime is defined as a period of t years, where ownership does not change outside a predetermined bandwidth. For example, in Fig. 3 we see that Indigo is in a stable regime where no changes occur to its wholly owned status from 1997 until 2001, then it enters another stable regime at ~ 45% until 2006 and subsequently it experiences changes every year. Indigo will then be part of the proportion of firms that are in a stable regime for at least five years, while for Celestica, the longest stable regime is three years (~ 65% in 2005-2007).
We examine minimum lengths of stable regimes from 3 to 16 years and changes of 0%, 5% and 10%. For example, when we require the change to be greater than 10% for the stable regime to end, Celestica will be in one for nine years (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007). However, when we require the change in ownership to be less than 5% or 0%, Celestica is never in a stable regime. We limit the analysis to firms that have at least 12 years of data to make the computed proportions in each year be driven only by changes and not by the number of available firms in the sample.
In Panel A of Table 6 we report the proportion of Canadian non-family firms with stable ownership regimes. Comparing the overall reduction in stability between Canadian nonfamily (Panel A) and family firms (Panel B), we note that the stability of family firms drops faster. From the second row of Panel A (Panel B) we note that although 98.2% (99.1%) of non-family (family) firms do not change their ownership structure within three years, by year 7 we see a decrease to 89.9% (84.4%), and this proportion further decreases to 67.2% (56.9%) by year 12 and 63.0% (51.6%) by year 16.
When we relax our definition of stable regime by increasing the bandwidth to 10%, we still find a clear reduction in the proportion of firms that follow stable regimes as time increases. The family effect is evident across jurisdictions, i.e., ownership stability dissipates faster for family firms.   We present regressions of the time-series standard deviation of ownership and board independence on indicators for jurisdiction, family control, ownership structure complexity, listed status and size deciles. Time-series standard deviation is calculated over rolling windows of 4 years for each firm. The ownership sample consists of productive firms only (corresponding to the overall firm counts in Panel B of Fig. 2). Panel A shows full sample analysis, while in Panel B we run the regressions separately by jurisdiction. Size groups in the ownership data are proxied by the number of subsidiaries in each structure (group 1 for 10 subsidiaries or less; group 2 for 11-25 subsidiaries; group 3 for 26-50 and group 4 for 51 or more), a measure which we confirm has a high correlation with total assets for listed firms. Panels C and D show the corresponding regressions for board independence of the firms covered by the BoardEx database. Size deciles in the board data are formed based on total assets, where we have merged the BoardEx data with CapitalIQ using ISIN codes. Significance symbols and thresholds used: * for p < 0.1; ** for p < 0.05 and *** for p < 0.01  Table 5 Year-to-year variation in ownership   This table covers productive firms for which we have data for at least 12 years and which have a group or a pyramid ownership structure (corresponding to the overall firm counts in Panel B of Fig. 2; results for the remaining firms are in the online appendix). In panels C and D data for non-listed firms begins only in 2006 and is initially very sparse Table 6 Stable regimes in ownership This table covers productive firms for which we have data for at least 12 years (corresponding to the overall firm counts in Panel B of Fig. 2; results for the remaining firms are in the online appendix). Ownership is ultimate cash flow rights of the controlling owner. We report the proportion of firms with the longest stable regime (based on each ownership range specified in the rows) having lasted at least as many years as specified in the columns The median and mean number of years in a stable regime reported in the last two columns of Table 6 also show that ownership stability is lower for family firms in both the Canadian and Anglo-Saxon MNC sample, as is board independence stability for Anglo-Saxon versus US firms.
On row four of all panels in Table 6 we report the year-to-year decrease in stability under the 0.05 bandwidth. In bold we have shown the years with the biggest drop. Using this metric, we find yet again that the biggest drop happens sooner (year 8) for family than non-family firms (year 12) in the Canadian sample. In Panels E and F of Table 6 we note that for board independence the largest drop in stability for US firms happens in year 9 (equal to the drop in year 10), while for Anglo-Saxon firms it is in year 5. Both groups begin with very high proportions of three-year stability, just as in the case of ownership, but the reductions are much larger, and by year 16 the proportion of firms still in a stable regime has dropped to 0 on row 1, whereas for ownership that proportion stays around or above 50% (Panels A-D of Table 6).
The stable regimes analysis shows that time variation accrues faster for some kinds of firms (Canadian family firms in the case of ownership and Anglo-Saxon firms for board independence). While within seven years close to 90% of non-family firms remain in stable ownership regimes, five more years reduces this proportion to two thirds. Stable regime tables for control rights and wedge are presented in the online appendix.

Analysis of variance and perisistence regressions
In Table 7 we show disaggregation of the variation in ownership and board independence by firm fixed effects, year fixed effects, industry-year, industry-size deciles and firm-time interactions, following DeAngelo and Roll (2015) and Lemmon et al. (2008). The first column shows the adjusted R-sq of each model, while the remaining columns report the relative proportion of variation explained by each set of fixed effects. Exact regression specifications and methodological details are presented in the online appendix. The proportion of explained variation in ownership accounted for by firm-time interactions is substantial in cash flow rights: 0.1325 for Canadian non-family firms and 0.1756 for family firms; 0.1814 for Anglo-Saxon MNC non-family firms and 0.2808 for Anglo-Saxon MNC family firms. The results are similar for control rights, wedge and board independence. For comparison, industry-time interactions are ten times less important for all groups. The analysis-of-variance results tell us that over a longer period of 25 years the firm-specific time varying component of ownership is important-accounting for a just under a fifth of all explained variation for family Canadian firms and close to a third for family Anglo-Saxon MNC firms. Therefore, the within-firm estimator should not be dismissed on the grounds of lack of firm-specific time variation. In contrast, the common-to-all-firms time series variation, industry-specific time variation and the one captured by industry-size deciles are all negligible. Existing studies often argue that using industry, time, size deciles and their interactions is a way to address the inability to use firm fixed effects. Our results, however, show that this approach does not come close to capturing the variation accounted for by firm fixed effects.
The persistence regressions in Table 8 follow Graham et al. (2020), where ownership and board independence are regressed on their initial level. The statistically significant and large coefficients confirm the stability we have documented in the previous analyses, as well as the fact that it differs across groups. For example, in Panel A we see that cash flow rights for Canadian non-family firms change by between 0.65 and 0.7 for a unit level change in the initial value, while this is only between 0.44 and 0.47 for family firms. The initial level coefficients for board independence in Panel B similarly show different degree of stability across jurisdictions. The lower persistence for US listed firms is evident in line with the time-series standard deviation results in Table 4.

Summary of empirical analysis
In the last part of this study we show the different time variation patterns in ownership and board independence manifesting in different subgroups by jurisdiction and type of control.
In year averages we find that both variables vary little for most subgroups (Fig. 4), and the standard deviation over time is lower than that computed cross-sectionally (Table 3). The year-to-year analysis shows that the proportion of firms that change between any two consecutive calendar years is different across subgroups (Table 5), with dramatically higher year-to-year variation for board independence than ownership. On the other hand, the stable regimes are calendar time independent and indicate that a relatively shorter period is required for time variation in ownership to accrue for Canadian family firms and longer for non-family firms and Anglo-Saxon MNCs, as well as for board independence of Anglo-Saxon versus US firms (Table 6). We confirm these descriptive findings in multivariate analyses. In particular, the time-series standard deviation of ownership is higher for family firms and pyramid structures, but lower for Anglo-Saxon firms (Table 4). In terms of listed status, the time-series standard deviation of board independence of non-US firms is higher relative to non-listed ones. Variance decomposition and persistence regressions (Tables 7  and 8) similarly demonstrate the different patterns of time variation revealed in the descriptive findings. Therefore, we confirm that the time variation in ownership and board independence exhibits different patterns across subgroups of firms, with the subsamples with higher time variation corresponding to sufficient power in the simulation results.

Conclusion
Recent methodological and editorial guidance calls for improving causal inference and ruling out alternative explanations in all sub-disciplines of business research. This can be done either through a source of exogenous variation or through careful and exhaustive tests that convincingly support the baseline results. Very often these approaches rely on time variation. For example, existing studies where controlling ownership or board independence is an explanatory variable often find it impossible to employ within-firm estimators (that address time-invariant endogeneity) because of the lack of time variation. Based on theoretically and empirically grounded rules, we simulate artificial data reflecting the evident stickiness of such variables in existing work and analyze the power properties of withinfirm estimators under different degrees of time variation. The results provide guidance on key elements of the empirical design of governance research. Responding to common challenges, such as difficulty of data collection, we derive relationships between lengths of time and gaps in data collection on the one hand and statistical power on the other. Our results are useful for a variety of research settings in any jurisdiction and can be employed as a benchmark to assess whether time variation is sufficient to detect a relationship if it exists.
When data needs to be hand collected or extensively pre-processed, a researcher may begin by examining the proportion of firms that change between two time points. Our simulations suggest that for a range of strengths of the theoretical relationship of interest, when  (1), and three sets of interactions: industry*year (2), industry*size deciles (3) and firm*time (4). The estimated Eqs. (1)-(4) are given in the online appendix. We show the Type III partial sum of squares for each effect in the model and then normalize it by the sum across the effects, forcing columns 2 through 6 on each row to sum to one. The ownership sample consists of productive firms only  sampling ownership data three years apart, a minimum of between 25 and 55% of firms in the data should change between two neighboring time points for sufficient statistical power. For board independence, sampling data two years apart requires between 15 and 50% of firms changing. Even when a project uses an electronic data source, which does not present data collection challenges, our findings are useful in judging whether a finding of no significance is due to lack of power as opposed to ownership or board independence being unrelated to the outcome of interest.
We illustrate the simulation findings with quasi-replications of seminal studies of ownership (Lin et al. 2013) and board independence (Coles et al. 2008). In the case of ownership, a unique granular database (ICO) allows the exact computation of variables for consecutive years over a relatively longer time period and a large number of firms. For board independence, we use the most popular source with the widest coverage (BoardEx). Based on our replications, we confirm existing results that firms controlled by an Anglo-Saxon jurisdiction often have insufficient variation in ownership for statistical power purposes; however, non-Anglo-Saxon firms exhibit more time variation that allows the use of withinfirm estimators. For board independence of US listed firms, a relationship is detectable only if the time series is sufficiently long or, for a shorter series with gaps, if the proportion of firm changing is sufficient. We further establish that there are indeed different degrees of time variation in governance variables across type of control, jurisdiction and complexity of ownership structure, which supports the usefulness of the simulation findings. When researchers use our findings for empirical design decisions and data collection strategies, they should condition these choices on the specific characteristics of their data as we do here.
Beyond methodological guidance addressing a particular problem with governance data, our findings open new avenues for future research. The fact that the amount of time variation in ownership and board independence differs across jurisdictions, listed status, family control, etc., invites a natural question about the drivers of this greater time variation. 28 At least three possible theoretical explanations can be explored. Greater decision making flexibility for single/concentrated versus multiple/dispersed owners (Bhaumik et al. 2010) is consistent with more frequent adjustments in governance mechanisms in response to arising technological and competitive opportunities or regulatory and commodity pricing shocks (Graham et al. 2020). The joint pursuit of business and personal objectives by decision makers can manifest in governance changes associated with personal inheritance and tax planning (Carney et al. 2014;Tsoutsoura 2015) or reputational concerns (Belenzon et al. 2019;Deephouse and Jaskiewicz 2013). If certain types of controllers are more likely to expropriate outside shareholders, they may opportunistically adjust their own exposure to the cash flows generated by a firm they control in the expectation of future negative performance (Bertrand et al. 2002).
Our generalized simulation results can be used to revisit many of the research questions in existing literature, where adding one more time observation or sampling further apart can allow time-variation-based methods to be employed. Further, by being able to exploit the time dimension in the data, many new research projects are more likely to become viable. In terms of new avenues of research where time variation in ownership can be useful, consider the adoption of peer-to-peer technology that has democratized access to startup and secondary equity financing. This recent phenomenon can inform the longlasting debate on whether business groups or other forms of concentrated corporate control are compensating for institutional weakness or hampering innovation and growth (Khanna and Yafeh 2007;Morck 2005). Researchers can revisit this question by studying the effect of changes in ownership patterns in industries that are relatively less costly for new entrants (software services, call centers) in jurisdictions with different degrees of institutional weakness, on firm-level innovation, growth and value added. Similarly, time-varying ownership data can be used to test whether the advent of block-chain technology, and the unprecedented degree of transparency of business transactions (including in ownership stakes) it allows, reduce the incentives to maintain complex ownership structures (Yermack 2017).
In terms of board independence, we know from Graham et al. (2020) that it exhibits long-term dynamics for US listed firms. Given regulatory and institutional constraints within other jurisdictions and different incentives for private as opposed to listed firms, further theories of bargaining and/or dynamic contracting become testable in data with manageable time series lengths. Pyramid A dummy variable equal to 1 if an entity belongs to a group of connected legal entities, where if the ultimate owner is an individual, there must be at least one corporation that is a parent to a nonwholly-owned subsidiary. Where the ultimate owner is non-family controlled, then it must be a parent to a non-wholly-owned subsidiary.

Appendix 1 Variable definitions of ownership variables
Often, researchers require that at least one of the entities in the group be publicly listed. We do not make this requirement because our data covers privately held entities and their ownership structure represents interest in itself Author definition consistent with Almeida and Wolfenzon (2006) and Khanna and Yafeh (2007) Variable/concept Definition Source Apex A dummy variable equal to 1 if an entity is the apex of an enterprise. An apex is the entity at the top of a corporate ownership structure, which exercises effective control over all other entities therein. The types of apexes we encounter are either families/individuals or companies for which control cannot be assigned to a single family/ individual ICO Group A group represents connected legal entities under common control. Groups are either flat structures, where the controlling owner holds all subsidiaries directly or could be layered, but all subsidiaries are wholly owned Author definition consistent with Almeida and Wolfenzon (2006) and Khanna and Yafeh (2007) Stand-alone firm A firm that is not a subsidiary or is directly held by its controlling owner in case of an individual or family Productive A dummy variable equal to 1 if an entity belongs to an industry that is not banking, finance or insurance (we do include utilities as a productive sector). It equals 0 for financial firms in banking, finance or insurance. Importantly, many of the financial firms are classified as holding companies (part of the financial group in the NAICS classification) without any productive activity. Non-profits are also included in the financial category as often corporate structures contain legal entities with a charity status that in practice serves as a holding company. We exclude entities classified as government bodies from the data and only retain them if they are the ultimate owner of a structure for classification purposes

ICO and Thompson Financial
Variable/concept Definition Source Anglo-Saxon country Anglo-Saxon jurisdictions are Australia, Ireland, New Zealand, UK and US (Esping-Andersen 1990). Canada is also considered an Anglo-Saxon country but here it represents a separate sample partly due to its hybrid commoncivil law system, partly because it covers the population of Canadian firms as opposed to a subset of large global multinationals. In the case of board independence data, we treat the US as a separate subgroup for similar reasons of data coverage in BoardEx Ultimate cash flow rights Ultimate cash flow rights (ucfr i ) equals the proportion of one unit of disbursement from firm i that is received by the apex. ucfr i is the i-th entry of the vector: ucf r = f � I n − A −1 , where I n is the identity matrix with dimensions n x n. We treat the ucfr of the apex in itself as 1. See Almeida et al. (2011) ICO by Statistics Canada and author calculations Control rights Control rights (cr i ) is defined in Almeida et al. (2011) as the critical control threshold which is shown to be equivalent to the concept of the weakest link (as used in La Porta et al. 1999, Claessens et al. 2000, and Faccio and Lang 2002 when cross-shareholdings and multiple links are absent but can also be computed for more complex structures "

Wedge
The ratio of ultimate cash flow rights and control rights (ucfr i / cr i ) "

Board independence
The ratio of independent directors out of the total number of directors BoardEx

Appendix 2 Summary statistics of variables used in replications
Below are the summary statistics for all variables used in the replications reported in