Skip to main content
Log in

A Theory of Succession in Family Firms

  • Original Paper
  • Published:
Journal of Family and Economic Issues Aims and scope Submit manuscript

Abstract

Succession is one of the most important issues for the most common type of firms. The literature on family firm succession has straggled as a part of different paradigms, setting forth stylized facts, informal arguments and observations. In this paper, we present a theory of family firm succession that unifies and synthesizes scattered and dispersed contributions depicted in family business research; specifically, the key role of the training activity in preparing the potential candidate, the importance of amenity potentials that is inherent to family businesses, the incumbent’s reluctance to step aside, the underperforming succession, the role of trust in the succession process, and the barriers to a “non-family” succession. Within a simple microeconomics framework, we find that these different facts and arguments spelt out in the literature are reflections of the same fundamental economic trade-off between proficiency (skills) and honesty (incentives) when choosing among potential successors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. See, for example, Smith and Amoaku-Adu (1999), Shepherd and Zacharakis (2000), Pérez-González (2006), Villalonga and Amit (2006), Bennedsen et al. (2007), Cucculelli and Micucci (2008), Anderson et al. (2009), Eklund et al. (2013), Isakov and Weiskopf (2014) or Merchant et al. (2018).

  2. See for example Handler (1994), Chrisman et al. (1998), Cabrera-Suárez et al. (2001) or Le Bretton-Miller et al. (2004).

  3. More specifically, Lee et al.’s Proposition 4a states that a highly proficient heir is preferred by the family to an outside manager of uncertain ability, while Proposition 4b shows that a low proficient heir but (exogenously) endowed with idiosyncratic knowledge will be hired provided the family firm performance is highly dependant on such an idiosyncratic knowledge.

  4. See Blumentritt et al. (2013) for an overview of potential applications of game theory to understand the decisions and outcomes in family business succession.

  5. The existence of non-pecuniary sources of utility derived from the control over the firm can be found in Burkart et al. (2003) and Bhaumik and Gregoriou (2010) in the context of family-owned firms.

  6. Observe that the labels “smooth” and “harsh” are related to the efficiency of the training process, not to the candidate’s capacity as a manager. Thus, the revenue, \(v_{i}({\mathsf {e}}_{i})\), could be higher for a harsh process than for a smooth one.

  7. This specification is taken from Pagano and Röell (1998) and Burkart et al. (2003).

  8. For an easy monitoring (\(\kappa _{i}<2\)) the incumbent needs not to spend all the time at this activity even in the case of full deprivation, i.e. \(s_{i}<1\) for \(m_{i}(s_{i};\kappa _{i})=1\); alternatively, for a burdensome monitoring (\(\kappa _{i}>2\)) the incumbent cannot fully deprive the manager even if all her time is devoted to this activity, i.e. \(m_{i}(1;\kappa _{i})<1\) even if \(s_{i}=1\).

  9. The specific existence of a time constraint differs our framework from Burkart et al.’s. Thus, their notion of “monitoring intensity” \(m_{i}\) becomes “deprivation intensity” in our setting and depends on the time devoted to monitoring activities \(s_{i}\) that is restricted by the temporal feasibility.

  10. Handler (1988) or Sharma et al. (2001) point out that the most cited barrier to effective succession is the personal sense of attachment of the incumbent with the business.

  11. Our specification—in which wages and monitoring are simultaneously and optimally determined—circumvents the time consistency problems found in Burkart et al. (2003, Sect. II.B). There, once the manager has signed on to run the firm and revenues realized, the incumbent has an incentive to reduce the manager’s private benefits by monitoring more.

  12. The incumbent’s welfare (5) is a generalization of the Burkart et al. (2013, p.2176)’s founder’s welfare \(V^{s}\) with \(\breve{\beta }=1\) and \(\rho =0\)—for these authors consider the incumbent fully retires (\(\pi =1\)) and receives no outside-of-the-firm welfare (\(\delta _{F}=0\)).

  13. Observe that in Burkart et al. the founder deprives a fraction of the total revenue (\(m_{i}\upsilon _{i}\)), while in our work she deprives a fraction of the manager’s private benefit appropriation (\(m_{i}\phi _{i}\upsilon _{i}\)). Thus, they find a different interior optimal deprivation, \(m_{i}^{*}=\upsilon _{i}/\kappa _{i}\), which forces them to set exogenous bounds to deprivation: \(m_{i}\in [0,1]\) and \(m_{i}\le {\overline{\phi }}\), with \({\overline{\phi }}\) set by legal protection to shareholders. Interestingly, all bounds on deprivation in (6) are endogenously obtained within our framework.

  14. See Proposition A.1 in "Appendix 4: Characterizing Potential Optimal Levels of Training".

  15. See Proposition A.2 in "Appendix 4: Characterizing Potential Optimal Levels of Training".

  16. Unlike Burkart et al. (2005) and Bhattacharya and Ravikumar (2010), we will not assume that the manager is better than the incumbent at managing the firm (i.e., \(\upsilon _{M}>\upsilon _{F}\)).

  17. Some authors such as Kandel and Lazear (1992) and Davis et al. (1997) have argued that family managers could be exposed to higher non-monetary rewards associated to firm success that other managers do not share. More recently, Puri and Robinson (2013) find evidence of the existence of non-pecuniary benefits (measured as attitudes towards retirement) in family business owners and in those who inherit a business.

  18. See, for example, Friedman and Olk (1995), Shen and Cannella (2002) or Klein and Bell (2007).

  19. Our interpretation of the ratio \(\lambda _{M}\) in the subsequent analysis focuses on the honest features of the manager, \(\phi _{M}\), for a given monitoring parameter \(\kappa _{M}\).

  20. These two types are characterized by two frontiers, the non-family-manager deprivation and monitoring frontiers, formally defined in "Appendix 3: The Non-family Manager Deprivation and Monitoring Frontiers".

  21. For instance, Zellweger (2018, Chap. 7.7.5) poses two key dimensions of the ‘right’ successor: willingness—the successor’s commitment with the firm—and ability—the successor’s capacity for the job profile. These dimensions completely fit with our variables honesty and relative performance. Interestingly, our Fig. 2 provides a formal depiction (as well as a deeper insight) of the succession options informally displayed in Zellweger’s Fig. 7.10, willingness and ability diagram.

  22. For example, the “outsider successor” in the typology proposed by Shen and Cannella (2002).

  23. A recent example is the inability to find a suitable substitute for Sir Alex Ferguson in 2013. Over the course of his 27-years tenure, Manchester United won the Premier League title 13 times and the UEFA Champions League twice (see The Economist 2014).

  24. Unlike the case of the non-family manager case, the thresholds \(\underline{ \mu }_{H}({\mathsf {e}}^{*})\) and \({\overline{\mu }}_{H}({\mathsf {e}}^{*})\) are not constant values, and they depend on the optimal training \({\mathsf {e}} ^{*}\). Interestingly, as optimal training increases, the region that depicts relatively average family managers shrinks, and it fully vanishes at \({\mathsf {e}}^{*}=1\), i.e. \({\underline{\mu }}_{H}(1)=\overline{ \mu }_{H}(1)\).

  25. Differently from the non-family manager case, the family manager’s honesty is a relative concept that depends on the optimal training \({\mathsf {e}}^{*}\). For an easy monitoring (\(\kappa _{H}<2\)), the threshold of honesty decreases steadily as optimal training increases in the range \({\mathsf {e}} ^{*}\le 1-\frac{\kappa _{H}}{2}\), beyond which is constant at \(\breve{ \lambda }_{H}({\mathsf {e}}^{*})=1/{\overline{\mu }}_{H}(1-\frac{\kappa _{H}}{2 })\) for any \({\mathsf {e}}^{*}>1-\frac{\kappa _{H}}{2}\) . For a cumbersome monitoring (\(\kappa _{H}>2\)), the threshold of honesty decreases steadily as the optimal training \({\mathsf {e}}^{*}\) increases up to \(\breve{\lambda } _{H}(1)=0\).

  26. This profile corresponds to the “high potential” type of successor in Blumentritt (2016).

  27. The reason is the following. The condition \(m^{*}_{i}({\mathsf {e}}^{*})=[\frac{2}{\kappa _{i}}(1-{\mathsf {e}} ^{*})]^{1/2}<\lambda _{i}\mu _{i}({\mathsf {e}} ^{*})\) entails that the resulting first-order condition in (8)—i.e. \(\mu _{i}^{\prime }({\mathsf {e}})=0\)—has no solution because of the monotonicity of the manager’s revenue technology.

  28. Observe that if \(\lambda _{i}=0\) then \(\tilde{{\mathsf {e}}}_{2}(0)=\tilde{ {\mathsf {e}}}_{1}\), so a necessary condition for \(\tilde{{\mathsf {e}}}_{2}\) to be a potential maximum in this region is \(\tilde{{\mathsf {e}}}_{1}>1-\frac{ \kappa _{i}}{2}\).

  29. Among the conditions defining \(\tilde{{\mathsf {e}}}_{1}\) and \(\tilde{{\mathsf {e}} }_{2}(\lambda _{i})\) for any \(\lambda _{i}\), we had to choose the more restrictive one—namely, the marginal condition in (A.7)—to prevent the existence of a root that could intercept with the \(\lambda _{i}-\) axis (see Fig. 3). Note, however, that both marginal conditions match if the manager is only productive with the incumbent’s nurture, \(v_{i}(0)=0\).

  30. This is because \(\tilde{{\mathsf {e}}}_{2}(\lambda)\) may cross the full-deprivation frontier for some honesty level \(\lambda \ge 1/\mu _{i}(\breve{{\mathsf {e}}})\); it may cross the no-working frontier at some honesty level \(\lambda \le 1/\mu _{i}(\breve{{\mathsf {e}}})\); or, it may cross both and cause \(\tilde{{\mathsf {e}}}_{2}\)to disappear as a potential maximum, which greatly complicates the analysis.

  31. This assumption is a requirement that \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})\not =\tilde{{\mathsf {e}}}_{4}(\lambda _{i})\) for any \(\lambda _{i}<1/\mu _{i}(\tilde{{\mathsf {e}}}_{3})\) (i.e., the two brackets in (A.7) have no common root).

  32. Recall that if \(\phi _{i}=0\)—i.e. \(\lambda _{i}=0\)–, then \(\tilde{{\mathsf {e}}}_{2}(0)=\tilde{{\mathsf {e}}}_{1}\). Also, after denoting \(F(\phi _{i},\tilde{{\mathsf {e}}}_{2})=\mu _{i}^{\prime }(\tilde{{\mathsf {e}}}_{2})[1-\phi _{i}\lambda _{i}\mu _{i}(\tilde{{\mathsf {e}}}_{2})]-1\), the Implicit Function Theorem allows us to find that \(\partial \tilde{{\mathsf {e}}}_{2}(\phi _{i})/\partial \phi _{i}<0\) due to the concavity of the manager’s revenue technology.

References

Download references

Acknowledgements

We are grateful for the useful insights of two referees of this journal. We also recognize comments of Massimo Baù, Marco Cucculelli, Katiuska Cabrera, Susana Menéndez, Mattias Nordqvist, Alberto Vaquero and the participants of IFERA conference (Lancaster), the 10th IBEW Workshop (Palma de Mallorca), Workshop “Empresa Familiar” (Ourense) and the XXXIV and XXXVII Simposio de la Asociación Española de Economía (SAEe).

Funding

The first author acknowledges financial support from the Spanish Ministry of Economics and Competitivity Project ECO2013-48884-C3-1-P and DER2014-52549-C4-2-R; the second acknowledges financial support from Inditex and the Galician Association of Family Business (AGEF) through the Family Business Chair of the Universidade da Coruña.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eduardo L. Giménez.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Proof of Theorem 1

Proof of Theorem 1

Initially, let us assume that \(\kappa _{M}<2\). From (10) we find the following three conditions relevant (see Fig. 2)

$$\begin{aligned} \mu _{M}\lambda _{M}= & {} 1 \end{aligned}$$
(A.1)
$$\begin{aligned} \mu _{M}= & {} \frac{{\varOmega }_{M}}{\upsilon _{F}}+\frac{\kappa _{M}}{2} (\rho +\beta) \end{aligned}$$
(A.2)
$$\begin{aligned} \lambda _{M}= & {} \frac{\left[ \frac{2}{(\rho +\beta)\kappa _{M}}\left(\mu _{M}-\frac{{\varOmega }_{M}}{\upsilon _{F}}\right) \right] ^{1/2}}{\mu _{M}} \end{aligned}$$
(A.3)

Note that equations (A.1)—the full-deprivation frontier—and (A.3) intersects at \((1/{\widehat{\mu }}_{M},{\widehat{\mu }}_{M})\) where \({\widehat{\mu }}_{M}\) is the value found at (A.2).

The level of deprivation can take the following values \(m^*_M=\min \{\mu _{M}\lambda _{M},1\}\). Consider first that \(\mu _{M}\lambda _{M}>1\), so \(m^*_M=1\) (i.e., \(s^*_M=(\kappa _M/2)<1\)) and, then, (10) is positive provided \(\mu _{M}>({\varOmega }_{M}/\upsilon _{F})+(\rho +\beta)(\kappa _{M}/2)\). Accordingly, the incumbent will implement full deprivation of benefits at the upper contour set of the full-deprivation frontier (A.1) and rightwards of condition (A.2). Now, consider the case \(m_M^*=\mu _{M}\lambda _{M}<1\). Then, (10) is positive whenever \(\lambda _{M}\mu _{M}>[(2/\kappa _{M})(\mu _{M}-{\varOmega }_{M}/[(\rho +\beta)\upsilon _{F}])]^{1/2}\). Accordingly, the incumbent will implement partial monitoring at the region below the conditions (A.1) and (A.3). In both cases, the manager still works at the firm, as \(s^*_M<T=1\).

Now assume that \(\kappa _{M}\ge 2\). From (10) the relevant three conditions turn out to be

$$\begin{aligned} \mu _{M}\lambda _{M}= & {} \left(\frac{2}{\kappa _{M}}\right) ^{1/2} \end{aligned}$$
(A.4)
$$\begin{aligned} \mu _{M}= & {} \frac{{\varOmega }_{M}}{\upsilon _{F}}+(\rho +\beta) \end{aligned}$$
(A.5)

and (A.3). Note that equations (A.4)—the full-monitoring frontier—and (A.3) intersects at \((1/{\widehat{\mu }}_{M},{\widehat{\mu }}_{M})\) where \({\widehat{\mu }}_{M}\) is the value found at (A.5).

The level of deprivation can take the following values \(m^*_M=\min \{\mu _{M}\lambda _{M},(2/\kappa _{M})^{1/2}\}\). Consider first that \(\mu _{M}\lambda _{M}>(2/\kappa _{M})^{1/2}\), so \(m^*_{M}=(2/\kappa _{M})^{1/2}<1\) and, then, (10) is positive whenever \(\mu _{M}>({\varOmega }_{M}/\upsilon _{F})+(\rho +\beta)\). Accordingly, the incumbent will spend all her time monitoring, \(s_{M}=T=1\), at the region above condition (A.4) and rightwards of condition (A.5). Now, consider the case \(m^*_M=\mu _{M}\lambda _{M}<(2/\kappa _{M})^{1/2}\). Then, (10) is positive whenever \(\lambda _{M}\mu _{M}>[(2/\kappa _{M})(\mu _{M}-{\varOmega }_{M}/[(\rho +\beta)\upsilon _{F}])]^{1/2}\). Accordingly, the incumbent will implement partial monitoring at the at the upper contour set of the full-monitoring frontier (A.4) and (A.3), \(s^*_M<T=1\), and thus she has available time to work/outside-of-the-firm activities. This concludes the proof of Theorem 1. \(\square\)

Appendix 2: Necessary and Sufficient Conditions to Offer a Contract at Date 0

Lemma A.1

Necessary and sufficient conditions to offer a contract at date 0. Consider a potential manager with an outside utility \(\omega _{i}\) and a level of expropriation \(\phi _{i}\). Then,

  1. (i)

    A necessary condition for an incumbent to offer a non-negative wage at date 0 is \(\omega _{i}/\upsilon _{i}\in [0,1)\) (or analogously \(\upsilon _{i}\ge \omega _{i}\));

  2. (ii)

    A sufficient condition for an incumbent to offer a non-negative wage at date 0 is \(\omega _{i}/\upsilon _{i}\in [\phi _{i},1)\). If full deprivation is optimal (\(m_{i}^{*}=1\)), the condition in (i) becomes a sufficient condition.

Proof

The proof is simple. Given that the wage compensation has to offset the manager’s opportunity cost, \(\upsilon _{i}w_{i}^{*}\ge \omega _{i}\), and the wage rate cannot be greater than 1, (i) is proved straightforwardly. Observe that the wage rate cannot be negative and the incumbent can deprive resources from the manager’s appropriation in a range \(m\in [0,1]\). Then, it is easy to show in (7) that the condition in (i) is also a sufficient condition in the case of full deprivation (\(m_{i}^{*}=1\)). Otherwise, if full deprivation is not optimal, the extreme case of no deprivation (\(m^{*}_i=0\)) sets a lower threshold for the non-negative wage rate, characterized in (7) by \(\omega _{i}\ge \upsilon _{i}\phi _{i}\). \(\square\)

Appendix 3: The Non-family Manager Deprivation and Monitoring Frontiers

Concerning the already-trained non-family manager’s honesty, it will be useful to identify brands of managers to formally characterize the non-family-manager deprivation and monitoring frontiers.

Definition A.1

The (non-family manager) full-deprivation frontier. If \(\kappa _{M}<2\), those combinations \((\mu _{M},\lambda _{M})\) satisfying

$$\begin{aligned} \lambda _{M}\mu _{M}=1 \end{aligned}$$

delineate a frontier beyond which a non-family manager is fully deprived, i.e. \(m_{M}^{*}=1\) so \(s_{M}^{*}=\kappa _{M}/2\).

Definition A.2

The (non-family manager) full-monitoring frontier. If \(\kappa _{M}\ge 2\), those combinations \((\mu _{M},\lambda _{M})\) satisfying

$$\begin{aligned} \lambda _{M}\mu _{M}=(2/\kappa _{M})^{1/2} \end{aligned}$$

delineate a frontier beyond which a non-family manager is fully monitored, i.e. \(s_{M}^{*}=1\) so \(m_{M}^{*}=(2/\kappa _{M})^{1/2}\).

Appendix 4: Characterizing Potential Optimal Levels of Training

The optimal level of training \({\mathsf {e}} ^{*}\) depends on particular values of the parameters and specific functional forms. In Table 3 we display the potential optimal levels of training for different regions of parameters, depicted at the relative-performance−honesty plane in Fig. 3 for a particular case. Specifically, we are able to identify potential maxima to the incumbent’s problem (8) after determining a key threshold in the training intensity: \(\breve{{\mathsf {e}}}\equiv 1-\frac{ \kappa _{i}}{2}\) for any given value of \(\kappa _{i}\)—a threshold found at the maximum deprivation level (see the inside bracket at the optimal deprivation condition (6) for \(T=1\)). This threshold allows us to distinguish between two cases: full deprivation is feasible for the incumbent (case i.) or it is not (case ii.).

Table 3 Potential optimal levels of training with \(\breve{\mathsf {e }}=1-\frac{\kappa _{i}}{2}\); \(\tilde{{\mathsf {e}}}_{1}\) a root of \(\mu _{i}^{\prime }({\mathsf {e}})-(\rho +\beta)\); \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})\) a root of \(\mu _{i}^{\prime }({\mathsf {e}})[1-\phi _{i}\lambda _{i}\mu _{i}({\mathsf {e}})]-(\rho +\beta)\); and \(\tilde{\mathsf { e}}_{4}(\lambda _{i})\) a root of \({\mathsf {e}} +\frac{\kappa _{i}}{2}[\lambda _{i}\mu _{i}({\mathsf {e}})]^{2}-1\)
Fig. 3
figure 3

The full-deprivation frontier, the no-working frontier and the potential optimum levels of training in the relative-performance−honesty plane (i.e., the \(\mu _{i}\)-\(\lambda _{i}\)—plane). It is represented, for any given \(\kappa _{i}<2\), the case the training process exhibits decreasing returns-to-scale, and Assumption A.2 and \(\tilde{{\mathsf {e}}}_{1}<\tilde{{\mathsf {e}}}_{3}\) are satisfied

Case i Full deprivation is feasible: \({\mathsf {e}} ^{*}\le \breve{{\mathsf {e}}}\equiv 1-\frac{\kappa _{i}}{2}\). We begin by considering that full deprivation in (6) is feasible, i.e. \(1\le [\frac{2}{\kappa _{i}}(1-{\mathsf {e}} ^{*})]^{1/2}\); that is, the (non-negative) optimal training level must satisfy \({\mathsf {e}} ^{*}\le 1-\frac{\kappa _{i}}{2}\). The region of training values satisfying this full deprivation condition is fully characterized by the full-deprivation frontier displayed in the following Definition (see this frontier at the \(\mu _{i}\)-\(\lambda _{i}\)—space in Fig. 3):

Definition A.3

The full-deprivation frontier. For each honesty parameter \(\lambda\) there exists a training intensity \(\overline{ {\mathsf {e}} }(\lambda)\) such that those combinations \((\mu _{i}\left(\overline{{\mathsf {e}} }(\lambda)\right) ,\lambda)\) satisfying

$$\begin{aligned} \lambda \mu _{i}\left(\overline{{\mathsf {e}} }(\lambda)\right) =1, \end{aligned}$$
(A.6)

delineate a frontier beyond which a manager is fully deprived, i.e., \(m_{i}^{*}=1\).

The full-deprivation frontier allows us to characterize potential optimal training levels when full deprivation is feasible and optimal (case i.i.) or is feasible and not optimal (case i.ii.).

Case i.i. Full deprivation is feasible (\({\mathsf {e}} ^{*}\le \breve{{\mathsf {e}}}\)) and optimal (\(m_{i}({\mathsf {e}}^{*})=1\)). If full deprivation is optimal for the incumbent, then \(\lambda _{i}\mu _{i}({\mathsf {e}} ^{*})\ge 1\) is satisfied in (6). This means that the value of the parameters results in a combination \((\mu _{i}({\mathsf {e}} ^{*}),\lambda _{i})\) located at the upper contour set of the full-deprivation frontier (A.6). In this case, the first order condition in (8) is

$$\begin{aligned} \left[ \mu _{i}^{\prime }({\mathsf {e}})-(\rho +\beta)\right] \left[ \mathsf {e } +\frac{\kappa _{i}}{2}-1\right] =0. \end{aligned}$$

Here, there are two potential optimal levels of training: the interior potential maximum \(\tilde{{\mathsf {e}}}_{1}\), a root of \(\mu _{i}^{\prime }({\mathsf {e}})-(\rho +\beta)\); and, the corner no-working potential maximum \(\tilde{{\mathsf {e}}}_{3}=1-\frac{\kappa _{i}}{2}\). Observe that the former is a marginal condition stating that the incumbent stops training the manager at \(\tilde{{\mathsf {e}}}_{1}\) because the benefits derived from devoting one additional unit of time in training activities \((\mu _{i}^{\prime }(\tilde{ {\mathsf {e}}}_{1}))\) equals the time and welfare cost of this additional unit of time (\(\rho +\beta\)). Because of the time constraint, \(\tilde{{\mathsf {e}}} _{1}\) must satisfy \(\tilde{{\mathsf {e}}}_{1}\le 1-\frac{\kappa _{i}}{2}\equiv \tilde{{\mathsf {e}}}_{3}\) to be considered a potential maximum.

Case i.ii Full deprivation is feasible (\({\mathsf {e}} ^{*}\le \breve{{\mathsf {e}}}\)), but not optimal (\(m_{i}({\mathsf {e}} ^{*})<1\)). If full deprivation is feasible but not optimal for the incumbent, then \(m^{*}_{i}({\mathsf {e}} ^{*})=\lambda _{i}\mu _{i}({\mathsf {e}} ^{*})<1\) must be satisfied in (6). The value of the parameters results in a combination \((\mu _{i}({\mathsf {e}} ^{*}),\lambda _{i})\) located below the full-deprivation frontier (A.6), and the first order condition in (8) becomes

$$\begin{aligned} \left[ \mu _{i}^{\prime }({\mathsf {e}})[1-\phi _{i}\lambda _{i}\mu _{i}(\mathsf { e})]-(\rho +\beta)\right] \left[ {\mathsf {e}} +\frac{\kappa _{i}}{2}[\lambda _{i}\mu _{i}({\mathsf {e}})]^{2}-1\right] =0. \end{aligned}$$
(A.7)

Again, there are two potential optimal levels of training: the interior potential maximum \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})\), a root of the marginal condition \(\mu _{i}^{\prime }({\mathsf {e}})[1-\phi _{i}\lambda _{i}\mu _{i}({\mathsf {e}})]-(\rho +\beta)\) for any given \(\lambda _{i}\ge 0\) ; and, the corner no-working potential maximum \(\tilde{{\mathsf {e}}} _{4}(\lambda _{i})\), a root of \({\mathsf {e}} +\frac{\kappa _{i}}{2}[\lambda _{i}\mu _{i}({\mathsf {e}})]^{2}-1\) for any given \(\lambda _{i}\ge 0\). The training level \(\tilde{{\mathsf {e}}}_{2}\) must satisfy the following three conditions to be considered a potential maximum for any given \(\lambda _{i}\) : \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})<\tilde{{\mathsf {e}}}_{4}(\lambda _{i})\)—because of the time constraint–, \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})\le 1-\frac{\kappa _{i}}{2}\equiv \tilde{{\mathsf {e}}}_{3}\)—because of the full deprivation condition—and \(\lambda _{i}\mu _{i}(\tilde{{\mathsf {e}}} _{2})<1\)—since full deprivation cannot be optimal at \(\tilde{{\mathsf {e}}} _{2}\). (Note that if \(\lambda _{i}=0\) then \(\tilde{{\mathsf {e}}}_{2}(0)=\tilde{ {\mathsf {e}}}_{1}\).) Observe, however, that the root \(\tilde{{\mathsf {e}}} _{4}(\lambda _{i})\) does not satisfy the full deprivation condition for any \(\lambda _{i}\), due to \(\tilde{{\mathsf {e}}}_{4}(\lambda _{i})>1-\frac{\kappa _{i}}{2}\equiv \tilde{{\mathsf {e}}}_{3}\)—because of \(\lambda _{i}\mu _{i}(\tilde{{\mathsf {e}}}_{4})<1\)–, and accordingly this root cannot be considered as a potential maximum within this region of parameters.

Case ii Full deprivation is not feasible (\({\mathsf {e}} ^{*}>\breve{{\mathsf {e}}}\equiv 1-\frac{\kappa _{i}}{ 2}\) and \(m_{i}({\mathsf {e}}^{*})<1\)). The alternative case is the one in which full deprivation is not feasible ; that is, the case in which the optimal training level must satisfy \({\mathsf {e}} ^{*}>1-\frac{\kappa _{i}}{2}\) and, then, the value of the parameters results in a combination \((\mu _{i}({\mathsf {e}} ^{*}),\lambda _{i})\) located below the full-deprivation frontier (A.6), i.e. \(\lambda _{i}\mu _{i}({\mathsf {e}} ^{*})<1\). Here, the optimal monitoring can only be \(m_{i}^{*}=\lambda _{i}\mu _{i}({\mathsf {e}} ^{*})<[\frac{2}{\kappa _{i}}(1-{\mathsf {e}} ^{*})]^{1/2}\).Footnote 27 For each given \(\lambda _{i}\ge 0\), the first-order condition (A.7) provides us with two potential optimal levels of training: the interior potential maximum \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})\); and, the corner no-working potential maximum \(\tilde{{\mathsf {e}}}_{4}(\lambda _{i})\). Analogous to the case i.ii., for any given \(\lambda _{i}\), the training level \(\tilde{{\mathsf {e}}}_{2}\) must satisfy the following three conditions to be considered a potential maximum: \(\tilde{{\mathsf {e}}}_{2}< \tilde{{\mathsf {e}}}_{4}\), \(\tilde{{\mathsf {e}}}_{2}>1-\frac{\kappa _{i}}{2}\) and \(\lambda _{i}\mu _{i}(\tilde{{\mathsf {e}}}_{2})<1\).Footnote 28 Observe that the root \(\tilde{{\mathsf {e}}}_{4}(\lambda _{i})\) can be considered now as a potential maximum, since it does not satisfy the full deprivation condition, \(\tilde{{\mathsf {e}}}_{4}(\lambda _{i})>1-\frac{\kappa _{i}}{2}\).

What remains to be shown is that \(\tilde{{\mathsf {e}}}_{4}(\lambda _{i})\) is always below the full-deprivation frontier (A.6) for any \(\lambda _{i}\). To prove this, we previously characterize the following no-working frontier (see this frontier at Fig. 3).

Definition A.4

The no-working frontier. For each \(\lambda\) there exists a \(\overline{\overline{{\mathsf {e}} }}(\lambda)\) such that those combinations \((\mu _{i}(\overline{\overline{{\mathsf {e}} }}(\lambda)),\lambda)\) satisfy \(s(\overline{\overline{{\mathsf {e}} }}(\lambda))+\overline{ \overline{{\mathsf {e}} }}(\lambda)=1\); that is,

$$\begin{aligned} \frac{\kappa _{i}}{2}\left[ \lambda \mu _{i}\left(\overline{\overline{ {\mathsf {e}} }}(\lambda)\right) \right] ^{2}+\overline{\overline{{\mathsf {e}} }} (\lambda)=1, \end{aligned}$$
(A.8)

delineates a frontier beyond which the incumbent only monitors and trains the manager, but does not work.

Observe that whenever the manager is fully honest, \(\lambda _{i}=0\), the no-working frontier (A.8) intercepts the \(\mu _{i}\)-axes at \(\mu _{i}(1)\). In this case, the incumbent only performs training activities \(\overline{\overline{{\mathsf {e}} }}(0)=1\). Next we can state the following result characterizing the functional relationships (A.6) and (A.8) (see also Fig. 3), which guarantees that the incumbent never fully deprives her manager when the level of training chosen is \(\tilde{{\mathsf {e}}}_{4}\).

Lemma A.2

Characterizing the full-deprivation frontier and the no-working frontier. The functional relationships defined in conditions (A.6) and (A.8) at the \(\lambda _{i}\)-\(\mu _{i}\)-plane have a negative slope, the former is steeper, and both intersect only once at the training intensity \(\breve{{\mathsf {e}}}=1-\frac{\kappa _{i}}{ 2}\).

Proof

Initially, note that the substitution of the right hand-side term in Condition (A.6) into (A.8), it is easy to find that \(\breve{{\mathsf {e}}}=1-\frac{\kappa _{i}}{2}\) is an intersection. Thus, it is only needed to compute the negativity for the slopes of conditions (A.6) and (A.8), and the value taken of both slopes at \({\mathsf {e}}=1-\kappa _{i}/2\) and find that the latter is steeper than the former. For any given level of training, condition (A.6) becomes an equilateral hyperbola, \(\lambda _{\text{(A.6) }}(\mu _{i})=1/\mu _{i}\), with slope \(-1/\mu _{i}^2\). Condition (A.8) becomes the function,

$$\begin{aligned} \lambda _{\text{(A.8) }}(\mu _{i})= & {} \frac{1}{\mu _{i}}\left[ \left(1-\mu ^{-1}(\mu _{i})\right) \frac{2}{\kappa _{i}}\right] ^{1/2} \end{aligned}$$
(A.9)

after defining the identity function \(\mu ^{-1}(\mu _{i}({\mathsf {e}}))={\mathsf {e}}\), whose derivative with respect to \({\mathsf {e}}\) is \(\mu ^{-1\prime }(\mu _{i}({\mathsf {e}}))=1/\mu _{i}^{\prime }({\mathsf {e}})\) by the Chain Rule. Derivation of (A.9) with respects to \(\mu _{i}\) is

$$\begin{aligned} \lambda _{\text{(A.8) }}^\prime (\mu _{i})= & {} -\frac{1}{\mu _{i}^2}\left[ \frac{1}{\kappa _{i}}\left[ (1-{\mathsf {e}})\frac{2}{\kappa _{i}}\right] ^{-1/2} \frac{\mu _{i}}{\mu _{i}^\prime }+\left[ (1-{\mathsf {e}})\frac{2}{\kappa _{i}}\right] ^{1/2}\right] . \end{aligned}$$

The slope at \(\breve{{\mathsf {e}}}=1-\frac{\kappa _{i}}{2}\) results to be \(\lambda _{\text{(A.8) }}^\prime (\mu _{i}(\breve{{\mathsf {e}}}))\le -\frac{1}{(\mu _{i}(\breve{{\mathsf {e}}}))^2}= \lambda _{\text{(A.6) }}^\prime (\mu _{i}(\breve{{\mathsf {e}}})).\) Then, \(\lambda _{\text{(A.8) }}(\mu _{i}({\mathsf {e}}))>\lambda _{\text{(A.6) }}(\mu _{i}({\mathsf {e}}))\) is satisfied for any \({\mathsf {e}}<\breve{{\mathsf {e}}}\), and vice versa for \({\mathsf {e}}>\breve{{\mathsf {e}}}\), which entails that (A.6) and (A.8) only intersect once. This concludes the proof of Lemma A.2. \(\square\)

Case \(\varvec{iii.}\) Corner solutions for the level of training. Finally, the time constraint additionally provides us with two additional corner potential maxima: the full-training potential maximum \(\tilde{{\mathsf {e}}} _{5}=1\) and the no-training potential maximum \(\tilde{{\mathsf {e}}}_{6}=0\). The former entails that no time for monitoring or working activities is available to the incumbent—i.e., \(m^{*}(\tilde{{\mathsf {e}}}_{5})=0\) and \(n^{*}(\tilde{{\mathsf {e}}}_{5})=0\)–, and \(\tilde{{\mathsf {e}}}_{5}=1\) can be considered as a potential maximum provided the incumbent offers a contract at date 0 to the manager with a non-negative wage rate \(w^{*}\) in (7), i.e. \(\frac{\omega _{i}}{\upsilon _{F}\kappa _{i}(\rho +\beta)}\ge \lambda _{i}\mu _{i}(\tilde{{\mathsf {e}}}_{5})\) (see Lemma A.1.ii). The latter, \(\tilde{{\mathsf {e}}}_{6}=0\) is the case that the manager is hired because of his own abilities alone. Yet, we consider the incumbent to be prone to devoting time to the successor. Precluding the no-training potential (\(\tilde{{\mathsf {e}}}_{6}=0\)) to be optimal depends on the value of \(\kappa _{i}\): if \(\kappa _{i}<2\)—the case depicted in Fig. 3—it must be required that \(\tilde{{\mathsf {e}}}_{1}\) or \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})\) for any \(\lambda _{i}\) cannot take zero as an optimal value; if \(\kappa _{i}>2\)—the area at the right of \(\mu _{i}(\tilde{{\mathsf {e}}}_{3})\) in Fig. 3—it must be required that \(\tilde{{\mathsf {e}}}_{4}(\lambda ^{max})>0\) with \(\lambda _{i}^{max}\equiv 1/[\kappa _{i}(\rho +\beta)]\). To this end, we state the following assumptionFootnote 29:

Assumption A.1

\(\mu _{i}^{\prime }(0)\left[ 1-\phi _{i}\lambda _{i}\mu _{i}(0)\right] >\rho +\beta\) for \(\kappa _{i}<2\); and, \(\frac{\kappa _{i}}{2}[\lambda ^{max}\mu _{i}(0)]^{2}<1\) for \(\kappa _{i}>2\) , with \(\lambda _{i}^{max}\equiv 1/[\kappa _{i}(\rho +\beta)]\).

Observe that for any set of parameters, all potential maxima are fully identified except \(\tilde{{\mathsf {e}}}_{2}\).Footnote 30 To guarantee that \(\tilde{{\mathsf {e}}} _{2}(\lambda _{i})\) can always be considered a candidate for any \(\lambda _{i}\), we present the following Assumption A.2 stating that the function \(\tilde{{\mathsf {e}}}_{2}(\lambda)\) never crosses either the full-deprivation frontier (Assumption A.2.1.) nor the no-working frontier (Assumption A.2.2.Footnote 31).

Assumption A.2

A.2.1. There exists no \(\lambda \le 1/[\kappa _{i}(\rho +\beta)]\equiv \lambda _{i}^{max}\) such that \(\mu _{i}^{\prime }(\tilde{{\mathsf {e}}}_{2}(\lambda))(1-\phi _{i})=\rho +\beta\).

A.2.2. \(\mu _{i}^{\prime }(\tilde{{\mathsf {e}}} _{2}(\lambda))(1-\phi _{i})>\rho +\beta\) is satisfied for any \(\lambda \le 1/\mu _{i}(\breve{{\mathsf {e}}})\).

All the proceeding analysis and interpretation have been developed for a given \(\kappa _{i}\). It is worth noting that if \(\kappa _{i}=0\), then the number of potential optimal levels of training are reduced to \(\tilde{ {\mathsf {e}}}_{1}\) and \(\tilde{{\mathsf {e}}}_{5}=1\); while if \(\kappa _{i}>2\), then the potential optimal levels of training are restricted to \(\tilde{ {\mathsf {e}}}_{5}=1\), and \(\tilde{{\mathsf {e}}}_{2}(\lambda)\) and \(\tilde{ {\mathsf {e}}}_{4}(\lambda)\) for \(\lambda <\lambda _{0}\) with \(\lambda _{0}\) satisfying \(\frac{\kappa _{i}}{2}[\lambda _{0}\mu _{i}(0)]^{2}=1\) (i.e., the \(\lambda _{0}\) is the level of the manager’s honesty, such that \(\tilde{ {\mathsf {e}}}_{4}(\lambda _{0})=0\)).

D.1 Optimal Training Decision and the Effectiveness of the Training Process

The incumbent’s optimal level of training eventually chosen (\({\mathsf {e}} ^{*}\)) depends on the particular values of the parameters that fulfill the corresponding restrictions (namely, the positive-wage, the full-monitoring and the no-working conditions). Among a myriad of cases, in this subsection we characterize the optimal training for different profiles of the effectiveness of the training process, represented by the increasing or decreasing returns-to-scale of the manager’s relative revenue technology (\(\mu _{i}({\mathsf {e}})\equiv v_{i}({\mathsf {e}})/\upsilon _{F}\)).

D.1.1 Increasingly Effective Training Process: \(\mu _{i}({\mathsf {e}})\) is convex.

If the training activities increasingly contribute to the revenue technology, it is intuitively to be expected that the incumbent is prone to nurture the manager the most (i.e. \({\mathsf {e}} =1\)) and not to work for the firm. However, this needs not be the case, since the manager might require some monitoring intensity if he is not honesty enough. The less honest the manager is—i.e., the higher \(\lambda _{i}\), the more time resources the incumbent has to devote to monitoring activities. All these intuitions are easy to characterize, as shown by the following result.

Proposition A.1

Consider Assumption A.1is satisfied. If the training process is increasingly effective, then the incumbent finds it optimal not to work (i.e., \(n^{*}=0\)) and train and monitor her manager, with

$$\begin{aligned} {\mathsf {e}} ^{*}=\left\{ \begin{array}{ll} \tilde{{\mathsf {e}}}_{3}=1-\frac{\kappa _{i}}{2} &{} \text{ if } \lambda _{i}>\max \left\{ \frac{\omega _{i}}{\upsilon _{F}\kappa _{i}(\rho +\beta)}/\mu _{i}(1);1/\mu _{i}\left(1-\frac{\kappa _{i}}{2}\right) \right\} \\ \tilde{{\mathsf {e}}}_{4}(\lambda _{i}) &{} \text{ if } \lambda _{i}\in \left(\frac{\omega _{i}}{\upsilon _{F}\kappa _{i}(\rho +\beta)}/\mu _{i}(1),\max \left\{ \frac{\omega _{i}}{\upsilon _{F}\kappa _{i}(\rho +\beta)}/\mu _{i}(1);1/\mu _{i}\left(1-\frac{\kappa _{i}}{2}\right) \right\} \right] \\ \tilde{{\mathsf {e}}}_{5}=1 &{} \text{ if } \lambda _{i}\le \frac{\omega _{i}}{\upsilon _{F}\kappa _{i}(\rho +\beta)}/\mu _{i}(1) \end{array} \right. \end{aligned}$$

and \(s_{i}^{*}=1-{\mathsf {e}} ^{*}\in [0,\frac{\kappa _{i}}{2}]\) .

The proof is straightforward, given that \(\tilde{{\mathsf {e}}}_{1}\) and \(\tilde{{\mathsf {e}}}_{2}\) are local minima—because of the convexity of the manager’s revenue technology–, and \(\tilde{{\mathsf {e}}}_{3}<\tilde{ {\mathsf {e}}}_{4}<\tilde{{\mathsf {e}}}_{5}=1\) implies \(V^{i}(1)=v_{i}(1)-\beta \upsilon _{F}-\omega _{i}+\gamma _{i}B>V^{i}(\tilde{{\mathsf {e}}}_{4})>V^{i}(\tilde{{\mathsf {e}}}_{3})\) in (8)—because of the monotonicity of the revenue technology.

D.1.2 Decreasingly Effective Training Process: \(\mu _{i}({\mathsf {e}})\) is Concave

The optimal level of training in the case of a harsh training process is much more difficult to characterize and, unlike in the case of increasing returns-to-scale, any potential maximum can now be an optimal level of training depending on the value of the parameters. The decreasing returns-to-scale of the revenue technology imply that as the incumbent devotes more time to nurture her heir, the opportunity cost of every additional unit of time resources—in terms of the incumbent’s productive revenue—increases more than proportionally. So eventually, the incumbent can find it optimal not to keep training the successor any longer and carry out other tasks in the firm instead. Notice that the manager’s honesty profile results crucial: the less honest the manager is—i.e. the higher \(\lambda _{i}\), the sooner the incumbent finds it beneficial to stop training the manager.

Here, we can identify two extreme cases in light of Fig. 3. If the opportunity cost of training the manager remains low for high \({\mathsf {e}}\), then full training—i.e. \({\mathsf {e}} ^{*}=1\)—could be the case for a (relatively) honest heir. Alternatively, if the opportunity cost increases quickly and the manager is not honest, then the heir optimally receives a minimum level of training to become productive—i.e., \({\mathsf {e}} ^{*}=\widetilde{{\mathsf {e}} }_{2}(\lambda _{i})\) (see Fig. 3), and the incumbent finds it optimal to partially retire (that is, to keep on devoting time to working at the firm together with the successor). These two extreme cases are presented in the following result. Any other possible optimal training falls between these two extreme cases.

Proposition A.2

Consider Assumptions A.1 and A.2 are satisfied, and the training process exhibits decreasing returns-to-scale. The following is satisfied:

(i) If \(\tilde{{\mathsf {e}}}_{1}<1\), then the incumbent finds it optimal a level of training \({\mathsf {e}} ^{*}=\tilde{{\mathsf {e}}}_{2}(\lambda _{i})\) for each \(\lambda _{i}\le 1/\mu _{i}(\breve{{\mathsf {e}}})\), a monitoring intensity \(s_{i}^{*}=\frac{\kappa _{i}}{2}\left[ \lambda _{i}\mu _{i}(\tilde{{\mathsf {e}}}_{2}(\lambda _{i}))\right] ^{2}\), and work at the firm \(n_{i}^{*}=1-s_{i}^{*}-{\mathsf {e}} ^{*}>0\) units of time.

(ii) If \(\tilde{{\mathsf {e}}}_{1}>1\), then the incumbent’s optimal level of training \({\mathsf {e}} ^{*}\) is the same as in Proposition A.1.

Proof

To prove the Proposition, we proceed by steps.

Step 1. Initially, we rank the potential maxima considering Assumption A.2. See Table 3 and Fig. 3 displaying the potential maxima to optimal level of training in the \(\mu _{i}\)-\(\lambda _{i}-\)plane. Observe that the function \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})\) is decreasing.Footnote 32 In addition, the Assumption A.2 guarantees that the function \(\tilde{{\mathsf {e}}}_2(\lambda _{i})\) satisfying \(\lambda _{i}\tilde{{\mathsf {e}}}_2(\lambda _{i})<1\) does not intersect the full-deprivation frontier (A.6) and the no-working frontier (A.8). The ranking of the potential optima is the following:

  1. (a)

    If \(\tilde{{\mathsf {e}}}_{1}\le 1-\frac{\kappa _{i}}{2}\), then \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})<\breve{{\mathsf {e}}}<\tilde{{\mathsf {e}}}_{4}(\lambda _{i})< \tilde{{\mathsf {e}}}_{5}=1\) is satisfied for any \(\lambda _{i}\le 1/\mu _{i}(\breve{{\mathsf {e}}})\);

  2. (b)

    if \(\tilde{{\mathsf {e}}}_{1}\in \left(1-\frac{\kappa _{i}}{2},1\right)\), then \(\tilde{{\mathsf {e}}}_{2}(\lambda _{i})<\tilde{{\mathsf {e}}}_{4}(\lambda _{i})< \tilde{{\mathsf {e}}}_{5}=1\) is satisfied for any \(\lambda _{i}\le 1/\mu _{i}(\breve{{\mathsf {e}}})\); and,

  3. (c)

    if \(\tilde{{\mathsf {e}}}_{1}>1\), then \(\breve{{\mathsf {e}}}<\tilde{{\mathsf {e}}}_{4}(\lambda _{i})< \tilde{{\mathsf {e}}}_{5}=1\) is satisfied for any \(\lambda _{i}\le 1/\mu _{i}(\breve{{\mathsf {e}}})\).

Step 2. Next, we present a partial result: due to the concavity of the manager’s welfare (8) for \(m^*_{i}({\mathsf {e}})=\lambda _{i}\mu _{i}({\mathsf {e}})\), optimality allows us to state that \(E[V^{i}(\tilde{{\mathsf {e}}}_{2}(\lambda _{i}))]>E[V^{i}(\tilde{ {\mathsf {e}}}_{4}(\lambda _{i}))]\) is satisfied for any given \(\lambda _{i}\le 1/\mu _{i}(\breve{ {\mathsf {e}}})\).

Step 3. Proof of (i). Recall that the potential maxima to optimal training for the interval \(\lambda _{i}\le 1/\mu _{i}(\breve{{\mathsf {e}}})\) are \(\tilde{{\mathsf {e}}}_2(\lambda _{i})\) and \(\tilde{{\mathsf {e}}}_5\), so it is indeed the case for \(\lambda _{i}=0\). Substituting \(\tilde{{\mathsf {e}}}_2(0)\) and \(\tilde{{\mathsf {e}}}_5\) in (8), and due to the concavity of the manager’s revenue technology, we obtain that \(E(V^i(\tilde{{\mathsf {e}}}_2(0)))>E(V^i(\tilde{{\mathsf {e}}}_5))\). Since the function \(\tilde{{\mathsf {e}}}_2(\lambda _{i})\) is decreasing, it can be the case that \(E[V^i(\tilde{{\mathsf {e}}}_5)]>E[V^i(\tilde{{\mathsf {e}}}_2(\lambda _{i}))]\) for some \(\lambda _{i}>0\). If so, this entails by the Bolzano Theorem that there exists a \({\hat{\lambda }}_{i}> 0\) such that \(V^i(\tilde{{\mathsf {e}}}_{5})=V^i(\tilde{{\mathsf {e}}}_{2}({\hat{\lambda }}_{i}))\). This proves Proposition A.2.(i)

Step 4. Proof of (ii). From \(\tilde{{\mathsf {e}}}_1\ge 1\) and Assumption A.2.2, the set of potential maxima is restricted to \(\tilde{{\mathsf {e}}}_3\), \(\tilde{{\mathsf {e}}}_4(\lambda _{i})\) and \(\tilde{{\mathsf {e}}}_5\). So the Proposition A.1 applies. This proves Proposition A.2.(ii) and concludes the proof of Proposition A.2. \(\square\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Giménez, E.L., Novo, J.A. A Theory of Succession in Family Firms. J Fam Econ Iss 41, 96–120 (2020). https://doi.org/10.1007/s10834-019-09646-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10834-019-09646-y

Keywords

JEL Classifications

Navigation