Background

“Cometh the moment cometh the (wo)man” (Anon)

Academics rarely make good managers of high tech businesses and even when they do their usefulness to the company is ephemeral, depending very much on the stage of development the business has reached: for example, a manager useful at startup in product development may have skills that become redundant when full-scale production and marketing is required (Hellmann and Puri 2002; Wright et al. 2005; Wright and Lockett 2005; Wright et al. 2007). The facts demonstrate that very few first-time entrepreneurs (owner-managers) last the course from inception to maturity; in the first 7 to 8 years of the business’ life, a high proportion are replaced by professional managers (with extensive previous management experience) often at the behest of the venture capitalist or other financier. (Baron et al. 2001; Hellmann 1998; Hellmann and Puri 2002)a. Clearly the replacement decision is an important one both for the entrepreneur and the VC whose investment is tied up in the company. The VC needs someone who is good at managing people (optimising individual performance), who is in touch with the market, the technology and competition (Hellmann and Puri 2002). All these things will influence the business’ performance and ultimately the VC’s returns. However, when a start-up is run by inexperienced individuals (e.g. academics spinning out from a university science department or other technically-oriented entrepreneurs with little management experience–see Wright et al. 2005, 2007) their quality as executives is (at least initially) unknown. The VC will learn about this quality over time as a result of frequent (or not so frequent) contact with the new company in the form of monitoring and advice (Cumming and Johan 2007). At some point the manager’s ability is sufficiently well known for the VC to be able to make a decision about replacement in favour of a professional manager. This process and its criteria have an inherent economic logic as we shall shortly see, but the theoretical literature provides little guidance on the matter. Hence the current paper.

In this paper we model the VC learning process and the replacement decision in a Bayesian dynamic programming frameworkb. Briefly, a venture capitalist monitors a start-up run by a manager of unknown quality over a finite horizon. The problem is when to replace him should he underperform. The VC knows that his unobservable quality as a manager affects the likelihood of an increment to firm value next period, which will ultimately enhance the VC’s return. The VC can however observe an informative signal (e.g. ‘people skills’),c S t , of the manager’s ability in period t at cost c. This enables her to update her prior on the manager’s ability and on the expected profits from retaining him for one more period rather than replacing him with a professional manager. The latter yields a known present discounted value to the VC of Π m . In each of the periods we derive an optimal cutoff S t * for the signal that results in a rule showing when to replace the manager. The chances of adding to firm value in any period are predicted to be positively related to past managerial performance (mean value of the signal). The probability of manager replacement is thus lower for managers with good track records (S1). We find that it is also lower for managers with higher incremental values (π3(γ2)) and is higher for lower VC discount rates (r). Finally it is higher the higher the return to professional replacement ( Π m ), the cost of investment (I2) and the costs of monitoring manager performance (c).

Basics

A VC does not know the quality of the manager he employs in his investee company. However, she has a prior distribution on manager quality and judges that the manager is of Good or Bad quality with probability p(G) and p(B) = 1–p(G). Only if the manager is Good is the return to the firm’s project in a period positive. The value of the project at t, if successful, π t , will in general itself depend on the manager’s track record, consisting of a set of observable past signals, S τ , τ = 1 , 2 , , t 1 . Thus we write π t = π S 1 , S 2 , S t 1 (see Figure 1).

Figure 1
figure 1

The payoff function.

Thus the expected value of the project conditional on the information set to date is positive if and only if the manager’s quality is Good.

The VC learning process

Consider first a discrete quality two-period model. We begin by showing that under the Monotone Likelihood Property (Milgrom 1981) the posterior probability of the manager incrementing firm value is increasing in his track record defined as his period 1 signal value, S1. In the next section we develop the optimal value function in terms of these posterior probabilities and the optimal cutoffs associated with them.

The VC can observe a costly signal of the manager’s quality which is either High (H) or Low (L). She thus starts off with a prior on the manager’s quality and then updates this estimate as monitoring occurs. The probability that a manager is Good given a signal S1 is by Bayes rule:

p G | S 1 = p S 1 | G p G p S 1 | G p G + p S 1 | B p B
(1)

where S 1 H , L . This can be rewritten as

p G | S 1 = 1 1 + p S 1 | B p B p S 1 | G p G
(2)

showing more explicitly the dependence of the posterior on the likelihood ratio

p S 1 | B p B p S 1 | G p G
(3)

We shall without loss of generality assume in what follows that p B = p G = 1 / 2 and that the ratio (3) satisfies the following inequalities:

Assumption 1:

p H | B p H | G < 1 < p L | B p L | G
(4)

This is equivalent to assuming that the likelihood ratio is increasing in the signal S or that the distribution function for quality Q conditional on H first order stochastically dominates that of Q conditional on L (see Milgrom (1981) for detailsd). We define for future use the terms x and y:

Definition 1:

x p H | B p H | G , y p L | B p L | G
(5)

Using Assumption 1 we can conclude that x < 1 , y > 1 and therefore that

p G | L < 1 2 < p G | H
(6)

Thus the probability of success (of the manager being Good) in any period, given the signal, is increasing in the value of the signal S2 (management performance).

A second observation of the quality signal, S2, results in an updating of the VCs prior to

p G | S 1 , S 2 = 1 1 + p S 1 , S 2 | B p S 1 , S 2 | G
(7)

If observations of the signal are independent this simplifies to

p G | S 1 , S 2 = 1 1 + p S 1 | B p S 2 | B p S 1 | G p S 2 | G
(8)

It follows that the four possible posterior probabilities are related as follows:

p G | H , H = 1 / 1 + x 2
(9)
p G | H , L = p G | L , H = 1 / 1 + xy
(10)
p G | L , L = 1 / 1 + y 2
(11)

And, using 5:

p G | H , H > p G | L , H = p G | H , L > p G | L , L
(12)

Thus a superior ‘track record’ (sequence of signals) of the manager results in a higher Bayesian estimate of his chances of producing an increment to firm value next period.

Using 5, 9-11 we have

1 / 1 + x 2 > 1 / 1 + x > 1 / 1 + y > 1 / 1 + y 2
(13)

so that the dispersion of conditional probabilities of success (value increment) is predicted to increase over time (rounds).

The 2-period optimal value function

Consider again the discrete quality two-period model. We begin by showing that under the Monotone Likelihood Ratio property (henceforth MLR)e and the posterior probability of the manager in adding value is increasing in his track record defined as his period 1 signal value, S1. We then develop the optimal value function in terms of these posterior probabilities and the optimal cutoffs associated with them.

The VC has some initial belief about the manager’s quality and updates this measure, S t , t = 1 , 2 , in period t, at a cost. A superior ‘track record’ (sequence of past signals) of the manager results in a higher Bayesian estimate of his chances of incrementing value (i.e. generating a positive payoff) next period.

Consider now the value function of the VC in period 2. Figure 2 shows the decision tree structure. Since we have just two periods, the optimal value function will be zero in period 3 and thereafter: E V 3 * = E V 4 * = = 0 . We can therefore write the period 2 VC value function as

Figure 2
figure 2

The VC’s decision tree.

V 2 S 2 ~ , S 1 , = max I 2 ( S 2 ) ~ + δ p 3 S 2 , ~ S 1 , π 3 c , Π m
(14)

where

I 2 ( S 2 ) ~ = signal-dependent investment in period 2, I 2 ' ( S 2 ) ~ 0 .

δ = discount factor (=1/(1 + r), where r = the risk-adjusted interest rate).

p 3 S 2 ~ , S 1 p G | S 2 ~ , S 1 = probability manager adds value (is Good) in period 3 given an observed signal about his ability from last period, S1, and the random variable representing his period 2 signal, S 2 ~ .

π3 = period 3 value increment of the manager under successf.

c = costs of monitoring managerial performanceg.

Π m =present discounted value (p.d.v.) of the VC’s return from the firm under professional managementh.

The second period value function V 2 then shows the present discounted value (p.d.v.) to the VC of either investing and continuing one more period with the existing manager of uncertain quality (yielding p.d.v. I 2 ( S 2 ) ~ + δ p 3 S 2 , ~ S 1 , π 3 c ) or investing and replacing her with an outsider of known quality (yielding p.d.v. Π m )i. Note that the continuous signal version of the MLRP guarantees that the first term in the max{.} expression in Equation 14 is increasing in the first period signal S1, since it implies p 3 S 2 ~ , S 1 / S 1 > 0 .

The expected value of this function with respect to (w.r.t.) S2 for an arbitrary cutoff signal S 2 ^ is given by

E S 2 V 2 S 2 ~ , S 1 | S ^ 2 = E S 2 max I 2 ( S 2 ) ~ + δ p 3 S 2 ~ , S 1 π 3 c , Π m = Π m 0 S 2 ^ dF S 2 | S 1 + S 2 ^ I 2 S 2 + δ p 3 S 2 , S 1 π 3 c dF S 2 | S 1
(15)

Choosing the cutoff optimally requires maximising (15) w.r.t. this cutoff and yields the first order condition

I 2 S 2 * + δ p 3 S 2 * , S 1 π 3 c = Π m
(16)

(see Figure 3). The second order condition requires

Figure 3
figure 3

A better track record in period 1 reduces the chances of replacement in period 2.

I 2 ' S 2 * + δ π 3 p 3 S 2 * , S 1 / S 2 * > 0
(17)

We shall assume henceforth that this condition holdsj. Combining this result with the second order condition for a maximum Equation 3 shows that the VC will at the beginning of period 2 choose to keep the manager if and only if the expected value to the company if he is retained, given his track record (S1), is greater than the value of his replacement. More precisely we have the replacement rule:

Replace the manager in period 2 if and only if

δ p 3 S 2 , S 1 π 3 c I 2 S 2 < Π m
(18)

where S2 is the realised value of S 2 ~ . Equivalently, we can say that the manager will be replaced, given his initial performance, if and only if his second period performance falls below a certain threshold:

Replace the manager in period 2 if and only if, given S 1,

S 2 < S 2 *
(19)

Plugging 3 into 2 the optimal period 2 value function now becomes

E S 2 V 2 * S 2 ~ , S 1 = Π m
(20)

where E S 2 V 2 * S 2 ~ , S 1 max S ^ 2 E S 2 V 2 S 2 ~ , S 1 | S ^ 2 .

We now let the manager’s incremental value, π3(γ2), be increasing in a market demand parameter γ2. Consider the continuous signal case. Using the MLR property of the distribution function we get

p 2 S 1 > 0
(21)

Differentiating w.r.t. the various parameters we then get the following comparative static results:

S 2 * S 1 , S 2 * γ 2 , S 2 * δ < 0
(22)
S 2 * Π m , S 2 * η 2 , S 2 * c > 0
(23)

where η is a shift parameter in the function I2 ( I 2 η >0). Thus we have shown that in the second period the probability of manager replacement is lower for managers with good track records(S1), higher incremental values (π3(γ2)) and lower VC discount rates (r), and that it is higher the higher the return to professional replacement ( Π m ), the cost of investment (I2) and the costs of monitoring manager performance (c). Figure 3 illustrates the effects of better performance on the likelihood of manager replacement.

We move back now to period one. The period 1 value function is given by

V 1 S 1 ~ = max I 1 ( S 1 ) ~ + δ p 2 S 1 ~ π 2 S 1 ~ c + E S 2 V 2 S 1 , S ~ 2 , Π m
(24)

with expected value

E S 1 V 1 S 1 ~ = E S 1 max I 1 ( S 1 ) ~ + δ p 2 S 1 ~ π 2 c + E S 2 V 2 S 2 ~ , S 1 ~ , Π m = Π m + S 1 ^ I 1 ( S 1 ) ~ + δ p 2 S 1 π 2 c + E S 2 V 2 S 2 ~ , S 1 ~ dF S 1
(25)

Choosing the period 1 cutoff optimally requires

I 1 S 1 ~ + δ p 2 S 1 * π 2 + E S 2 V 2 S 2 ~ , S 1 * = Π m
(26)

Substituting back into Equation 11 the optimal period 1 value function now becomes

E S 1 V 1 * S 1 ~ = Π m
(27)

It is clear that whilst the optimal value function is a constant the optimal cutoffs will vary with the information available at the time. The comparative statics of the first period cutoff with respect to the relevant parameters, assuming symmetrically that π2 = π2(γ1) is increasing in the demand parameter, γ1, show that, as might be expected, the first period probability of manager replacement is lower for managers with good track records(S1), higher incremental values (π3(γ2)) and lower VC discount rates (r); it is higher the higher the return to professional replacement ( Π m ), the cost of investment (I2) and the costs of monitoring manager performance (c)k.

The T-period model

The generalisation of the model to T periods is straightforward and we present most of the results rather than proving them in the text. The obvious way to represent the manager’s track record in the multiperiod context is by the mean of the signals over the periods up to the present (t). For some distribution functions (e.g. the Normal) the mean of the signal history and the number of periods before the present, t-1, will be a sufficient statistic for the signal historyl. Restricting ourselves to such distributions we can write the tth period value function as

V t S t ~ , S ¯ t 1 = max I t ( S t ) ~ + δ [ p t S t , ~ S ¯ t 1 π t + 1 c + E S t + 1 ~ V t + 1 S ~ t + 1 , S ¯ t 1 , Π m
(28)

where

S ¯ t = i = 1 t S i / t is the mean signal from the manager up to time tm.

Taking expectations with respect to the period t signal we get

E S ˜ t V t S ˜ t , S ¯ t 1 = E S ˜ t max I t S ˜ t + δ p t + 1 S ¯ t 1 π t + 1 c + E S ˜ t + 1 V t + 1 S ˜ t + 1 , S ¯ t , Π m = Π m F S ^ t + S ^ t I t S ˜ t + δ p t + 1 S t , S ¯ t 1 π t t + 1 c + E S ˜ t + 1 V t + 1 S ˜ t + 1 , S ¯ t dF S t
(29)

Differentiating w.r.t. the tth period cutoff we get the optimality condition

I t S t * + δ p t + 1 S t * , S ¯ t 1 π t + 1 c + E S t + 1 V t + 1 S ~ t + 1 , S ¯ t * = Π m
(30)

where we define

S ¯ t * = t 1 S t * + t 1 S ¯ t 1
(31)

We have using the MLR property that the probability of success increasing in the manager’s track record:

p t S ¯ t 1 > 0
(32)

Comparative statics then go through as before with S1 being replaced by S ¯ t 1 :

S t * S ¯ t 1 , S t * γ t , S t * δ < 0
(33)
S t * Π m , S t * η t , S t * c > 0
(34)

Summary and conclusions

We developed a theory of managerial replacement in which a venture capitalist monitored an investee firm run by a manager of unknown quality (Good or Bad). An informative signal St correlated with performance (value-added) was available to the VC at a cost in each period t. The problem was when to replace him if he underperformed. We derived a solution to this problem that took the form of an optimal cutoff for each period t, namely, S t + 1 * , such that, given his track record, the manager would be replaced if and only if next period’s signal fell below S t + 1 * . We showed that the probability of manager replacement was lower for managers with good track records, higher incremental values and lower VC discount rates, and was higher the higher the return to professional replacement, the cost of investment and the costs of monitoring manager performance. Replacement was also predicted to enhance company value.

Endnotes

aHellmann, reports statistics from Hannan et al. (1996) who found that in Silicon Valley high tech startups 20% of owner-managers were replaced in the first 10 months of the business’ life, rising to 80% in the first 80 months. These figures we shall see later are broadly consistent with those in the current dataset.

bThere is a parallel here with the model of entrepreneurship as a learning experiment in Jovanovic (1982). Jovanovic argues that an entrepreneur learns about his ability in entrepreneurship only by starting a business. His initial prior is updated by successive feedback from the market on his costs of operation. Our model is consistent with this view of the entrepreneur, but we look at it from the VCs perspective, so that the VC learns about the entrepreneur’s ability by investing in him or her and observing her performance. In Jovanovic the entrepreneur decides if and when to quit based on her updated information on her skills; in our model the VC makes the decision for her.

cA great manager has the ability to bring out the very best in people thus optimising their ability. This is modelled in the paper by assuming that the probability of success in any period increases in the value of the signal.

dVery briefly Milgrom shows that under the Monotone Likelihood Property (henceforth MLRP) given in our case by inequalities 4, that any risk averter (in our case the VC) will strictly prefer the posterior distribution manager quality Q (in our case B, G) conditional on the signal H over the same distribution conditional on L.

eSee Milgrom (1981).

fWe shall henceforth, without loss of generality, drop the dependence of π on the signals St.

gWe assume c < p 3 S 2 , ~ S 1 , π 3 with probability 1.

hWe assume that this return is based on a known success probability (no learning needs to take place on the part of the VC about the parameters).

iNote that we are modelling only stages at which investment by the VC occurs. There is always in practice the possibility that the VC will not invest at all at a given stage. However, our data (as most other data) records only stages at which investment occurred. Hence our tests will be on ‘superior’ businesses in this sense. Our modelling effectively assumes therefore that the value function in 1 is positive with probability 1. It is very straightforward to adjust the model to take into account the possibility of no investment at a given stage.

jIt is automatically satisfied, given the MLRP, if I 2 ' = 0 .

kNote that because of the absence of observations on the managerial signal in period 1 (this is not visible until period 2) we cannot examine the impact of track record at this stage.

lWe can assume that the signals H and L assume the values 1 and 0 respectively. This gives us as the mean value the proportion of past periods in which the manager performed well.

mBear in mind here that this mean contains now the random variable S ~ t of period t.