1 Introduction

The rise in the level of executive compensation in international banking in the last two decades has been striking. By the end of 2003 Citigroup, Lehman Brothers, and Bear Stearns, large players in the banking industry at that time, were run by CEOs whose earnings were among the top ten in the S&P 500 (Hodgson 2004). Two of these banks—Lehman Brothers and Bear Stearns—collapsed in 2008. According to Bebchuk et al. (2010), in spite of obvious mismanagement the executives of these banks had received considerable performance-based compensation packages during the years preceding the financial crisis. Switzerland was not exempt. In 2012, UBS paid out 70 million Swiss Francs to the members of its executive committees despite a 2.5 billion loss that year. It stands to reason that the effectiveness of such compensation schemes have since become subject of ever more heated discussions, not least in international banking. The recent rise in executive compensation has not been confined to the banking industry. Other industries have been following the same trend. Murphy (2013) documents that total pay for executives in the S&P 500 skyrocketed in the late 2000s.

This general development has also piqued the interest of economists, who are intrigued by the underlying pay-setting mechanism. Executive compensation is a classic example of a principal-agent problem and lies at the heart of the controversy of corporate separation of ownership and control (Jensen and Meckling 1976). Put succinctly, the challenge lies in motivating the CEO (the agent) to act in the best interest of the shareholder (the principal). Because the effort of the agent is not perfectly observable, the principal is not able to force the agent to choose the action that would be optimal from the principal’s perspective. This invokes a moral hazard problem. There has been much discussion about how firms are to solve this agency problem (Ross 1973; Gjesdal 1982; Mahoney 1995). A straightforward solution would involve a compensation scheme which provides proper incentives for the CEO. Economic intuition would suggest tying compensation to firm performance. However, firm performance is also influenced by a myriad of factors that are beyond the control of the agent. This exogeneity introduces undesirable risk into contracting.

This is where relative performance evaluation (RPE) comes in (Holmstrom 1982). RPE implies that compensation contracts should be linked to firm performance in relation to peers with similar characteristics. Such contracts account for common shocks that are out of an agent’s control and thus offer a more conclusive way to assess the agent’s individual performance. At the same time, RPE contracts offer the same incentives as contracts based on absolute performance. For shareholders, knowing that RPE is implemented is of particular importance because the mechanism creates incentives for CEOs to increase shareholder wealth.Footnote 1 The case for employing RPE in executive compensation contracts, then, seems clear-cut. Indeed, RPE has become seemingly popular in practice. Recent studies show that roughly every fourth firm in the S&P 1500 openly claims to use RPE in their compensation contracts (Carter et al. 2009; Gong et al. 2011).

In this paper, we test for the existence of RPE in international banking and pay particular attention to banks that claim to purport its application. For banks might have incentives to misreport RPE practice if their board of directors are prone to managerial influence. Board members might want to appease executives rather than constrain them. For example, managers of large banks in particular usually enjoy a good reputation inside and outside of the bank, which could be beneficial for the members of the board. Bebchuk and Fried (2003; 2004) note that managers also enjoy great authority within the company, rendering conflict less appealing. Designs of executive compensation schemes endorsed by the board of directors might thus succumb to this pressure. This is not without corporate risk. If managerial compensation packages are identified as unjustified by relevant outsiders, managers and directors face considerable social disapproval, denoted as “outrage” costs by Bebchuk and Fried (2004). The stronger the negative perception of outsiders, the greater the managers’ cost of enjoying large compensation packages. This criticism can be avoided by camouflaging compensation packages.Footnote 2 Camouflaging the peer group of an underperforming bank manager would make excessive pay more acceptable to the public and investors. In other words, we might want to be cautious of flaunting RPE statements; some banks might want to avoid evaluating the actual performance of their underperforming manager relative to an appropriate peer group and instead pick particularly low-performing peers.

RPE has been investigated empirically. Most studies try to infer its usage by regressing executive compensation on the performance of a target firm and some measure of peer group performance. A negative and statistically significant coefficient on peer performance is taken as indication that common shocks are being removed from compensation contracts, constituting evidence of RPE. The scope of the existing literature is rather limited. The focus lies on compensation practices of industrial firms, typically in the USA.Footnote 3 This regional limitation has its reasons. It is difficult to obtain comprehensive data on executive compensation outside of the USA. Despite the ubiquitously proclaimed use of RPE in practice, the empirical results of the literature have been a mixed bag.Footnote 4 This is partly owed to the fact that the post hoc construction of peer groups is fraught with issues. If researchers identify a different peer group than the target firm itself had actually used, inferences on RPE are no longer meaningful. Correct identification is one reason why one may fail to find evidence of RPE. A simpler explanation would be that the RPE claims are merely empty rhetoric to appease stakeholders. As Albuquerque (2009) puts it, any empirical tests of RPE are, in this sense, joint tests.

This paper embraces this duality and tests for RPE in a new sample of large and globally operating non-U.S. banks.Footnote 5 We contend that the global banking industry is an ideal playground to test the usage of RPE, for at least three reasons. First, RPE makes especially sense for firms that are exposed to common shocks. This applies particularly well to international banking. The main reason for this exposure is that banks are highly leveraged institutions. Around 90 percent of their assets come from debt, making them more prone to exogenous volatility (Houston and James 1995; Chen et al. 2006). Second, the barriers to global integration in the banking industry have been significantly trimmed in the last two decades, shifting banks from once centralized domestic organizations to global behemoths. In turn, the structure of competition in the industry has adjusted (Berger and Smith 2003). Large banks operating on the international level are now dealing with intense competition.Footnote 6 Third, the recent financial crisis was characterized by failures of large international banks such as Bear Stearns or Lehman Brothers. The downfall of these banks has drawn increasing attention to corporate governance issues in remuneration policy.Footnote 7 If anything, this pressure has prompted banks to make more efficient use of RPE.

Our study tackles the caveat that the soundness of empirical tests on RPE critically hinges on the correct identification of the peer groups. We follow the seminal approach by Albuquerque (2009) and aggregate peer performance on the basis of industry and industry/size peer groups. Aggregating in this manner accounts for the observation that industry affiliation and firm size are informative proxies for the common market risks that RPE-setting firms face. In doing so, this approach takes up Holmstrom’s (1979) theoretical requirement of common uncertainties. Our study also deals with the potential issue of RPE being corporate window dressing, with the firm not actually engaging in RPE. If that were the case then signaling the disclosure of peer group usage would be mere noise, and incorporating this information should not alter our results qualitatively. To test this hypothesis, we differentiate between disclosing and non-disclosing banks. Disclosing banks claim to compare their executive’s performance relative to peer group performance in determining one or several components of executive pay.Footnote 8

We collect a new data set with information on 46 large international banks. The results of our basic regression specification document negative and insignificant parameter estimates in industry peers. Taken by itself, this casts doubt on the use of RPE in our sample. When we perform tests of RPE on industry/size peers, we find moderate evidence consistent with RPE. In restricting attention to the subsample of disclosing banks, we find stronger and more conclusive evidence that systematic risk is filtered out from CEO compensation. Strong-form RPE tests support this conclusion. This result stands in contrast to Gong et al. (2011), who do not find informational value in RPE disclosure among US. firms.Footnote 9 To gain more insight, we disentangle potential factors related to RPE usage. A logistic regression indicates that firm size, and, to a smaller extent, growth options are associated with RPE usage. The results imply that the greater a bank is, the higher is the probability that it will use RPE in its compensation contracts. On the other hand, the probability of using RPE decreases with the magnitude of growth options.

Our paper contributes to the ongoing discussion on RPE along several dimensions. Existing studies testing RPE on banks have focused solely on US data. This is hard to square with an industry that is characterized by pronounced international competition. We provide broader evidence by conducting tests on a newly collected sample of large international banks. We also shed new light on the informative value of disclosure. Our results withstand several robustness tests and suggest that the banks in our sample which proclaim the use of peers in assessing the performance of their CEOs are not merely window dressing: we do find stronger evidence for RPE usage among disclosing banks. This indicates that lumping together disclosing and non-disclosing firms can be detrimental to the conclusiveness of RPE tests. Finally, we examine the association of several bank characteristics with the intensity of RPE usage.

The rest of the paper is organized as follows. In the “Relative performance evaluation and the banking industry” section we describe the main characteristics of the banking industry. This section also introduces the empirical model and depicts the peer group construction mechanism. The “Data description” section presents our novel data set of international banks. The “Results” section reports summary statistics and regression results. In the “Extensions and robustness checks” section, we explore factors related to RPE, investigate the association of RPE pay practices with bank-level characteristics, and conduct several robustness checks. The last section concludes.

2 Relative performance evaluation and the banking industry

2.1 Executive pay in the banking industry

The literature on executive compensation in banking attends to the particularities of the industry. There are three characteristic features of banks (John and Qian 2003; Macey and O’Hara 2003; Tung 2011). First, banks have a peculiar capital structure. They hold much less equity than other companies, rendering them highly leveraged. Roughly 90% of a bank’s funds comes from debt. In addition, a bank’s assets and liabilities are mismatched (Diamond and Dybvig 1983). Second, the presence of federal guarantees of bank deposits, a public measure to protect private depositors from losses in case of insolvency, distinguishes banks from other firms. Third, these deposits can increase the risk of fraud and self-dealing in the banking industry by reducing the incentives for monitoring (Macey and O’Hara 2003).

Against this background, the literature addresses three main topics (Houston and James 1995). One cluster of studies examines whether the sensitivity of executive compensation to a bank’s performance was affected by the US corporate control market deregulation (Crawford et al. 1995; Hubbard and Palia 1995; Cuñat and Guadalupe 2009)Footnote 10. Two other studies test whether the existing compensation policies promote risk-taking in the US banking sector by examining the relation between the specific component of the compensation and market measures of risk (Houston and James 1995; Chen et al. 2006).Footnote 11 Other research examines the usage of RPE in the US banking industry. Barro and Barro (1990) test RPE on a data set that covers 83 commercial banks in the US between 1982 and 1987. They regress the growth rate of real compensation on the average of the real total rate of return from the current and previous period, the first difference of accounting-based returns, regional averages for both accounting-based return, and the average of the real total rate of return. This effectively compares the performance of banks relative to the performance of other banks in the same region. Their evidence is not consistent with the use of RPE. Crawford (1999) tests two hypotheses on 215 executives from 118 US commercial banks from 1976 to 1988. He regresses change in CEO pay for a specific bank on a change in shareholder wealth for that bank, an industry relative performance measure, and a market performance measure using S&P 500 returns. His findings suggest that relative compensation is negatively related to market and industry returns and positively related to shareholder returns. In addition, in his sample, the use of RPE increases upon introduction of banking deregulation. Crawford reports evidence consistent with RPE if CEO compensation is evaluated relative to industry peers. He does not, however, find evidence of RPE when using market performance measures.

The literature also provides insight about the remuneration practice in the banking industry. The data show that bank CEOs receive less cash compensation on average, are less likely to participate in a stock option plan, and hold fewer growth options than CEOs in other industries. These differences are likely to stem from banks’ different investment opportunities (Houston and James 1995). But not all is different in the banking industry. Houston and James do not find any differences between banking and non-banking industries regarding the overall sensitivity of pay to performance. They presume that the factors that influence compensation in the banking industry are similar to those in non-banking industries despite differences in the compensation structure. Adams and Mehran (2003) suggest that the difference in the governance structures between manufacturing firms and banks are industry-specific. Furthermore, the differences seem to be mostly due to different investment opportunities of bank holding companies (BHCs) and pertinent regulation. Adams and Mehran’s study examines whether firm performance measures are influenced by the governance structure. Their results indicate that differences between the board structures of manufacturing firms and banks might not be a reason for concern in this respect. Aebi et al. (2012) study the strength of incentive features of top management compensation contracts in banks. They compare the pay-performance sensitivity in banks with those in manufacturing firms and show that debt ratio, firm size, risk, and regulation are important determinants of pay-performance sensitivity in banks. Finally, the executive compensation structure and the governance structure of banks differ from firms in other industries. Even so, the factors that influence the overall pay-performance sensitivity do not seem to differ significantly across industries.

2.2 Empirical model

To test for RPE, we employ a model that is based on Holmstrom and Milgrom (1987). Specifically, we use the following weak-form test of RPE:Footnote 12

$$\begin{array}{*{20}l} {Comp}_{it}=&\alpha_{0}+\alpha_{1}\cdot FirmPerf_{it} \\ &+\alpha_{2}\cdot {PeerPerf}_{it}+\alpha_{3}\cdot C_{it}+ \epsilon_{it}. \end{array} $$
(1)

Compit measures executive compensation in monetary terms, FirmPerfit stands for the performance of firm i measured as the continuously compounded gross real rate of return to shareholders (assuming that dividends are reinvested), and PeerPerfit denotes the performance of firm is peer group. To account for variation not included in firm is and its peer group’s performance we include several control variables, subsumed in the column vector Cit. These variables include firm size and growth options. In addition, we include time, industry, and country dummies. The subscript t denotes the respective year and εit represents an independent firm-specific white noise process. α0, α1, α2, and α3 denote model parameters.Footnote 13

In this model, rejecting the null hypothesis H0:α2≥0 against the one-sided alternative H1:α2<0 provides evidence of RPE in executive compensation contracts. In that case, exogenous shocks outside of the control of the executive management are filtered out from the compensation contract.

Researchers typically use the so-called strong-from RPE test to examine whether all exogenous shocks are removed from the compensation contract. The first step in conducting this test is to regress firm performance on peer performance (Antle and Smith 1986). The first step regression model is

$$\begin{array}{*{20}l} {FirmPerf}_{it}=&\gamma_{i}+\beta_{i}\cdot {PeerPerf}_{it}\\ &+ \eta_{i}\cdot C_{it} +\varepsilon_{it}. \end{array} $$
(2)

The unsystematic and systematic performance are obtained from the equation above in the following manner:

$$\begin{array}{*{20}l} \begin{aligned} {UnsysFirmPerf}_{it}&=\widehat{\eta}_{i}\cdot C_{it} + \widehat{\varepsilon}_{it}, \\ {SysFirmPerf}_{it}&=\widehat{\gamma}_{i}+\widehat{\beta}_{i}\cdot {PeerPerf}_{it}. \end{aligned} \end{array} $$
(3)

\(\widehat {\epsilon }_{it}\) denotes regression residuals and \(\widehat {\gamma _{i}},\,\widehat {\beta _{i}}\) denote parameter estimates.Footnote 14 The second step estimates the sensitivity of CEO compensation with respect to the unsystematic and systematic components of firm performance, that is:

$$\begin{array}{*{20}l} Comp_{it}&=\delta_{0}+\delta_{1}\cdot UnsysFirmPerf_{it} \\ &+\delta_{2} \cdot SystFirmPerf_{it}+ \delta_{3}\cdot C_{it}+ e_{it}. \end{array} $$
(4)

Cit denotes a column vector of control variables and the row vector δ3 its coefficients. If the systematic risk is filtered out from the compensation contract, the systematic performance δ2 in Eq. (4) should not be significantly different from zero.

3 Data description

This section describes the data preparation process. The “Compensation data” subsection reports the collection of the international compensation data, the “International banking sample” subsection provides details about the sample of international banks that we use in the regression analysis, and the “Peer group composition” subsection documents the peer group data selection process.

3.1 Compensation data

There is no standardized database for international corporate executive compensation. We collect our data from several sources for the years 2003–2014. Financial and accounting data are obtained from Thomson Reuters Datastream and Thomson Reuters Worldscope. Compensation data are collected manually either from annual reports or management proxy circulars available online. We do not include US banks in our analysis. In August 2006, a new regulatory requirement by the US Securities and Exchange Commission mandated, among other things, full disclosure of peer group compositions (if applicable) for fiscal years ending on or after December 15, 2006. In a recent study, Faulkender and Yang (2013) provide evidence that this event generated a structural break in the peer group selection, discouraging the use of US compensation data for our purpose.Footnote 15 For the other countries in our sample, we could not find any corresponding regulation that was introduced in our observed timeframe.Footnote 16

Our initial data set is composed of firms classified as banks from the FTSE All World Index with an index weight higher than 0.02. This yields 75 firms. Based on this list, we collect remuneration data for 46 different firms with a total of 335 firm-year observations (henceforth dubbed the “full sample”). A detailed list of all banks in our sample is shown in Table 12 in the Appendix. In addition, we list all non-US global systemically important banks (G-SIBs) from 2011 to 2012, indicating the inclusion in our sample or stating the reason for exclusion (see Table 13 in the Appendix).

In line with the source information, we quantify the compensation in nominal terms. As CEO compensation, we define the compensation paid by the parent company as well as the one paid by subsidiaries (for the CEO position). In rare cases, firms only provide a certain wage range. In that case, we always include the higher bound as the actual compensation. We do not include the measure of CEO compensation changes in the value of existing firm options and stock holdings owned by the CEO.

In order to collect the total compensation data, we focus on the amount the firm itself defines as the “total.” This always includes all the positions used for the fixed compensation amount as well as performance-related components. The name and the exact composition of these performance-related components vary significantly between firms. For example, some firms differentiate between long-term and short-term incentives, whereas others just talk about bonuses. This seems to be related to the pertaining country and its national regulations. We ignore any extraordinary compensation such as restricted shares (which had been allocated when starting as CEO), payment in lieu of notice, or buyout. Finally, we exclude all amounts received related to the holding of a director position in addition to the CEO position.

We also collect information to create a dummy variable that indicates explicit disclosure of peer firms in determining a company’s relative performance pay to its CEO. We translate such disclosures as indication of RPE usage and examine the subsample of thusly disclosing firms in the “Regression results” section. We then identify possible factors related to this disclosure in the “Extensions and robustness checks” section. Note that this approach is less excluding than a strict requirement of overt RPE claims, and would therefore also pick up simple benchmarking (for details on the difference between RPE and benchmarking, see Gong et al. (2011)). This runs a higher risk of not rejecting the null hypothesis even if it is false. If we do find evidence against the null hypothesis, however, we can be quite confident that disclosure has a significant impact.

3.2 International banking sample

We convert all compensation data into US Dollars by using exchange rates from Thomson Reuters Datastream. The exchange rate is determined by the day after the end of the fiscal year (e.g., if the fiscal year ends on December 31, 2010, we take the exchange rate on January 1, 2011). We measure firm performance with stock market return data from Thomson Reuters Datastream. Following the literature, we control for firm size (sales) (Smith and Watts 1992; Fama and French 1992) and growth opportunities (Fama and French 1992). In addition, we include dummies to control for year-specific differences in the level of compensation, industry dummies that capture unobservable variation at the industry level, and country dummies that capture any country-specific variation (e.g., due to different regulations or legal directives). In order to control for this possible country-specific heterogeneity, we only keep banks from countries with at least two observations.

Panel A of Table 1 shows the frequencies for the full sample, the RPE disclosure subsample, and the non-RPE disclosure subsamples for each year. Altogether, the data for the full sample are evenly distributed over the years 2004–2013, though the frequency of the data tends to increase somewhat over time.Footnote 17 The same applies for both subsamples.

Panel B of Table 1 displays the sample frequency by industry group within the banking industry. In addition, it reports the frequency of RPE disclosing and non-disclosing banks by industry group. Subsector 6029 (Commercial Banks) dominates the full sample with more than 80% of all observations. The other subsectors are National Commercial Banks (6021), State Commercial Banks (6022), Federal Saving Institutions (6035), and Security Brokers and Dealers (6211). Similar to the full sample results, RPE disclosing (84.57%) and non-RPE disclosing banks (86.71%) mostly belong to the subsector 6029 (Commercial Banks). All the banks belonging to the subsector 6022 (State Commercial Banks) and the subsector 6035 (Federal Saving Institutions) disclose their peer groups.

Panel C of Table 1 depicts the sample frequency by country, including the RPE and non-RPE subsamples. Among the 14 countries in the full sample, Canada, Australia, Singapore, Sweden, and the UK provide the largest shares of our observations. The banks in Canada, Australia, and Germany have the highest propensities of RPE disclosure. In Australia and Germany, all banks provide information about their peer groups, whereas none of the banks in Hong Kong, China, and Norway do so.

Table 2 shows Pearson correlation coefficients between performance measures and the control variables firm size and growth options. Firm stock return and industry peer return as well as firm stock return and industry/size peer return display positive correlations. The correlation of firm stock return with its industry peer return (0.73) is lower than the correlation of firm stock return with its industry/size peer return (0.79), which is consistent with previous evidence (Albuquerque 2009).Footnote 18 The statistically significant correlation coefficients increase our confidence that industry and industry/size peers are suited for filtering out noise from firm performance measures. In addition, total executive compensation is positively and significantly correlated with stock return (0.15). The same holds for the correlation between total compensation and industry peer return, and total compensation and industry/size peer return. Not surprisingly, total compensation is positively correlated with firm size. In order to identify a possible multicollinearity problem in the upcoming regressions, we report variance inflation factors in all respective tables.Footnote 19

3.3 Peer group composition

For the selection of the peer firm pool, we start with a comprehensive list of 4,228 firms, most of which are financials. We use SIC-codes to remove firms which do not belong to the banking industry.Footnote 20 We also exclude other firms which we do not consider valid peers, such as the Allied Irish Banks, which technically became state-owned during the financial crisis. We then apply a number of screens to the return data to obtain a qualitatively sound data set (Ince and Porter 2006). First, we delete any consecutive zero returns at the end of the sample period. Second, we remove returns below − 80% and above 300%. We also require that the one-year continuously compounded return obtained from monthly data is available. We end up with 1,570 firms as the pool of potential peers. Note that in order to mitigate survivorship bias, this pool also contains so-called “dead stocks” which were delisted from the stock market during the sample period.

RPE firms assess their CEOs’ compensation levels based on performance in relation to their respective peers. These peers are not simply a random draw of the market; firms follow a specific methodology in selecting their peers. Because researchers usually do not know a firm’s peers, a different approach is needed to approximate the peer group. Most studies assessing RPE employ broad industry or market indices as a comparison group for peer performance. This is not without problems. Firms within an industry are hardly homogenous in their characteristics, so simple benchmarks are not able to adequately capture common shocks (Albuquerque 2009).Footnote 21 This introduces a bias in the statistical estimation and can distort inferences. An inappropriate comparison group can lead to a higher (or lower) prescribed level of CEO pay. An expedient and replicable comparison group based on a reasonable and objective criterion is therefore the key element when empirically testing for RPE.

Albuquerque (2009) provides a pragmatic solution for the ex post reconstruction of RPE peer groups. She composes groups based on both the two-digit SIC level and firm size. The first step in her process sorts firms by beginning-of-year market value into size quartiles within an industry. This yields four peer groups per industry. Each firm is then matched with its industry-size peer group. It turns out that this approach yields stronger empirical support for the use of RPE in executive compensation than sorting by industry classification alone, an improvement that is due to the information that firm size captures. Firms of comparable size are similar along several other characteristics which proxy for systematic risk. Albuquerque shows how the levels of diversification, financing constraints, and operating leverage vary with industry-size-ranked portfolios and provides evidence that firm size subsumes these characteristics. She finds that larger firms tend to be more diversified, have greater operating leverage, and smaller financing constraints. This claim is supported by other literature. Demsetz and Strahan (1997), for instance, construct a measure of diversification of BHCs. Their results establish a strong, positive effect of bank size on the diversification of BHCs. Moreover, small firms tend to face bigger financial constraints in comparison to large ones. In other words, firm size is a proxy with high explanatory power for the common uncertainty Holmstrom (1982) insinuated. We thus proceed to build the specific peer groups by adapting the industry/size approach by Albuquerque (2009).

4 Results

4.1 Full sample results

This subsection presents the results for our full banking sample. We first present descriptive statistics of compensation data, performance measures, and firm characteristics for the 46 firms during 2004–2013 (see the “Summary statistics” subsection). In the “Regression results” section, we then document the statistical results. We regress the logarithm of total CEO compensation on firm stock performance, peer return, and several control variables.

4.1.1 Summary statistics

Table 3 presents descriptive statistics for the full sample. We report two measures of compensation: total compensation and the logarithm of total compensation. In the regression analysis, we use the logarithm of total compensation as a dependent variable because its empirical distribution is more symmetrical than the one for total compensation. This mitigates heteroscedasticity as well as extreme skewness and allows for a direct comparison with results from previous studies (Murphy 1999). We also report summary statistics for the control variables firm size (log of sales and sales) and growth options. Table 3 shows that the average (median) total compensation of an executive in our sample is USD 5.28 million (USD 4.12 million), which is not all that surprising in a sample that largely consists of major global players in the banking industry. Firm performance is measured using log-returns. The mean firm stock return is 5% and the median return is 14%. Averages of peer returns hover around 8%. The average (median) size in terms of sales of a bank is USD 34.15 billion (USD 20.63 billion). Using total assets as a proxy for size, the according value is USD 738.47 billion (USD 376.83 billion).

Table 3 Descriptive statistics

4.1.2 Regression results

We test the use of RPE in CEO compensation using Eq. 1. Peer groups are constructed with the industry and industry/size approach. We regress the logarithm of total CEO compensation on firm stock return, peer return, growth options, and log of sales. Year, country, and industry dummies are also included.

Panel A of Table 4 shows the sensitivity of CEO total compensation to RPE when using industry and industry/size peer groups. The coefficient on firm stock return is positive and statistically significant at the 1% level for both peer group specifications, with values of 0.47 and 0.55 for the industry and industry/size specifications, respectively. When the peer group is restricted to firms within the same industry, the coefficient of the peer portfolio is negative but not significant (− 0.11 with a p value of 0.56). Put differently, the performance of these peers does not seem to be filtered out from the CEO compensation contracts. If we include size into sorting and consider industry/size peers, the parameter estimates become statistically significant (with a coefficient of − 0.32 and a p value of 0.05). Robustness checks yield mixed results.Footnote 22Footnote 23

By and large, the results for our international banks match previous findings for US firms, which showed that industry/size peers are better able to capture exogenous shocks than industry peers alone (Albuquerque 2009).

4.2 RPE subsample results

4.2.1 Weak tests of RPE

The results above are consistent with the notion that the 46 banks in our full sample follow an RPE scheme. We now turn to the informational value of peer disclosure. Although there is a risk of taking disclosure at face value, we exploit this information to sharpen our sample’s profile. We test the sensitivity of CEO pay to RPE in the subsample of 25 banks that explicitly declare the use of peers in determining the performance of their CEOs in their statement proxies (see the “Compensation data” section). We follow the same empirical specification used in the previous analysis.

Panel B of Table 4 shows the sensitivity of CEO total compensation to RPE when using industry and industry/size peers. The results show positive and statistically significant parameter estimates on firm stock performance for both peer group specifications. The estimates are 0.49 and 0.69, respectively, indicating that a CEO is being rewarded for positive firm performance. Hence, on average, CEO compensation increases with firm performance. When the peer groups are composed of banks within the same industry, the coefficient on the peer portfolio is negative and not significant (with a coefficient of − 0.30 and a p value of 0.40). The industry/size parameter estimate is also negative but statistically significant at the 5% level (with a coefficient of − 0.66 and a p value of 0.02). The results for the subsample of disclosing banks, too, provide evidence consistent with RPE, but more conclusively so than the results for the full sample. The coefficient on the peer portfolio doubles in size and increases in statistical significance.Footnote 24 This suggests that peer group disclosure holds informational value regarding RPE. One could also say that the inclusion of non-disclosing banks in the full sample tends to dilute the statistical inference and renders it less conclusive.Footnote 25 These results stand in contrast to Gong et al. (2011), who find no informational value of RPE disclosure. However, their sample only comprises US firms for 1 year.

4.2.2 Strong-form test of RPE

Following Antle and Smith (1986), we perform so-called strong-form tests of RPE on the subsample of RPE disclosures to verify the robustness of our results. Strong-form tests of RPE examine whether all the noise that can be removed is indeed filtered out from the compensation contracts. Details on the construction of systematic and unsystematic firm performances and the employed empirical model are reported in the “Empirical model” subsection. In a nutshell, the results are consistent with RPE if only the unsystematic performance exerts influence on CEO pay, and not the systematic one.

Panel C of Table 4 documents the regression results from Eq. (4) for the subsample of disclosing banks. Here, we regress the logarithm of CEO compensation on unsystematic firm performance, systematic firm performance, and control variables for 162 firm-year observations over the time span 2004–2013. In that specification, we restrict ourselves to industry/size groups for constructing the systematic performance variable. The systematic component is not significant with a coefficient estimate of 0.03 (p value = 0.89). The unsystematic performance variable, on the other hand, is positive and statistically significant with a coefficient of 0.69 (p value = 0.00). This suggests that the CEOs in our subsample are being compensated for unsystematic performance only. These results hold up to several robustness tests and provide evidence in keeping with the use of strong-form RPE and reinforce the previous finding that CEOs are not being compensated for systematic performance in the subsample of RPE disclosures.Footnote 26

5 Extensions and robustness checks

5.1 Associated factors of RPE in the banking industry

Prior studies have put forth a variety of factors that are related to the usage of RPE in compensation contracts in UK and US firms (Carter et al. 2009; Gong et al. 2011; Albuquerque 2014). They do not, however, examine the relation of one factor at a time on the usage of RPE while controlling for other factors. Gong et al. (2011) investigate explicit disclosures on RPE in the US to identify the factors that prompt the use of RPE in compensation contracts in 2006. Carter et al. (2009) examine the use of RPE in performance-vested equity grants in a sample of UK firms in 2002. This section examines international firms over a longer time span. Understanding what factors are linked to RPE is instructive for researchers testing for RPE and could offer yet another reason for the mixed evidence in existing empirical studies.

In order to pinpoint possible factors related to RPE, we conduct a logit regression. The dependent variable yit is an indicator variable that equals 1 for banks that disclose information on the use of a peer group to determine the level of executive compensation, and 0 otherwise (see the “Compensation data” section). The independent variables include CEO pay (Compit), firm performance (FirmPerfit), various specifications of peer return (PeerPerfit), and control variables. We control for firm size (FirmSizeit) and growth options (GrowthOptionsit) and include year (YearDummyit), industry (IndustryDummyit), and country (CountryDummyit) dummies to control for cross-sectional variation. Sales are used as a proxy for firm size. Growth options are calculated as follows: (Market Equity+Total Assets−Common Equity)/Total Assets.

Our logit model is based on the following latent variable model:

$$\begin{array}{*{20}l} y_{it} = & \;\gamma_{0}+\gamma_{1}\cdot {Comp}_{it}+ \gamma_{2}\cdot {FirmPerf}_{it}\\ &+ \gamma_{3} \cdot {PeerPerf}_{it} +\gamma_{4}\cdot {FirmSize}_{it} \\ &+ \gamma_{5} \cdot {GrowthOptions}_{it}+\gamma_{7} \cdot {YearDummy}_{it}\\ &+ \gamma_{8} \cdot {IndustryDummy}_{it} \\ &+ \gamma_{9} \cdot {CountryDummy}_{it} + u_{it}. \end{array} $$
(5)

We estimate Eq. (5) with the full sample of 335 firm-year observations from 2004 to 2013. Table 5 reports the results. We find that the likelihood of using RPE is positively related to firm size and negatively related to growth options for industry and industry/size peers.Footnote 27 The opposite holds for growth options.Footnote 28 None of the other predictors are statistically significant, indicating that in our sample there is a strong link between size and growth options on the one hand and RPE on the other one.Footnote 29

These results are in line with existing evidence. Gong et al. (2011) find that larger firms are more likely to use RPE. This could be for several reasons. Firm size might represent a crude proxy for public scrutiny and shareholder concerns about compensation practices. Larger firms are also more exposed to monitoring pressure compared to smaller firms. This might well force them to be more committed to RPE (Bannister and Newman 2003). Albuquerque (2014) and Gong et al. (2011) find that the level of RPE in CEO compensation contracts is negatively associated with a firm’s level of growth options. Carter et al. (2009) examine the disclosure of performance-based conditions in equity grants and document that growth options are inversely related to the performance-based conditions. Albuquerque (2014) argues that high growth options firms have to bear more risks and thus exhibit a higher idiosyncratic variance. These firms are also characterized by firm-specific know-how and operate in markets with high barriers to entry. As a consequence, these characteristics make peer performance uninformative with respect to capturing external shocks. This eventually leads to less usage of RPE among high growth options firms (Albuquerque, 2014, p.1).

5.2 RPE pay practices and bank-level characteristics

We next extend our analysis and investigate the relation between the magnitude of RPE pay practices and various bank-level characteristics. We first repeat the standard estimation (Eq. 1) conducted in the “Regression results” section with the industry/size peer group. We quantify RPE-intensity via the ratio of predicted (log) CEO-compensation to the actual (log) CEO-compensation. This prediction is only based on firm stock return and peer group return. The idea here is to separate firms (or firm-years) for which compensation is mainly based on firm performance and peer group performance from firms (or firm-years) for which other factors are more important. We then proceed to sort all firm-years based on this measure of RPE-intensity into four groups of equal size to examine if various bank-level characteristics are related to RPE-intensity. We analyze three different measures; two proxies for firm performance and one proxy for firm-specific risk: (1) return on equity (ROE), (2) the (yearly) firm stock return, and (3) the variance of the firm stock return. We calculate ROE by dividing net sales (Datastream code DWSL) by lagged common equity (Worldscope item WC03501). The calculation of the yearly stock return is described in the “International banking sample” section. Finally, firm stock return variance is calculated as the variance of the stock returns over the previous 36 months.

The results are shown in Table 6. Between RPE-intensity and ROE no clear relation can be made out. Returns seem to decrease with RPE-intensity. This is a purely descriptive exercise which cannot pinpoint any causality, so one could only speculate whether lower stock returns lead to more RPE practices or whether less RPE practices lead to higher stock returns (or whether there is an unobservable characteristic driving both of them). There is no monotonic relation between stock return variance and RPE-intensity, but the correlations show that firms with comparatively high and low RPE-intensities have somewhat higher return variances, suggesting that these firms are riskier. A more detailed investigation on the mechanism behind of these observations could be an area for future research.

Table 6 RPE pay practices and bank-level characteristics

5.3 Robustness checks

In this section, we conduct three different checks to gauge the robustness of the results obtained in the “Results” section: (1) we construct regional instead of global peer groups, (2) we construct value-weighted instead of equal-weighted peer groups, and (3) we examine the effect of excluding the years of the financial crisis (2007 and 2008).

5.3.1 Regional peer groups

Our first robustness check restricts the construction of the peer group by employing regional instead of global peer groups. We first classify seven regions as defined by the World Bank:Footnote 30 “Europe and Central Asia," "Middle East and North Africa," "Latin America and Caribbean," "East Asia and Pacific,” “South Asia,” “Sub-Saharan Africa,” and “North America.” Not surprisingly, the correlations between peer and firm stock returns are somewhat higher with regional peer groups than with global peer groups (as shown in Tables 7 and 2, respectively). The correlation between industry peer group and total compensation is no longer significant. The correlation between industry/size peer group and total compensation, on the other hand, does not change from going regional.

Table 7 Pearson correlation coefficients with regional peer groups

Table 8 replicates the main results of the “Regression results” section using regional peer groups. By and large, the results are similar to the ones obtained with global peer groups in the “Regression results” section. The coefficients of the industry/size peer group remain significant at the 5% level for both the full sample and the disclosure subsample. The strong-form RPE test for the disclosure subsample continues to support the hypothesis that firms apply RPE.

Table 8 Regressions estimating the sensitivity of CEO compensation to RPE with regional peer groups

5.3.2 Value-weighted peer groups

With a skewed distribution, equal weights might overstate the influence of smaller banks in peer groups. This is a concern in our sample because it contains the largest banks in the world, possibly biasing our results. To mitigate the impact of smaller banks, the next robustness check employs value-weighted instead of equal-weighted peer groups. For this purpose, we use the market capitalization at the fiscal year date. Table 9 shows the according correlations. The results do not change much compared to equal-weighted peer groups as shown in Table 2.

Table 9 Pearson correlation coefficients with value-weighted peer groups

Table 10 replicates the main results of the “Regression results” section using value-weighted peer groups. The differences between the industry and the industry/size peer groups become less pronounced. This is to be expected because value weights shift the focus to the biggest firms in each industry, and the banks in our sample are most likely in this group. Taken together, we come to the same conclusions: the size/industry peer group still performs better than the industry peer group, and the strong-form RPE test for the disclosure subsample continues to support the hypothesis that firms apply RPE.

Table 10 Regressions estimating the sensitivity of CEO compensation to RPE with value-weighted peer groups

5.3.3 Exclusion of financial crisis years

The financial crisis in 2007 and 2008 had far-reaching implications for the performance of banks (e.g., Fahlenbrach and Stulz (2011)). These crisis years might distort the results of our analysis by driving the correlation between firm performance and industry/size peer return. In a third robustness check, we exclude the years 2007 and 2008 from our sample.Footnote 31

The correlations obtained without the years 2007 and 2008 strongly differ from the baseline results in Table 2 (see Table 16). While still significant, the correlations between peer return and firm stock return become much less pronounced. The correlations between total compensation and peer return even turn negative, albeit not statistically significantly.

We next examine whether this substantial change in correlations affect the main conclusions from our baseline results. Table 11 shows that the peer group coefficients become generally more negative. But the weak-form regression paints the same picture as the baseline results: Size/industry peer groups have larger coefficients than industry-only peer groups, and the disclosure subsample shows stronger evidence in favor of RPE than the full sample. The strong-form regressions reject the hypothesis that systematic risk is not filtered out of the compensation contract at a 5% level of statistical significance (instead of 10% in the other results).

Table 11 Regressions estimating the sensitivity of CEO compensation to RPE without the years 2007 and 2008

5.3.4 Additional robustness checks

We performed additional, unreported robustness checks.Footnote 32 Using total assets instead of sales as a proxy for firm size yields very similar results. Value-weighted regional peer groups do not substantially alter the results of the baseline specification, either. Finally, in order to disentangle cross-sectional from time-series effects, we conduct a panel estimation with fixed year-effects and fixed bank-effects. The between-group estimates yield mostly insignificant coefficients. The coefficient for the disclosure subsample with the size/industry peer group, however, is negative and significant (at the 5% level), and thus in line with our baseline result.

6 Conclusion

This papers tests the presence of RPE in an original sample of 46 international banks from 2004 to 2013. We regress the logarithm of total compensation on firm performance, industry and industry/size peer performance, and control variables such as firm size and growth options. We control for unobservable variation in the level of compensation across years, industries, and countries. When we account for peer groups with peer selection based on industry and firm size, we find evidence for the use of RPE in international banking. This evidence becomes stronger once we focus on banks who openly disclose the use of peers in their remuneration practice. This insight contrasts and complements previous findings for the US.

We next employ a logit regression model to identify factors related to RPE in international banking. The evidence supports the working theory that growth options and firm size play a crucial role in banks’ decisions to use RPE. Our results are robust to different model specifications and are consistent with existing evidence. We find that the likelihood of RPE usage is decreasing with growth options. A possible explanation for this result is that the implementation of RPE in high growth option banks might be too costly due to difficulties in identifying the correct peer group, rendering such banks less likely to use RPE. We also find that larger banks are more inclined to use RPE in their compensation contracts. This is a plausible finding. In light of the recent financial crisis, high levels of CEO compensation have attracted a lot of attention, and large banks in particular have been under significant monitoring and shareholder pressure. In response to such pressure, large banks are more likely to have become incentivized to be committed to RPE usage in determining the level of CEO pay.

Our overarching findings suggest at least four things. First, large international banks seem to entertain the use of RPE in assessing the performance of their CEOs. This holds more conclusively for banks that disclose their peer groups. The latter implies the second point: disclosure statements seem to have some merit, at least in our sample, and credibly reflect good corporate practice on that score. Disclosing firms do not seem to limit themselves to preaching water; they likely drink it, too. This finding lends support to the credibility and thus to the informational value of RPE commitments. These first two points have important consequences for shareholders. Left on their own, CEOs would rather follow their own ways to maximize utility. RPE helps align the interest of shareholders and CEOs by creating incentives for CEOs to take actions to increase shareholder wealth. Third, in line with previous studies, our evidence indicates that industry/size peers are better able to capture exogenous shocks than industry peers alone. Finally, empirical evidence on RPE runs the risk of diluting. In studies of RPE, it seems, if nothing else for robustness, to stratify empirical samples by disclosure. This should inform future research.

7 Appendix

7.1 Sample banks and (non-US) global systemically important banks

Table 12 List of international banks in the full sample
Table 13 List of non-US global systemically important banks (G-SIBs), 2011–2012

7.2 Kernel-based peer group construction

This appendix addresses some issues one might have with the industry/size quartile approach and introduces a novel Kernel-based alternative. This alternative extends Albuquerque (2009) with a more flexible peer group construction method. The following example illustrates a possible caveat of the industry/size quartile approach. In Albuquerque (2009), all firms are partitioned and ranked into four size groups (per industry). In ascending order, the first group contains 25% of the firms with the smallest size, and the fourth group contains 25% of the firms with the largest size. The boundaries between the four groups, the so-called breakpoints, thus lie on 25%, 50%, and 75% of the ranked values of firm size. Now let us assume that we want to test the RPE hypothesis on a target company that is very close to the breakpoint between the first and the second quartile, but just happens to fall into the first one. In this particular case, it is not readily obvious why the first peer group, and not the second one, should be assigned to the target firm. Our alternative method of peer group composition addresses this issue. For every target firm, we assign a unique peer group that is determined by the target firm’s size. We implement this with a Kernel-based weighting scheme. Firms that are closer to the target firm in terms of firm size receive a weight specific to the distance from the target firm. More concretely, a weighting function assigns a higher weight to a peer firm if it exhibits a smaller distance to the target firm in terms of firm size (we will also allow for equal weights). We measure the differences of firm sizes as follows:

$$ \begin{aligned} \mathrm{D}_{i}=\text{Size}_{T}-\text{Size}_{i} && \text{where} && i=1,..., N. \end{aligned} $$
(A.6)

Size T denotes the size of the target company measured in terms of firm sales, and Size i is a proxy for the size of all other firms. We standardize the ”distances” by dividing them with the cross-sectional standard deviation, s(D i):

$$ \begin{aligned} \mathrm{D}^{*}_{i}=\frac{\mathrm{D}_{i}}{\mathrm{s}(\mathrm{D}_{i})} && \text{where} && i=1,..., N \end{aligned} $$
(A.7)

From these standardized distances, we construct weights using a kernel weighting function. The firm i in the sample of N firms will be assigned the weight

$$ \begin{aligned} \mathrm{w}_{i}=\mathrm{K}(\mathrm{D}^{*}_{i}) \end{aligned} $$
(A.8)

Additionally, we create weights by multiplying the standardized difference with the following scaling factor (SF):

$$ \begin{aligned} \text{SF}_{i} &= \text{Median}\left(\frac{\mathrm{s}(\text{Size}_{i})}{\mathrm{s}(\text{Size}_{T})}\right)\cdot 2 \\ \mathrm{D}^{**}_{i} &= \mathrm{D}^{*}_{i} \cdot \text{SF} \\ \mathrm{w}_{i} &= \mathrm{K}(\mathrm{D}^{**}_{i}). \end{aligned} $$
(A.9)

For robustness, we use three types of kernel functions to assign weights: (1) the probability density function (pdf) of the standard normal distribution, (2) the pdf of the uniform distribution, and (3) the pdf of the ”cosine distribution”.1In addition, we standardize each weight with the sum of all weights. This amounts to the following peer performance weight:

$$ {\mathrm{w}_{i}}^{*}=\frac{\mathrm{w}_{i}}{{\sum_{j=1}^{N}\mathrm{w}_{j}}} $$
(A.10)

such that

$$ \sum_{i=1}^{N}{\mathrm{w}_{i}}^{*}=1. $$
(A.11)

Finally, we use the performance weights and individual firm performance Perf i to construct each target firm’s peer group as follows:

$$ \text{PeerPerf}=\sum_{i=1}^{N}{\mathrm{w}_{i}^{*}\cdot \text{Perf}_{i}} \; \; \text{where} \;\; i=1,..., N. $$
(A.12)
Table 14 Regressions estimating the sensitivity of CEO compensation to RPE
Table 15 Logit regression of RPE usage in executive compensation contracts
Table 16 Pearson correlation coefficients without the years 2007 and 2008

7.3 Regression results (kernel-based peers)

7.3.1 Full sample of banks

Panel A of Table 14 reports the results from regressing the logarithm of total compensation on firm stock return, Kernel-based peers, growth options, and log of sales. The parameter estimates are negative and insignificant, which is not consistent with the presence of RPE. The estimates hardly differ across the different Kernel specifications. They are − 0.26 (p value = 0.38) for the normal Kernel function, − 0.16 (p value = 0.54) for the cosine Kernel function, and −0.20 (p value = 0.48) for the uniform Kernel function. In panel A of Table 14 we have slightly adjusted the Kernel-based approach by multiplying the difference of the firm size by the scaling factor introduced in the previous section. We test the presence of RPE by regressing the log of total CEO compensation on firm stock return, peer performance, growth options, and log of sales. We also include year, country, and industry dummies. The coefficient on the log of firm stock return is again positive and statistically significant at the 1% level for every specification. The negative coefficients on the Kernel-based peer portfolio keep persisting. They are − 0.22 (p value = 0.39) for the normal Kernel function, − 0.20 (p value = 0.38) for the cosine Kernel function, and − 0.27 (p value = 0.29) for the uniform Kernel function. The adjusted Kernel-based approach reports smaller p values. The coefficients remain insignificant, revealing no evidence of RPE in the full sample.

7.3.2 Weak tests of RPE (disclosure subsample)

Panel B of Table 14 documents the same regression procedure on the subsample of banks that explicitly disclose the use of peers in determining their level of CEO compensation. Under the Kernel-based peer group specification, external shocks are removed from the compensation contract, which is consistent with RPE. The peer coefficients do not differ much across the Kernel specifications. They are − 0.99 (p value = 0.03) for the normal Kernel function, − 0.81 (p value = 0.05) for the ”cosine” Kernel function, and − 0.88 (p value = 0.06) for the uniform Kernel function. The coefficient on firm stock performance is positive, statistically significant, and ranges from 0.70 to 0.77. Panel B of Table 14 also reports the regression results with the adjusted Kernel-based peers (columns labeled ”scaled”). All the Kernel-based peer coefficients keep a negative and statistically significant sign, soundly rejecting the null hypothesis of no RPE. The coefficient of the normal Kernel peer group is − 0.82 (p value = 0.03), of the cosine Kernel peer group − 0.83 (p value = 0.01), and of the uniform Kernel peer group − 0.77 (p value = 0.04).

7.3.3 Strong-form tests of RPE (disclosure subsample)

We now use strong-form RPE tests in order to test the RPE hypothesis on the disclosing subsample. We use the Kernel-based method to construct a systematic performance variable and run the same regression model. Panel C of Table 14 reports the results. We document insignificant parameter estimates on systematic firm performance. The coefficient of the normal Kernel peer group is 0.04 (p value = 0.85), of the cosine Kernel peer group 0.08 (p value = 0.70), and of the uniform Kernel peer group 0.09 (p value = 0.65). Our results are robust to different specifications of the weights in the Kernel-based approach, which is reported in panel C of Table 14. The parameter estimates of the normal Kernel peer group is 0.07 (p value = 0.75), of the cosine Kernel peer group 0.01 (p value = 0.96), and of the uniform Kernel peer group 0.09 (p value = 0.66). The unsystematic firm performance is significant at the 1% level for every Kernel-based specification.

Table 17 Regressions estimating the sensitivity of CEO compensation to RPE

7.3.4 Associated factors of RPE

In this section, we estimate Eq. (5). For this purpose, as in the previous section, we use the alternative peer group definitions based on the Kernel approach. The results are presented in Table 15. The parameter estimates on firm size remain statistically significant and have the same sign. The firm size coefficient of the normal Kernel peer group is 2.68 (p value = 0.00), of the cosine Kernel peer group 2.70 (p value = 0.00), and of the uniform Kernel peer group 2.69 (p value = 0.00). The coefficient for growth options remains negative and in most cases statistically significant. The results are qualitatively similar when we use the adjusted Kernel-based approach. The coefficients are − 19.64 (p value = 0.09) for the normal Kernel peer group, − 19.28 (p value = 0.10) for the cosine Kernel peer group, and for the uniform Kernel peer group − 19.74 (p value = 0.09).

7.4 Correlation coefficients of non-crisis years

Table 16 shows Pearson correlations obtained without the years 2007 and 2008.

7.5 Clustered standard errors

Here, we consider the same regression procedure (Eq. (1)) for the full sample of 42 banks but include clustered standard errors across industry codes. Table 17 reports the regression results when peers are based on industry and industry/size. The coefficient on industry peer is − 0.06 (p value = 0.87), and the coefficient on industry/size peers is − 0.31 (p value = 0.06). That is to say, we find qualitatively similar results to those presented in panel A of Table 4. In addition, in unreported results, we find that the results for the Kernel-based approaches are robust to the inclusion of clustered standard errors.