Introduction

In 2020, 174 hedge funds, managing US$315 billion in total, already signed the principles for sustainable investment, designed by the United Nations in 2006. As for Environmental, Social and Governance (ESG) exchange-traded funds, cash flows increased threefold in the first quarter of 2020 compared to the previous year, according to Bloomberg (2020). The growing interest in ESG criteria stems both from their increasing popularity among investors and from the desire to enhance financial performance. Indeed, ESG criteria are used defensively for a better allocation of resources in relation to profitability and risk (Bass et al. 2017) and provide a variety of useful information for factor models (Ross 2017). Even though investment practices including ESG criteria are growing significantly, management science research has not reached a clear consensus on the link between sustainable performance and financial performance, despite a large number of articles published over the last decade. In fact, it has been argued in the literature that this relation can be positive, negative and neutral (Huang et al. 2020; Revelli and Viviani 2013), insignificant (Surroca et al. 2010), U-shaped (Barnett and Salomon 2012), inverted U-shaped (Lankoski 2008), or asymmetric (Jayachandran et al. 2013). Meta-analyses have led to the conclusion that the relation appears to be weak, positive, and significant (Orlitzky, Schmidt and Rynes 2003; Revelli and Viviani 2013). Overall, it is difficult to measure this link as it greatly depends on methodologies, databases, time period, and measurement of the level of CSR use.

Thus, our research aims to contribute to a better understanding of the relation between the levels of Corporate Social Performance (CSP) and Corporate Financial Performance (CFP). To do so, we employ an innovative methodological approach through the use of explainable artificial intelligence (XAI)Footnote 1. Moreover, the social responsibility of companies is in perpetual evolution, and investors' recent enthusiasm about responsible funds makes it necessary to analyze the CSP–CFP link over a recent period. Therefore, we propose a longitudinal analysis of S&P 500 companies between 2014 and 2019. This time period represents an additional interest, as it falls under a bullish market. In times of financial crisis, the level of corporate sustainability has proven to be beneficial to financial performance (e.g., Ducassy 2013); however, the interest regarding the CSR level during bullish periods is still to be demonstrated. Finally, we also propose using a database that is widely available to all investors and therefore likely to be extensively used. We address some of the shortcomings of previous research by including a level of CSR relative to the sector to be studied and interaction variables, such as the level of ESG disclosure (Fatemi et al. 2018) and governance (Broadstock et al. 2020; Deng, Kang and Low 2013). This ensures that CSR is not indirectly linked to other non-financial variables.

Our results make several contributions to the existing literature. We show that firms with a CSR level that is much superior to their peers in their sector have a ROA level that will be positively impacted. Thus, CSR expenditures are not well valued by the markets, but for companies that are CSR pioneers, their performance improves in the long term. Third, XAI uses tools, methods, and algorithms to explain black-box models and expose their behavior and underlying decision-making processes. We advocate the implementation of a humanly reasonable (i.e., explainable) type of artificial intelligence, resulting in XAI, and by extension, interpretable machine learning.

Literature review

Two conceptual approaches currently coexist in the literature about the relation between CSP and CFP. The first approach is based on the agency theory (Jensen and Meckling, 1976). CSR investment must be in line with the company's core business; otherwise, the return on investment will be negative and will generate an opportunity cost (Zhang et al. 2021). Executive and managerial involvement is critical for the conversion of CSR policies into future financial results because CSR activities are multidimensional and often pictured as a collection of uncoordinated investments (Hasan et al. 2018). However, managers can use CSR spending for their own benefit in order to improve their image at the cost of shareholders (Jiao 2010). Thus, investors must consider the motivations inherent in the implementation of CSR activities before investing in firms. The trade-off hypothesis (Preston and Post 1981) considers that increasing the level of CSR generates unnecessary additional costs and creates a competitive disadvantage compared with less sustainable firms, which negatively impacts profitability. Sustainable expenditures are detrimental to financial profitability, as they constitute a significant resource diversification cost for the firm (Cordeiro and Sarkis 1997; Lu et al. 2013). The integration of CSR criteria into portfolio strategies is not a standard practice, due to the risk of sacrificing profitability.

Nevertheless, through the prism of the stakeholder theory (Freeman and Philips 2002) and the instrumental view of CSR (McWilliams and Siegel 2001; Surroca et al. 2010), CSR can also lead to a positive anticipation of the CSP–CFP link. Indeed, taking into account the expectations of primary and secondary stakeholders will create moral capital or goodwill, which will improve the long-term performance of the firm. Thus, stakeholders can act in ways that either help or not help the firm achieve its objectives and can increase their engagement that leads to more productive and financially viable investments (Benabou and Tirole 2010; Eccles, Ioannou and Serafeim 2014; Hasan et al. 2018).

However, extra-financial information, through CSR, is increasingly considered by asset managers for several reasons. CSR information would allow the optimization of the risk-return measures. The contribution of qualitative criteria would decrease the risk, when used defensively as determinants in resource allocation (Bass et al. 2017). According to Ross (2017) and Bender and Samanta (2017), the search for diversified information is desirable in factor models. CSR would limit information asymmetry (Bouslah et al. 2013) and extra-financial risk (Archambeault, Dezoort and Hermanson 2008). Callan and Thomas (2009) argue that portfolios comprising sustainable stocks outperform their benchmarks. These results would be driven by an improvement in reputation (Salama et al. 2011). Hasan et al. (2018) argue that social legitimacy and moral capital, derived from better CSR engagement, can stabilize and enhance a firm's competitiveness and mitigate adverse consequences of negative events. Finally, (Akisik and Gal 2014; Dhaliwal et al. 2021-10-28) show that issuance of CSR reports is used as a proxy result to reduce forecast errors by financial analysts. Our study takes place during a bullish market, which should limit the effectiveness of the sustainable approach. Indeed, in a bullish market, we assume that the CSP–CFP link will be significant, positive, and very weak.

H1

The link between CSP and CFP is expected to be significant and marginally positive.

Moreover, financial performance is not an unidimensional construct, and accounting and market measures capture its distinct dimensions (Gentry and Shen 2010). Accounting-based measures are generally conceptualized as reflections of past, short-term financial performance (Hoskisson et al. 1994). In our study, we consider ROA as our accounting-based measure (Stanwick and Stanwick 2000; Tebini et al. 2016). Market-based measures are conceptualized as reflections of future, long-term financial performance (Hoskisson et al. 1994). We have decided to take Tobin's Q as a measure of market performance.

H1a

The link between CSP and financial accounting-based measures is expected to be significant and marginally positive.

H1b

The link between CSP and financial market-based measures is expected to be significant and marginally positive.

A common issue is the search for the optimal level and combination of CSR activities in order to maximize their impact on performance (Hasan et al. 2018). There must be sub-optimal levels of CSR investment. Follower companies in CSR investment may be misperceived or may be greenwashing to keep up with sustainability trends. We therefore believe that there are minimum and/or maximum CSR thresholds to optimize the positive impact of CSR expenditures. We thus consider that the link between the CSR level and financial performance is not linear and that thresholds for the CSR level exist and modify the relations of CSR with the other financial variables. The CSP–CFP link is highly complex and multidimensional. Thus, we analyze the possible threshold effects on the relations between the CSR level and the control variables.

H2a

There are threshold effects under which the CSP–CFP link becomes positive.

H2b

There are some threshold effects under which financial variables are likely to improve the level of sustainable performance.

Data and methodology

Data and variables

All financial and extra-financial data are taken from the BloombergFootnote 2 database, which is widely used by professional investors. We focus on the analysis of a bullish market from 2014 to 2019, between the subprime crisis and the COVID-19 crisis. This period of economic stability allows us to obtain information about inform the impact of CSR over a period that is not assumed to be fully favorable to its effects. To differentiate among firms by sector of activity, we use the Global Industry Classification Standard (GICS) code classification. To estimate the CSR level, we use the ESG score available on Bloomberg. This score is one of the ESG scores provided by Bloomberg from third-party rating agencies. We believe that this score has two essential qualities. First, it is widely available to professional investors. This wide distribution allows great replicability of our method. Moreover, this ESG score considers the impact of the sector. This score is provided by Sustainalytics. It looks at key ESG issues and indicators. Key ESG issues are split into three themes: environmental, social, and governance. The set of issues that will be analyzed varies by industry. Depending on the industry, a specific weight is placed on each ESG issue. Sustainalytics covers at least 70 indicators in each industry. ESG indicators are split into three dimensions. Preparedness: Assessment of management systems and policies in place to help manage ESG risks. Disclosure: Whether company reporting meets international best practice standard and is transparent in relation to ESG issues. Performance (Quantitative and Qualitative): ESG performance based on quantitative metrics and assessment based on review of controversial incidents the company may have been involved in. Before publication of an ESG Rating Report, a draft report is sent to the company researched. The aim is to gather feedback and additional/updated information from the company. Reports are published annually. Sustainalytics is a well-known score and has formed strategic relationships with Columbia Threadneedle, Norwegian Government Pension Fund, BNY Mellon, and City of London Investment Management (CLIM), who integrate Sustainalytics’ ESG research into their investment process. Sustainalytics ESG ratings are available on three third-party systems: Bloomberg, Factset, and IHS Markit. Recently, a fundamental difference emerges between ESG ratings and ESG scorings and opinions. ESG ratings measure a company’s exposure to ESG risks: higher ratings indicate a less relevant exposure to such risks and a better ability to manage them. ESG scores measure a company’s ESG attitude, opering a valuation of how virtuous companies have been and currently are in managing ESG factors. Our research uses this last one due to the quality of historical data. ESG risk rating has been only recently available, and data are missing before 2019. To consider the different factors influencing a firm's financial performance and CSR level, we use several classical control variables (see Appendix A for details). We also add the level of ESG disclosure scoreFootnote 3 and the governance scoreFootnote 4 retrieve from Bloomberg database.

Methodology - explainable artificial intelligence (XAI)

Prior studies have demonstrated that linear regression is less accurate in making predictions than advanced techniques (Risse 2019) and suffers from a variety of statistical constraints, including endogeneity and multicollinearity. Over the past several years, the field of XAI has witnessed considerable development (Vilone and Longo 2021). A significant contributing factor has been the widespread use of machine learning, especially eXtreme Gradient Boosting (XGBoost) by Chen et al. (2015), which has resulted in the creation of extremely accurate models that are difficult to both explain and understand. XAI is a branch of artificial intelligence that uses tools, methods, and algorithms to give explanations for black-box models in order to expose the behavior and underlying decision-making processes of the models. The goal of XAI is to assist end users and domain specialists in understanding how black-box models generate predictions (Alicioglu and Sun 2021). A large number (34) of XAI techniques have recently been developed to describe the inner workings of 35 black-box models and their choices. To support the understanding of XAI methods, we present a visual explanation of the most popular one, namely SHapley Additive exPlanations (SHAP) by Lundberg and Lee (2017). SHAP uses game-theory tactics and handles features such as team members in a game. It estimates each feature's relative contribution to the individual prediction, which corresponds to each team member's contribution to the game's victory. The treeSHAP interaction values may be calculated as follows, according to Lundberg and Lee (2017):

$$\omega_{i} ,_{j} = \mathop \sum \limits_{{H \subseteq N\left\{ {i,j} \right\}}} \frac{\left| H \right|!(Z - \left| H \right| - 2!}{{2\left( {Z - 1} \right)!}}\varphi_{ij} \left( H \right)$$

when i ≠ j, \(\varphi_{ij} \left( H \right) = f_{x} (H \cup \left\{ {i,j} \right\} - f_{x} (H \cup \left\{ i \right\} - f_{x} (H \cup \left\{ j \right\} + f_{x} \left( H \right)\), Z is the number of features, and H is the number of all feature subsets. By incorporating feature significance, feature dependency graphs, local explanations, and summary plots, SHAP values help us comprehend tree models better (Tables 1 and 2).

Table 1 Descriptive statistics
Table 2 Correlation matrix

Results and discussion

According to Figure 1b, all variables in our model have a marginal impact on the performance measures. Figure 2b shows the impact of each variable on the financial performance level of each firm. For example, high levels of cash flow have a positive impact on financial performance for ROA and Tobin's Q due to red point on the right side of the figure (Figure 3).

Fig. 1
figure 1

The model’s interpretation. a: The importance ranking of the variables according to the mean; b: Order of highest influence according to SHAP value. Red and blue dots indicate when each factor is high or low, respectively, to determine the direction of influence on ROA output.

Fig. 2
figure 2

The model’s interpretation. a: The importance ranking of the variables according to the mean (SHAP value); b: Order of highest influence according to SHAP value. Red and blue dots indicate when each factor is high or low, respectively, to determine the direction of influence on Tobin’s Q output.

Fig. 3
figure 3

The model’s interpretation. a: The importance ranking of the variables according to the mean (|SHAP value|); b: Order of highest influence according to SHAP value. Red and blue dots indicate when each factor is high or low, respectively, to determine the direction of influence on CSR output.

CSR and financial performance

Our first hypothesis questions the relation between CSR and financial performance. However, our methodology shows a marginal impact of the levels of CSR and ESG disclosure (see Figure 1a). As shown in Figure 1a, this marginal impact of CSR on the ROA level seems to be almost as important as the R&D level, even if the impact of extra-financial variables remains very limited. Figure 4a–b show the marginal impact of CSR on financial performance. Regarding Tobin's Q, Figure 4b shows a larger number of points for which a CSR level lower than 40 leads to better financial performance. For several companies with a CSR level between 40 and 100, there is a negative relation between the CSR level and Tobin's Q level. Regarding the ROA level, in Figure 4a, we also notice a link that seems to favor firms with a low CSR level. Nevertheless, for all companies with a CSR level above 95, the impact of the level of responsibility on the ROA is strongly positive. Thus, firms have to be the best in their sector in terms of CSR to benefit from the positive effects on financial accounting performance.

Fig. 4
figure 4

SHAP dependence plots. a Effect of CSR on ROA output; b Effect of CSR on Tobin’s Q.

Moreover, for companies that are the best in their sector, with a CSR level above 95, the CSR level has a strong positive impact on accounting performance (ROA). These companies may have developed a management development process and comparative advantages that favor their performance, which is in line with the instrumental theory of CSR. Nevertheless, companies that are fully engaged in a CSR perspective and are the best in their sector witness a positive impact on their ROA level. Our results therefore suggest that a threshold effect must be crossed in order for CSR expenditures to improve financial performance through ROA. Thus, we partially reject Hypothesis 1a (H1a); the CSR level does not improve financial accounting performance, but for the most sustainable companies, the impact can be very positive. However, we must reject Hypothesis 1b (H1b) because we find that CSR has a negative impact on financial market performance in a bullish market, even if this impact is still low.

Our results highlight trends and threshold effects between CSR level and financial variables. Figure 3a shows the impact of the variables on CSR. ESG disclosure is the variable with the highest impact on the CSR level. This strong and positive link is not surprising as these two variables are by nature interrelated. The level of ESG disclosure is a precondition for collecting ESG data by the scoring agencies. This first result reinforces the relevance of the Bloomberg Sustainalytics ESG score for the quality and the limited divergence that can exist between ESG scoring (Berg et al. 2019). This result follows the consideration of Fatemi et al. (2018) on the importance of ESG Disclosure in the link between ESG performance and financial performance. The impacts of the other variables remain fairly homogeneous, although it seems that R&D, governance, and size have the greatest impacts. Analyzing Figure 3b, we notice that R&D is generally positively related to CSR. Additionally, a good level of governanceFootnote 5 seems to be positive for the level of responsibility of the company. These results are in line with those reported by Broadstock et al. (2020). Size also seems to be positively related to CSR, which can be explained by the additional constraints and pressures that the largest and most exposed companies may face with regard to the numerous stakeholders, such as Non-Governmental Organizations, states, and customers (Quéré, Nouyrigat and Baker 2018).

Figure 5 illustrates the interaction between the CSR level and the financial variables. The graphs in Figure 5 show some interesting threshold levels that determine the impact of the variables on the CSR level. Figure 5a shows that above an ESG disclosure level of 45, the impact of disclosing CSR information is positive on the effective CSR level. Companies that disclose more than the average level will therefore have a higher CSR level. Regarding to Figure 5b for a high level of governance (i.e., below 4), the impact on CSR is very high. Conversely, a low level of governance (above 7) has a negative impact on CSR. The level of governance is essential in the constitution of the CSR level.

Fig. 5
figure 5

SHAP dependence plots of the all features for CSR output.

Figure 5c to i relates the relations between traditional financial variables and CSR. As shown in Figure 5c, an R&D expenditure level between 10 and 20% of assets seems to have a positive impact on CSR. However, above 20%, the impact no longer positive. As shown in Figure 5d and e, companies characterized by a strong market capitalization or high level of sales tend to have a more important CSR level. However, for a sales level higher than 3.5% of assets, the relation with CSR is negative. It is assumed that a too high level of sales may lead to more important negative externalities. For the level of cash flow of the previous year, Figure 5f shows an inverted U-shaped relation, suggesting an upward trend. Indeed, it seems that the higher the company's cash flow in the previous year, the more the CSR level will be positively affected. This is in line with (La Leyva-de Hiz, Ferron-Vilchez and Aragon-Correa 2019) who have commented that a moderate level of slack resources reinforces the positive relation between CSP and CFP. We note that losses representing 10% of the company's assets in N-1 have a negative impact on the CSR level. We understand here that the company cannot afford CSR expenditures. Nevertheless, Figure 5f shows that for cash flow levels above 20% of assets, the impact seems almost negative. We assume that beyond a certain level, despite a significant amount of resources, there is no additional incentive to make CSR expenditures. The resources are then not wasted. The dividend level, as shown in Figure 5i, is positively related to the CSR level; a positive linear trend is observed. At least, high dividend levels compared with the net results of the previous year have a positive impact on the CSR level. In contrast, a dividend payout when net income is negative in N-1 has a negative impact on the CSR level. We can consider that in the first case, sustainable companies will also create value, in line with virtuous circle approach (Waddock and Graves, 1997). In the second case, the issuance of dividends when the companies had deficits the previous year will starve these companies of resources to implement a CSR approach. This distribution has a negative impact on the firms' CSR level.

CSR and risk level

Concerning the risk level, the link between the beta and the CSR levels is characterized by an inverted U-shaped form, as shown in Figure 5g. A beta level close to 1 seems generally positively related to the CSR level. However, for a certain number of companies whose beta level is below 0.5, a positive impact on the CSR level appears. This is partly in line with the expectation that sustainable companies are less risky and can therefore be added in portfolio investment as a defensive strategy (Viviani et al. 2019). Nevertheless, it is important to choose these companies carefully because this relation is not systematic. As shown in Figure 5h, the weighted average cost of capital exhibits a positive linear trend as well. Thus, it appears that the higher the WACC level, the higher the CSR level. However, beyond a weighted average capital cost of 11%, the impact on the CSR level is negative. This may also reflect companies that are potentially very risky and therefore have a higher return on capital. We also notice that for a WACC level between 2.5 and 5%, a significant number of companies break away from the positive linear trend. This may reflect the gains in terms of benefits from the CSR approach so that investors will revise their profitability requirements, due to the social utility of the company. Alternatively, it could simply be that sustainable companies are less risky and therefore less profitable. In this case, a higher CSR level would translate into a lower refinancing cost. This is in line with previous studies' results (Goss and Roberts 2011; Oikonomou, Brooks and Pavelin 2014).

Conclusion

Our study has been conducted with a couple of aims. The first is to examine the impact of CSR on financial performance in a bullish market. The second aim is to investigate the link between the CSR level and the different financial variables for the purpose of revealing threshold effects. Our study highlights several interesting results. The use of a new methodology through machine learning, XAI, allows us to better understand the CSP–CFP link. We have shown threshold effects that modify the relation between the variables. Indeed, we have highlighted that for a much higher CSR level than the average in the sector to which a company belongs, the ROA level is positively and strongly affected by CSR. Thus, it is necessary for a firm to be the best in one's sector in order to transform the sustainability costs into value for the company. The instrumental approach to CSR would therefore only work at high levels, and companies in the average range would not be rewarded in times of economic growth.

Finally, we have analyzed the determinants of the CSR level. We have shown that R&D, governance, ESG disclosure, and market capitalization are key variables in the composition of CSR. These variables have a positive impact on the CSR level of companies. Nevertheless, we note other interesting elements that support the interrelation of financial variables with the CSR level. The link between the risk (through beta) and CSR takes an inverted U-shaped form. However, for a small beta level (below 0.5), the link with the CSR level is positive. The WACC level shows an increasing linear relation with the CSR level, but below 5%, the relation can be highly positive and therefore supports the theory of a lower refinancing cost for sustainable companies. In comparison, a level above 11% is highly detrimental to sustainable performance.

Thus, our research develops knowledge about the impact of CSR in three ways. First, we have highlighted that in a bullish market, the CSR level negatively affects the financial performance level. Second, we have shown that during this period, only the most sustainable companies in their sector manage to create value through ROA but not through Tobin's Q. Therefore, in a bullish market, asset managers cannot hope to improve the financial performance of their portfolios through the criterion of the level of sustainability. Nevertheless, the negative relation remains very weak, and a defensive asset allocation against the resurgence of crisis will not too strongly affect the profitability of assets. It is therefore always interesting to take the CSR level into account in asset management. Lastly, it is important to keep in mind the existence of threshold effects on the relations between the variables and the CSR level.

However, there are several limitations to our work that future research could investigate. First, it would be interesting to study the threshold effects over bearish periods to determine whether these relationships diverge as a result of the economic context. Second, there are two limitations to the use of ESG scoring. The first one is related to the divergence between the several ESG scores in terms of scope, weight and measurement (Berg et al. 2019). These divergences could slightly modify our conclusion. The second is related to the recent evolution of ESG scoring, with the new ESG ratings focusing more specifically on risk exposure. The lack of current historical data does not allow us to carry out this study for the moment. We could expect that the same effects would still be observed but inverted.