1 Introduction

The operational research in bank efficiency has undergone rapid development. Lots of advanced non-parametric methods and parametric methods have been proposed and used in estimating efficiency in the banking industry. These methods include, but are not limited, to: Bayesian stochastic frontier analysis; network Data Envelopment Analysis; quantile regression; hyperbolic measurement; fuzzy super-efficiency; hybrid minimax reference point-DEA; conditional directional distance approach; satisficing DEA approach; and parallel production frontier approach, among others. In general, Bayesian stochastic frontier analysis and quantile regression analysis belong to the parametric methods, the rest of the above-mentioned methods are the examples of the non-parametric methods. The Bayesian stochastic frontier analysis specifies the functional form in estimating efficiency; therefore, the main disadvantage of this method lies in the fact that the efficiency results would be sensitive to different functional forms. The quantile regression analysis estimates the production process for benchmark banks located at top conditional quantiles, however, it suffers from the similar disadvantage. The hyperbolic measurement has a higher ability to model undesirable outputs, and the fuzzy super-efficiency model benefits from the advantage of being able to provide a better ranking among banks. The conditional directional distance approach can estimate bank efficiency while accounting for the time effect. The hybrid minimax reference point-DEA can measure efficiency from the perspective that each bank branch is regarded to be unique by itself, and different branches can have different sizes and targets for specific market segments. The satisficing DEA approach can incorporate risk factors as inputs in the bank efficiency analysis, while the parallel frontiers provide consistent measures of technical changes for all Decision-Making Units (DMUs) at different periods. The Malmquist Productivity Indices (MPIs) and technical changes obtained in this way have the property of circularity. Finally, the network DEA model benefits from the advantage of being able to open the black box and better identify the source of inefficiency in the production process. It would be able to consider different aspects of the banking production process carefully and comprehensively through incorporating carryover variables in designing the production process (Zhu, 2022), which is also one of the advantages of the non-parametric methods over the parametric ones. In addition to the estimation of bank efficiency, DEA has been used to evaluate merger gains in the banking industry (Amin & Boamah, 2020). Not only in the banking sector, but DEA has been proposed and applied to different economic sectors or different types of DMUs from different perspectives, this includes the proposal of a hybrid DEA-Machine learning approach for predicting performance of micro, small and medium enterprises (Boubaker et al., 2023); the estimation of managerial ability of listed firms (Dalwali et al., 2023); the development of DEA in supplier selection (Dutta et al., 2022); the evaluation of healthcare system’s efficiency under DEA and compromise programming DEA (Habib & Shawan, 2020; Lozano et al., 2020; Mourad et al., 2021; Rouyendegh et al., 2019). For more detailed studies regarding DEA and its applications in operations and data analytics, please refer to Chen et al. (2019).

Few attempts have been made to consider the "carryover" characteristics in the banking production process using DEA. Among the DEA studies incorporating carryover variables in the production process, two groups can be classified: one of them treats assets as one of the variables with the carryover characteristics, and the other group treats "credit risk" as the carryover variable. As can be seen in Sect. 2, the theoretical literature review, previous studies have used two credit risk indicators as the carryover variables: non-performing loans and loan loss reserves. Both of these indicators have been treated as the carryover variables with an undesirable characteristic. This is mainly due to the consideration that non-performing loans are the type of loans that are lent by the banks, but the borrowers do not pay back to the banks when the loans become due. Therefore, non-performing loans are a type of bank risk that banks try to minimize. In terms of loan loss reserves, they have a close linkage with non-performing loans by the fact that larger volumes of loan loss reserves will be made if there is an expectation that there would be an increase in the volumes of non-performing loans. Thus, lower levels of loan loss reserves are pursued by the banks in their operation.

Our study considers credit risk in a different way. Although non-performing loans are an indicator of instability, banks take different measures to absorb this negative shock. A certain amount of loan loss provisions is set aside by the banks every year, and loan loss reserves are the accumulation of loan loss provisions over several years (Monokroussos et al., 2017). Therefore, from this perspective, setting aside higher volumes of loan loss reserves can be understood as a way to improve bank stability. The relationship between loan loss reserves and the risk of bank failure is further confirmed by Bushman and Williams (2012) as well as Ng and Roychowdhury (2014).

Not only can the desirability of loan loss reserves be reflected by their impact on credit risk, but loan loss reserves are also found to have a positive impact on bank earnings (Beaver & Engel, 1996). Therefore, our proposal of treating loan loss reserves with a desirable characteristic can be validated by the literature, while we build on the literature by incorporating it into the production process under a dynamic nature.

As an important component in the banking production process, the role played by deposits has been a debatable issue among academic researchers. Berger and Humphrey (1992) provided a systematic and comprehensive discussion related to the role of deposits from different perspectives. The asset approach in defining bank outputs argues that bank deposits and other liabilities should be treated as bank inputs due to the consideration that they are the raw materials in generating loans and other banking assets. In comparison, the user cost approach argues that the treatment of deposits as inputs or outputs depends on their net contribution to bank revenue. If the financial cost of deposits is higher than the opportunity cost, they will be considered inputs. If the opportunity cost is higher, deposits should be treated as outputs.

Thirdly, the value-added approach argues that all bank liabilities and assets have some output characteristics, and there is no clear boundary in defining the difference between inputs and outputs. Finally, explicit revenue in banking argues that banks accumulate implicit revenues because of the payment of below-market interest rates. This means that substantial service outputs are generated by deposits. However, because the explicit revenue generated by deposits is very small, there is still a high level of controversy as to whether to treat deposits as inputs or outputs.

Although several rounds of banking reforms in China aimed to improve competitive conditions and bank stability (Tan, 2016), the non-performing loan ratios in the Chinese banking industry consistently increased from 2013 to 2019, with the highest ratio of 1.9%. In comparison, other countries, including those in Asia and Europe, had much lower non-performing loan ratios. For example, Singapore had non-performing loan ratios of no more than 1.4% over the same period. The United Kingdom experienced a significant drop in the non-performing loan ratio from 2013 to 2015, and afterwards, although the ratio suffered from some volatility, it remained no more than 1.2%. Finally, South Korea consistently kept its non-performing loan ratios no higher than 0.6%. These statistics and relevant comparisons show that the Chinese banking industry has higher levels of instability. As Tan and Floros (2013) argue, improving bank efficiency will improve bank stability. Therefore, investigating efficiency in the Chinese banking context is essential.

Over the past few years, several studies have examined efficiency in the Chinese banking context (2022b; Antunes et al., 2022; Fukuyama & Tan, 2021a, 2021b; Fukuyama et al., 2022a; Tan & Tsionas, 2022; Tan et al., 2021), and relevant methodological contributions have been made. However, little attempt has yet been made to clearly and explicitly discuss the concept and importance of loan loss reserves in the banking production process, particularly the dynamic nature of this variable, and no effect has yet been given to correctly incorporate this essential factor in the modeling framework. In addition, the empirical banking studies place too much emphasis on the bank-specific determinants of efficiency, with no consideration given to examining the interaction between the banking industry and the operational performance of various economic sectors. Considering the nature of banking services, which cover loan services to various industries, examining the impact of non-performing loans derived from different economic sectors on bank performance will not only significantly contribute to the bank literature but will also provide more tailored policy implications in terms of credit allocations to different economic sectors.

The current study contributes to the literature as follows: (1) Loan loss reserves, rather than being regarded as an undesirable input or a carryover output variable, are treated as a "desirable input" with carryover characteristics because they provide stability to bank operations. We propose that loan loss reserves are produced at the first stage along with the intermediate products of loans and securities investment. Loan loss reserves affect the production at the second stage of the production process in the next period, not the current one. (2) In addition to considering the role played by loan loss reserves under a network DEA model, we engage in a second-phase analysis examining the influence of industry-level risk on efficiency in the Chinese banking sector. No previous studies in banking have addressed this issue. We justify our method by providing a causal interpretation within a dynamic network framework.

Establishing causal relationships without experimental data is known to be difficult (Pearl, 2009; Peters et al., 2013, 2017; Pfister et al., 2019). For causal models, it is well-known that changes in the environment or active interventions in the covariates should not affect their prediction properties since the effect of confounding variables has been considered. Without correctly establishing a causal relationship, the results associated with technical efficiency would be spurious and misleading. Our paper proposes a new technique that facilitates formal statistical testing for the causal interpretation of a given non-parametric model with a dynamic framework. The results of this causal modelling show that our proposed framework, considering the carryover characteristics of loan loss reserves in the banking production process, is valid. This indicates that our theoretical and innovative proposal regarding the dynamic role played by loan loss reserves as the desirable inputs can be supported. In addition, the causal interpretation for our second-stage regression analysis is also accepted and shows that our modelling, as well as the estimated results, are reasonable, reliable, and robust. Specifically, although no study has empirically investigated the impact of risk derived from different economic sectors on bank efficiency, this cannot necessarily lead to an examination of this issue without proper validation. In addition to our theoretical arguments, our causal modelling framework supports our innovative proposal on the investigation of this topic. In summary, besides the purpose of robustness check and validation, facilitated by our innovative causal modelling techniques, more generally, our study provides a pioneering example of including the operational research method and statistical modelling together to address the banking issue from both the perspectives of efficiency analysis and empirical banking. For all the efficiency estimation studies in the future, more efforts should be given to the validation of the relevant proposed modelling framework in the production process. For the empirical banking analysis, in addition to relevant econometrics techniques, our statistical causal modelling framework is recommended to be incorporated into the analysis for validation. We set out the remainder of the current paper in the following way: Sect. 2 reviews the literature. Section 3 presents our research design. The empirical results and discussion are provided in Sect. 4. Finally, the concluding remarks are given in Sect. 5.

2 Literature review

2.1 Contextual background

A country's economy is composed of the public and private sectors. The public sector enjoys an advantage over the private sector in terms of receiving stronger government support, especially in investment and financial needs. In contrast, a lack of funding is one of the main obstacles faced by the private sector, which hinders its investment activities. The private sector is characterized by a large number of small and medium-sized enterprises in different economic sectors. According to Tan (2017), these enterprises provided 70% of employment and contributed 60% of China’s GDP in 2012, but only received 30% of bank loans. Based on data from the World Bank Open Data between 2013 and 2020, the volume of domestic credits provided to the private sector by banks in China has generally increased, with the volume of credits provided in 2020 being 1.82 times the country's GDP.

The lack of funding provided by banks to the private sector is mainly due to the small and medium-sized enterprises' lack of capital and the uncertainty in their operations. This presents a level of risk when granting credits to these enterprises, resulting in non-performing loans that have a significant impact not only on banks but also on the operation and development of other economic sectors. The risk derived from the volume of non-performing loans is a historical issue in the Chinese banking industry, which the Chinese government has addressed through various measures, including capital injection, write-off of non-performing loans, establishment of four asset management companies, and the China Banking Regulatory Commission (Tan, 2016). However, academic scholars have not investigated the source of non-performing loans, particularly the distribution of non-performing loans across different economic sectors and their impact on banks and the economy.

Apart from the issue of non-performing loans, the Chinese government has implemented relevant measures to enhance bank performance and competitiveness. The current Chinese banking industry structure includes banks with different ownership types, among which state-owned commercial banks, joint-stock commercial banks, and city commercial banks are the three largest ownership groups in terms of total assets. Competition among banks of the same ownership type is expected to have a positive impact on performance. The Chinese government has also introduced foreign strategic investors and allowed initial public offerings of domestic Chinese banks, among other measures (Tan, 2014).

2.2 Theoretical literature review

Originally, the static DEA model was proposed by Ferrier (1994) to investigate and compare the efficiency levels between proprietary and cooperative firms. This work can be regarded as one example of how DEA can be applied specifically to estimate the efficiency level of firms with different ownership types and/or characteristics. Shephard and Färe (1980) developed dynamic production theory by presenting an axiomatic analysis that allows for the inventories of standard inputs, intermediate products, and the final outputs. Following this study, Hackman (1990) extended this analysis and defined a dynamic production system as a network of production activities that produce the final outputs.

Regarding Hackman’s analysis, Färe and Grosskopf (1996a, 1996b) presented a dynamic DEA framework that explicitly connects a sequence of single-period technologies and allows the producers to either produce the final outputs or carry over some outputs to adjust production in the subsequent period. In Färe and Grosskopf (1996a, 1996b)’s framework, network DEA can be thought of as the formulation of intertemporal production that allows each time-specific activity to represent the same production unit in different time periods. Their framework can be regarded as the foundation of dynamic network DEA since some final outputs or intermediate products in period t are carried over to period t + 1 as inventories.

One study related to dynamic DEA other than that of Färe and Grosskopf (1996a, 1996b) is Sengupta (1995), who developed the adjustment cost-based analysis by seeking to decide the optimal levels of inputs over a time horizon and estimate the optimal inputs to calculate the level of overall efficiency. Another study is the cost function-based work of Nemoto and Goto (2003), who provided optimal control-theoretic models for examining the dynamic DEA efficiency of public entities. Tone and Tsutsui (2014) proposed a slacks-based dynamic network DEA approach that allows carryovers and classifies various kinds of production variables.

Applying the idea of Färe and Grosskopf (1996a, 1996b)’s theoretical approach into bank efficiency measurement, Fukuyama and Weber (2015) allow a bank to decrease or increase the current production of loans and securities investments by controlling the amount of carryovers in a future period. More recently, using the trade-off relationship (Eq. 4), Fukuyama et al., (2022a, 2022b) examined the Chinese bank efficiency with a sequential structure and a behavioral causal analysis.

While we acknowledge Tone and Tsutsui (2014) as a contribution to DEA, their approach differs from that of Färe and Grosskopf (1996a, 1996b) in the sense that they do not have a constraint related to Eq. (2). Therefore, our dynamic network model is directly in line with Färe and Grosskopf (1996a, 1996b). Related to dynamic DEA, there are two useful survey papers: One is Fallah-Fini et al. (2014), who identified five primary characteristics: (i) production delays; (ii) inventories; (iii) quasi-fixed of physical capital leading to embodied technical change; (iv) adjustment costs; and (v) learning models dealing with disembodied technical change. Another survey article on dynamic efficiency is Mariz et al., (2018), which focused on intermediate and carryover variables in dynamic DEA. According to Mariz et al., (2018), the intertemporal variables are: (i) intertemporal inputs and outputs; (ii) storable inputs; (iii) quasi-fixed inputs; (iv) cost of adjustments; (v) lagged productive effects; and (vi) carryovers. Our approach considers (v) and (vi).

A dynamic model under a two-stage framework was proposed by Fukuyama and Weber (2015). The proposed model considered undesirable inputs, carryover inputs, and carryover outputs in addition to the original inputs, intermediate output, and outputs with desirable and undesirable characteristics. More specifically, labor, capital, and equity of the current year were used as the initial inputs, together with the undesirable input (non-performing loans) of the previous year (t-1), to produce two intermediate products in year t, including (1) deposits, and (2) other raised funds. These two intermediate outputs produced in year t at stage 1 were used as inputs, together with two carryover inputs in year t-1 (loans and securities), to produce three different types of final outputs in stage 2. These are desirable outputs in year t (loans and securities), carryover outputs in year t (carryover assets), and an undesirable output in year t (non-performing loans).

Another study by Bansal et al. (2022) proposed the use of dynamic Luenberger Productivity Indices and applied them to the Indian banking industry. The dynamic nature of the modelling framework was reflected by the inclusion of two desirable carryovers, namely unused assets and net profits, from period t to period t + 1. Non-performing loans were considered as the undesirable carryover input from period t to t + 1, while the study did not account for the dynamic nature of loan loss provisions and instead treated them as a desirable input. Efforts have been made to investigate the role of loan loss reserves as a carryover output in the banking production process (Chao et al., 2015).

Moreover, Fukuyama et al., 2023) proposed a dynamic DEA behavioural model and applied it to the Chinese banking industry. The model not only considered the dual role of intermediate and final outputs in a dynamic manner, but also treated loan loss reserves as the carryover generated at the second stage of the production process in time period t and used it as the carryover input in the first-stage production in time period t + 1. In this study, we differentiate ourselves from Fukuyama et al., (2023) by more explicitly considering the desirability and positive role played by loan loss reserves in the banking production process. Instead of considering the carryover input of loan loss reserves in the first-stage production in time period t + 1, we treat loan loss reserves in period t as a carryover variable in the first stage of production, along with the intermediate products generated in time period t + 1, to generate the final outputs. This modelling framework is designed to consider that the generation of final outputs (income) depends on loans, which are supported by loan loss reserves.

Fukuyama et al., (2023) not only contributed to the DEA banking literature by proposing an innovative behavioural model, but they were also pioneers in proposing a causal analysis to validate the framework of the production process. Their work provided a good example for future DEA studies as the proposal of inputs and outputs in the production process is the basis for the implementation of the model. The choice of inputs and outputs is typically based on relevant economic theory, but due to different types of observational data used for different sectors of the economy, there could be a causal interpretation problem. The proposal and estimation of a causal model would be able to solve this issue and provide validity to the DEA model proposed. In the causal modelling framework, the study tested that, conditional on income, equity, and loan loss reserves in the previous period, no factor affects personal expenses, total deposits, and fixed assets. Personal expenses, total deposits, and fixed assets are the only factors influencing loans and securities investment. Loan loss reserves in the current period may be caused by themselves in the previous period, and income and equity are only affected by loans and securities investment. Table 1 provides a summary of the reviewed studies above.

Table 1 Review of previous efficiency studies

In order to highlight the importance of incorporating carryover variables in network DEA, we also review relevant studies that adopted network DEA in the banking industry without considering the carryover variable in the production process and make a comparison to the results of the above-reviewed literature. Regarding the Japanese banking industry, Fukuyama and Matousek (2018) estimated Nerlove’s inefficiency between 2001 and 2013 under a two-stage network DEA. Labor and capital were used as the original inputs in the first stage to generate the intermediate product (deposits). Deposits were then further used as inputs in the second stage to produce loans, non-performing loans, and securities investment. The results obtained are completely opposite to Fukuyama and Weber (2015), which shows that the level of inefficiency between 2007 and 2009 is higher than the one between 2008 and 2010.

Regarding the Canadian Credit unions, Dia et al., (2020, 2022) used a three-stage network DEA to estimate efficiency between 2007 and 2017. The first stage of production used total assets, employee compensation, and other operating expenses to generate deposits, which were further used as inputs to generate loans and securities in the second stage. Finally, the revenue generation stage generated interest income and non-interest income. The results show that there had been a sharp decline in the level of efficiency over the examined period, which is mainly attributed to the managerial issues at the revenue generation stage. Regarding the Chinese banking industry, Fukuyama and Tan (2022a, 2022b) designed a three-stage DEA framework to estimate efficiency without considering the carryover characteristics in the production process. Labor and other operating expenses were used to produce intangibles, which were the reflection of innovation activities. Labor, as the share inputs, together with fixed assets and deposits, were used to generate securities investment, good loans, and bad loans. The securities investment and good loans are related to the production of income in the final stage, and bad loans are related to the production of equity capital. The results show that there is a much stronger volatility in the inefficiency level in the Chinese banking industry, compared to the results reported by Fukuyama et al., (2023), which considered the carryover characteristics in the production process. It is understandable that the different results obtained here are related to the data sample selected as well as the variables adopted, but whether or not incorporating the carryover variables in network DEA does matter for the results. As Zhou et al. (2019) argue, neglecting carryovers can overestimate bank performance.

2.3 Empirical literature review and hypothesis development

Thanks to the development of operational research, risk can now be incorporated in the evaluation of efficiency when addressing its influence on banking (Fukuyama & Matousek, 2018). Another stream of studies tests the impact of risk on efficiency using various econometric techniques, such as the three-stage least square estimator, Granger-causality test, bootstrapped truncated regression, fractional logit regression, and Tobit regression, among others (Tan & Anchor, 2017; Tan & Floros, 2013, 2018). All empirical banking studies focus on risk at the bank level. Tan and Floros (2013) find a positive relationship between bank risk and bank efficiency using a sample of Chinese commercial banks facilitated by the three-stage least square estimator. Tan and Floros (2018) use the Granger-causality test to investigate the relationship between risk and efficiency in the Chinese banking industry and find a significant and positive impact of credit risk on bank efficiency. Tan and Anchor (2017) use the bootstrapped truncated regression and fractional logic regression to investigate the impact of credit risk on bank efficiency in China but do not find any robust impact.

In addition to empirical investigations and evidence, the bad luck hypothesis (Berger & DeYoung, 1997) states that an increase in problem loans is primarily attributable to external events rather than managers' skills or their appetite for risk. Higher risk increases costs and managerial efforts, leading to a decline in bank efficiency. These types of bank-specific risks are crucial for bank stability in China, where the banking industry provides credits to every sector of the economy. This raises important questions: (1) what is the level of credit risk from different economic sectors? (2) What would be the impact of credit risk from various economic sectors on efficiency in banking? Addressing these questions would help regulatory authorities in China generate concrete and important policies.

As argued by the bad luck hypothesis, the negative impact of risk on efficiency is primarily due to the increase in bank cost, while allocating credits to a specific economic sector comes with a level of risk. However, we must also consider various other factors when assessing the impact of industry-level risk on bank efficiency, such as the importance and size of the economic sector, the characteristics of a specific sector, the economic sector’s relationship with the banking sector, and the type of credits allocated. For example, for sectors that play a crucial role in the economy and have a high demand for banking credits, there could be cost reduction for the banks derived from economies of scale. On the other hand, if a sector is less developed and carries a higher level of risk, it could lead to an increase in bank cost, resulting in a negative impact on bank efficiency. Furthermore, if an economic sector is closely related to the banking sector, there would be a possibility that they could help reduce the level of risk by undertaking credit allocation to companies with higher levels of risk. Lastly, the type of credits allocated to a specific economic sector matters for the relationship between risk and efficiency. For instance, green credit, allocated to environmentally friendly projects, has a higher level of risk, and there is a negative impact of this type of loan on bank efficiency (Galan & Tan, 2022). Based on these arguments, we propose the following hypotheses:

H1

For the economic sector that plays an important role in the economy and demands a large amount of banking credits, banks may have a potential cost reduction advantage, and there may be a positive impact of risk from this sector on bank efficiency.

H2

If a specific economic sector has a higher level of risk, there will be a negative impact of risk from this sector on bank efficiency.

H3

For the economic sector that has a close relationship with the banking sector and plays a similar role, the impact of risk on bank efficiency will depend on the level of risk brought to the banks by this sector and the level of risk reduction made by this sector for the banking sector. If the former is higher than the latter, there will be a negative impact of risk on bank efficiency. Otherwise, if the latter is higher, there will be a positive impact of risk on bank efficiency.

H4

The impact of risk on bank efficiency from a specific economic sector will depend on the type of credits allocated. Specifically, for the economic sector receiving green credits from banks, risk would have a negative impact on bank efficiency.

3 Research design

3.1 Data description

The current paper collects data from 64 Chinese banks that operated between 2013 and 2020. The sample includes five different bank types: 22 foreign banks, 21 city banks, 7 rural banks, 9 joint-stock banks, and 5 state-owned banks. The data was collected from the Fitch Connect database, which provides financial data for over 30,000 banks worldwide. Three original inputs were used: (1) labor, (2) deposits, and (3) physical capital (fixed assets), to produce two intermediate products: loans and securities investments. We argue that loan loss reserves are produced together with the intermediate products of loans and securities investment, but will not directly affect the next stage of production. These two intermediate products are used as inputs in the next stage of the production process, together with loan loss reserves from the previous period, to generate income. One special characteristic of this paper is considering the role played by loan loss reserves in the production process. The loan loss reserves are produced in the first stage in year t, and we treat them as desirable inputs in year t + 1, together with loans and securities investment, to produce income. The desirable feature of loan loss reserves is attributed to the fact that they provide bank stability. In general, we follow the intermediation approach in selecting inputs and outputs (Sealey and Lindley, 1977). While building on the proposal of the intermediation approach, we extend the model into a multi-period, two-stage dynamic DEA model. In our model, we used physical capital as one input because it includes land, buildings and facilities, and computer equipment, which can be depreciated. Hence, physical capital can be regarded as the equivalent item of fixed assets. In contrast, one of the important components of total assets is loans, which have been widely used as outputs. Our use of physical capital as one of the inputs follows Tan et al. (2021).

3.2 Study models and variable measurement

3.2.1 Dynamic network bank production

A two-stage network technology is developed in this section. Let “\({\text{T}}\)” represent the transposition operator. For time periods \(t = 0,{ }1,,{ } \ldots ,\tau\), bank \(j = 1,,{ } \ldots ,J_{{}}^{t}\) converts three standard inputs including deposits, labour and physical capital, \({\mathbf{x}}_{{}}^{t} = \left( {x_{1}^{t} ,x_{2}^{t} ,x_{3}^{t} } \right)_{{}}^{{\text{T}}} \in {\mathbb{R}}_{ + }^{3}\)., in Stage 1 to produce the two intermediate outputs of loans and securities investments, \({\mathbf{z}}_{{}}^{t} = \left( {z_{1}^{t} ,z_{2}^{t} } \right)_{{}}^{{\text{T}}} \in {\mathbb{R}}_{ + }^{2}\), which are then transformed in Stage 2 to produce a single desirable output of income,\({ }y_{{}}^{t} \in {\mathbb{R}}_{ + }^{{}}\), and a scalar undesirable output of loan loss reserve, \(c_{{}}^{t} \in {\mathbb{R}}_{ + }^{{}}\). The loan loss reserve is carried over to the nexteriod. That is, this carryover variable \(c_{{}}^{t}\) links two different periods in two ways: (a) the reserve for loan sses produced in Stage 1 is chosen by optimally deciding the configuration of good outputs and the carryoverariable; and (b) the loan loss reserve in period t-1 (\(c_{{}}^{t - 1}\)) affects production in Stage 2 production in period t in the sense that such reserve produced in the previous period is used in Stage 2 in the current period.

Therefore, we assume that managers decide the trade-offs between the final desirable outputs and the carryover between current and future periods. Greater amounts of loans \(z_{1}^{t}\). and securities investments \(z_{2}^{t}\) as well as the loans loss reserve in the previous period \(c_{{}}^{t - 1}\) are desable inputs because such activities create larger amounts of income \(y_{{}}^{t}\).. On the other hand, larger amounts of \(z_{1}^{t}\) \(z_{2}^{t}\) lead to the contraction of the amount of \(c_{{}}^{t}\), given the fixed amounts of other assets-related items, \(\left( {other assets} \right)_{{}}^{t}\) on the asset side of the balance sheet. To explain the relationship among \(z_{1}^{t}\), \(z_{2}^{t}\) and \(c_{{}}^{t}\)., we write the total assets, \(\left( {total \, assets} \right)_{{}}^{t}\), on the asset side of the balance sheet as follows:

$$\begin{array}{c}{\left(total \, assets\right)}^{t}={z}_{1}^{t}+ {z}_{2}^{t}+{\left(other \, assets\right)}^{t}\end{array}$$
(1)

We treat the loan loss reserve as a positive carryover variable. While the loan loss reserve is reported as a negative value on the asset side of the balance sheet, is a value greater than zero in this study. Hence, \(\left( {total \, assets} \right)_{{}}^{t}\). is expressed as

$$ \begin{array}{*{20}c} {\left( {total\; assets} \right)_{{}}^{t} = \left( {nominal\; assets} \right)_{{}}^{t} - c_{{}}^{t} } \\ \end{array} $$
(2)

where \(\left( {nominal assets} \right)_{{}}^{t}\) is the nominal assets in time t. The nominal total assets distinguished from the item called total assets in the balance sheet as shown in (2), \(\left( {nominal assets} \right)_{{}}^{t}\) is the sum of the total assets reported as the one in the balance sheet and the loan loss reserve (carryover variable). Combining Eq. (1) and Eq. (2) yields

$$\underbrace{ \mathop {\left( {nominal\; assets} \right)^{t} -c_{{}}^{t}}}\limits_{{\left( {total \;assets} \right)_{{}}^{t} } } = z_{1}^{t} + z_{2}^{t} + \left( {other\; assets} \right)_{{}}^{t} $$
(3)

Consequently, we can writeFootnote 1

$$ \begin{array}{*{20}c} {\left( {nominal\; assets} \right)_{{}}^{t} - \left( {other \;assets} \right)_{{}}^{t} = z_{1}^{t} + z_{2}^{t} + c_{{}}^{t} } \\ \end{array} $$
(4)

Assume that bank managers make decisions on the amounts of \({z}_{1}^{t}\), \({z}_{2}^{t}\) and \({c}^{t}\), while keeping \({\left(nominal \, assets\right)}^{t}\) fixed. Now suppose that the bank managers would like to increase \({c}^{t}\). To do so, the total assets item in the balance sheet must be decreased in view of Eq. (2) and the sum of \({z}_{1}^{t}\), \({z}_{2}^{t}\), and \({\left(other \, assets\right)}^{t}\) must be decreased according to Eq. (1). If the decrease in \({\left(total \, assets\right)}^{t}\) is totally made by decreasing \({\left(other \, assets\right)}^{t}\) then Eq. (4) states that it is possible to increase \({c}^{t}\) without changing \({z}_{1}^{t}\) and \({z}_{2}^{t}\). It should be noted that \({z}_{1}^{t}\) and \({z}_{2}^{t}\) can be changed as long as Eq. (4) when \({c}^{t}\) is increased. Figure 1 show the bank production structure.

Fig. 1
figure 1

Dynamic network structure

To develop a two-stage network framework, we add “j” in the subscripts of production factors for DMU j as \({\mathbf{x}}_{j}^{t}\), \({\mathbf{z}}_{j}^{t}\), \(y_{j}^{t}\), \({c}_{j}^{t-1}\) and \({c}_{j}^{t}\) to represent the jth decision-making unit,\({DMU}_{j}^{t}\). Then, the technology of stage 1 in period t is defined by

$$ T_{{}}^{1t} = \left\{ {\left( {\begin{array}{*{20}c} {{\mathbf{x}}_{{}}^{t} } \\ {{\mathbf{z}}_{{}}^{t} } \\ {c_{{}}^{t} } \\ \end{array} } \right) \in {\mathbb{R}}_{ + }^{3 + 2 + 1} {|} {\mathbf{x}}_{{}}^{t} \;{\text{can }}\;{\text{produce}}\, \left( {\begin{array}{*{20}c} {{\mathbf{z}}_{{}}^{t} } \\ {c_{{}}^{t} } \\ \end{array} } \right) } \right\} $$
$$=\left\{\left(\begin{array}{c}{\mathbf{x}}^{t}\\ {\mathbf{z}}^{t}\\ {c}^{t}\end{array}\right)\in {\mathbb{R}}_{+}^{3+2+1}| \begin{array}{c}{x}_{1}^{t}\ge {\sum }_{j}^{{J}^{t}}{x}_{1j}^{t}{\lambda }_{j}^{1t}, {x}_{2}^{t}\ge {\sum }_{j}^{{J}^{t}}{x}_{2j}^{t}{\lambda }_{j}^{1t}, {x}_{3}^{t}\ge {\sum }_{j}^{{J}^{t}}{x}_{3j}^{t}{\lambda }_{j}^{1t}, {z}_{1}^{t}\le {\sum }_{j}^{{J}^{t}}{z}_{1j}^{t}{\lambda }_{j}^{1t}, \\ {z}_{2}^{t}\le {\sum }_{j}^{{J}^{t}}{z}_{2j}^{t}{\lambda }_{j}^{1t}, {z}_{1}^{t}+{z}_{2}^{t}+{c}^{t}\le {\sum }_{j}^{{J}^{t}}\left({z}_{1j}^{t}+{z}_{2j}^{t}+{c}_{j}^{t}\right){\lambda }_{j}^{1t}, {\lambda }_{j}^{1,t}\ge 0, j=1,\dots ,{J}^{t}\end{array}\right\}$$
(5)

and the Stage 2 technology in the same period is defined by

$${T}^{2t}=\left\{\left(\begin{array}{c}{c}^{t-1}\\ {{\varvec{z}}}^{t}\\ {y}^{t}\end{array}\right)\in {\mathbb{R}}_{+}^{1+2+1}| \left(\begin{array}{c}{c}^{t-1}\\ {\mathbf{z}}^{t}\end{array}\right) \mathrm{can \, produce} {y}^{t} \right\}$$
$$\begin{array}{c}=\left\{\left(\begin{array}{c}{c}^{t-1}\\ {{\varvec{z}}}^{t}\\ {y}^{t}\end{array}\right)\in {\mathbb{R}}_{+}^{1+2+1}| \begin{array}{c}{c}^{t-1}\ge {\sum }_{j}^{{J}^{t}}{c}_{j}^{t-1}{\lambda }_{j}^{2t}, {z}_{1}^{t}\ge {\sum }_{j}^{{J}^{t}}{z}_{1j}^{t}{\lambda }_{j}^{2t}, {z}_{2}^{t}\ge {\sum }_{j}^{{J}^{t}}{z}_{2j}^{t}{\lambda }_{j}^{2t}, \\ {y}^{t}\le {\sum }_{j}^{{J}^{t}}{y}_{j}^{t}{\lambda }_{j}^{2t}, {\lambda }_{j}^{2t}\ge 0, j=1,\dots ,{J}^{t}\end{array}\right\} \end{array}$$
(6)

where \({\lambda }_{j}^{1,t}\) and \({\lambda }_{j}^{2,t}\) are intensity variables related to the bank j’s Stage 1 and Stage 2 technologies, respectively, at time t. In Eq. (5), a pair of period technologies are connected by the following constraint

$$\begin{array}{c}{z}_{1}^{t}+{z}_{2}^{t}+{c}^{t}\le {\sum }_{j}^{{J}^{t}}\left({z}_{1j}^{t}+{z}_{2j}^{t}+{c}_{j}^{t}\right){\lambda }_{j}^{1t}\end{array}$$
(7)

and Eq. (7) implements a dynamic DEA structure given in Eq. (3). We utilise Eq. (7) in view of the accounting practice that the loan loss reserve, a proxy of the carryover variable, is reported as a nonpositive number on the asset side of the bank balance sheet. In Stage 1, banks combine inputs \({\mathbf{x}}^{t}\) to produce intermediate products \({{\varvec{z}}}^{t}\) and carryover \({c}^{t}\). We believe the constraint (7) reflects the current regulatory situation in Chinese banking and other environmental conditions; consequently, the number of carryovers is not completely chosen freely. Bank managers, therefore, have some latitude on the choice of carryovers.

In Stage 2 the bank converts the earlier period carryover \({c}^{t-1}\) and \({{\varvec{z}}}^{t}\) to produce final output \({y}^{t}\). Connecting (5) and (6), the network technology in period t is described as:

$$\begin{array}{c}{NT}^{t} =\left\{\left(\begin{array}{c}\begin{array}{c}{\mathbf{x}}^{t}\\ {{\varvec{z}}}^{t}\\ {c}^{t}\end{array}\\ {c}^{t-1}\\ {y}^{t}\end{array}\right)\in {\mathbb{R}}_{+}^{3+2+1+1+1}| \left(\begin{array}{c}{\mathbf{x}}^{t}\\ {{\varvec{z}}}^{t}\\ {c}^{t}\end{array}\right)\in {T}^{1} \& \left(\begin{array}{c}{c}^{t-1}\\ {{\varvec{z}}}^{t}\\ {y}^{t}\end{array}\right)\in {T}^{2}\right\}\end{array}$$
(8)

Explicitly incorporating Eq. (3), we construct a nonparametric DEA technology in period t as follows:

$$\begin{array}{c}{NT}^{t} =\left\{\left(\begin{array}{c}\begin{array}{c}{\mathbf{x}}^{t}\\ {{\varvec{z}}}^{t}\\ {c}^{t}\end{array}\\ {c}^{t-1}\\ {y}^{t}\end{array}\right)\in {\mathbb{R}}_{+}^{3+2+1+1+1}| \begin{array}{c}{x}_{1}^{t}\ge {\sum }_{j}^{{J}^{t}}{x}_{1j}^{t}{\lambda }_{j}^{1t}, {x}_{2}^{t}\ge {\sum }_{j}^{{J}^{t}}{x}_{2j}^{t}{\lambda }_{j}^{1t}, {x}_{3}^{t}\ge {\sum }_{j}^{{J}^{t}}{x}_{3j}^{t}{\lambda }_{j}^{1t}, {z}_{1}^{t}\le {\sum }_{j}^{{J}^{t}}{z}_{1j}^{t}{\lambda }_{j}^{1t},\\ {z}_{2}^{t}\le {\sum }_{j}^{{J}^{t}}{z}_{2j}^{t}{\lambda }_{j}^{1t}, {z}_{1}^{t}+{z}_{2}^{t}+{c}^{t}\le {\sum }_{j}^{{J}^{t}}\left({z}_{1j}^{t}+{z}_{2j}^{t}+{c}_{j}^{t}\right){\lambda }_{j}^{1t},\\ \begin{array}{c}{c}^{t-1}\ge {\sum }_{j}^{{J}^{t}}{c}_{j}^{t-1}{\lambda }_{j}^{2t}, {z}_{1}^{t}\ge {\sum }_{j}^{{J}^{t}}{z}_{1j}^{t}{\lambda }_{j}^{2t}, {z}_{2}^{t}\ge {\sum }_{j}^{{J}^{t}}{z}_{2j}^{t}{\lambda }_{j}^{2t},\\ {y}^{t}\le {\sum }_{j}^{{J}^{t}}{y}_{j}^{t}{\lambda }_{j}^{2t}, {\lambda }_{j}^{2t}\ge 0, j=1,\dots ,{J}^{t}\end{array}\end{array}\right\}\end{array}$$
(9)

Then a \(\tau\)-period dynamic network technology is constructed using \({NT}^{t} \left(t=1,, \dots ,\tau \right)\) as follows:

$$\begin{array}{c}{NT}^{t} =\left\{\left(\begin{array}{c}\begin{array}{c}{\left({\mathbf{x}}^{1},\dots , {\mathbf{x}}^{\tau }\right)}^{\mathsf{T}}\\ {\left({\mathbf{z}}^{1},\dots , {\mathbf{z}}^{\tau }\right)}^{\mathsf{T}}\\ {\left({c}^{1},\dots , {c}^{\tau }\right)}^{\mathsf{T}}\end{array}\\ {\left({y}^{1},\dots , {y}^{\tau }\right)}^{\mathsf{T}}\end{array}\right)\in {\mathbb{R}}_{+}^{3\tau +2\tau +(\tau +1)+\tau }| \left(\begin{array}{c}\begin{array}{c}{\mathbf{x}}^{1}\\ {{\varvec{z}}}^{1}\\ {c}^{0}\end{array}\\ {c}^{1}\\ {y}^{1}\end{array}\right)\in {NT}^{1}, \dots , \left(\begin{array}{c}\begin{array}{c}{\mathbf{x}}^{\tau }\\ {{\varvec{z}}}^{\tau }\\ {c}^{\tau }\end{array}\\ {c}^{\tau }\\ {y}^{\tau }\end{array}\right)\in {NT}^{\tau }\right\}\end{array}$$
(10)

Let \({\mathbf{g}}^{t}={\left({\mathbf{g}}^{x,t},{g}_{3}^{y,t}\right)}^{\mathsf{T}}\in {\mathbb{R}}_{+}^{3+1}\) be a period-dependent direction vector utilised to scale exogenous inputs and outputs to the efficient frontier.

Using dynamic network technology (10), we define the dynamic \(\tau\)-period network directional distance function for bank o as:

$$DNIneff=max \, {\alpha }^{1}+{\alpha }^{2}+\dots +{\alpha }^{\tau }$$

subject to:

$$\begin{array}{c}\begin{array}{c}{x}_{1o}^{t}-{\alpha }^{t}{g}_{1}^{x,t} \ge {\sum }_{j}^{{J}^{t}}{x}_{1j}^{t}{\lambda }_{j}^{1t}, {x}_{2o}^{t}-{\alpha }^{t}{g}_{1}^{x,t} \ge {\sum }_{j}^{{J}^{t}}{x}_{2j}^{t}{\lambda }_{j}^{1t}, {x}_{3o}^{t}-{\alpha }^{t}{g}_{1}^{x,t} \ge {\sum }_{j}^{{J}^{t}}{x}_{3j}^{t}{\lambda }_{j}^{1t}, \\ {z}_{1}^{t}\le {\sum }_{j}^{{J}^{t}}{z}_{1j}^{t}{\lambda }_{j}^{1t}, {z}_{2}^{t}\le {\sum }_{j}^{{J}^{t}}{z}_{2j}^{t}{\lambda }_{j}^{1t}, {z}_{1}^{t}+{z}_{2}^{t}+{c}^{t}\le {\sum }_{j}^{{J}^{t}}\left({z}_{1j}^{t}+{z}_{2j}^{t}+{c}_{j}^{t}\right){\lambda }_{j}^{1t},\\ \begin{array}{c} {c}^{t-1}\ge {\sum }_{j}^{{J}^{t}}{c}_{j}^{t-1}{\lambda }_{j}^{2t}, {z}_{1}^{t}\ge {\sum }_{j}^{{J}^{t}}{z}_{1j}^{t}{\lambda }_{j}^{2t}, {z}_{2}^{t}\ge {\sum }_{j}^{{J}^{t}}{z}_{2j}^{t}{\lambda }_{j}^{2t}, {y}_{o}^{t}+{\alpha }^{t}{g}^{y,t} \le {\sum }_{j}^{{J}^{t}}{y}_{j}^{t}{\lambda }_{j}^{2t}, \\ {\lambda }_{j}^{1t}\ge 0, {\lambda }_{j}^{2t}\ge 0, {z}_{1}^{t}\ge 0, {z}_{2}^{t}\ge 0, {c}^{t-1}\ge 0, {\alpha }^{t}\ge 0, \left( \forall j=1,\dots ,{J}^{t}, \forall t=1,\dots ,\tau \right)\end{array}\end{array}\end{array}$$
(11)

where \(DNIneff\) stands for dynamic network inefficiency.

Färe and Grosskopf (1996a) strove to utilise the relationship between final outputs and carryover outputs to model a dynamic DEA problem. In a banking context, Fukuyama and Weber (2013, 2015, 2017a, 2017b) and Yu et al. (2021) for example implemented a dynamic production structure by considering the final outputs and the carryover simultaneously. The standard Malmquist indexes (Caves et al., 1982a, 1982b); Färe et al. (1994) and Luenberger productivity (Chambers, 2002) indicators that do not utilise dynamic structures such as (8), may be thought of as static dynamic models. A black-box version of the static directional distance function was developed by Chambers et al., (1996, 1998). The original version provides the maximum reduction in inputs and simultaneous expansion in outputs given a production technology by using a single scaling factor. The dynamic network directional distance function utilises \(\tau \) scaling factors \({\alpha }^{1},{\alpha }^{2},\dots ,{\alpha }^{\tau }\).

Before concluding this section, let us mention why we adopted the current dynamic framework which adopts the directional distance function (Chambers et al., 1986, 1989) rather than the slack-based measure (Lozano, 2016; Moreno & Lozano, 2018; Tone, 2001; Tone & Tsutsui, 2010; Fukuyama & Weber, 2010). The directional distance function is only consistent with the weakly efficient frontier, although the slack-based measure projects all decision-making units to the strongly efficient frontier. Despite this fact, we have chosen the directional distance function as the basis because, in a static setting, the variable and the constant returns to scale directional distance functions completely characterises the production technologies consistent to the constant and the variable returns to scale production possibility sets due to Charnes et al., (1978) and Banker et al. (1984), respectively. By contrast, the reference technology derived from slack-based measures is not equivalent to the resultant constant or the variable returns to scale technology (as was stated for example by Fukuyama, Matousek & Tzeremes, 2021). The \(DNIneff\) model (11) shows how the current carryovers not only are identified (i.e., Eq. (3) or Eq. (4)) but also are carried over to the next period (i.e., Eq. (7)). In other words, \(DNIneff\) considers the final outputs and carryovers jointly in a dynamic decision-making framework. In contrast, Tone and Tsutsui’s (2010) dynamic slacks-based framework does not directly incorporate such a dynamic roleFootnote 2 of carryovers (see the conclusion section of Tone et al., 2019 for example). We think it is of great significance to consider the carryovers across time periods in productivity analysis, although we acknowledge Tone and Tsutsui’s (2010) contribution which helped increase many dynamic DEA based publications. Moreover, while the individual bank-based carryover specification by Fukuyama and Weber (2013, 2015) considers various levels of time deposits, our specification does not use time deposits partly due to data limitations and hence it is quite useful for dynamic bank production analysis in the situations of data limitations. In view of these considerations, the directional distance function is used in this study.

In this study, we have adopted network DEA because of its strengths as follows: (1) various returns to scale properties (i.e. variable returns to scale and constant returns to scale) and the dynamic network structure can be implemented simultaneously as a simple optimization problem by adding or deleting the constraint on the sum of the intensity variables; (2) our dynamic network DEA imposes the nature of the returns to scale with respect to the intensity variables, but not the functional form. The imposition of a functional form by the parametric model does not only require the addition of the parameter constraints, but also make the estimation framework difficult. For instance, if the multi-output setting is represented by the standard Cobb–Douglas production function, the convexity of the production possibility set will not be allowed. On the other hand, if the multi-output setting is represented by a more flexible functional model, such as a translog or quadratic model, the imposition of a dynamic network structure and various returns to scale may make the estimation complicated. It should be noted that implementing noise in the parametric model may sometimes relatively easy. So, to cope with this limitation of the dynamic network DEA, we check its validity by means of causal modelling similar to Fukuyama et al., (2023).

Although the return to scale is usually considered in a static framework, this study defines the technology with a carryover variable with respect to previous and current periods. Therefore, our technology is somehow different from the standard production technology, but each time-period technology \({N}^{t}\) given in Eq. (9) exhibits constant returns to scale in the following sense. \({N}^{t}=\delta {N}^{t}\) for all \(\delta >0\). Note that if T is the production possibility set based on Charnes et al. (1978), then T exhibits constant returns to scale, i.e., \(T=\delta T\) for all \(\delta >0\). Note that constant returns to scale is considered to be a socially desirable scale. Our model employs the input orientation because the Chinese banks are too big and should be down sized (Fukuyama & Tan, 2021a, 2021b; Wilson & Zhao, 2022).

3.2.2 Causal interpretation

In general, we have a joint probability distribution in the form \(p({{\varvec{x}}}^{t},{{\varvec{z}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1},{{\varvec{y}}}^{t})\) which can be written as

$$p({{\varvec{x}}}^{t},{{\varvec{z}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1},{{\varvec{y}}}^{t})=p({{\varvec{y}}}^{t}|{{\varvec{x}}}^{t},{{\varvec{z}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1})p({{\varvec{z}}}^{t}|{{\varvec{x}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1})p({{\varvec{c}}}^{t}|{{\varvec{x}}}^{t},{{\varvec{c}}}^{t-1})p({{\varvec{x}}}^{t}|{{\varvec{c}}}^{t-1})p({{\varvec{c}}}^{t-1}).$$
(12)

P represents probability distribution, \(|\) stands for “conditional on”, therefore, \(p({{\varvec{y}}}^{t}|{{\varvec{x}}}^{t},{{\varvec{z}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1})\) means conditional on \({{\varvec{x}}}^{t},{{\varvec{z}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1}\), the probability distribution of \({{\varvec{y}}}^{t}\).

Under restriction (2), the general specification (12) is represented as follows:

$$p({{\varvec{x}}}^{t},{{\varvec{z}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1},{{\varvec{y}}}^{t})=p\left({{\varvec{y}}}^{t}|{{\varvec{z}}}^{t}\right)p\left({{\varvec{z}}}^{t}|{{\varvec{c}}}^{t}\right)p\left({{\varvec{c}}}^{t}|{{\varvec{x}}}^{t}\right)p({{\varvec{x}}}^{t}).$$
(13)

which means the following claim:

1. \(p({{\varvec{y}}}^{t}|{{\varvec{x}}}^{t},{{\varvec{z}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1})\) depends only on \({{\varvec{z}}}^{t}\).

2. \(p({{\varvec{z}}}^{t}|{{\varvec{x}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1})\) depends only on \({{\varvec{x}}}^{t}\).

3. \(p({{\varvec{c}}}^{t}|{{\varvec{x}}}^{t},{{\varvec{c}}}^{t-1})\) does not depend on \({{\varvec{c}}}^{t-1}\).

4. \(p({{\varvec{x}}}^{t}|{{\varvec{c}}}^{t-1})\) does not depend on \({{\varvec{c}}}^{t-1}\).

Alternatively, based on the production process we designed in our dynamic network DEA, in the causal analysis, we try to test (1) securities and loans in the current period are the only factors affecting net income in the current period; (2) labour, capital and deposits in the current period are the only factors affecting the production of loans and securities in the current period; (3) loan loss reserves in the current period are not affected by the ones in the previous period; (4) labour, capital and deposit in the current period are not affected by loan loss reserves in the previous period.

The decomposition (13) may or may not be supported by the data; this is particularly a cause for concern when the sample has the annual data, however the panel dataset is characterised in our sample. In order to deal with the curse of dimensionality, we approximate the probability distributions to test the decomposition. Let \({\widehat{u}}_{(k)}\) and \({F}_{(k)}\) be, respectively, an inefficiency and the machine-learning approximation dependent on parameter \({\theta }_{(k)}\), which are parameters for adjusting the values \({\widehat{u}}_{(k)}\). Let

$${\widehat{u}}_{(k)}={F}_{(k)}\left({{\varvec{y}}}^{t},{{\varvec{c}}}^{t},{{\varvec{c}}}^{\mathrm{t}-1},{{\varvec{z}}}^{t},{{\varvec{x}}}^{t};{\theta }_{(k)}\right), k\in \{1,\mathrm{2,3},4\},$$
(14)

represent the four machine-learning approximations of probability distributions obtained from the DEA (in)efficiency scores. Figure 1 indicates the following:

(A) \({F}_{(1)}\) depends only on \({{\varvec{z}}}^{t}\) but not \({{\varvec{c}}}^{t},{{\varvec{c}}}^{t-1},{{\varvec{x}}}^{t}\).

(B) \({F}_{(2)}\) depends only on \({{\varvec{x}}}^{t}\) but not \({{\varvec{c}}}^{t-1},{{\varvec{z}}}^{t}\) or anything else.

(C) \({F}_{\left(3\right)}\) depends only on \({{\varvec{c}}}^{t-1}\) but not \({{\varvec{x}}}^{t}\) or anything else.

(D) \({F}_{\left(4\right)}\) does not depend on anything else.

Without experimental data, it is not easy to establish the causal relationships (Pearl, 2009; Peters et al., 2013, 2017 and Pfister et al., 2019). However, the prediction properties are not influenced by the changes in the environment or interventions in the covariates due to the fact that the confounding variables are taken into consideration in the causal models. An example of confounding is that \({{\varvec{c}}}^{t-1}\) influences both \({{\varvec{z}}}^{t}\) and \({{\varvec{c}}}^{t}\) and a spurious relationship between \({{\varvec{z}}}^{t}\) and \({{\varvec{c}}}^{t}\) occurs. If the confounding effect of \({{\varvec{c}}}^{t-1}\) is ignored, then such an approach provides a misleading result.

Hence, we are interested in finding a reasonable causal relation, so we turn to the deconfounding approach established in a general setting. The deconfounding method is proposed by Wang and Blei (2019), in which a set of common factors are used to represent the relationship between certain variables:

$${{\varvec{Y}}}_{it}={{\varvec{\delta}}}_{i}{f}_{t}+{{\varvec{\xi}}}_{it},$$
(15)

where \({{\varvec{\delta}}}_{i}\) represents a \(d\times 1\) vector of factor loadings for bankFootnote 3\(i=1,\dots ,J\), time t dependent variable \({f}_{t}\) is the single common factor, and \({{\varvec{\xi}}}_{it}\) is a \(d\times 1\) error term vector. The common factor \({f}_{t}\) at \(t\) is constant over \(i\) and factor loadings \({{\varvec{\delta}}}_{i}\) of bank i are constant over at \(t\). Here, upper case letter \({{\varvec{Y}}}_{it}\) represents a potential outcome from the population, assessed at its single cause \({f}_{t}\), which is in the data set.

A number of procedures can be used to extract the common factor from the data and the nonlinear models are recommended by Wang and Blei (2019); one example would be the quadratic factor model as below:

$${{\varvec{Y}}}_{it}={{\varvec{\delta}}}_{(\mathrm{I})}{f}_{t}+{{\varvec{\delta}}}_{(\mathrm{II})}{f}_{t}^{2}+{{\varvec{\xi}}}_{it},$$
(16)

where \({{\varvec{\delta}}}_{(\mathrm{I})}\) and \({{\varvec{\delta}}}_{(\mathrm{II})}\) are \(d\times 1\) vector of factor loadings. There are a few additional studies discussing the alternative methods for coping with deconfounding (Chernozhukov et al., 2017; Schneeweiss et al., 2009; McCaffrey et al., 2004; and Lee et al., 2010).

A dynamic factor model with the following form is proposed due to the consideration that high persistency is the characteristic of the missing environmental variables:

$$\begin{array}{c}{{\varvec{Y}}}_{it}=\boldsymbol{\varphi }\left({f}_{t};{{\varvec{\delta}}}_{(\mathrm{I})},{{\varvec{\delta}}}_{(\mathrm{II})}\right)+{{\varvec{\xi}}}_{it},\\ {f}_{t}=\vartheta {f}_{t-1}+{\xi }_{f,it},\end{array}$$
(17)

where \({{\varvec{Y}}}_{it}\) is a vector function represented by \(\boldsymbol{\varphi }({f}_{t};{{\varvec{\delta}}}_{(1)},{{\varvec{\delta}}}_{(2)})\), the autoregressive parameter is expressed as \(\vartheta \), and the error term is denoted by \({\xi }_{f,it}\) with a zero mean and normalised variance to one.

In addition, a mean vector zero and diagonal covariance matrix \(\Sigma \) are the properties of a multivariate normal distribution \({{\varvec{\xi}}}_{it}\sim {\mathcal{N}}_{G}(0,\Sigma )\). Any persistent missing variables which are not observed by the researcher but are important in defining the operational context and environment could be picked up by the Dynamic Factor Model (DFM). For a set of variables under consideration, a proper causal interpretation can be facilitated by a proper deconfounding (Wang and Blei, 2009). Put in alternative words, a proper deconfounding indicates:

  1. (a)

    Models pass the causal interpretation test.

  2. (b)

    In the context of cross-validating or hold out samples, the predictive ability of DEA models is reasonable.

Using the deconfounding method explained above, this paper estimate \({{\varvec{\delta}}}_{(\mathrm{I})}\) and \({{\varvec{\delta}}}_{(\mathrm{II})}\) in Eq. (17). In our empirical analysis, the sample distribution of \({{\varvec{Y}}}_{it}\), the realization of which is \({y}_{it}\), depends on the estimates of components of \({{\varvec{\delta}}}_{(\mathrm{I})}\) and \({{\varvec{\delta}}}_{(\mathrm{II})}\) for each bank \(i=1,\dots ,J\) at time \(t\).

3.2.3 Algorithm for implementing the causal structure

The finite mixture of multivariate normal (FMMN) distributions is used to approximate the underlying distributions, through which the restrictions (12)-(13) are tested. Let \({{\varvec{W}}}^{t}=({\mathbf{c}}^{t}, {{\mathbf{z}}^{t},\mathbf{x}}^{t},{\mathbf{c}}^{t-1})\). Noting \({{\varvec{c}}}^{t-1}\) is the auxiliary variable and applying the standard Markov chain Monte Carlo (MCMC) methods typically used in Bayesian modelling, we test \(p({{\varvec{y}}}^{t}|{{\varvec{x}}}^{t},{{\varvec{z}}}^{t},{{\varvec{c}}}^{t})\) as follows.

Step 1. An \(M\) variate normal distribution is used by approximate \(p\left({\mathbf{y}}^{t}|{{\varvec{W}}}^{t}\right),\) in which there are G components. Hence, the number of outputs is represented by \(M\), we can write

$$p\left({\mathbf{y}}^{t}|{{\varvec{W}}}^{t}\right)=\sum_{g=1}^{G}{\varpi }_{g}{f}_{M}^{\mathcal{N}}\left({\mathbf{y}}^{t};{\mu }_{g},{\Sigma }_{g}\right).$$
(18)

In Eq. (18) the mixing probabilities are represented by \({\varpi }_{g}\), an \(M\times 1\) vector of means is denoted by \({\mu }_{g}\), an \(M\times M\) covariance matrix is expressed by \({\Sigma }_{g}\), and the density of an \(M\)-variate normal distribution is proxied by \({f}_{M}^{\mathcal{N}}\left({\mathbf{y}}^{t};\mu ,\Sigma \right), \mu \) and \(\Sigma \) stand for the mean vector and covariance matrix, respectively. The following are functions of \({{\varvec{W}}}^{t}\) for mixing probabilities, means, and covariances:

$${\mu }_{g}={{{\varvec{W}}}^{t}}^{\mathrm{^{\prime}}}{\theta }_{\mu ,g} \, \forall g=1,\dots ,G,$$
(19)
$${\varpi }_{g}=\mathrm{exp }\left\{{{{\varvec{W}}}^{t}}^{\mathrm{^{\prime}}}{\theta }_{\varpi ,g}\right\} \, \forall g=1,\dots ,G,$$
(20)
$${C}_{kh,g}=\mathrm{exp }\left\{{{{\varvec{W}}}^{t}}^{\mathrm{^{\prime}}}{\theta }_{\Sigma ,kh,g}\right\} \, \forall g=1,\dots ,G, k\le h=1,\dots ,M ,$$
(21)

A lower triangular matrix is represented by \({C}_{g}=[{C}_{kh,g}]\) so that \({C}_{g}{{C}_{g}}^{\mathrm{^{\prime}}}={\Sigma }_{g}\), \({\theta }_{\mu ,g}\), and vectors of parameters are denoted by \({\theta }_{\varpi ,g},{\theta }_{\Sigma ,kh,g}\) (jointly called \(\theta \in {\mathbb{R}}^{d}\)). From this perspective, \({{\varvec{W}}}^{t}\) is included as a component in the functions of all mixture parameters, in which case the tight approximations to any conditional distributions can be yielded when there is an increase in G (Geweke & Keane, 2007; Norets, 2010; Norets & Pelenis, 2012).

Step 2. The parameters related to \({{\varvec{x}}}^{t},{{\varvec{c}}}^{t}\), as represented by \(\theta \) are tested to make sure that they are jointly zero. Equivalently, in order for the flexible FMNM to be fitted to the distribution \(p({\mathbf{y}}^{t}|{{\varvec{W}}}^{t})\), the same methodology as the one adopted in Step 1 is used.

Step 3. Compute the \(p\)-values of the three models in Steps 1 and 2.

4 Empirical results and discussion

4.1 Descriptive statistics

The statistics of the production factors are described in Table 2. Upon examining the standard deviation of the variables, it can be observed that Chinese banks in the sample have the largest difference in terms of the volumes of deposits, followed by the outputs loans and securities. This can be attributed to the fact that the Chinese banking industry has a variety of different ownership types and the size of operation among them is substantially different. The difference in the volumes of income generated among the banks in the sample is smaller than the difference in the volumes of outputs generated, indicating that larger banks should focus more on effective and efficient cost management. The smallest difference among the production factors of the banks in the sample is observed for labor. This difference can be mainly attributed to the fact that different numbers of workers are employed for different banks based on the size of the operation. Additionally, the difference can also be explained by the fact that there is a large gap in salary/wage levels between employees and management personnel.

Table 2 Descriptive statistics of inputs and outputs in the production process

4.2 Pairwise correlation and multicollinearity test

We conducted pairwise correlation analysis using the Pearson correlation coefficient to test the relationship between inefficiency and each of the independent variables. Additionally, this correlation coefficient was used to check for the potential issue of multicollinearity among the independent variables. Table 3 presents the results. Two important findings from the table are: (1) the results indicate that risk derived from the agricultural sector and risk from the financial services sector have a negative impact on bank inefficiency, while risk derived from the wholesale sector and Water Conservancy, Environment and Public Facilities have a positive impact. However, in our regression analysis using either boostrapped truncated regression or Tobit regression, the signs and significance of the independent variables may differ from those obtained from the Pearson correlation coefficient; (2) the coefficients among the independent variables are below 0.8, indicating that the variables used in our analysis are not affected by multicollinearity issues (Tan & Floros, 2012).

Table 3 Pearson correlation coefficients

4.3 Results of the study models

The directions are equal to the average values of labour, physical capital, deposits and income for all sample periods. Therefore, we use the same direction vector for all sample years. Take the first sample bank (Agricultural Bank of China) as an example, the dynamic network inefficiency of this bank is equal to

$$ \begin{aligned} DNIneff = &\, \alpha_{{}}^{2013} + \alpha_{{}}^{2014} + \alpha_{{}}^{2015} + \alpha_{{}}^{2016} + \alpha_{{}}^{2017} + \alpha_{{}}^{2018} + \alpha_{{}}^{2019} + \alpha_{{}}^{2020} \\ = & \,3.872067 + 3.292366 + 3.517703 + 4.540133 + 4.237997 + 4.185962 + 3.769933 + 3.704772 \\ = &\, 31.12093 \\ \end{aligned} $$

Figure 2 presents the results on the level of inefficiency, which shows fluctuations during the period from 2013 to 2020. These findings are in line with Antunes et al. (2021), who reported a volatile efficiency trend in Chinese banking after 2014. Notably, the figure illustrates a significant increase in inefficiency in 2016. This was due to the government's removal of restrictions on deposit and loan interest rates after the completion of interest rate liberalization in 2015, which intensified competition in the banking sector. The resulting deterioration in efficiency is consistent with Tan and Floros (2019). The substantial increase in the inefficiency level in 2016 was not revealed by Liu et al., (2020a, 2020b), which proposed a two-stage DEA model based on the meta-frontier boundary and intermediate outputs goal setting to estimate bank efficiency in China. We further notice that there is an improvement in the efficiency level after 2016 (i.e., decrease in the inefficiency level); this can be attributed to the fact that, in 2017, the Chinese banking industry deepened the integration between traditional banking and internet financial platforms, and the use of financial technology in traditional banking has been further embedded in the operation. The use of financial technology in the banking operation reduces bank cost, which is reflected in the improvement in bank efficiency.

Fig. 2
figure 2

Inefficiency scores between 2013 and 2020

Figure 3 shows that the highest inefficiency and inefficiency volatility were experienced by the state-owned banks. This is in contrast with An et al. (2015), who proposed a slacks-based two-stage DEA model to estimate bank efficiency in China and reported that the level of efficiency of state-owned banks was higher than that of the joint-stock banks. Another study by Liu et al., (2020a, 2020b) reported the similar results regarding the superior performance of state-owned banks compared to joint-stock banks and city banks. This is mainly attributed to the fact that they proposed a two-stage meta-frontier DEA network models for measuring the level of efficiency in the Chinese banking industry and the sample covered in their research is smaller compared to ours because foreign banks were not considered by them. We further notice that the highest efficiency was possessed by foreign banks. The figure shows that stable inefficiencies were characterised in foreign, city and rural banks and, compared to state-owned banks and joint-stock banks, they perform significantly better. This is in contrast with the findings of Shih et al. (2007). The highest and lowest efficiency occupied by state-owned and foreign banks is in accordance with Berger et al. (2009).

Fig. 3
figure 3

Inefficiency scores of different bank types

In order to show the accuracy of our proposed innovative model, we compare the inefficiency scores generated from the model with two alternative models, including the one which does not treat loan loss reserves as a carryover variable as well as the one derived from the single network directional distance function.Footnote 4 Figure 4 shows the results. As we can see from the figure, the model which does not treat loan loss reserves as a carryover variable and the one from the single network directional distance function generate lower inefficiency scores compared to the ones from our proposed approach. In other words, the efficiency scores generated by these two alterative models are inflated. In addition, it is noticed that our proposed method produces inefficiency scores with a higher degree of volatility over the period.

Fig. 4
figure 4

Comparison of inefficiency scores among the proposed model, the model without treating Loan Loss Reserves as a carryover and single period network directional distance function

In the second-phase regression analysis, we investigated the impact of credit riskFootnote 5 derived from four economic sectors on bank efficiency, including the agricultural sector, the wholesale sector, the Water Conservancy, Environment, and Public Facilities management sector, as well as the Financial Services sector under the bootstrapped truncated regression. This is another contribution we make. All the empirical banking studies concentrated on the influence of risk at the bank-level; no attempt has been made to examine the impact of credit risk at the industry-level on bank efficiency. We collected the data regarding the non-performing loan ratios of 25 different economic sectors over the period 2013–2020 from the annual financial statements published by the China Banking Regulatory Commission. The detailed figures can be found in Table 4.Footnote 6

Table 4 Non-performing loan ratios of different economic sectors in China over the period 2010–2017

We can see from the table that the level of credit risk derived from allocating credits to international organizations is quite low compared to that of other economic sectors, with non-performing loan ratios for most years being 0. On the other hand, the data clearly indicates that the level of credit risk derived from allocating credits to the agricultural sector is very high. Our aim is to investigate the impact of credit risk at the industry-level on bank efficiency and examine whether the impact of risk at the industry-level on bank efficiency differs between sectors with a higher risk level and those with a lower level of risk. To achieve this, we first need to select economic sectors with low and high risk. We compute the average non-performing loan ratios for each economic sector and compare the values. We select the two economic sectors with the highest average values as the target sample for sectors with a high level of risk, while the two economic sectors with the lowest average values are selected as the target sample for sectors with a low level of risk. After the calculation, we select the agricultural sector and the wholesale sector as the sectors with a high level of risk, with average non-performing loan ratios of more than 3.7 and 4.2, respectively. The Water Conservancy, Environment and Public Facilities management sector and the Financial Services sector have the lowest levels of credit risk, with average non-performing loan ratios of 0.12125 and 0.24, respectively. Therefore, our second-stage model to investigate the impact of industry-level risk on bank efficiency can be expressed as follows:

$${DNIneff}_{it}={\beta }_{0}+{\beta }_{1}{Agriculture}_{t}+{\beta }_{2}{Wholesale}_{t}+{\beta }_{3}{WCEPF}_{t}+{\beta }_{4}{Financial}_{t}+{\varepsilon }_{it}$$

where \({DNIneff}_{it}={\alpha }_{i}^{t}\) is the dynamic network inefficiency derived from the first-stage analysis,Footnote 7Agriculture, Wholesale, WCEPF and Financial stand for four different industries, namely, the agricultural industry, the wholesale sector, the Water Conservancy, Environment and Public Facilities Management sector and the financial services sector, i and t represent bank and year, respectively. The results regarding the impact of credit risk derived from these four economic sectors on efficiency are presented in Table 5. We find that credit risk derived from the agricultural sector decreases efficiency. However, we further find that credit risk derived from the wholesale and retail sector improves bank efficiency. We also find that credit risk from the Water Conservancy, Environment and Public Facilities sector increases bank inefficiency. Looking at the coefficients and probability values of the agricultural sector and the Water Conservancy, Environment and Public Facilities sector, we can find out that the latter is less significant but with a bigger size of the coefficient, which means that the latter has a bigger impact on the inefficiency level (i.e. one unit of increase in the risk level results in a larger degree of decline in the efficiency level compared to the agricultural sector). Finally, we find that credit risk derived from allocating credits to the financial services sector is not significant.

Table 5 Empirical results (bootstrapped truncated regression)

4.4 Validity analyses and robustness tests

Tobit regressionFootnote 8 is used to double check the accuracy of our results; the findings are provided in Table 6. The results confirm that credit risk from the agricultural sector decreases bank efficiency, while credit risk from the wholesale and retail sector increases bank efficiency. Our findings confirm that credit risk from the Water Conservancy, Environment, and Public Facilities sector has a larger negative impact on bank efficiency than the agricultural sector, as evidenced by a significantly larger coefficient. Finally, bank efficiency is not affected by credit risk derived from the financial services sector. We use the methods in the previous section and the Bayes factors to decide whether the restricted versions in (12)–(13) are valid. Our results are reported in Table 7.

Table 6 Empirical results (Tobit regression)
Table 7 p-values

The odds against the restrictions of the causal structure A and B, as reflected by the results in Table 7, are overwhelming, which indicates that without deconfounding, there would be an invalid causal structure in the dataset. This is within our expectation due to the fact that some important variables may be missed in the data generation process. Quadratic deconfounding in Eq. (16) provides the marginal results. Finally, quadratic (dynamic) deconfounding in Eq. (17) provides the results which are largely consistent with a causal structure. Furthermore, we can test the causal interpretation of the truncated regression results in Tables 5 and 6 (bootstrapped truncated regression, and Tobit regression, respectively). Since specifications A, B, and C are rejected, we focus on specification D and we res-estimate the specifications in Tables 5 and 6 by adding the dynamic factor and its square into the specification. Additionally, in this context, we can test the same specification (viz. bootstrapped truncated regression, and Tobit regression) for causality. The results are reported in Table 8.

Table 8 Deconfounding of models (model without considering Loan Loss Reserves and single period network directional distance function)

As the two specifications (the model without considering loan loss reserves as well as the one with a single period directional distance function) are rejected, they do not admit a causal interpretation, so the respective efficiency evaluation models should not be used. To evaluate requirements (b) and (c) (viz. the predictive ability of the DEA models in cross-validating or hold-out samples should be “reasonable”, and the predictive ability of the DEA models in cross-validating or hold-out samples remains invariant under active interventions into the covariates of the model), we proceed as follows: For the specifications in Tables 3 and 4, we compute a pseudo-R-squared as the squared correlations between the actual and the predicted values. The requirement (b) would hold provided the pseudo-R-squared in cross-validating or hold-out samples does not differ significantly relative to the baseline pseudo-R-squared in the baseline specification (that uses all the data); we call this ratio \(\tau \). Based on the \(\tau \) values reported in Table 8, which are very close to 1, we see that in alternative sub-samples, the behaviour of the truncated regression and the one of the Tobit regression are not dramatically different. For this purpose, we generated 500 randomly selected sub-samples from the original data and split them into an estimation sample consisting of 80% observations and a cross-validation sample of approximately 20%.

Another test in this context is to examine whether the parameter estimates change when we consider alternative estimation samples similar to the above. The relevant p-values are reported in Table 8. The relevant test is a \({\chi }^{2}-\) statistic of the form \({\chi }^{2}={d}^{^{\prime}}{V}^{-1}d,\) where \(d={b}_{1}-{b}_{2}\), \({b}_{1}\) and \({b}_{2}\) correspond to the parameter estimates from the truncated/Tobit regression in the whole sample, \({b}_{2}\) is the average of the parameter estimates from the truncated/Tobit regression in alternative sub-samples, and \(V={V}_{1}+{V}_{2}\), where \({V}_{1}\) and \({V}_{2}\) are the corresponding covariance matrices. The empirical results show that both of the two regressions (truncated and Tobit) admit a causal interpretation in our context. In addition to testing the internal validity of the adopted model, following Habib and Mourad (2022) and Habib and Kayani (2022), we also verified the external validity of the adopted model by examining the consistency of the results over time (Table 9). The Mann–Whitney U test showed no statistically significant variance in the inefficiency score distribution for the study years 2013–2014 (p = 0.7896), 2014–2015 (p = 0.7822), 2015–2016 (p = 0.3793), 2016–2017 (p = 0.8600), 2017–2018 (p = 0.9886), 2018–2019 (p = 0.3806), and 2019–2020 (p = 0.9084). The Kruskal–Wallis H test also revealed no statistically significant variation in the efficiency score distribution over the study period (p = 0.9413). The Spearman rank correlation between each year was highly significant as well. Therefore, the general distribution of inefficiency scores and the rate of efficient DMUs do not appear to change significantly from one period to another, and the banks ranked as efficient mostly remain the same from one period to another.

Table 9 The distribution variance of inefficiency scores

4.5 Discussions

After reviewing the theoretical literature on efficiency estimation in the banking context, a number of studies have incorporated the risk factor in the production process. We contribute to the literature in bank efficiency analysis by proposing a dynamic two-stage network DEA analysis, in which loan loss reserves, one of the main risk indicators in banking, have been treated as a carryover input variable with a desirable characteristic. In the second-stage empirical analysis, existing literature studies mainly focus on the influence of bank-level risk on bank efficiency. We contribute to the empirical banking literature by focusing on the impact of industry-level risk on bank efficiency.

Our findings indicate that the Chinese banks have undergone ups and downs with regard to the inefficiency levels; the deterioration and improvement in the efficiency level are mainly attributed to the change in competitive conditions and the use of financial technology in banking operations. The impact of competition on efficiency has been documented in the competition-efficiency and competition-inefficiency hypotheses (Konara et al., 2019), while the influence of financial technology on bank efficiency is supported by Ataullah et al. (2004).

Our results are in contrast with Fukuyama and Tan (2022a, 2022b), which reported an improvement in efficiency in the Chinese banking industry between 2013 and 2016. This difference in findings is mainly attributed to the fact that two different methods were adopted. State-owned banks experienced the highest inefficiency and inefficiency volatility, while foreign banks possessed the highest efficiency. The lowest efficiency of state-owned commercial banks is mainly attributed to the strong support they received from the government. In addition, the complicated and bureaucratic procedure of this bank ownership type leads to significant waste of resources, which eventually drags down the level of efficiency. In comparison, foreign banks have a more open, transparent, and simpler operational pattern. Furthermore, foreign banks have more advanced technology in banking operations and risk management, resulting in a more optimal allocation of resources and an improvement in the level of efficiency. Joint-stock banks were found to be less efficient than city and rural banks. This is different from Antunes et al. (2022), which reported much higher volatility in the level of efficiencies for all ownership types, except for state-owned banks. This is mainly attributed to the fact that Antunes et al. (2022) used a basic DEA model, in which the multi-stage production process and the carryover characteristics of the variables are not considered.

In the second-stage analysis, we further evaluate the impact of risk from four industries on efficiency in banking, including the agricultural sector, the wholesale and retail trade sector, the Water Conservancy, Environment and Public Facilities management sector, and the Financial Services sector. The first two have the highest average non-performing loan ratios among all economic sectors, while the latter two have the lowest average non-performing loan ratios. The bootstrapped truncated regression shows that credit risk derived from the agricultural sector decreases bank efficiency. This negative impact of the agricultural industry on bank performance is evidenced and supported by Osei-Assibey and Asenso (2015), who argued that the agricultural sector is highly risky, and banks may be forced to scale back their lending activities in this sector. In China, there has been a specific type of loan tailored to agriculture called "sannong." More specifically, this type of loan focuses on farmers, rural areas, and agricultural-related activities. The characteristics of this type of loan are that the volume is small, and the allocation of this type of loan has a dispersed distribution, which increases bank costs.

We find that credit risk derived from the wholesale and retail sector improves bank efficiency. The positive impact of the wholesale and retail sector on bank efficiency can be explained by the fact that there is a very important role played by this economic sector. As argued by Su et al. (2022), the proportion of economic value added in China from the wholesale and retail trades sector increased from 1.11% in 1992 to 7.81% in 2017. The benefits derived from economies of scale for allocating credits to this economic sector are more than the losses derived from the volumes of non-performing loans. The positive and significant impact of the wholesale sector and retail sector on bank efficiency can also be explained by the fact that this specific economic sector has a significant and positive externality effect. In other words, the development of this sector will significantly improve the development of other economic sectors such as the manufacturing sector, transportation sector, agricultural sector, among others. The benefits banks receive from the development of these economic sectors are more than enough to offset the potential losses from the non-performing loans of the wholesale and retail sector, resulting in an overall improvement in bank efficiency.

We report that credit risk derived from the Water Conservancy, Environment, and Public Facilities management sector decreases bank efficiency. Danye (2020) argued that in China, the credits allocated to the Water Conservancy, Environment, and public facilities management sector focus on green loans, while green credits come with higher risk levels, which further result in a decline in bank efficiency (Galan & Tan, 2022). In the Chinese context, one of the main issues in terms of green loans is this type of loans allocated to specific types of businesses is still underdeveloped, the businesses applying for green loans are short of information disclosure, the resulted increase in the problem of asymmetric information will increase bank risk. In addition, the Chinese banking industry now implemented different levels of interest rate charged to normal loans and green loans, with the latter receiving more beneficial policies from the perspective of a lower level of interest rate. Considering that the resources available for the banks are fixed, the volumes of green loans allocated by the banks not only will receive lower levels of income, but also will reduce the volume of “normal loans” that could be allocated by the banks. Due to the fact that the normal loans are the main source of bank income and profits, this would have a significant and negative impact on bank performance, at least in the short term.

Finally, bank efficiency is not affected by credit risk derived from the financial services sector. Li (2022) argued that the there is a dual aim for the financial services sector, one is to provide financial services and the other is profit maximization. Because non-banking financial institutions can reduce the level of bank risk by providing credits to individuals or companies with higher levels of risk, the increase in the level of risk for the banks, from the perspective of higher volumes of non-performing loans from this sector, is compensated by the reduction in bank risk, facilitated by credit allocation from the non-banking financial institutions. Comparing the coefficients between the agricultural sector and the Water Conservancy, Environment and Public Facilities management sector, we can conclude that the latter has a bigger impact, as reflected by the larger coefficient.

In order to ensure the robustness of our first-stage efficiency results, as well as the second-stage regression analysis, we have tested the internal and external validity of our proposed models. We tested the causal structures of our proposed network DEA by examining different models, including the proposed model, the model without consideration of loan loss reserves, and the single-period network directional distance function. We confirmed the validity of our proposed approach, which includes loan loss reserves with a carryover characteristic in the production process. Finally, we looked at the equality of coefficients in alternative sub-samples. The results show that the parameters do not differ, which guarantees the validity of our second-stage regression analysis.

Regarding the external validity of our proposed models, we used Mann–Whitney U test, Kruskal–Wallis H test, and the correlation of Spearman rank. The results show no statistically significant variation in the efficiency score distribution over the study, and the rate of efficient banks did not change significantly from one period to another. We further find that the banks ranked as efficient remained mostly harmonious from one period to another. Our current study moves the current body of knowledge forward. Therefore, for future studies investigating relevant relationships using econometric analysis, the interval validity of the econometric techniques adopted can be confirmed by relevant statistical causal analysis. For future efficiency studies incorporating a two-stage analysis, the validity of the proposed methods should be checked internally and externally.

5 Concluding remarks

5.1 Research conclusions

The current study significantly contributes to the subject area of empirical banking and operational research in banking by proposing a two-stage dynamic DEA model that considers the desirability of loan loss reserves in the banking production process with a carryover role. We validate our proposed production process using causal inference analysis. Empirically, we estimate the impact of industry-level risk on bank efficiency in a second-stage analysis facilitated by bootstrapped truncated regression and Tobit regression. The findings based on the deconfounding of models show that both of the two regressions (truncated and Tobit) admit a causal interpretation in our context, which guarantees the interval validity of our model and results. We also check the external validity of our results through the Mann–Whitney U test and Kruskal–Wallis H test. The findings show that our results are robust and consistent.

Our first-stage efficiency analysis shows that the inefficiencies were volatile in Chinese banking over the period 2013–2020. We further find that the state-owned banks experienced the highest levels of inefficiency and volatility, while foreign banks have the highest level of efficiency among different ownership types. We also find that although the efficiency level of city and rural banks is lower than that of foreign banks, it is still higher than that of joint-stock commercial banks. Our second-stage analysis shows that credit risk derived from the agricultural sector and the Water Conservancy, Environment and Public Facilities management sector decreases bank efficiency, while credit risk derived from the wholesale and retail sector improves bank efficiency.

5.2 Policy implications

Our results provide important and interesting policy recommendations: (1) additional efforts should be made to carefully check the projects related to credits allocated to the agricultural sector, as enhanced risk monitoring, checking, and management will reduce credit risk derived from this economic sector and further improve bank efficiency; (2) the volumes of credits allocated to the wholesale and retail sector should be increased; (3) additional efforts should also be made to focus on closely monitoring, checking, and managing the credits allocated to the Water Conservancy, Environment, and Public Facilities management sector; (4) as indicated by our results, we recommend that special efforts should be made to reduce credit risk derived from the economic sectors with lower levels of credit risk, because this will significantly increase bank efficiency compared to the economic sectors with higher credit risk.

Our results provide important and interesting policy recommendations: (1) additional efforts should be made to carefully check the projects related to credits allocated to the agricultural sector, as enhanced risk monitoring, checking, and management will reduce credit risk derived from this economic sector and further improve bank efficiency; (2) the volumes of credits allocated to the wholesale and retail sector should be increased; (3) additional efforts should also be made to focus on closely monitoring, checking, and managing the credits allocated to the Water Conservancy, Environment, and Public Facilities management sector; (4) as indicated by our results, we recommend that special efforts should be made to reduce credit risk derived from the economic sectors with lower levels of credit risk, because this will significantly increase bank efficiency compared to the economic sectors with higher credit risk.

From a theoretical perspective, the benefits enjoyed by large banks from economies of scale and scope do not play a decisive role in improving performance in the Chinese banking industry. Instead, a simpler, more transparent, and straightforward operating procedure, as well as advanced techniques/technology employed in operations and risk management, are of vital importance in improving bank efficiency in China. The literature has well-documented that risk has an impact on bank performance/efficiency, while no discussion has yet been initiated to investigate the level of risk derived from different economic sectors on bank efficiency. Our study provides new discoveries regarding the impact of risk from different economic sectors on bank efficiency. Based on our findings, we argue from a theoretical perspective that risk derived from economic sectors does not necessarily be harmful to bank efficiency, although there is a negative impact on bank stability. In particular, we have the novel finding that risk derived from the wholesale and retail sector benefits bank efficiency. This implies that for economic sectors that can generate positive externality to the development of other economic sectors, the benefits received by the banks can cover the potential losses from non-performing loans allocated to these sectors. At an overall level, various DEA models established from measuring performance of different economic sectors from different perspectives are short of theoretical support in terms of the validity of the proposed models. We provide a pioneering practice of supplementing and supporting the test of validity by using causal inference analysis. In addition, the econometric analysis, which was widely practiced by previous literature studies through relevant econometric techniques, lacks robustness and accuracy. We provide an example for future studies to include both econometric analysis and statistical analysis for robustness checking.

5.3 Limitations and future directions

Although the current study makes innovations from the modelling perspective as well as the empirical perspective, it still suffers from few limitations and the future research can be directed mainly in the following areas: (1) the robustness of the results can be further checked by considering the role played by the materiality of investment in the influence of industry-level risk on bank efficiency. In other words, a specific economic sector may have a higher risk, therefore, banks will have low exposure in that sector. The investigation of this issue would further provide and confirm the validity of our results; (2) The current study investigates the influence of the selected four specific economic sectors, however, there is a potential issue that a bank or some banks in the sample may not lend to one of more of the selected sectors. We tried to address this issue in our study, however, the annual financial statements provided by the banks do not disclose enough information about this. Thus, further studies could collect more data through an alternative channel and confirm that all the banks in the sample provided credits to all the selected economic sectors. This will further strengthen the accuracy and robustness of our results. Shen et al. (2022) proposed an interesting nonparametric approach to estimate the shadow pricing of non-performing loans, which can be regarded as the measurement of risk performance in the banking sector. Future studies could use our dataset, together with the nonparametric approach proposed, to evaluate the shadowing pricing of loan loss reserves. This would help check and obtain the robustness regarding the level of risk performance in the Chinese banking sector. Finally, Bădin et al. (2019) argued that conditional efficiency measures are needed when the production process does not depend only on the inputs and outputs but may be influenced by external factors and/or environmental variables. Therefore, another area of future research could further build on the existing dynamic two-stage DEA model by incorporating the environmental variables in the modelling framework.