An ensemble machine learning approach for forecasting credit risk of agricultural SMEs’ investments in agriculture 4.0 through supply chain finance

Belhadi, Amine; Kamble, Sachin S.; Mani, Venkatesh; Benkhati, Imane; Touriki, Fatima Ezahra

doi:10.1007/s10479-021-04366-9

An ensemble machine learning approach for forecasting credit risk of agricultural SMEs’ investments in agriculture 4.0 through supply chain finance

Original Research
Published: 09 November 2021

(2021)
Cite this article

Download PDF

Annals of Operations Research Aims and scope Submit manuscript

An ensemble machine learning approach for forecasting credit risk of agricultural SMEs’ investments in agriculture 4.0 through supply chain finance

Download PDF

11k Accesses
32 Citations
Explore all metrics

Abstract

Credit risk imposes itself as a significant barrier of agriculture 4.0 investments in the supply chain finance (SCF) especially for Small and Medium-sized Enterprises. Therefore, it is important for financial service providers (FSPs) to differentiate between low- and high-quality SMEs to accurately forecast the credit risk. This study proposes a novel hybrid ensemble machine learning approach to forecast the credit risk associated with SMEs’ agriculture 4.0 investments in SCF. Two core approaches were used, i.e., Rotation Forest algorithm and Logit Boosting algorithm. Key variables influencing the credit risk of agriculture 4.0 investments in SMEs were identified and evaluated using data collected from 216 agricultural SMEs, 195 Leading Enterprises and 104 FSPs operating in African agriculture sector. Besides the classical measures of credit risk assessment without involving SCF, the findings indicate that current ratio, financial leverage, profit margin on sales and growth rate of the agricultural SME are the upmost important variables that SCF actors need to focus on, in order to accurately and optimistically forecast and alleviate credit risk. The output of our study provides useful guidelines for SMEs, as it highlights the conditions under which they would be seen as creditworthy by FSPs. On the other hand, this study encourages the wide application of SCF in financing agriculture 4.0 investments. Due to the model’s performance, credit risk forecasting accuracy is improved, which results in future savings and credit risk mitigation in agriculture 4.0 investments of SMEs in SCF.

Graphic abstract

Machine learning techniques for credit risk evaluation: a systematic literature review

Article 01 April 2020

Impact of big data analytics on supply chain performance: an analysis of influencing factors

Article Open access 27 May 2022

Artificial intelligence for decision support systems in the field of operations research: review and future scope of research

Article 03 January 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Like industry had to go through technological changes, agriculture needs to go through the same path to survive. Agricultural production is mainly challenged by weather conditions, crop diseases, and pest infections, which make it vulnerable to high risks. Increased demand, changing consumer patterns, food waste, climate change, and food scarcity are other issues that intensify the need for a drastic change in the agricultural sector (Clercqet al., 2018). To meet these challenges, agricultural actors, especially small and medium-sized enterprises (SMEs) are shifting towards digitalization. This includes robotics (Lottes et al., 2018), artificial intelligence (AI) (Liu et al., 2020), blockchain (Kamble et al., 2020), internet of things (IoT) (Ray, 2017), and big data analytics (BDA) (Liu et al., 2020). Industry 4.0 (I4.0) holds promising opportunities to reshape the agriculture industry and transform it into what is called, “Agriculture 4.0” (Clercq et al., 2018; Liu et al., 2020). However, this transformation is escorted by high investments that hinder SMEs' ability to invest in I4.0.

On the other hand, supply chain finance (SCF) has been growing as an innovative method to support SMEs' operations and investments (Yan et al., 2020). Agricultural SMEs can benefit from SCF to get funding for their investments in agriculture 4.0, lower financing costs and improve financing efficiency and effectiveness (Xu et al., 2018). This can be achieved by applying for loans. These loan requests are then evaluated at the financial service providers (FSPs) level based on the credit risk assessment. At this level, the leading enterprises (LEs), with whom SMEs are involved, act as the guarantors between agricultural SMEs and FSPs (Zhang et al., 2015). However, the few number of agricultural SMEs that have cooperated in SCF over the past decades alongside the fast-changing and uncertain technological context of the agricultural sector generate high credit risk with SCF for investments in agriculture 4.0. This hinders the wide implementation of agriculture 4.0, and creates further risk of propagation through the supply chain (SC) (Wang et al., 2020a, 2020b). Subsequently, it is crucial for FSPs to make correct credit loan decisions in SCF for investment, especially in agriculture 4.0.

For this purpose, machine learning (ML) approaches emerge as a strong forecasting method to predict SMEs' credit risk in SCF for financing both operations and investments (Abedin et al., 2020; Zhu et al., 2019). ML can significantly reduce variance and improve accuracy (Zhang & Ma, 2012). Ensemble machine learning (EML) in particular were acknowledged to provide better forecasting performance than ML and to generate better results with fewer requirements of data (Bi et al., 2019; Shajalal et al., 2021). EML uses multiple learning algorithms to obtain better predictive performance than the ones that originally derived from (Opitz & Maclin, 1999). Nevertheless, very few studies have focused on the use of EML for SMEs’ credit-risk forecasting in SCF for specific investments such as agriculture 4.0. Besides, no comparison between two or more hybrid methods has yet been conducted. This limited use of EML generates a lack of accuracy and diversity of the proposed models, which can seriously affect the forecasting of credit risk in SCF, increase credit risk uncertainty and hinder further investment in agriculture 4.0. Therefore, there is an urgent need to develop a new model to forecast credit risk as none of the models developed in the literature can deal with this special context.

Thus, this paper aims to address this research void, and address the following research objectives.

To prove that the hybrid EML approach is remarkably better than Individual ML methods in predicting SMEs credit risk to help SMEs, FSPs and LEs to accurately assess and mitigate credit risk associated with I4.0 investment of SMES.
To provide the optimistic set of variables that SMEs, FSPs and LEs must focus on alleviate credit risks, thus providing the conditions under which the SMEs would be seen as creditworthy in the eyes of their LEs and FSPs.

We, therefore, propose a hybrid EML approach that investigates the combination of three different sets of techniques, i.e., Gama Test (GT), Rotation Forest algorithm (RotF) and Loogit Boosting algorithm (LB). The proposed approach has been tested using data collected from 216 agricultural SMEs, 195 LEs and 104 FSPs operating in African agricultural market. The experimental study includes a meticulous performance comparison of the proposed EML approach with certain hybrid and simple classifiers to confirm its performance.

The remainder of our paper is structured as follows: Sect. 2 reviews the literature on agricultural SMEs' credit risk and the methods of forecasting SMEs’ credit risk for investments in agriculture 4.0 through SCF. Section 3 explains the construction and theory of the proposed approach. Section 4 presents the numerical application. Section 5 discusses the results of the experiment. Implications of the study are provided in Sect. 6. Finally, conclusions and future research directions are given in Sects. 7 and 8.

2 Theoretical underpinning

2.1 The concept of agriculture 4.0

Agriculture 4.0 integrates the farming operations’ internal and external interacting networks to provide digital, autonomous, and real-time information and allow communication between all parties in the agriculture industry (Wolfert et al., 2017). The technologies that enable agriculture 4.0 include robotics, AI, blockchain, BDA, IoT, and 3D food printing. Robots are used to collect vegetables and harvest the good ones through computer vision (Lottes et al., 2018). AI is applied in different applications in agriculture like decision support systems, mobile expert systems, and intelligent animal health monitoring (Liu et al., 2020). Traceability and tracking of product movements are ensured with blockchain which enhances the recall process of contaminated food. Historical data about weather, seed, soil quality is analyzed to provide farmers with insights for better decision-making using BDA (De Clercq, Vats, and Biel 2018). Monitoring the soil’s efficiency, temperature and humidity have become easier with the IoT (Prathibha et al., 2017). In a nutshell, I4.0 can deliver important improvements to shift the agriculture industry's challenges into opportunities.

2.2 Supply chain finance for investment in agriculture 4.0

Lipper et al. (2014) suggested that adapting agricultural systems will require increased upfront investment. Therefore, it is important to identify and credit the “co-benefits” generated through smart agriculture and to get access to the appropriate funds to absorb financial risks generated by innovative activities. The research and development phases may be co-financed with private partners, governments, or even universities (Clercq et al., 2018). The role of governments in supporting this shift is highly highlighted. This assistance can be achieved by offering financial incentives, identifying potential partners and deals to invest in, and providing affordable infrastructures (Clercq et al., 2018). Despite the importance of governments’ support, food SC is required to consider other alternatives. In this context, SCF emerges as an effective strategy to lower financing costs and improve financing efficiency and effectiveness (Xu et al. 2018).

Hofmann (2005) defines SCF as an approach for two or more organizations in a supply chain, including external service providers, to jointly create value through means of planning, steering, and controlling the flow of financial resources on an inter-organizational level. While preserving their legal and economic independence, the collaboration partners are committed to share the relational resources, capabilities, information, and risk on a medium- to long-term contractual basis. Suppliers and buyers can collaborate with financial institutions and high-technology firms to provide capital requirements, benefiting from both lower interest rates and flexible payment methods (Wuttke et al., 2016). The use of SCF techniques such as cash-to-cash cycle, the weighted average cost of capital (WACC), payables and receivables have the potential to remarkably reduce operational risks, improve data visibility, availability, and delivery of cash (Lam et al., 2019).

SCF allows for financial institutions to improve their risk assessment process when dealing with SMEs. Deakins and Hussain (1994) consider that this assessment is escorted with high uncertainty due to information distortion and asymmetry, which affects FSPs’ willingness to approve credit loans. In this context SCF process guarantees data availability and accuracy, which help FSPs to accurately estimate credit risk and profitability. Weak food SC players like agricultural SMEs can benefit from SCF to invest in I4.0, resulting in solving not only liquidity-profitability problems in the short term, but also financial and credit risks in the long term (Lam et al., 2019).

2.3 Agriculture 4.0 investment’s credit risk of agricultural SMEs in SCF

Financing investments in agriculture 4.0 in SCF involve three main actors: agricultural SMEs, LEs involved in the SC with the SMEs, and funders represented by banks or other FSPs. Figure 1 depicts the interactions between those actors.

When the agricultural SME is involved with a LE of a high credit level, FSPs are more disposed to consider the SME's financial request to finance its investment because a large-scale LE pays the receivable accounts with stable profitability. Therefore, LEs are considered a guarantor between agricultural SMEs and FSPs (Chi et al., 2017b; Zhang et al., 2015).

Deciding whether to finance an SME’s investment on agriculture 4.0 in a SC or not, and which SME should get priority to finance its investment is based on evaluating credit risks in SCF. SCF can reduce credit risks; however, it cannot eliminate it entirely (Zhang et al., 2015). Therefore, it represents a barrier of agriculture 4.0 investments’ support in SCF, as it can be ‘contagious’ and could lead to credit risk propagation in the whole SC (Wang et al., 2020a, 2020b). Credit risk associated with investments in agriculture 4.0 of agricultural SMEs in SCF depends on five factors:

Agriculture development status in the country The level of the agricultural sector development influences the agri-food SC operating status. Good development prospects creates a large profit space in both profitability and debt paying ability for agricultural SMEs and LEs (Wang et al., 2020a, 2020b).
Quality and credit conditions of agricultural SMEs SMEs are usually regarded as riskier enterprises compared to larger companies since they face a more uncertain and competitive environment (Stiglitz & Weiss, 1981). Thus, it is important for FSPs to distinguish between ‘high’ and ‘low’ quality SMEs (Song et al., 2016). This quality is based on the governance structure, staff skills, financial performance and management expertise (Zhang et al., 2015). Agricultural SMEs of 'high' quality are more likely to commit to loan fulfillment on time.
Credit conditions of LEs LEs are the ones with tight contacts with FSPs like banks and financial institutions; therefore, their credit records can be easily verified. If the LE has a good financial status and conforms to the FSP requirements of profitability and solvency, they can repurchase the contract, or act as guarantor between the agricultural SME and the FSP. Hence, credit risk can be reduced further (Zhang et al., 2015).
The profitability of agriculture 4.0 investment Agriculture 4.0 can improve productivity, cost savings, and profitability. I4.0 would enable farmers to achieve a drastic advance in their financial performance and could help them achieve maximum profits. Consequently, the good profitability gained from agriculture 4.0 would positively influence the credit risk, as agricultural SMEs would have enough resources to repay their loans.
Status of the cooperative relationship in the SC Under good cooperation between SMEs and LEs, profitability and commitment to repay debts can be improved (Xingli & Liao, 2020). SC collaboration is one of the most spread formative elements required to build SC visibility and mitigate various risks (Belhadi et al., 2021). Zhang et al. (2015) suggested that trade relationships between the two parts have a significant impact on the credit risk. Therefore, it is crucial to maintain stable and durable relationships across the SC.

2.4 Machine learning approaches for forecasting credit risk of SMEs in SCF

ML emerges as a strong forecasting method to predict SMEs' credit risk in SCF for financing both operations and investments (Zhu et al., 2019). Several studies used individual ML to predict credit risk for SMEs in SCF. For instance, Jackson and Wood (2013) constructed a default prediction model using logit regression. Their model was highly efficient at this point compared to other traditional methods undertaken by FSPs, such as Z″-Score model. Afterwards, models such as Generalized Extreme Value Regression (Calabrese & Osmetti, 2013), and nonparametric approach based on Random Survival Forests (Fantazzini & Figini, 2009) have been used for loan defaults in SMEs and performed much better than the classical logit model. Further, Chen et al. (2010) proposed the Key Mediating Variable model. These models generally require large datasets, fewer variables, and high accuracy for input data (Zhu et al., 2019). In addition, they are inadequate when dealing with nonlinear pattern classification, and they need to assume certain data distribution (Huang et al., 2004). These conditions are absenting since few SMEs have cooperated in SCF over the past decades alongside the fast-changing and uncertain technological context of agriculture. This hinders individual ML techniques' efficiency from dealing with credit risk forecasting problems for SMEs in SCF.

On the other hand, EML comes in the form of a concrete finite set of individual ML approaches, which generally leads to much flexible structure to exist within these approaches. It uses multiple learning algorithms to obtain better predictive performance than the one obtained from any composing learning algorithm alone (Opitz & Maclin, 1999). The number of EML techniques use-cases for credit risk forecasting in the context of SMEs is very limited. Even though EML is acknowledged to provide better forecasting performance than individual ML, only very few studies focused on using EML approaches for SMEs’ credit-risk forecasting in SCF for specific investments such as agriculture 4.0 (Zhu et al., 2016, 2017). Among the few, Zhu et al. (2019) predicted the credit risk in SCF by applying a two-stage hybrid model combining Random Subspace and MultiBoosting. Decision Tree (DT) was taken as the base method for the proposed model. However, several limitations are escorted with this hybrid EML approach. First, DT is highly sensitive to small changes in data which can cause instability (Dev & Eden, 2019). Therefore, it cannot be applied in the context of agriculture 4.0 investments since the available data in this context is dynamic and unstable. Second, the specific C4.5 algorithm used generally overfits the training examples with noisy data (Singh & Gupta, 2014). Third, limited datasets were used (46 SMEs and seven CEs), which cannot accurately verify their proposed models' performance. Finally, the study compares the hybrid approach to classical ML methods with no comparison between two or more hybrid approaches. The previous models considered only SME- oriented or/and general financial influencing factors, and did not consider specific factors like the profitability in the context of agriculture 4.0 investment. Therefore, to accurately predict credit risk for agricultural SMEs in SCF, both diversity and error reduction need to be fulfilled, especially in the context of agriculture 4.0 projects that is a unique and novel challenge with the lack of complete datasets. Additionally, some of these models do not reduce the feature selection and work with the original set of independent variables, which is less powerful in forecasting than the optimistic set of independent variables. Therefore, there is an urgent need to develop a new model to forecast credit risk in this special context.

To our knowledge, this study is among the earlier studies that deal with forecasting credit risk in SCF of agricultural SMEs for investment in agriculture 4.0 by making use of EML power and its strong forecasting attributes. The special context of African countries in this study is one of the main reasons why we used EML. Given the fact that SCF is a novel solution that is not widely applied in Africa, therefore, it is practically complicated to obtain a complete dataset especially for SMEs. In this vein, EML emerges as an efficient solution, as acknowledged by Zhu et al. (2019) that confirms EML was found to achieve a high-performance forecasting when dealing with small datasets since it is able to extract knowledge by training the model.

Hybrid EML in particular are more performant, more accurate, more robust and more diverse than IML or EML (Zhu et al., 2016, 2017) which responds to the needs of our study. Again, agriculture 4.0 investment’s credit risk of agricultural SMEs in SCF is in its infant stages, therefore LEs and FSP are less disposed to finance such risky operations if credit risk is not accurately assessed.

Building up on the previous discussions, we propose a hybrid EML approach that investigates the combination of three different set of techniques, i.e., GT, RotF algorithm and LB. We use 22 variables that consider the main factors influencing the credit risk of agricultural SMEs in SCF for investment in agriculture 4.0. For each factor we define the suitable set of variables, their definition and measurement. In addition, we use a large dataset, supporting further the performance of our model. We also compare two individual classification models, i.e., RotF and LB, alongside two-hybrid classification models Adaboost (AB)-RotF and Random Forest (RF)-LB to evaluate the classification performance.

2.5 Proposed ensemble learning model

EML is a group leaning approaches that make use of a set of individuals, independent and complementary ML methods to solve the same problem in order to yield better accuracy with collective intelligence (Ilk et al., 2020). The common feature of EML approaches is the reiterated running of the base ML algorithm to a sample retrieved from the training dataset. Following each running of the algorithm, a new classifier is created and appended to the ensemble (Webb & Zheng, 2004). The classification process is conducted by each classifier of the ensemble in isolation from the other classifiers before the ensuing votes fit together to release a single final classification (Shang & Goes, 2020).

In this study, we propose a hybrid EML approach to estimate the credit risk of agricultural SMEs in SCF for investment in Agriculture 4.0 illustrated in Fig. 2. The approach includes two main methods i.e., the instance partitioning method aiming to mitigate the effect of the noise data and the attribute partitioning method, which enhances the results accuracy and encourage diversity among trees. Combining these two features allow the proposed approach to benefit from the diversity needed to reduce test error (Webb & Zheng, 2004). It will also result in a better computation efficiency and considerable improvements in performance. Further, the proposed model deals perfectly with small data sets since it uses RotF technique. This latter was found to be more robust and stable than extreme learning machines, SVM, and neural networks, especially with small training sets (Chi et al., 2017a; Han et al., 2018; Sagi & Rokach, 2018).

As shown in the figure, the first stage is the pre-processing of the raw data by conducting nominal.

As shown in the figure, the first stage is the pre-processing of the raw data by conducting nominal data encoding, numerical data scaling, and outlier removal as well as feature selection using a non-linear modeling through GT in order to determine the most impactful variables on the forecasting performance. Afterwards, the dataset is split into several sub-datasets using bootstrap sampling with the replacement approach in LB. Then, RotF technique is used to select new sub-datasets from the original sub-datasets. Following, LB technique is used to train these new sub-datasets to provide a set of decisions from each new sub-dataset. Finally, the majority vote approach enables to aggregate the decision set and deduce the final decision. The overall pseudo-code of the proposed RotF-LB is given in Appendix. In the following subsections, we describe the three main techniques involved in our model i.e., feature selection using GT, RotF and LB.

2.6 Feature selection using GT

Selecting appropriate independent variables for ML models is important for improving forecasting accuracy and reducing computation time and overfitting (Gouvêa & Gonçalves, 2007). It is commonly performed to increase the predictive model’s interpretability and possibly reduce its cost (Zhu et al., 2019). Feature selection approaches aim to select a set of features by optimizing a criterion over different combinations of inputs. Afterwards, the dependencies between each combination of input features and the corresponding output are computed (Gouvêa & Gonçalves, 2007).

GT was highly used in feature selection problems. It is a technique for estimating noise variance, or the minimum mean square error (MSE), that can be achieved without overfitting using any continuous nonlinear models (Sharafati et al., 2020). GT allows to efficiently estimate the variance of the model’s output that cannot be accounted for by any smooth model based on the inputs. Also, GT leads to more accurate input selection than other methods like mutual information and grants time efficiency (Noori et al., 2010). The results of the GT are based on the calculation of a statistic called gamma. The calculation is as follows:

Consider ${X}_{i}$ the vector of independent variables (the input) and ${y}_{i}$ the dependent variables (the output).

1.
Compute the delta function of ${X}_{i}$ and the gamma function of ${y}_{i}$ using Eq. (1) and (2).
$$ \delta \left( k \right) = \frac{1}{m}\sum\nolimits_{i = 1}^{m} {\left| {X_{i,k} - X_{i} } \right|^{2} ,\quad k = 1, \ldots ,p} $$
(1)
where $X_{i,k}$ is the kth nearest neighbors for each $X_{i}$ (1 ≤ i ≤ m).
$$ \gamma \left( k \right) = \frac{1}{2m}\sum\nolimits_{i = 1}^{m} {\left| {y_{i,k} - y_{i} } \right|^{2} ,\quad k = 1, \ldots ,p} $$
(2)
where $y_{i,k}$ indicates the values of the output corresponding to each $X_{i,k}$.
2.
Compute the gamma statistic (Δ) using regression line as follow γ = Aδ + Δ, A, δ, and γ are the slope of the regression line, the delta function, and the gamma function, respectively.
3.
Calculate the intercept of the regression line, which is equal to the gamma statistic
$$ (\Delta ):\vartheta_{ratio} = \frac{\Delta }{{\sigma^{2} \left( y \right)}} $$
(3)

where σ2 (y) is the variance of the dependent variable y

The $\vartheta_{ratio}$ is within range of [01], where the best prediction performance provides the zero value for the ratio.

2.7 Rotation forest algorithm

RotF is a popular ensemble classifier generation technique in which the training set for each base classifier is formed by applying Principal Component Analysis (PCA) transformation to rotate the original attribute axes. RotF is built with independent decision trees trained with a complete dataset including a rotated feature space. The feature set is randomly divided into Z subsets in order to determine what principal components are expected to maintain variability and generate the training samples of the base classifier. Via k axis rotations, the new features for base classifier are formed, ensuring data diversity through doing feature extraction for each base classifier and accuracy by keeping all principal components and using the whole dataset to train each base classifier (Rodríguez et al., 2006).

Let L be a training set, L = [X Y] where X is the matrix of the input attribute values in a form of N × n matrix and Y is N-Dimensional vector with class labels of the data.

z be the number of attribute subset
x be the data point to be classified by n features
T number of iterations
F is the feature set

For j = 1, 2,…,T

1.
Start by splitting F the attribute set into Z subsets ${F}_{l,z}$ (z = 1, 2…, Z)
2.
For z = 1, 2,..,Z
1. i.
  Select X columns that attributes in ${F}_{l,z}$ and compose submatrix ${X}_{t,z}$
2. ii.
  From ${X}_{t,z}$ draw a bootstrap sample of objects ${X}_{t,z}^{^{\prime}}$
3. iii.
  Run PCA and obtain ${D}_{t,z}$ matrix containing the coefficient of the i^th principal component in the i^th column.
3.
Organize the resultant vectors with corresponding coefficients in the rotation matrix ${R}_{i}$.

2.8 Logitboosting algorithm

Boosting is a gradual additive method that works by sequentially applying a classification algorithm to reweighted versions of the training data and then taking a weighted majority vote of the sequence of classifiers thus produced (Friedman, 2001). However, boosting algorithms are challenged by the overfitting when dealing with noisy data. Therefore, Friedman (2001) proposed an advanced variant of the AB algorithm namely LB: a new algorithm that can handle random label noise in a better way. LB can greatly reduce training errors and hence results in better generalization. The potential function used by the LB algorithm is convex; therefore, it allows better computation efficiency and results in dramatic improvements in performance.

The details of LB algorithm is presented below:

Let E be an ensemble of learning and Z the number of boost samples.

1.
Starting with equal weights ${\omega }_{i}$ = $\frac{1}{N}$, i = 1,…,N, Function F(x) = 0 and sample probability estimates p(${x}_{i}$) = 0.5
2.
Repeat form = 1 to Z
1. i.
  Compute the working responses and weights
  $$ Z_{i} = \frac{{y_{i} - p\left( {x_{i} } \right)}}{{p\left( {x_{i} } \right)\left( {1 - p\left( {x_{i} } \right)} \right)}}\,{\text{and}}\,\omega_{i} = \frac{{p\left( {x_{i} } \right)}}{{1 - p\left( {x_{i} } \right)}} $$
  (4)
2. ii.
  Fit the decision function $f_{m}$(x) by a weighted least-squares regression of ${\mathrm{z}}_{\mathrm{i}}$ to ${x}_{i}$ using weights ${\omega }_{i}$.
3. iii.
  Update both F(x) and p(x)
  $$ F\left( x \right) = F\left( x \right) + 0.5f_{m} \left( x \right)\,{\text{and}}\,p\left( x \right) = \frac{{e^{F\left( x \right)} }}{{e^{F\left( x \right)} + e^{ - F\left( x \right)} }} = (1 + e^{ - 2F\left( x \right)} )^{ - 1} $$
  (5)
3.
Output classifier = the most often predicted class of E.

3 Empirical study: a case study of agricultural SMEs in Africa

3.1 Context and variables

Financing the integration of I4.0 investments for agricultural SMEs in Africa is not well developed. Plenty of institutional and policy-related factors increase credit risk uncertainty, thus influencing the attraction of private investment or even the use of public funds to develop agriculture 4.0 capabilities in Africa (Wang et al., 2020a, 2020b). In view of that, exploring a large set of requests for funding agriculture 4.0 initiatives in the African exchange market should involve plenty of indicators. The dependent variables are the probability of risky agriculture 4.0 initiative taken directly from the FSPs and is assigned the value of 0 or 1, indicating whether the initiative is likely to be risky or not. Furthermore, we select, from the literature, 22 independent variables related to the different factors presented in Sect. 1.3. Table 1 defines the initial set of independent variables and their factors.

Table 1 Original set of independent variables

Full size table

3.2 Data collection procedure

Since SCF is considered as a novel solution to overcome financial issues and not yet applied widely in Africa, only a few African SMEs cooperate with LEs and FSPs in making use of SCF during agriculture 4.0 investments. This makes it practically complicated to obtain a complete SCF dataset for investment in agriculture 4.0, especially for SMEs (Belhadi et al., 2019). Therefore, a survey-based study seems to be the best alternative to obtain the initial dataset to evaluate our proposed EML approach (Belhadi et al., 2020). The selection criteria for the initial sample include three factors. First, we targeted SMEs listed on the Small and Medium Enterprise Board of the African Securities Exchanges Association (ASEA) with the International Standard Industrial Classification (ISIC) codes between 01–03 and 10–11 related to agricultural activities. Second, we ensured that these SMEs had engaged in SCF for an agriculture 4.0 investment by reviewing their official website and newspapers. Third, the LEs are selected to be direct partners (buyers or suppliers) of the selected SMEs, which have some supremacy in leading operations of the SC, enjoying strong creditworthiness and having high financial power. Thus, when SMEs have financing requirements, SCF can take them with the involvement of the LEs (Wuttke et al., 2019).

Based on the above sample selection criterion, we select 618 listed agricultural SMEs, 469 LEs from their direct SC circles alongside 551 FSPs operating in the agricultural sector in Africa. The companies were then contacted via an initial email, attaching a website link to an electronic copy of the survey with a request to participate in the project. Given the multitude reminders and phone follow-ups for the targeted companies, we have obtained a relatively high response rate. Indeed, of the initial sample, 216 questionnaires were returned from agricultural SMEs with a response rate of 34.95%, 195 questionnaires from the LEs with a response rate of 41.57% and 104 questionnaires from the FSPs with a response rate of 18.87%. Data were collected between January and June 2020. The online survey design was such that responses to all the questions were compulsory, without which the respondent was not permitted to make the final submission of the questionnaire. This compulsion eliminated missing values and incomplete responses, making all the questionnaires usable for analysis. However, to account for the early and late submissions' non-response bias, we followed a two-step procedure. In the first step, performance bias was tested by pair-wise comparison of the demographic characteristics of the samples using paired sample t-tests. We did not observe a significant difference. In the second step, we compared two categories: the first 30% (equivalent to early respondents) and the last 30% using ANOVA analysis on all variables. We found no significant difference between respondents and non-respondents (i.e., p > 0.2). Both the tests yield no significant difference between samples. Therefore, we conclude that non-response bias was not a potential concern in this study. It is noteworthy that the respondents of the questionnaire were enterprises’ senior leaders with titles such as financial manager, sales manager, etc. Collected data have been triangulated with official announcements of the sample companies as well as related articles released in newspapers and business magazines.

3.3 Performance measurement

The performance of the RotF-LB model is evaluated based on interpretable performance measures (Fayyaz et al., 2020). For comparison purposes, two individual classification models i.e., RotF (Ren et al., 2017) and LB (Friedman, 2001) alongside two-hybrid classification models i.e., AB-RotF (Rodríguez et al., 2006) and RF-LB (Kotsiantis, 2013) are also conducted through the experiments. Typically, five evaluation measures are used to evaluate and compare the classification performance, including the accuracy, precision, recall, F-measure, and the area under the curve (AUC) measure. Table 2 provides calculation details.

Table 2 Evaluation measures

Full size table

Obviously, a good forecasting approach should have high classification accuracy, precision, recall, F-measure and AUC. A high classification accuracy means that the model can correctly identify positive samples and negatives samples, helping thereby FSPs to cluster credit risky agriculture 4.0 investments and non-credit risky agriculture 4.0 investments (Bekhet & Eletter, 2014). Additionally, higher classification precision implies that FSPs can correctly identify non-credit risky agriculture 4.0 investments among all the positive samples (Zhu et al., 2019). The recall value quantifies the ability to pinpoint non-credit risky agriculture 4.0 investments from all the samples that should have been labeled as non-risky (Ying Liu & Huang, 2020). Moreover, the F-measure is the harmonic mean of the precision rate (i.e., positive predictive) and the recall rate (i.e., sensitivity). Accordingly, the higher the F-measure is, the better the forecasting performance of the approach. Finally, AUC measure is considered to quantify the approach's ability to avoid false classification (Zhu et al., 2019).

3.4 Computational experiments

We conduct two sets of computational experiments. The first experiment set aims to select the most important (optimistic) independent variables $ {\rm A}^{*} = \left\{ {\alpha_{1}^{*} ,\alpha_{2}^{*} ,\alpha_{3}^{*} , \ldots ,\alpha_{m}^{*} } \right\}$ from the original set of variables $ {\rm A} = \left\{ {\alpha_{1} ,\alpha_{2} ,\alpha_{3} , \ldots ,\alpha_{n} } \right\}$. Specifically, we conduct a GT on the original set of variables to select the best set of independent variables. Beside its positive impact on the model accuracy and computational time (Sharafati et al., 2020), selecting appropriate independent variables can inform us and FSPs about which factors are prominent forecasting credit risk related to agriculture 4.0 investment in SCF (Gouvêa & Gonçalves, 2007). Additionally, agricultural SMEs’ managers could be informed about which factors to focus on to enhance their financing ability and creditworthiness when seeking SCF for their agriculture 4.0 investments (Zhang et al., 2015). Further, in the second experiment set, we conduct a parameter tuning of the used models' main parameters. Therefore, we grid-search approximately 1000 parameter settings for each classifier and use a 10-fold cross-validation method to push up the forecasting performance and lessen the effect of the training set's variability.

The experimental study is conducted on a PC with a 2.11 GHz, Intel i7-8650U CPU and a 16 GB RAM using Windows 10 operating system. The data mining toolkits used is Anaconda (https://www.anaconda.com/): a conditional free and open-source distribution of the Python programming language for data science and machine learning applications. The total experiment took a runtime of about four hours. In the following section, we provide a discussion on the experimental procedures and some important experimental results.

4 Results and discussion

4.1 Discussion on feature selection

According to Ransom et al. (2017), the optimistic set of independent variables A^* often presents more forecasting power than the set A of the original independent variables. Using nonlinear modeling, GT is applied on the original set A in order to reduce the features and obtain the most powerful optimistic variables for forecasting the credit risk associated with investments in agriculture 4.0 of SMEs. Table 3 demonstrates the ϑ − ratio values of the optimistic set A^*, including 11 variables.

Table 3 Set of optimistic independent variables according to GT

Full size table

The GT findings indicate that the top three highest relative importance ϑ − ratio include the return on investment and abnormal return of agriculture 4.0 investment and the credit rating of the LE. Surprisingly, these variables are found to be more important than the credit conditions variables of the agricultural SME. This highlights SCF-related variables' role in helping agricultural SMEs get financing access to support their agriculture 4.0 investment. Further, the findings show that the profitability of the investment itself is the main effect factor in assessing SMEs’ credit risk in SCF for agriculture 4.0 investments. Also, several factors related to the LE and the agricultural SME's credit conditions are important, such as the credit rating of the LE and the current ratio of the agricultural SME (Lang Zhang, Hu, and Zhang, 2015). Importantly, the intensity of the contract between the agricultural SME and the LE is found to be critical in assessing the credit risk and promoting SME's access to SCF finance to support their agriculture 0.4 investment itself is profitability, no variables belonging to the status of agriculture development in the country is determinant in assessing credit risk in SCF (Wuttke et al., 2019).

4.2 Discussion on model forecasting performance

The performance of the five classification models, i.e., RotF, LB, AB-RotF, RF-LB, and RotF-LB are evaluated using the measures presented in Table 3. Two rounds of evaluation have been done, i.e., using the original set of independent variables A and using the optimistic set of independent variables A*. The results of these two rounds of evaluation are illustrated in Table 4. For more clarity during the discussion, the comparative results are presented in Figs. 2 and 3.

Table 4 Performance evaluation of RotF-LB and other classification models using the optimistic set A* and original set A of independent variables

Full size table

The performance evaluation results provide prima facie evidence confirming that the optimistic set of independent variables A* enhances the performance of all the five classification models. Furthermore, the comparison depicted in Fig. 3 indicates that the proposed hybrid EML approach i.e., RotF-LB shows the best forecasting performance using both the original set A and the optimistic set A* of independent variables. Notably, we highlight that using the simple classifier RotF or combined with AB contributes to enhancing precision. This is critical for FSPs in SCF to guarantee that the non-credit risky agriculture 4.0 investments identification is actually correct. However, individual RotF or even combined with AB does not ensure high accuracy. This could be misleading for FSPs since they might miss several non-risky agriculture 4.0 investments and even profitable due to the inability to distinguish risky and non-risky agriculture 4.0 investments. On the other hand, using LB individually or to boost another classifier enhances the accuracy and recall of the forecasting performance. This is of utmost importance for FSPs to draw a clear distinction between risky and non-risky agriculture 4.0 investments. However, LB shows a weak precision. Overall, neither RotF nor LB is sufficiently robust by itself since they are unable individually to present a high F-measure, which is the combination of precision and recall.

Further, the complementarity between RotF and LB allows to alleviate each approach's weaknesses individually and makes their combination the strongest classification approach in all performance measures among the four forecasting approaches (see Fig. 3). This combination gets stronger with the optimistic set of independent variables (see Fig. 4).

4.3 Discussion on the accumulated local effect (ALE) plots

Having identified the set of the most important (optimistic) independent variables A* influencing the credit risk of agriculture 4.0 investments in agricultural SMEs using the output of GT, it pave the way for FSPs to disclose the most important factors and variables to manage and evaluate agriculture investments through SCF of SMEs. Besides, we have confirmed the robustness and the high relative forecasting power of our proposed hybrid RotF-LB EML approach compared to other approaches. Further in this section, we emphasize the optimistic set of independent variables A* by exploring the individual effect of these variables on the credit risk of agriculture 4.0 investments. This information is of utmost importance for decision-makers in FSPs to manage credit risk, agricultural SMEs to enhance creditworthiness and LEs to reduce the credit risk of joint liability. Following the many weaknesses of several analysis studies, such as the partial dependence plot and the individual conditional expectation, the concept of Accumulated Local Effect (ALE) is introduced (Apley, 2016) as the most reliable technique in one-dimensional plots for highly correlated space of variables. The aim of using ALE is to visualize the marginal effects of each variable across its distribution. To elaborate ALE plots, we have used the “alibi” library in Python. Figures 4, 5 and 6 illustrate the ALEs visualizing the impact of the optimistic set of variables on the probability of non-risky agriculture 4.0 investments. Dashed horizontal lines stand for ALE = 0 for that variables, indicating values that do not influence predictions significantly. The distribution of each variable is illustrated visually on the x-axis, where ticks represent individual data points.

4.4 The Impact of quality and credit conditions of agricultural SMEs

The variables related to agricultural SMEs' quality and credit conditions represent the classical measures to assess credit risk on an investment without involving SCF. Figure 5 depicts the ALE plots indicating the impact of variables related to agricultural SMEs' quality and credit conditions on the probability of non-risky agriculture 4.0 investments. According to Fig. 5a, the probability of non-risky agriculture increases significantly with higher values of the current ratio of agricultural SMEs until it exceeds approximatively 3.8. Then, the probability of non-risky agriculture 4.0 investments decreases with higher values of the current agricultural SME ratio. Interestingly, the agriculture 4.0 investments become non-risky (ALE > 0) once the current ratio of agricultural SMEs exceeds 2.0. However, the agriculture 4.0 investments are becoming again risky (ALE < 0) with the current agricultural SME ratio above 3.8. This pattern is fully consistent with the financial significance and characteristics of this variable. Financial literature indicates clearly that a minimum two-to-one ratio between floating assets and liabilities is required to ensure a high ability of short-term debt paying (Custódio et al., 2013). Therefore, we observed a high probability of non-risky investment at this point. On the other hand, higher values of the current ratio might indicate an excess of liquidity or inventory within the agricultural SME, which means poor and lack of efficiency in managing liquidity (Bhunia et al., 2011). Hence, financing investments in this kind of SME is risky. We conclude that agricultural SMEs should maintain their current ratio between 2.0 and 3.8 to show high creditworthiness in the eyes of FSPs. Moreover, Fig. 5b depicts that the probability of non-risky agriculture 4.0 investment decreases with higher financial leverage of the agricultural SME. Agriculture 4.0 investments made by SMEs with a financial leverage exceeding 0.18 are qualified as risky. Indeed, a high level of long-term debt means that the SME is accumulating external financing rather than mobilizing internal capital to finance its investments in the long term. According to the Pecking order theory (Frank & Goyal, 2003), long-term profitability decreases with a high debt accumulation level in the long-term. Therefore, agricultural SMEs should control the value of financial leverage at lower than 0.18. Furthermore, according to Fig. 5c, the profit margin on sales of the agricultural SME is positively correlated with the probability of non-risky agriculture 4.0 investment in the general trend. Naturally, the agricultural SME's higher profitability signifies a high financial ability, which may mitigate credit-risks and enhance trust between FSPs and the SME. Therefore, agricultural SMEs should optimize their operations and reduce their production costs prior to engaging in agriculture 4.0 investment. Finally, Fig. 5d indicates an interesting trend. Naturally, a high growth rate of the agricultural SME should involve faster expansion velocity of the asset management scale and thereby a low risk in financing long-term investments such as agriculture 4.0. Nonetheless, we notice that from a certain value, the probability of non-risky agriculture 4.0 investments decreases. This could be explained by the fact that smaller companies mostly rush in expanding their assets in a short period through long-term debt, which generates financial and operational risks (Chen et al. 2020). Hence, agricultural SMEs are recommended to opt for internal capital to expand their assets even in a longer period.

4.5 The impact of the credit conditions of the LE and the status of the cooperative relationship

The variables associated with the credit conditions of the LE alongside the status of the cooperative relationship between the LE and the agricultural SME constitute the cutting-edge features brought by SCF to investments’ financing. Figure 6 depicts the ALE plots indicating the impact of variables related to agricultural SMEs' quality and credit conditions on the probability of non-risky agriculture 4.0 investments. According to Fig. 5a, the LE rating is positively correlated with the probability of non-risky investment. Generally, from the credit rating level 'fair' of the LE, agriculture 4.0 investments are seen as non-risky (ALE > 0). This finding suggests that agricultural SMEs should work with LEs having at least a 'fair' level of credit rating to improve their creditworthiness in the SCF context. Also, the effect of the financial leverage of the LE depicted in Fig. 6b is similar to that of the financial leverage of the agricultural SME (Fig. 5b). However, the threshold value of the leverage for the LE is found to be approximately 0.28, which is greater than the acceptable value for the agricultural SME. This can be explained by the agency theory (Shapiro, 2005), which suggests that larger firms tend to be more mature and diversified and can absorb financial debt in the long-term run more than smaller companies. Another interesting feature indicated in Fig. 6c is that the probability of non-risky agriculture 4.0 increases significantly with a higher profit margin of the LE. However, once a certain level has been reached, the probability of non-risky investment levels off regardless of the further increase of the profit margin on sales of the LE. Using this insight, agricultural SMEs willing to invest in agriculture 4.0 could effectively select the LE with the minimum required level of profit margin on sales for applying for SCF. In addition, FSPs could use this finding to support their financing decision-making. Concerning the intensity of the contract between the agricultural SME and the LE, weak relationships (< 5 years) are found to reduce the probability of non-risky agriculture 4.0 investments. Therefore, agricultural SMEs need to tie long-term relationships with LEs in their SCs.

4.6 The impact of the profitability of the agriculture 4.0 investment

The overall picture of credit-risk assessment could not be completed without exploring the impact of the variables associated with the profitability of the agriculture 4.0 investment itself (Fig. 7). First, Fig. 6a depicts the impact of the return on investment of agriculture 4.0. A higher value of the investment return indicates high profitability of the investment and a high non-probability of risk. We notice that the minimum acceptable rate of return is approximately 40%. Above this value, the agriculture 4.0 investment becomes non-risky. This extends several studies in the literature (e.g., Horváth & Szabó, 2019; Moeuf et al., 2018). This finding implies that SMEs should engage in large-scale I4.0 programs whose return on investment exceeds 40%. FSPs should request strong evidence for agriculture 4.0 projects presenting a return on investment below 40% and weigh up whether these investments are worth the risks associated with them. Second, the abnormal return of agriculture 4.0 investments depicted in Fig. 6b presents a linear effect on the probability of non-risky investment. Indeed, at least the investment should be able to achieve what has been expected from it (abnormal return = 0) to be acceptable. Finally, it is noticeable from Fig. 6c that the profitability index follows a similar trend. In fact, around the value of 1.0 of the profitability index, the agriculture 4.0 investment switches from the risky zone to the non-risky zone. Thus, the motivation for agriculture 4.0 investment financing for SMEs might be questioned if the capital investment cost is too high or the present value of the cash collected from the investment is too low. Therefore, to minimize the financing credit risk, FSPs decision-makers need to pay more attention to the comparison capital invested vs. present value of cash collected from agriculture 4.0 investment.

5 Managerial implications

The extant literature on I4.0 related investments lacks the mechanisms to highlight the factors affecting the assessment of credit risk of SCF to promote the financing, especially for smaller companies to implement such technologies (Zhu et al., 2019). Accordingly, the present study could be a solution for SMEs and startups who mostly face tremendous challenges to get access to funding to finance their I4.0 implementation initiatives. Through the practical example of agriculture 4.0 investments for African SMEs in SCF, our study provides numerous implications for SMEs, FSPs alongside LEs to assess accurately and mitigate the credit risk associated with I4.0 investments of SMEs.

First, the findings of our study extend the traditional lending system mainly based on the credit conditions of the SME to include other crucial SCF factors such as the credit conditions of the LE and the status of the cooperative relationship between the LE and the agricultural SME, alongside the profitability of the agriculture 4.0 investment itself. Moreover, SMEs, FSPs, and LEs are provided with the optimistic set including the most important variables to focus on to assess accurately and alleviate credit risks associated with agriculture 4.0 investments, namely current ratio, financial leverage, profit margin on sales and growth rate of the agricultural SME. Furthermore, our model is perfect when dealing with small datasets. The use of RotF was found to be robust and stable than extreme ML, SVM, and neural networks, especially with small training sets (Abedin et al., 2019; Han et al., 2018). Thus, our model would be of great usefulness for SMEs that particularly deal with small datasets.

Second, our study clearly highlights the conditions under which the SMEs would be seen as creditworthy in the eyes of their LEs and FSPs. Accordingly, agricultural SMEs should control their current ratio between 2.0 and 3.8 with financial leverage exceeding 0.18. Agricultural SMEs are also recommended to increase their profitability and growth through internal capital and minimize long-term debt usage. These insights could redirect the effort of agricultural SMEs in Africa to improve their credit conditions and get access to funding to support their agriculture 4.0 transformation.

Third, the study's findings imply that agricultural SMEs should select strong, creditworthy LEs for their SCF collaboration (Wuttke et al., 2019). This includes LEs with high credit ratios, low financial leverage, and strong profitability in the long-term. In addition, agricultural SMEs are concerned with establishing long-term cooperation with their LEs in order to be able to improve their credit conditions in the SCF context to finance their agriculture 4.0 initiatives (Wuttke et al., 2019). Thus, our findings provide a useful guideline for agricultural SMEs that struggle to invest in I4.0 to support their agricultural operations.

FSPs should be aware of the usefulness and high performance of the proposed EML approach in accurately evaluating the credit risk associated with agriculture 4.0 investments of agricultural SMEs in SCF, which results in improving SC cash flow, reducing potential risks in the entire chain, and making correct loan decisions. Future savings can be made due to only a slight improvement in the forecasting accuracy, which was guaranteed with our model. This would also enable them to shift the way of doing things, and push them further to consider more developed tools and consider building a supportive culture for innovation. The study also implicitly supports the trend of refining popular algorithms to be more efficient and suitable for different cases and transform EML into simpler and more comprehensive models while preserving the predictive accuracy of the methods they were derived from. We believe that EML approaches are bound to move financial services toward the direction of efficiency, accuracy, and dynamism not only in loan decision making but in different applications.

Finally, in the special context of the coronavirus pandemic, economic sectors all over the world were deeply affected, including the agricultural sector. Many agricultural SMEs are currently faced with issues such as high demand uncertainty and changing customer patterns since many customers and local markets were closed. Therefore, agricultural SMEs have potential risks to be penalized by excess stocks and supply disruptions. As a result, FSP are more hesitant and uncertain about financially supporting investments in agriculture 4.0 projects. Consequently, it is more challenging to make correct credit loan decisions in SCF for investment, especially in agriculture 4.0. Thus, to help mitigate credit risk and credit risk propagation, this finding assumes significance in this special context.

Recent evidence shows that the covid crisis has made digitalization more crucial than ever. Therefore, more I4.0 projects are expected to rise and most likely to stick in the long term. Hence, our findings also prepare agricultural SMEs, LEs and FSPs to confidently face the upcoming challenges that will escort these new projects, mainly the credit risk forecasting. This research would enable these parties to engage in digitalization projects, create a culture of learning, innovation, resilience and trust.

6 Conclusion, limitations and future research

The present study addresses forecasting credit risk associated with agriculture 4.0 investments of SMEs in the SCF context. In doing so, an initial set of 22 independent variables affecting credit risk of agriculture 4.0 investments have been established based on five prominent factors, i.e., status of agriculture development in the country, quality and credit conditions of the agricultural SME’s, credit conditions of the LE, status of the cooperative relationship and the profitability of the agriculture 4.0 investment. This list has been prioritized using the GT algorithm to establish an optimistic set of variables, including 11 variables. These variables have been used as an input for a novel hybrid EML approach based on RotF and LB algorithms. Using a dataset collected from 216 agricultural SMEs, 195 LEs and 104 FSPs operating in Africa's agricultural market, the performance of the proposed RotF-LB ensemble approach has been tested. The results showed a good performance of the proposed model and suggested meaningful implications to assist agricultural SMEs, LEs and FSPs to promote the wide transformation towards agriculture 4.0.

In this study, we recognize certain limitations that opens for future research avenues on forecasting credit risk associated with SMEs' I4.0 investments in SCF. First, sample selection and data collection could generate some biases that might alter our findings. Hence, we recommend future research to conduct further experiments to confirm our proposed approach using large datasets from different contexts, particularly with primary data on SMEs in SCF. Second, although the comparison of the forecasting performance of the RotF-LB approach with some individual and EML approaches is beneficial, it may not be adequate. Future studies are encouraged to conduct further comparisons with advanced EML techniques to confirm its performance. In addition, the use of LB can certainly lessen the risk of the curse of dimensionality; however, it cannot omit it entirely. Thus, non-linear approaches such as autoencoders and representation learning need to be investigated, with the aim to reduce dimensionality and optimize accuracy.

References

Abedin, M. Z., Chi, G., Uddin, M. M., Satu, M. S., Khan, M. I., & Hajek, P. (2020). Tax default prediction using feature transformation-based machine learning. IEEE Access, 9, 19864–19881.
Article Google Scholar
Abedin, M. Z., Guotai, C., Moula, F. E., Azad, A. S., & Khan, M. S. U. (2019). Topological applications of multilayer perceptrons and support vector machines in financial decision support systems. International Journal of Finance & Economics, 24(1), 474–507.
Article Google Scholar
Apley, D. W. (2016). Visualizing the effects of predictor variables in black box supervised learning models. ArXiv. ArXiv Preprint http://arxiv.org/abs/1612.08468
Bekhet, H. A., & Eletter, S. F. K. (2014). Credit risk assessment model for jordanian commercial banks: Neural scoring approach. Review of Development Finance, 4(1), 20–28.
Article Google Scholar
Belhadi, A., Mani, V., Kamble, S. S., Khan, S. A. R., & Verma, S. (2021). Artificial intelligence-driven innovation for enhancing supply chain resilience and performance under the effect of supply chain dynamism: An empirical investigation. Annals of Operations Research, 1–26.
Belhadi, A., Kamble, S. S., Zkik, K., Cherrafi, A., & Touriki, F. E. (2020). The integrated effect of big data analytics, lean six sigma and green manufacturing on the environmental performance of manufacturing companies: The case of North Africa. Journal of Cleaner Production. https://doi.org/10.1016/j.jclepro.2019.119903
Article Google Scholar
Belhadi, A., Zkik, K., Cherrafi, A., & Sha’ri, M. Y. (2019). Understanding big data analytics for manufacturing processes: Insights from literature review and multiple case studies. Computers and Industrial Engineering, 137, 106099. https://doi.org/10.1016/j.cie.2019.106099
Article Google Scholar
Bhunia, A., Khan, I., & MuKhuti, S. (2011). A study of managing liquidity. Journal of Management Research, 3(2), 1.
Article Google Scholar
Bi, W. L., Hosny, A., Schabath, M. B., Giger, M. L., Birkbak, N. J., Mehrtash, A., Allison, T., Arnaout, O., Abbosh, C., Dunn, I. F., & Mak, R. H. (2019). Artificial intelligence in cancer imaging: clinical challenges and applications. CA: A Cancer Journal for Clinicians, 69(2), 127–157.
Calabrese, R., & Osmetti, S. A. (2013). Modelling small and medium enterprise loan defaults as rare events: The generalized extreme value regression model. Journal of Applied Statistics, 40(6), 1172–1188.
Article Google Scholar
Chen, H., Xu, Y., & Yang, J. (2020) Systematic risk, debt maturity, and the term structure of credit spreads. Journal of Financial Economics.
Chen, X., Wang, X., & Desheng Dash, Wu. (2010). Credit risk measurement and early warning of SMEs: An empirical study of listed SMEs in China. Decision Support Systems, 49(3), 301–310.
Article Google Scholar
Chi, G., Abedin, M. Z., & Moula, F. E. (2017). Chinese small business credit scoring: Application of multiple hybrids neural network. International Journal of Database Theory and Application, 10(2), 1–22.
Article Google Scholar
De Clercq, M., Vats, A., & Biel, A. (2018). Agriculture 4.0: The future of farming technology. Proceedings of the World Government Summit, Dubai, UAE, 11–13.
Comerton-Forde, C., Hendershott, T., Jones, C. M., Moulton, P. C., & Seasholes, M. S. (2010). Time variation in liquidity: The role of market-maker inventories and revenues. The Journal of Finance, 65(1), 295–331.
Article Google Scholar
Corallo, A., Latino, M. E., & Menegoli, M. (2018). From Industry 4.0 to agriculture 4.0: A framework to manage product data in agri-food supply chain for voluntary traceability. International Journal of Nutrition and Food Engineering, 12(5), 146–150.
Google Scholar
Custódio, C., Miguel, A. F., & Luís, L. (2013). Why are US firms using more short-term debt? Journal of Financial Economics, 108(1), 182–212.
Article Google Scholar
Deakins, D., Guhlum H. (1994). Risk assessment with asymmetric information. International Journal of Bank Marketing.
Dev, V. A., & Eden, M. R. (2019). Gradient boosted decision trees for lithology classification. In Computer Aided Chemical Engineering (Vol. 47, pp. 113-118). Elsevier..
Diamond, D. W., & Rajan, R. G. (2001, June). Banks, short-term debt and financial crises: theory, policy implications and applications. In Carnegie-Rochester conference series on public policy (Vol. 54, No. 1, pp. 37-71). North-Holland
Fantazzini, D., & Silvia, F. (2009). Random survival forests models for SME credit risk measurement. Methodology and Computing in Applied Probability, 11(1), 29–45.
Article Google Scholar
Fayyaz, M. R., Rasouli, M. R., & Amiri, B. (2020). A data-driven and network-aware approach for credit risk prediction in supply chain finance. Emerald Publishing Limited.
Google Scholar
Frank, M. Z., & Vidhan, K. G. (2003). Testing the pecking order theory of capital structure. Journal of Financial Economics, 67(2), 217–248.
Article Google Scholar
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189–1232.
Gouvêa, M. A., Eric B. G. (2007). Credit risk analysis applying logistic regression, neural networks and genetic algorithms models. In POMS 18th annual conference.
Guotai, C., Mohammad, Z. A., & Fahmida, E. M. (2017). Modeling credit approval data with neural networks: an experimental investigation and optimization. Journal of Business Economics and Management, 18(2), 224–240.
Article Google Scholar
Han, T., Jiang, D., Zhao, Q., Wang, L., & Yin, K. (2018). Comparison of random forest, artificial neural networks and support vector machine for intelligent diagnosis of rotating machinery. Transactions of the Institute of Measurement and Control, 40(8), 2681–2693.
Article Google Scholar
Hofmann, E. (2005). Supply chain finance: some conceptual insights. Beiträge Zu Beschaffung Und Logistik, 203–14.
Horváth, D., & Szabó, R. Z. (2019). Driving forces and barriers of Industry 4.0: Do multinational and small and medium-sized companies have equal opportunities?. Technological forecasting and social change, 146, 119–132.
Huang, Z., Chen, H., Hsu, C. J., Chen, W. H., & Wu, S. (2004). Credit rating analysis with support vector machines and neural networks: A market comparative study. Decision Support Systems, 37(4), 543–558.
Article Google Scholar
Ilk, N., Shang, G., & Goes, P. (2020). Improving customer routing in contact centers: An automated triage design based on text analytics. Journal of Operations Management, 66(5), 553–577.
Article Google Scholar
Jackson, R. H. G., & Anthony, W. (2013). The performance of insolvency prediction and credit risk models in the UK: A comparative study. The British Accounting Review, 45(3), 183–202.
Article Google Scholar
Kamble, S. S., Angappa, G., & Rohit, S. (2020). Modeling the blockchain enabled traceability in agriculture supply chain. International Journal of Information Management, 52, 101967.
Article Google Scholar
Kamble, S. S., Angappa, G., Vikas, K., Amine, B., & Cyril, F. (2021). A machine learning based approach for predicting blockchain adoption in supply chain. Technological Forecasting and Social Change, 163, 120465.
Article Google Scholar
Keskin, B. B., Bott, G. J., & Freeman, N. K. (2021). Cracking sex trafficking: Data analysis, pattern recognition, and path prediction. Production and Operations Management, 30(4), 1110–1135.
Article Google Scholar
Kotsiantis, S. (2013). Rotation forest with logitboost. International Journal of Innovative Computing, Information and Control, 9(3), 1087–1094.
Google Scholar
Kovács, I., & Husti, I. (2018). The role of digitalization in the agricultural 4.0–how to connect the industry 4.0 to agriculture? Hungarian Agricultural Engineering, 33, 38–42.
Article Google Scholar
Lam, H. K. S., Yuanzhu, Z., Minhao, Z., Yichuan, W., & Andrew, L. (2019). The effect of supply chain finance initiatives on the market value of service providers. International Journal of Production Economics, 216, 227–238.
Article Google Scholar
Li, D., Chen, S., Chiong, R., Wang, L., & Dhakal, S. (2020). Predicting the printed circuit board cycle time of surface-mount-technology production lines using a symbiotic organism search-based support vector regression ensemble. International Journal of Production Research, 1-20.
Li, D. C., Che-Jung, C., Chien-Chih, C., & Wen-Chih, C. (2012). A grey-based fitting coefficient to build a hybrid forecasting model for small data sets. Applied Mathematical Modelling, 36(10), 5101–5108.
Article Google Scholar
Lipper, L., Thornton, P., Campbell, B. M., Baedeker, T., Braimoh, A., Bwalya, M., Caron, P., Cattaneo, A., Garrity, D., Henry, K., & Hottle, R. (2014). Climate-smart agriculture for food security. Nature Climate Change, 4(12), 1068–1072.
Article Google Scholar
Liu, Y., & Lihua, H. (2020). Supply chain finance credit risk assessment using support vector machine-based ensemble improved with noise elimination. International Journal of Distributed Sensor Networks, 16(1), 1550147720903631.
Article Google Scholar
Liu, Y., Ma, X., Shu, L., Hancke, G. P., & Abu-Mahfouz, A. M. (2020). From industry 4.0 to agriculture 4.0: Current status, enabling technologies, and research challenges. IEEE Transactions on Industrial Informatics, 17(6), 4322–4334.
Article Google Scholar
Lottes, P., Behley, J., Milioto, A., & Stachniss, C. (2018). Fully convolutional networks with sequential information for robust crop and weed detection in precision farming. IEEE Robotics and Automation Letters, 3(4), 2870–2877.
Article Google Scholar
Moeuf, A., Pellerin, R., Lamouri, S., Tamayo-Giraldo, S., & Barbaray, R. (2018). The industrial management of SMEs in the era of Industry 4.0. International Journal of Production Research, 56(3), 1118–1136.
Noori, R., Abdulreza, K., & Mohammad, S. S. (2010). Evaluation of PCA and gamma test techniques on ANN operation for weekly solid waste prediction. Journal of Environmental Management, 91(3), 767–771.
Article Google Scholar
Opitz, D., & Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11, 169–198.
Article Google Scholar
Pandey, D., & Agrawal, M. (2014). Carbon footprint estimation in the agriculture sector. In Assessment of Carbon Footprint in Different Industrial Sectors, Volume 1 (pp. 25-47). Springer.
Prathibha, S. R., Anupama, H., Jyothi, M. P. (2017) IoT based monitoring system in smart agriculture. In 2017 international conference on recent advances in electronics and communication technology (ICRAECT) (pp. 81–84). IEEE.
Qiao, Y., Friederike, M., Xueqing, H., Huayang, Z., & Xihe, P. (2019). The changing role of local government in organic agriculture development in wanzai county, China. Canadian Journal of Development Studies, 40(1), 64–77. https://doi.org/10.1080/02255189.2019.1520693
Article Google Scholar
Ransom, K. M., Nolan, B. T., Traum, J. A., Faunt, C. C., Bell, A. M., Gronberg, J. A., Wheeler, D. C., Rosecrans, C. Z., Jurgens, B., Schwarz, G. E., & Belitz, K. (2017). A hybrid machine learning model to predict and visualize nitrate concentration throughout the central valley aquifer, California, USA. Science of the Total Environment, 601, 1160–1172.
Article Google Scholar
Ray, P. P. (2017). Internet of Things for Smart Agriculture: Technologies, Practices and Future Direction. Journal of Ambient Intelligence and Smart Environments, 9(4), 395–420.
Article Google Scholar
Ren, L., Lin, Z., Lihui, W., Fei, T., & Xudong, C. (2017). Cloud manufacturing: Key characteristics and applications. International Journal of Computer Integrated Manufacturing, 30(6), 501–515.
Article Google Scholar
Rodríguez, J. J., Kuncheva, L. I., & Alonso, C. J. (2006). Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1619–1630. https://doi.org/10.1109/TPAMI.2006.211
Article Google Scholar
Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), 1–18. https://doi.org/10.1002/widm.1249
Article Google Scholar
Shajalal, M., Hajek, P., & Abedin, M. Z. (2021). Product backorder prediction using deep neural network on imbalanced data. International Journal of Production Research. https://doi.org/10.1080/00207543.2021.1901153
Article Google Scholar
Shapiro, S. P. (2005). Agency theory. Annual Review of Sociology, 31, 263–284.
Article Google Scholar
Sharafati, A., Asadollah, S. B. H. S., & Hosseinzadeh, M. (2020). The potential of new ensemble machine learning models for effluent quality parameters prediction and related uncertainty. Elsevier.
Book Google Scholar
Singh, S., & Priyanka, G. (2014). Comparative study ID3, cart and C4. 5 decision tree algorithm: A survey. International Journal of Advanced Information Science and Technology (IJAIST), 27(27), 97–103.
Google Scholar
Song, H., Kangkang, Yu., Ganguly, A., & Turson, R. (2016). Supply chain network, information sharing and SME credit quality. Emerald Group Publishing Limited.
Book Google Scholar
Stiglitz, J. E., & Andrew, W. (1981). Credit rationing in markets with imperfect information. The American Economic Review, 71(3), 393–410.
Google Scholar
Wang, J., Zhao, L., & Huchzermeier, A. (2020a). Operations-finance interface in risk management: Research evolution and opportunities. Production and Operations Management. https://doi.org/10.1111/poms.13269
Article Google Scholar
Wang, Z., Qiang, W., Yin, L., & Chaojie, L. (2020b). Drivers and outcomes of supply chain finance adoption: an empirical investigation in China. International Journal of Production Economics, 220, 107453.
Article Google Scholar
Webb, G. I., & Zijian, Z. (2004). Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering, 16(8), 980–991.
Article Google Scholar
Wetzel, P., & Erik, H. (2019). Supply chain finance, financial constraints and corporate performance: An explorative network analysis and future research agenda. International Journal of Production Economics, 216(July), 364–383. https://doi.org/10.1016/j.ijpe.2019.07.001
Article Google Scholar
Wolfert, S., Lan, G., Cor, V., & Marc-Jeroen, B. (2017). big data in smart farming–a review. Agricultural Systems, 153, 69–80.
Article Google Scholar
Wuttke, D. A., Constantin, B., Sebastian, H. H., & Margarita, P. S. (2016). Supply chain finance: optimal introduction and adoption decisions. International Journal of Production Economics, 178, 72–81.
Article Google Scholar
Wuttke, D. A., Rosenzweig, E. D., & Heese, H. S. (2019). An empirical analysis of supply chain finance adoption. Journal of Operations Management, 65(3), 242–261. https://doi.org/10.1002/joom.1023
Article Google Scholar
Xingli, W., & Huchang, L. (2020). Utility-based hybrid fuzzy axiomatic design and its application in supply chain finance decision making with credit risk assessments. Computers in Industry, 114, 103144.
Article Google Scholar
Xu, X., Chen, X., Jia, F., Brown, S., Gong, Y., & Xu, Y. (2018). Supply chain finance: A systematic literature review and bibliometric analysis. International Journal of Production Economics, 204, 160–173.
Article Google Scholar
Yan, N., Xuyu, J., Hechen, Z., & Xun, X. (2020). Loss-averse retailers’ financial offerings to capital-constrained suppliers: Loan vs. investment. International Journal of Production Economics, 227, 107665.
Article Google Scholar
Zambon, I., Cecchini, M., Egidi, G., Saporito, M. G., & Colantoni, A. (2019). Revolution 4.0: Industry vs. agriculture in a future development for SMEs. Processes, 7(1), 36.
Article Google Scholar
Zhang, C., & Yunqian, M. (2012). Ensemble machine learning: Methods and applications. Springer.
Book Google Scholar
Zhang, L., Haiqing, H., & Dan, Z. (2015). A credit risk assessment model based on SVM for small and medium enterprises in supply chain finance. Financial Innovation, 1(1), 14.
Article Google Scholar
Zhang, L., Jia, J., Gui, G., Hao, X., Gao, W., & Wang, M. (2018). Deep learning based improved classification system for designing tomato harvesting robot. IEEE Access, 6, 67940–67950.
Article Google Scholar
Zhu, Y., Chi, X., Bo, S., Gang-Jin, W., & Xin-Guo, Y. (2016). Predicting China’s SME credit risk in supply chain financing by logistic regression, artificial neural network and hybrid models. Sustainability, 8(5), 433.
Article Google Scholar
Zhu, Y., Chi, X., Gang-Jin, W., & Xin-Guo, Y. (2017). Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China’s SME credit risk in supply chain finance. Neural Computing and Applications, 28(1), 41–50.
Article Google Scholar
Zhu, Y., Li, Z., Chi, X., Gang-Jin, W., & Truong, V. N. (2019). Forecasting SMEs’ credit risk in supply chain finance with an enhanced hybrid ensemble machine learning approach. International Journal of Production Economics, 211, 22–33.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cadi Ayyad University, Marrakech, Morocco
Amine Belhadi
EDHEC Business School, Roubaix, France
Sachin S. Kamble
Montpellier Business School, Montpellier, France
Venkatesh Mani
ENSA-Safi, Cadi Ayyad University, Marrakech, Morocco
Imane Benkhati & Fatima Ezahra Touriki

Authors

Amine Belhadi
View author publications
You can also search for this author in PubMed Google Scholar
Sachin S. Kamble
View author publications
You can also search for this author in PubMed Google Scholar
Venkatesh Mani
View author publications
You can also search for this author in PubMed Google Scholar
Imane Benkhati
View author publications
You can also search for this author in PubMed Google Scholar
Fatima Ezahra Touriki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Venkatesh Mani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: RotF-LB algorithm

Let E be an ensemble of leaners, initially empty, and Z the number of boost samples

For j = 1 to Z do

Begin

The input variables are randomly grouped.

For each group of input variables:

Consider a dataset formed by this input variables.
Eliminate from the dataset all examples from a proper subset of the classes.
Eliminate from the dataset a subset of the examples.
Apply PCA with the remaining dataset.
Consider the components of PCA as a new set of variables.

1.
Starting with equal weights$ \omega_{i}$ = $\frac{1}{N}$, i = 1, …, N, Function F(x) = 0 and sample probability estimates p($x_{i}$) = 0.5
2.
Repeat form = 1 to Z
1. a.
  Compute the working response and weights
  $$ Z_{i} = \frac{{y_{i} - p\left( {x_{i} } \right)}}{{p\left( {x_{i} } \right)\left( {1 - p\left( {x_{i} } \right)} \right)}}\, {\text{and}}\,\omega_{i} = \frac{{p\left( {x_{i} } \right)}}{{1 - p\left( {x_{i} } \right)}} $$
  (6)
2. b.
  Fit the decision function $f_{m}$(x) by a weighted least-squares regression of ${\text{z}}_{{\text{i}}}$ to $x_{i}$ using weights $\omega_{i}$.
3. c.
  Update both F(x) and p(x)
  $$ F\left( x \right) = F\left( x \right) + 0.5f_{m} \left( x \right){\text{and}}\,p\left( x \right) = \frac{{e^{F\left( x \right)} }}{{e^{F\left( x \right)} + e^{ - F\left( x \right)} }} = \left( {1 + e^{ - 2F\left( x \right)} } \right)^{ - 1} $$
  (7)

Output classifier = the most often predicted class of E.

End

Rights and permissions

Reprints and permissions

About this article

Cite this article

Belhadi, A., Kamble, S.S., Mani, V. et al. An ensemble machine learning approach for forecasting credit risk of agricultural SMEs’ investments in agriculture 4.0 through supply chain finance. Ann Oper Res (2021). https://doi.org/10.1007/s10479-021-04366-9

Download citation

Accepted: 19 October 2021
Published: 09 November 2021
DOI: https://doi.org/10.1007/s10479-021-04366-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An ensemble machine learning approach for forecasting credit risk of agricultural SMEs’ investments in agriculture 4.0 through supply chain finance

Abstract

Graphic abstract

Similar content being viewed by others

Machine learning techniques for credit risk evaluation: a systematic literature review

Impact of big data analytics on supply chain performance: an analysis of influencing factors

Artificial intelligence for decision support systems in the field of operations research: review and future scope of research

1 Introduction

2 Theoretical underpinning

2.1 The concept of agriculture 4.0

2.2 Supply chain finance for investment in agriculture 4.0

2.3 Agriculture 4.0 investment’s credit risk of agricultural SMEs in SCF

2.4 Machine learning approaches for forecasting credit risk of SMEs in SCF

2.5 Proposed ensemble learning model

2.6 Feature selection using GT

2.7 Rotation forest algorithm

2.8 Logitboosting algorithm

3 Empirical study: a case study of agricultural SMEs in Africa

3.1 Context and variables

3.2 Data collection procedure

3.3 Performance measurement

3.4 Computational experiments

4 Results and discussion

4.1 Discussion on feature selection

4.2 Discussion on model forecasting performance

4.3 Discussion on the accumulated local effect (ALE) plots

4.4 The Impact of quality and credit conditions of agricultural SMEs

4.5 The impact of the credit conditions of the LE and the status of the cooperative relationship

4.6 The impact of the profitability of the agriculture 4.0 investment

5 Managerial implications

6 Conclusion, limitations and future research

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: RotF-LB algorithm

Appendix: RotF-LB algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation