1 Introduction

The study of complexity dynamics has become increasingly of interest in the recent scientific literature, especially pertaining the financial field. The global financial crisis (2008–2009), the COVID-19 pandemic, the Russian–Ukraine 2022 War represent just some recent examples of exogenous super-shock which increase the binomial uncertainty/complexity of the financial markets. Understanding how to potentially reduce financial uncertainty is, on the one hand, a scientific requirement in many different fields (see, for example, Degiannakis et al. 2018) and a practitioner need (Kritzman 1991), and, on the other hand, it is the expression of a growing urgency (Lesch and Millar 2022).

Unfortunately, in this scenario, applying traditional problem-solving strategies will not be sufficient to guide decision-makers toward optimal financial choices. The complexity of financial market, in fact, inevitably calls for the use of Decision Support Systems (DSSs, hereafter) (Zopounidis et al. 1997; Boonpeng and Jeatrakul 2016; Banik et al. 2022). The DSS, employed for problems that cannot be treated with the more traditional operations research models, has increasingly become a mixture of different kinds of methodologies and instruments coupled by the machine learning approach (Merkert et al. 2015; Kraus and Feuerriegel 2017; Kumar 2020). The paper by Beraldi et al. (2011) presents a decision support system which integrates simulation techniques to predict future uncertain market conditions e sophisticated optimization models based on the stochastic programming paradigm.

Among DSS, the multi-criteria decision analysis (MCDA, hereafter) technique results to be the most promising one for the purposes of this contribution. In fact, given that use of multi-criteria/attributes to support decision-making, MCDA helps in dealing with the high level of uncertainty due to the complexity of the issue at hand and to multiple forms/types of data/information.

Therefore, this paper aims at: (i) introducing a methodical proposal which rests on a the MCDA technique, and (ii) applying it for evaluating the quality classes of different financial markets. In particular, on the empirical point of view, we consider the most important exchange indexes related to the principal financial markets compared with different risk investor profiles in terms of their uncertainty attitude.

It is worth noting that MCDA is based on mathematical models and it enables the simultaneous processing of qualitative and quantitative data (Ishizaka et al. 2012). This allows a very close approximation of the more realistic idea that decision-making difficulties are not characterized by a single purpose with perfect knowledge, but rather by a plurality of objectives with partial information.

Therefore, under this perspective, this paper investigates the primary strategies that have been established by theory and practical applications for the management of the conceptual representation, the quality evaluation, and the quality assurance purposes, all of which have an impact on the outcomes. The novelty of this contribution concerns the application of the combination of MCDA tools and procedures already present in the existing literature into the context of decision-making processes for financial issues. In other words, this paper proposes a seminal idea of possible combination among Fuzzy approaches—in particular, using hesitant fuzzy numbers—and machine learning techniques for that concerns classifiers and classification procedures in developing expert-based methods to study the dynamics/quality classes of different financial markets.

Pertaining the latter, a variety of decision-making aids are employed: i) the AHPSort II, a particular case of fuzzy analytic hierarchy process (AHP) to model the hierarchical structure; ii) the fuzzy analytic hierarchy process (FAHP), to determine weights in the construction of the matrix of the pairwise comparison (Emrouznejad and Ho 2017); and iii) the hesitant fuzzy sets (HFS), to more realistically represent the preferences of the decision-makers (Torra 2010).

The remainder of this paper is organized as follows. In Sect. 2, the DSS based on MCDA and fuzzy logic is discussed. Section 3 presents the results obtained applying the methodological proposal to assign different financial markets to quality class (high, medium, and low) taking advantage also of financial news. Section 4 shows the new development exploiting machine learning for dealing with a Big Data archive of news in the financial markets quality classification. Conclusions follow in Sect. 5.

2 The decisions support systems for the quality classes of financial markets

Institutions are increasingly recognizing the financial sector’s transformation due to technology advancement and massive data availability (see, e.g., Pejić Bach et al. 2019; Trelewicz 2017). If more or less ten years ago, finance was still known as small-data research area due to data scarcity, today it is characterized by a high level of data proliferation (especially in terms of volume, velocity and variety) (Fang and Zhang 2016). The latter affects many practical and financial research areas such as portfolio analysis, risk management, retail banking, and credit scoring. Thus, for the financial industry, which generates and stores these vast volumes of data daily, it is crucial to interpret and create predictive models to support decision-making processes.

In this framework, the majority of issues that arise in the real world to accommodate high complexity levels (Hasan et al. 2020) calls for the simultaneous optimization of multiple goals, many of which also in direct opposition. The presence of high level of uncertainty and complexity imposes the use, for instance, of multi-objective evolutionary algorithms (MOEAs) (Von Lücken et al. 2014; Deb 2015) to solve different financial issues (Schlottmann and Seese 2004; Tapia and Coello 2007; Ravi et al. 2017). One of these issues is just the need of ranking different financial alternatives within priority/quality classes to optimize financial decisions.

Therefore, MOEAs result to be particularly useful for financial optimization problems given that they that simulate the processes of natural evolution and, thus, allow to find the optimal Pareto boundary. In fact, Ciano and Ferrara (2022) propose a three-objective portfolio optimization model and use MOEAs to test the effectiveness of the proposed approach considering five financial markets (HS33, DAX100, FTSE100, S &P100 and Nikkei225).

Assuming a different perspective but still within the same context of analysis, this contribution aims to treat the financial uncertainty with a different approach: the use of MCDA methods and tools in a general framework where the optimization is a concrete behavior. To this aim, a novel strategy that incorporates both a MCDA method (Subsection 2.1) and a fuzzy logic (Subsection 2.2) into its workings is here proposed to evaluate the quality of different financial markets (based on findings of earlier studies presented in Sect. 3) as summarized in Fig. 3.

2.1 AHP and its evolution

The MCDA is based on parameters that allow the categorization of financial stock markets as representative of the “quality of the market”.

For the issue at hand which requires the ranking of financial markets to asses their quality class, the MCDA methodology recommends making use of a modification of the Analytic Hierarchy Process (AHP) (Saaty 1980, 1977), rather than of a whole new methodology. In particular, it suggests the use of the AHPSort (Ishizaka et al. 2012).

It is precisely the first approach, the AHP, that will be here analyzed. Among the six MCDA problem formulations (choice, sorting, ranking, description, elimination, and design), AHP has been developed for ranking problems, and it can also be used for choice problems occasionally (Ishizaka et al. 2012). A number of different concepts, such as the decomposition principle, the comparison principle, and the priority synthesis principle, are used as the foundation for the AHP modeling of a selected problem (Saaty 1987).

Although additional MCDA approaches have been developed in sorting problems, the AHP is not suitable for these kinds of challenges. In fact, “sorting is significantly different from ranking or choice and therefore necessitates the employment of specific procedures” (Vetschera et al. 2010). If the first part of this sentence is correct, then we believe that ranking methods can be applied to sorting methods with the right tweaks rather than requiring entire reconstruction (Nemery 2009). As demonstrated in many papers (Zahedi 1986; Shim 1989; Vargas 1990; Saaty and Forman 1992; Forman and Gass 2001; Ho 2008; Liberatore and Nydick 2008; Sipahi and Timor 2010), the AHP has an impressive track record of accomplishments. The pairwise comparison of alternatives and criteria is at the heart of the AHP methodology. This evaluation yields a more accurate conclusion than a direct evaluation, such as that found in the more conventional weighted sum methodology (Millet 1997; Saaty 2005, 2005, 2006; Whitaker 2007).

To fill this gap, the AHPSort has been developed as a variant of AHP for sorting alternatives. The original AHP has been transformed into AHPSort (Ishizaka et al. 2012) and AHPSortII (Miccoli and Ishizaka 2017) to keep all the advantages of the original approach. It allows the evaluation of only two items at a time, which results in information that is more precise (Millet 1997). Moreover, given that the problem is organized in a hierarchy, it is much simpler to comprehend, explain, and find a solution to it. Inherited from the AHP, in the AHPSort approach, there are a consistency check and sensitivity analysis, both of which contribute to an improvement in the quality of the conclusion which could be reached.

Consistency is a crucial aspect in a decision-making process. Accordingly, pairwise comparison matrices (PCMs) must be mentioned as one of the most sensitive and discussed aspects in the AHP (and its evolution) literature. It is worth noting that pairwise comparison is a robust and efficient technique for comparing alternatives/criteria. At the same time, it is fundamental in the development of modern decision-making methods. Here, our interest focuses on that tools useful for reinforcing machine learning approach toward a more performing decision support system platforms. Given that several pairwise comparison matrices (such as multiplicative, additive, fuzzy) have been already analyzed in the literature, it is promising the evaluation of their ability to represent subjective preferences of decision-makers in socio-economic contexts. PCMs have been long used in psychophysical research to evaluate and comparing sensory intensities (Thurstone 1994; Kunc et al. 2016; Wixted and Thompson-Schill 2018). In recent years, they acquire popularity also in decision theory in terms of precision, accuracy, and robustness (Koczkodaj et al. 2016). They reduce, by comparing two alternatives at a time, the complexity of a decision-making problem, especially when the set of alternatives is large. Since preferences representation is not unique under PCMs, Cavallo et al. (2012, 2014, 2019) proposed a unified approach based on algebraic structure as Abelian linearly ordered groups (Alo-groups), i.e., commutative groups equipped with an ordering relation. Thus, in this context, it is easy to define a consistency condition (i.e., a cardinal transitivity condition of preferences on triplets of decision elements) such that, if it holds, the decision-maker is considered fully coherent and his/her judgements are not contradictory (for conditions weaker than consistency). To this aim, further research could reinforce this algebraic structure by exploiting Carnot Groups and their properties (see, for further details, Molica Bisci and Ferrara 2016).

AHPSort has undergone additional development and is now capable of being applied to large issues (Miccoli and Ishizaka 2017) as well as to collective decisions (López and Ishizaka 2017).

Fig. 1
figure 1

Hierarchy with the AHP method

The AHP approach, shown in Fig. 1, is expanded upon here by assigning various alternatives to different priority groups according to the criteria and preferences of the decision-makers. The AHP approach (Saaty and Forman 2003) has been extended to create a new multi-criteria sorting method before with AHPSort and after with the so-called AHPSortII. The latter results to be particularly useful in the presence of a large number of different alternative given that it simplifies the required pairwise comparisons and it sorts the available alternatives into ordered classes (from most to least preferred) (Ishizaka et al. 2020).

As from Fig. 2 compared to Fig. 1, when modeling the hierarchy of a DM problem, a new level is added by incorporating the attribution of alternatives to ordered classes from the most to the less important (AHPSortII). This may be considered as an addition to the modeling of the hierarchy of the decision problem.

The building of the matrix of the pairwise comparison, which must obey the axioms of transitivity and proportionality, is the methodological tool on which the analysis is based. This matrix is represented by the construction of the comparison matrix. It is required to define core profiles to properly distribute the choices throughout the classes. The central profiles are an index that collects information about the similarity of the performance of the various options in relation to the class that has to be assigned. By utilizing limit profiles, which demarcate the various priority categories, it is possible to categorize the many options in the way that is most suitable for the task at hand.

Further possible extension of this proposal could be the implementation of the ELECTRE TRI method into this decision-making platform, taking advantage of a machine learning support (see, for instance, the modeling proposed by Fattoruso and Barbati 2021). The analysis of the effects on this ongoing decision support system platform is a further field of study (see, for further details, Alvarez et al. 2021; Fattoruso et al. 2023).

2.2 Hesitant fuzzy sets and their role

The second step of our methodological proposal rests on the use of a fuzzy logic (see Zadeh 1988; Kosko and Isaka 1993; Klir and Yuan 1995; Hájek 2013, just to cite a few). In particular, here the alternatives are assigned to the various classes according to the degree to which they belong. This is done in accordance with the principles of fuzzy logic, and it is accomplished by the construction of a priority function which assumes values in the interval [0, 1].

More in details, the various kinds of priority classes are outlined with the reference labels, and after that, representative limit profiles are figured out for the specific dimensions of every single class. During the process of building the performance matrix, the person making the decision will give expression to their preferences by using the Saaty semantic scale (Saaty 1991, 1994). This scale will give each pair of elements a subjective estimate with respect to the decision criteria, and it will do so in accordance with the principle of comparison, which will be applied through pairwise comparisons. In this empirical application, we will focus on the fuzzy scale, which takes as its starting point the presence of a triangular characteristic function with three values, namely: high (H), medium (M), and low (L).

Fig. 2
figure 2

Hierarchy with the AHPSort II method

After that, the weights that are associated with each criterion are constructed by calculating the geometric mean of the values H, M, and L for each pairwise comparison involving the criteria that are under discussion for the financial issue at hand (see Sect. 3 for further details).

Last but not least, the principle of priority synthesis (Saaty 1983) is adhered to. This principle refers to the clarification of a ranking by defining global and relative priorities through the application of the linear interpolation formula.

The decision-making process begins with a process that is characterized by the uncertainty and hesitation of decision-makers with respect to their financial preferences. This requires a flexible method to attribute to an element a series of possible values in terms of degrees of membership to best represent the preferences of the decision-maker. The method that is here selected for reaching this goal is represented by the Hesitant Fuzzy Sets (HFS). The pioneering work of Torra and Narukawa (2009) and, subsequently, of Torra (2010) are the ones credited to be the first contributions to present the HFS idea.

Intuitively, it is possible to think to the HFS as an extension of the Zadeh’s fuzzy set theory (Zadeh 1965), developed with the scope of capturing the uncertainty through the use of specific models. As a result, in this framework, it is difficult to model the hesitation of the DMs on the “right” value to attribute. Since this hesitation can make an appearance while modeling the uncertainty, HFS has seen a boom in its applicability in recent years, both in terms of qualitative (Rodriguez et al. 2011) and quantitative (Qian et al. 2013; Chen et al. 2013; Yu 2013; Zhu et al. 2012) contributions. Nevertheless, the financial applications of HFS are still limited. Bisht and Kumar (2019) proposed the use of HFS for financial time series forecasting while Deng and Zhang (2023) constructed a HFS environment within which evaluating the development level of digital inclusive finance. A fuzzy environment has been selected also in Li et al. (2022) to evaluate the supply chain finance credit risk within a multi-criteria decision-making (MCDM) approach.

The increasing interest on HFS is related to the possibility of using these sets to model the uncertainty in both directions. HFSs have been utilized by researchers elaborating MCDM problems, multi-expert with multi-criteria decision-making, evaluation processes (Yu 2013), and clustering techniques.

For instance, Yue et al. (2013) used aggregation operators to determine the most effective manufacturing plan by making use of HFS. As a result of the inability of the parties involved in the decision-making process to convince one another, the vote that the alternative should satisfy this criterion can be represented by a hesitant fuzzy element (HFE) that contains all of the values attributed by the DM.

These kinds of situations occur quite frequently within the context of decision-making processes, in general and in particular in the financial field. For instance, with particular attention to credit risk He et al. (2016); Shen et al. (2018); Wen et al. (2019) proved how HFS could reduce the information process complexity and well represent the DMs hesitancy in the decision process. Thus, HFS allows a more stable and reliable results of complex decision-making process. However, to the best of our knowledge, HFS has never been applied to asses financial market quality, accommodating the psychological behavior and risk preference of DMs.

Therefore, the HFS represents a tool that is used to make decisions in circumstances in which there is uncertainty and hesitation. Further details will be presented in the Toy example proposed in Sect. 3.

In this contribution, HFS is defined in terms of a function that returns a fuzzy set A in which the degree to which each element belongs to the domain can have a variety of different values. More precisely, given a set X, a hesitant fuzzy set A in X is a function which applied X to a subset [0, 1], i.e.,

$$\begin{aligned} A=\{(x, h_{A}(x))|x\in X\}, \end{aligned}$$
(1)

where \(h_{A}(x):X\rightarrow [0,1]\) is a set of a value in [0, 1], denoting the degree of belonging to each element \(x\in X\) in the set A. To simplify \(h_{A}(x)=h\) denotes a hesitant fuzzy element (HFE).

When the function \(h_{A}(x)\) returns a null value, i.e., \(h=0\), the degree of membership is zero, and A is called “empty set”. The latter is not a set without elements, but rather it is the expression of the facts that all the DMs are opposed to the alternative. When \(h_{A}(x)\) returns a value equal to 1, i.e., \(h=1\), A a labeled “complete set”. In this case, it is not the set of all possible elements, but it indicates that all DMs agree with it. Finally, if the function returns the set [0, 1], it means that all values between 0 and 1 are possible, as mentioned above we can not only get multiple values for a single element but it is also possible to get a single value which can be seen as a subset of [0, 1].

Therefore, in the decision-making process, the substantial difference between HFS and traditional ensembles is that the HFS considers an organization with multiple DMs from different areas to evaluate an alternative use of all the values attributed to them.

The structure for the suggested methodological approach is summarized in Fig. 3 and applied in Sect. 3 to financial markets and further discussed in Sect. 4. In particular, the three main steps proposed to model decision-making are the following: (i) the AHPSort II to model the hierarchical structure; (ii) the fuzzy analytic hierarchy process (FAHP), to determine weights in the construction of the matrix of the pairwise comparison; and (iii) the hesitant fuzzy sets (HFS), to more realistically represent the preferences of the decision-makers.

Fig. 3
figure 3

Flow chart for the proposed study

3 The application to financial market for market quality classification

Let consider the vector X representing the set of financial markets \(X=\{x_i\}_{i=1,\dots ,I}= \{x_{1},x_{2},x_{3},\dots ,x_{I}\}\) and the decision-making criteria \(C= \{c_n\}_{n=1,\dots ,N}=\{c_{1},c_{2},c_{3},\dots , c_{N}\}\). The evaluation of the i-th market on the n-th criterion is denoted with \(c_{n}(x_i)\).

The purpose of this paper is to qualitatively identify three priority classes (\(Q=\{q_p\}_{p=1,\dots ,P}=\{q_1,q_2,\dots ,q_P\}\) with \(P=3\)) of the financial market based on the use of the methodology summarized in Fig. 3. In this embryonic idea, this vector of criteria C is strictly connected with a twin vector C representing different news/information categories \(C^*= \{c_n^*\}_{n=1,\dots ,N}=\{c_{1}^*,c_{2}^*,c_{3}^*,\dots ,c_{N}^*\}\) gained from the information market by media, online blog, social networks, and so on. This second vector \(C^*\), including the most influential news categories able to condition a great part of the DM processes, could be considered for volume, variety, and velocity of the collected documents a Big Data (Sagiroglu and Sinanc 2013) archive D. One of the great challenges is the assignation of each document in D to news/information categories (i.e., the document classification) related to financial issues and concerning, in this case, banking, non-banking, governmental, or global dynamics areas. Our idea is deeply connected to the one of Dogra et al. (2022).

The huge number of daily news made available by various online sources and the storage of these information in a Big Data archive D in nowadays a crucial component of the management of financial assets. Extracting information to be used in MCDA, thus, becomes relevant in this field and requires multiclass classification (see, for instance, Aly 2005; Peng et al. 2011; Grandini et al. 2020) and in particular multi-text classification (Damaschk et al. 2019; Rennie and Rifkin 2001). The interpretation and forecasting of the movement of stock prices are, in fact, significantly influenced by the occurrence of news events (Shiller et al. 1984; Yermack 1997; Schumaker and Chen 2009; Akita et al. 2016). For instance, in the Indian stock market, an essential and as-yet unsolved problem is the development of a framework for the storage of news articles, the collection of information on certain topics, and the extraction of useful information. When online news portals generate financial news articles about a wide variety of topics at the same time, it might be difficult to classify news items related to a certain category because they have been produced simultaneously. Thus, it is essential on the one side to gather and store news articles, and on the other side to arrange/classify these text documents into the four categories (Dogra et al. 2022).

Here, multiple testsFootnote 1 have been conducted to classify the financial articles into one of the four predetermined categories (banking, non-banking, governmental, or global) based on a multiclass text classification approach (Dogra et al. 2022). Further details are proposed also in Sect. 4.

It is necessary to define the priority classes \(Q=\{q_1,q_2\dots ,q_P\}\) ordered and labeled according to the decision problem presented. This allows the classification of each element under investigation at a certain quality level using one of the main MCDA methods, the AHPSort. In this context, the use of the Fuzzy-AHPSort method (Krejčí and Ishizaka 2018) is considered more appropriate to calculate weight criteria. Thus, the proposed combination of the MCDA method (to model the hierarchical structure) with the theory of fuzzy sets (to calculate weight criteria and construct the final decision)has the advantage of better interpreting the “ambiguous” assignments between classes. Furthermore, the approach introduced also provides useful information about the degree of belonging of each alternative to each priority class \(q_p\).

The profiles of each class, \(q_p\), could be defined using: i) the local limiting profiles, namely the minimum performance that a criterion \(c_n\) should obtain to belong to the class \(q_p\); or ii) the local central profiles which is the expression of values considered ideal within each class \(q_p\) on criterion \(c_n\).

For this purpose, the limit profile

$$\begin{aligned} PL= & {} q_1(pl_1-pl_2), q_2(pl_2-pl_3),\dots ,\nonumber \\{} & {} q_P(pl_{P}-pl_{P+1}) \end{aligned}$$

of each priority class are obtained to establish the lower (\(pl_{p}\)) and upper (\(pl_{p+1}\)) reference values based on each criterion \(c_n\) for the class \(q_p\). To obtain satisfactory results, central profiles \(PC=\{pc_1, pc_2,\dots ,pc_P\}\) are also introduced on criterion \(c_n\).

The constituent elements of the decision problem are compared in pairs, proceeding from the lower level to the higher ones, repeating the process for each level of the hierarchy. The value of each comparison obtained will be a vector composed of three elements \(v_T=\{H,M,L\}\).

The procedure continues with the calculation of the geometric mean of the fuzzy characteristic function of each alternative criterion—sub-criterion—until a single representative value v is obtained. To calculate local priorities (\(L_P\)) of the i-th financial markets \(x_i\) on criterion \(c_n\), the linear interpolation formula is adopted considering the values of the limit profiles, the parameter \(c_n(x_i)\) (score of alternative \(x_i\) on criterion \(c_n\)), and the representative value v:

$$\begin{aligned} L_P= v+\frac{pl_{p-1} - pl_{p}}{pl_{p} - pl_{p+1}} (c_n(x_i) - v), \end{aligned}$$
(2)

where

  • \(L_P\) is the local priority of the i-th financial market \(x_i\) according to the criterion \(c_n\);

  • \((pl_{p} - pl_{p+1})\) is the limit profile of the priority class \(q_p\);

  • \(c_n(x_i)\) is the parameter that measures the i-th market with respect to the considered criterion \(c_n\).

By aggregating all the local weights determined by the previous step (the priority for the criterion \(c_n\) importance is based on its weight, \(w_n\) according to the AHP eigenvalue method), the global priority value of each market i is obtained:

$$\begin{aligned} G_i= \sum _{n=1}^{N}L_P \cdot w_n. \end{aligned}$$
(3)

The final step is the assignment of each financial market i to one of the three (high, medium, low) priority classes \(Q=\{q_1,q_2,q_3\}\) based on its proximity to the central profile \(pc_p\) as a representative parameter of the value deemed ideal within each class \(q_p\).

The assets (\(i=1,\dots ,I\)) are assessed on the basis of different criteria, which, in this specific case, have been identified in three decision-making criteria (\(N=3\)) such as: i) the trade-off between expected return and reliability \((c_1=a_i)\); ii) the appropriate level of capital to the sub-portfolios \((c_2=d_i)\); and iii) the level of risk \((c_3=r_i)\):

$$\begin{aligned} C=\{a_i,d_i,r_i\}. \end{aligned}$$
(4)

Priority classes are defined and a label is assigned to them. From the analysis carried out, a triple classification of high (H), medium (M) and low (L) quality was determined, respectively:

$$\begin{aligned} Q=\{q_H,q_M,q_L\}, \end{aligned}$$
(5)

by identifying the limit profiles of each class, as also shown in Fig. 4:

$$\begin{aligned} classPL= & {} \{q_H(pl_1 - pl_2),q_M(pl_2-pl_3),\nonumber \\{} & {} q_L(pl_3-pl_4)\}. \end{aligned}$$
(6)
Fig. 4
figure 4

Priority classes

The pivotal tool is the construction of the matrices of the pairwise comparison (see Tables 1 and 2).

Table 1 Performance matrix: comparison of the three criteria
Table 2 Geometric mean of fuzzy (\(v_T\))

We proceed with the determination of the weights \(w_n= v_T*(v_T)^{-1}\), and normalize the values obtained from the arithmetic mean of \(\{H,M,L\}\). The computation of the pairwise comparison is repeated for each level up to the level of the alternatives. Using the linear interpolation formula, the values of the local priorities follow Eq. (2) for every financial market on criterion \(c_n\), \(n=1,\dots ,3\).

The priorities are then summarized to arrive at the value of the global priorities as reported in Eq. (3).

The next step involves determining the central profile \( pc_p\) as a representative parameter of the value deemed ideal to be established with investors and based on their risk attitude.

The ongoing research on hesitant fuzzy operations and measures is strongly oriented on the analysis of equal length processing (Lv et al. 2019), and the latter method will enlarge the vision of the original data structure while changing data information. This is an arising problem to be solved in the development of hesitant fuzzy sets research. The hesitant fuzzy distance measure and similarity measure are studied based on the information feature vector. Finally, the hesitant fuzzy network clustering method based on similarity measure is given, and the effectiveness of our algorithm through a numerical example is illustrated (Lv et al. 2019).

Practically, under a group setting, it is really difficult to determine the membership of an element to a set due to doubts between a few different values. For example, let consider that two DMs discuss the membership degree of x into A. One wants to assign the value 0.3 and the other 0.7, and they cannot persuade with each other. Thus, the membership degrees of x into A can be represented by \(\{0.3, 0.7\}\). This is obviously different from the fuzzy number 0.3 (or 0.7) and the intuitionistic fuzzy number (0.3, 0.7). Therefore, hesitant fuzzy sets can better simulate the hesitant preferences of decision-makers. Since it was put forward, the hesitant fuzzy set has received extensive attention, as already discussed in Sect. 2.

In the toy example here proposed, five financial markets (i.e., \(i=1,\dots ,5\)) have been considered for performing tests of the methodological proposal (see also Chang et al. 2000). In particular, the following financial markets have been selected \(X=\{HS33, DAX100, FTSE100, SP100, Nikkei225\}\).Footnote 2 The information contained in the data set is comprising of 291 weeks (2020–2021). These data sets are publicly available from Or-library (Beasley 1990). We consider three decision-makers (DMs) with different behavioral profiles DM1 (as risk-averse decision-maker), DM2 (as risk-tolerant decision-maker), and DM3 (as risk-neutral decision-maker).

The hesitant fuzzy numbers contained into the matrix proposed in Table 3 are determined by a combination of values from the markets and possible DM behavior for what concerns both risk and uncertainty. For instance, if we consider the Hong Kong Stock Index (HS33) and the priority class \(q_H\), in terms of evaluation of the market the following values (0.2,0.3,0.7) are collected. The first value is related to DM1 that reflects her/his own personal evaluation about assessment of this Index. Similarly, 0.3 represents the value related to DM2 (the risk tolerant) while 0.7 to DM3, the risk-neutral DM.

Now, the assignment of the five alternatives to the three classes, namely the assignment of financial markets to priority classes, takes place through the construction of the hesitant decision matrix which can assume values in the range [0, 1] as reported in Table 3.

Table 3 Example of Hesitant decision matrix

Each single value is determined as a comparison parameter using the score function whose equation is

$$\begin{aligned} s(h)= \frac{1}{\#h}G_i, \end{aligned}$$
(7)

where #h represents the number of elements in the set h related to hesitant decision matrix. For instance, (0.2, 0.3, 0.7) represents \(H_{1,1}\). The value obtained is compared with the central profile of the class to which it belongs. On the basis of the similarity of the values, the ranking of the financial markets within each class of high-, medium-low priority is identified \(Q={q_H,q_M,q_L}\).

4 New developments by machine learning approach and managerial insights

This section proposes a new variation of the MCDA technique. As explained in Sect. 2, the AHPSort II method involved as a first step induce the decomposition of a decision problem into a hierarchical structure which can be represented as in Fig. 2. The new procedure, instead, is characterized by a different articulation of the levels of the hierarchy (Fig. 5), resulting in the following structure:

  • first level: definition of the objective/s;

  • second level: identification of the alternatives;

  • intermediate level: determination of the relevant criteria for the analysis;

  • next level: seamless data integration on a cloud;

  • last level: identification of the priority classes.

As shown in Fig. 5, this hierarchical structure takes another aspect with respect to Fig. 3, as another analysis perspective is defined and implemented.

Fig. 5
figure 5

New hierarchical structure integrated with machine learning

The first phase does not change; therefore, the analysis starts with the definition of the goal or goals that the business wants to achieve. After that, it is recommended to identify the different paths that could lead to the organization reaching the first level of success. Then, the next step could be reached only if all the available choice options have been outlined. This is because it is generally accepted that defining the alternatives first makes it easier to focus on the criteria that are most important for the analyses and eliminate those that are not. It is useful to insert the trend of these data into a cloud to manage a large amount of information, monitor changes over a pre-established time period, and make it easily accessible to decision-makers. The activities that are involved in an organization (which may concern business activities, processes, financial assets, and so on) generate data that must be collected in the analysis reports. The latter represent the performance that these activities are able to achieve. After that, comparison matrices are formed between the parameters specified by the experts as the best performance value that this activity can attain and the average of the values assumed by each activity. These parameters are compared to the values assumed by each activity (Table 4).

Table 4 Comparison matrices are formed between the parameters specified by the experts and the average of the values assumed by each activity

The value inside the cells \(a_{m,n}\) will be determined by the difference between the mean \({\overline{M}}\) and the established parametric value \(\overline{p_s}\) specified by the expert:

$$\begin{aligned} {\overline{M}}= & {} \frac{1}{n}\sum _{i=1}^{n}x_i, \end{aligned}$$
(8)
$$\begin{aligned} a_{m,n}= & {} \frac{1}{n}\sum _{i=1}^{n,m}x_i-\overline{p_s}:(x_i). \end{aligned}$$
(9)

This step is repeated for all the levels in which the hierarchy is expressed. In this review of the multi-criteria model, the classic matrix of the comparison in pairs is introduced only to give the criteria a relative weight, allowing to express a subjective opinion about the importance of the different criteria according to the DMs experiences. A further variation of the methodology could be implemented in the attribution of the criteria weights: since the experts of certain sectors do not have in-depth knowledge of decision support tools, they could express errors in the judgments, consequently their decision-making process must be simple and efficient, to maximize their ability to make decisions. In particular, the weight could be assigned by means of the membership function of the fuzzy theory. Considering the following equation:

$$\begin{aligned} A=\{(x, h_{A}(x))|x\in X\}, \end{aligned}$$
(10)

where

  • A is a set of priority elements;

  • \(h_A (x)\) is a set of a value in [0, 1], denoting membership degree to each element \(x\in X\) in the set A.

We introduce an oriented segment with values included in the interval [0, 1], and ask the experts to position the different criteria within it based on the degree of “dominance” of one criterion over the other.

Let us assume that DMs must attribute the relative weight to the different importance of four evaluation criteria \(C=\{c_1,c_2,c_3,c_4\}\). This mean that they are asked to position the four elements within the segment based on their importance.

Figure 6 identifies an hypothetical attribution of weights which configures the following membership function:

$$\begin{aligned} h_A(x)=[(c_1/0.2)(c_2/0.7)(c_3/0.9)(c_4/0.5)]. \end{aligned}$$
(11)
Fig. 6
figure 6

Weight attribution by fuzzy characteristic function

In this second alternative, this string would represent the vector of the criteria weights. The procedure continues with the calculation of the priority vector of the matrices with the eigenvalue method, the calculation of local priorities by linear interpolation, as well as the calculation of fundamental global priorities to classify the predefined alternatives.

In this step, we can consider the news/information categories vector C as we said in Sect. 3. Out of the pool of financial news stories, we are interested in extracting news on the banking industry and the domains that are most closely related with it. We believe that “banking news” of any nation is most correlated with their “governmental news-eve”, which covers news on government initiatives for good governance, state or national elections, change or new development of governmental policies, and “global” financial news, which covers global trade, changes in currency-commodities prices, and global sentiments. So, we have a difficulty with the four-class classification of a collection of news items to separate banking and the news that is most correlated with it, namely government and global news, from complete sets of financial news articles. Already in Sect. 3 these four categories (banking, non-banking, governmental, or global dynamics) have been exploited.

When the goal is the attribution of a document to a category inevitably there is the call of a classification. The latter represents one of the major machine learning issues even if in the presence of a rapid growing of machine learning theories and applications in the last decade. The selection of a suitable machine learning classifier is challenging and allow to overcome performance of econometric models. In this field, the need of robust classification techniques is urgent especially given the presence of not well defined, vague or imbalanced data. The latter subject is outlined when the number of examples that represent each class is not equal, and it is a frequent issue when classifying news articles, as in this contribution. Nevertheless, common machine learning algorithms for multiclass text classification may introduce biases in the presence of imbalanced datasets (see, for instance, the discussion in Kaur et al. 2019). Moreover, text classifications in this filed are generally based on binary text classification but real-world issues here considered required a multiclass text classification as discussed in Sect. 3.

In this context, fuzzy techniques could represent a solution to increase the performance of machine learning classification algorithms (Dai and Chen 2020; Tabakov et al. 2021), also within a multi-criteria approach (Ye et al. 2020). Moreover, a fuzzy logic helps in dealing with attributes redundancy, missing or diffuse values due to noises, and missing partial data (Caballero 2019). It is especially the use of HFS which appears to be more promising and allows to open new directions for further research although still little explored. Indeed, in addition to this contribution, Li et al. (2009) already developed the fuzzy support vector machine, further extended by Ha et al. (2013) with intuitionistic fuzzy number and kernel function.

In line with this research direction, HFS can be used with reference to the following machine learning and/or econometrics models (see Fig. 7), with potential decision-making applications not only in the financial field.

Fig. 7
figure 7

Machine learning for decision-making

4.1 Management dynamics insights: a sketch

Our decision-making process, here presented in a first attempt, is connected with the financial environment. Its use could, in fact, be easily extended to management sciences. This is especially true for decision support systems applied to decision-making related to sectors/markets characterized by forms of high complexity and/or entropic uncertainties. In this connection, HFS and FAHP allow to go beyond the stochastic approach and they also offer an interesting and challenging alternative to the classic approaches used by practitioners (CEO, Managers, Start-uppers, etc.). Further research, that we are currently undertaking, is aimed at providing numerical and computational evidences from Big Data (not so easy to arrange).

4.2 Linear regression

The linear regression model is considered one of the fundamental model, first developed as a statistical and econometrical model (Krämer and Sonnberger 2012), now is also used as a supervised Machine learning algorithm (Khalil et al. 2022). It studies the relationships between one continuous output/dependent variable and one (simple linear regression) or more (multiple linear regression) input/independent variables, assuming a linear relationship between them. It predicts the continuous output taking advantage of a constant slope and evaluating, according to the following formulation, how the variability of the dependent depends on the variability of the independent variables:

$$\begin{aligned} y_i= \beta _0+\beta _1X_{i1}+\beta _2 X_{i2}+\cdots +\beta _k X_{ik}+\varepsilon _i, \end{aligned}$$
(12)

where:

  • \(y_i\) is the i-th (\(i=1,\dots , N\)) observation of the dependent variable,

  • all the X’s represent the independent/input variables;

  • all the \(\beta \)’s are the model parameter to be estimated (via OLS) accordingly to the specification of the error terms;

  • all the \(\varepsilon \)’s represent the random error of the model. Specific assumptions over the error terms are formulated (Poole and O’Farrell 1971). In particular, errors are assumed to be normally distributed with mean zero mean and constant variance (homoschedasticity). These assumptions lead to the so-called ordinarily least square (OLS) regression model. They are the expression of both the advantages and application limits of the OLS model.

However, a fuzzy relationship between an output variable and input variables could be assumed especially when the phenomenon under study is imprecise. Thus, a hesitant fuzzy environment for linear regression model could be specified to account for MCDM problems in a hesitant environment. This could represents an interesting and alternative approach suitable when input–output variables are observed as hesitant fuzzy elements (Sultan et al. 2021).

4.3 Random utility models

The description and the learning from individual choice behaviors have become increasingly important in social sciences, as for instance microeconomics, finance and marketing. In the presence of mutually exclusive discrete alternatives (i.e., binary, categorical, etc.), well-established random utility models (RUM) (Marschak 1959) are employed and commonly refereed to discrete choice models. The latter provide an interesting extension of the classical theory of utility maximization to choices realized among multiple discrete alternatives, with challenging empirical applications and statistical issues. For a complete review over discrete choice models applied to economic fields refer to (Train 2009), while Bayesian parametric and non parametric extensions of RUMs can be found, for instance, in Carota and Nava (2021).

The microeconomic problem we are dealing with is based on the j-th (\(\forall \, j=1,\dots ,J\)) decision-maker, which selects the \(i^{th}\) choice among a finite set \(C=\{ 1,...,I\}\) of mutually exclusive and exhaustive alternatives, i.e., \(Y_{ji}\), driven by a random utility maximization, so that

$$\begin{aligned} \textsf{Pr}(Y_{ji}|C)= & {} \textsf{Pr}\left( {\textsf{U}}_{ji}=\max _{h=1,\ldots ,I}\,{\textsf{U}}_{jh}\Big |C\right) \,\,\,\text{ with } \nonumber \\ {\textsf{U}}_{ji}= & {} {\textbf{x}}_{ji}^{'}\varvec{\beta }+\varepsilon _{ji}. \end{aligned}$$
(13)

In the linear utility function \({\textsf{U}}_{ji}\), \({\textbf{x}}_{ji}\) represents the \(r\times 1\) vector of observed explanatory variables (for individual j and choice i), \(\varvec{\beta }\) is the \(r\times 1\) vector of their fixed coefficients of fixed and \(\varepsilon _{ji}\) is an error component. Both \({\textbf{x}}_{ji}\) and \(\varepsilon _{ji}\) can be individual specific for the \(j^{th}\) decision-maker and/or choice specific accordingly to i, characterizing \({\textsf{U}}_{ji}\) in Eq. (13). In all cases, \({\textbf{x}}_{ji}'\varvec{\beta }\) represents the systematic part of the utility function while \(\varepsilon _{ji}\) is the stochastic one.

The selection of an error \(\varepsilon _{ji}\) distribution leads to different econometric models. Advantages, for instance, from a Gumbel distribution specification of this error component are that the difference between two Extreme Value Type I random variables is a Logit and that the Extreme Value Type I is closed under maximization. Thus, the random component of the utility modeled as a Gumbel distribution stands for errors in the researcher ability to represent all the elements that influence the utility of an alternative choice for the decision-maker. We are interested in i.i.d. Gumbel errors \(\varepsilon _{ji}\) that lead to the multinomial Logit model (MLM) (McFadden and Train 2000), i.e., to a special generalized linear model, where the choice probability Eq. (13) becomes:

$$\begin{aligned} \textsf{Pr}(Y_{ji}|C)=\frac{\exp \{{\textbf{x}}_{ji}^{'}\varvec{\beta }\}}{\sum _{h=1}^{I}\exp \{{\textbf{x}}_{jh}^{'}\varvec{\beta }\}}. \end{aligned}$$
(14)

If Logit and nested Logit have closed-form expressions for the choice probability, for Probit and nested Logit which fall within the class of RUMs it is not. The resulting integral is not in a closed form and, thus, numerical simulations are required. In these two cases, the distribution of the error terms is assumed to multivariate normal for the Probit, while i.i.d. extreme value for the mixed Logit (see Train 2009 for further details).

Even if RUMs, and in particular Logit models, are the most widely used classification models in economics, many empirical studies are characterized by few samples and massive uncertain information. Thus, the application of RUMs is more challenging, while HFS could create the right environment to depict uncertain information required to better model the decision process. HFS, in fact, allow to consider the complexity and the uncertainty in the application of RUMs (see Song et al. 2022, for the Logit case).

Further extension of machine learning algorithms with a fuzzy environment for binary (but not only) classification can involve the algorithms compared via ROC curves to the performance of the Logit one in Guerzoni et al. (2021).

4.4 Support vector machine

A support vector machine (SVM), considered one of the classical machine learning techniques, is a computer algorithm that learns by example to assign labels to objects (Boser et al. 1992). Its theory is the statistical learning one (Cortes and Vapnik 1995) and SVMs help the multidomain applications (classification) in a Big Data context. It is characterized by a balanced predictive performance, even in empirical application with a small sample size. Given its simplicity and flexibility for classification issues, SVM has been widely applied to a variety of economic and financial issues (see, for instance, Trafalis and Ince 2000; Huang et al. 2005; Hua et al. 2007).

Also SVMs refer to the class of supervised non-parametric learning techniques according to the specification of the learning problem. Let assume that there is an unknown and nonlinear mapping between a high-dimensional input vector x and scalar output y, i.e., \(y = f(x)\). Thus, a distribution-free learning must be performed given that the underlying joint probability functions are unknown. Only a training data set has available information (Kecman 2005).

However, SVM induces a high mathematical complexity and it is computational expensive (Suthaharan and Suthaharan 2016). SVMs rest technically on: (i) the separating hyperplane (ii) the maximum-margin hyperplane, (iii) the soft margin, and (iv) the kernel function (Pisner and Schnyer 2020). SVM uses support vectors to define the margin of the hyperplane. The number of support vectors held from the first dataset is information subordinate. SVM used different kernel functions to map the data into higher dimensional space. The most popular kernel functions are linear, polynomial, radial, and sigmoid kernel functions (Khalil et al. 2022).

The use of Fuzzy set within SVM is only recently explored (Chen and Wang 2003). Even if Li et al. (2009) developed the fuzzy support vector machine, further extended with intuitionistic fuzzy number and kernel function (Ha et al. 2013), to the best of our knowledge, HFS are not already employed within SVM. In general, the use of fuzzy sets in this context allows to combine (i) the ability of SVM to work in high-dimensional spaces, and (ii) the high interpretability of fuzzy. And the use of HFS appears to be promising.

4.5 Multicollinearity analysis

Let consider a multiple linear regression model as specified in Eq. (12). When applying a multivariate regression model, a multicollinearity issue may empirically arise. This happen when two or more independent variables are linearly dependent to one another. There are two forms of multicollinearity: strong and weak. The former is a real violation of the OLS assumptions while the second one generates inferential and model interpretability issues. Multicollinearity can be considered an interdependency condition almost independent from the relation between X and y. It is in general the effect and the symptom of a poor experimental design (Alin 2010).

The four main symptoms of weak multicollinearity are: i) large standard error; ii) variable coefficient sign differences with misleading explanations, iii) high correlations between predictor variables and outcomes, and iv) large correlation coefficients in relation to explanatory power (Lafi and Kaneene 1992). Given that (weak) multicollinearity could be considered an empirical issue, collecting more data might reduce its effects, but may not always be feasible, especially with convenience sampling research (Schroeder et al. 1990).

Commonly, there are four ways to detect multicollinearity:

  1. 1.

    the pairwise correlation across dependent variables, considering a correlation of 0.8 or 0.9 the cut-off to indicate a high correlation between two regressors. But correlations do not necessarily mean multicollinearity;

  2. 2.

    the Variation Inflation Factor (VIF) or Tolerance (TOL) (Neter et al. 2004). VIF is computed as follows:

    $$\begin{aligned}VIF_j=\frac{1}{1-R^2_j},\end{aligned}$$

    where \(R^2_j\) denotes the coefficient of determination for the regression of \(X_j\) on the remaining dependent variables. The TOL is the reciprocal of VIF. A value of VIF \(\ge \) 10 indicates multicollinearity;

  3. 3.

    the eigenvalues from a principal component approach (PCA). A smaller eigenvalue is the symptom of a higher larger multicollinearity probability;

  4. 4.

    the Condition index (CI), based on the eigenvalue, is the square root of the ratio between the maximum eigenvalue and each eigenvalue. The rule of thumb suggests that a CI between 10 and 30 is associated with a moderate multicollinearity, while above 30 with a severe multicollinearity.

However, multicollinearity is not only an econometric issue but also an artificial intelligence and machine learning one (Schroeder et al. 1990). Indeed big data collecting has been accelerated by technologies in various industries, including genomics and business intelligence; as a result, the quantity of variables and data points gathered and stored significantly increases. The multicollinearity problem is one of the main issues this creates when evaluating the data, despite the fact that it offers opportunity to more accurately model the link between predictors and response variables.

Financial empirical studies may be affected by this issue. This is especially due to the fact that financial forecasting takes into account in general a number of different factors, such as macroeconomic, microeconomic, earnings reports, and technological indicators. Due to the possibility of economic events changing the dependencies of variables, multicollinearity may arise. Since stock market data are extremely time-variant given speculative occurrences, and aims at maximizing the profit, it is challenging to lower forecast inaccuracy (Iba and Sasaki 1999).

5 Conclusions

This paper represents the first step of an ongoing research which is going to connect tools, methods and different approaches as artificial intelligence, optimization modeling, machine learning techniques, and multi-criteria decisions analysis (MCDA) in viewing to elaborate new decision support systems (DSSs) platforms to help the decision-makers with a robust and efficient approach. An interesting issue was promoted by this work inserting into this context of analysis, the AHPSort II to model the hierarchical structure, FAHP to determine weights in the construction of the matrix of the pairwise comparison and hesitant fuzzy sets (HFS) to better represent the preferences of the decisions makers. All these tools were considered by a machine learning approach trying to create a new basis opening research scenarios and interesting research lines and perspectives. The role of information, in particular of faking news and disinformation is a new issue which was inserted in this work to reinforce our initial idea of elaborating a DSS which can arrange an efficient platform capable of facing the challenges of the future increasingly characterized by complexity and uncertainty.