1 Introduction

Due to the development of society and the increasing complexity of decision-making problems, many companies and organizations employ multiple members in decision-making processes, which is known as group decision-making (GDM). GDM aims to reconcile differences of preferences articulated by multiple decision makers or experts to find an alternative or a subset of alternatives with acceptable group agreement [1, 2]. In recent years, with the rapid development of information technology, the number of participants allowed to take part in a decision-making activity drastically increases, which has led to the so-called large-scale group decision-making (LSGDM) to become a new type of GDM [3,4,5,6,7,8,9,10,11,12,13,14,15]. Generally, a GDM problem has been considered as a LSGDM problem when the number of experts engaged in the decision process is no fewer than 20 [16]. Compared with the traditional GDM problems, LSGDM problems bring some new challenges such as dimension reduction, decision information aggregation, behavior management, cost management and knowledge distribution and information increase [10].

Current studies regarding LSGDM mainly focuses on the following two aspects: (1) the preference clustering and (2) the consensus-reaching process (CRP). Clustering is one of the most important methods to reduce the dimension of participants for LSGDM problems. By utilizing clustering analysis, a large number of individuals are clustered into small subgroups. Many clustering methods have been proposed in recent years to solve LSGDM problems [5, 7, 17,18,19,20,21]. For example, Palomares et al. [17] proposed a clustering algorithm for LSGDM problem using fuzzy c-means clustering method. Based on an adapted fuzzy c-means, Rodríguez et al. [18] proposed a clustering approach. Wu et al. [5] introduced a changeable clustering method based on the commonly used k-means clustering method. Xu et al. [19] utilized the self-organizing maps to classify the large number of individuals into discrete subgroups. By considering experts’ preferences and concerns, Zhang et al. [20] proposed a novel clustering method for LSGDM problem. By considering opinion similarity and trust relationship simultaneously, Du et al. [21] developed a trust-similarity analysis (TSA)-based clustering method to handle the clustering operation in LSGDM problems under a social network environment. In GDM, a group of experts initially may have very different opinions due to the different attitudes, motivations, and perceptions of experts. Therefore, it is necessary to develop a CRP to help experts achieve agreement. In LSGDM, CRP becomes much more necessary and complicated because of the fact that opinions among a large number of experts tend to be easily polarized and conflicting [7]. CRP in LSGDM has received increasing attention recently. For example, Quesada et al. [22] developed a weighting method for CRPs of LSGDM, which used uninorm aggregation operators to compute the experts’ weights by considering their behaviors. Rodríguez et al. [18] introduced an LSGDM consensus model, which used hesitant fuzzy sets to fuse subgroup’s opinions to retain as much information as possible. Wu et al. [5] developed a consensus model for LSGDM with hesitant fuzzy information, in which the clusters were allowed to change. Xu et al. [19] proposed a consensus model for multi-attribute LSGDM to manage experts’ minority opinions and non-cooperative behaviors. In the past few years, some CRP models have been proposed to deal with LSGDM problems in linguistic environment [13, 63, 64].

Recently, GDM research from the perspective of social network has attracted the much attention [23,24,25,26,27,28,29]. To resolve LSGDM in the social network environment, a variety of decision-making models based on social networks have been proposed. For example, Liu et al. [30] developed a TR-CDE model for handling LSGDM problems where some intra-group social relationships exist among decision makers. Ding et al. [31] introduced a social network analysis-based conflict relationship investigation process and a conflict degree-based CRP for LSGDM problems. By considering the trust of social behavioral factor, Wu et al. [8] proposed a two-stage trust network partition algorithm to reduce the complexity of LGDM problems. Tian et al. [32] developed a social network-based decision framework for handling LSGDM problems with incomplete interval type-2 fuzzy information. Ren et al. [33] developed a CRP model to manage minority opinions for LSGDM with social network analysis for micro-grid planning. Chu et al. [34] developed a social network community detection approach of social networks based on the fuzzy clustering method. Lu et al. [11] proposed a CRP model based on robust optimization to deal with the uncertain unit adjustment cost in LSGDM problems. Li et al. [12] proposed a framework based on WeChat-like interaction network to deal with manipulative and non-cooperative behaviors in the LSGDM problems. It is worth noting that the social network information in these LSGDM models is not derived from the experts’ preference information about the alternatives. In these works, information about social network is usually assumed to be known in advance or determined by other methods that require additional input information. In addition to the works mentioned above, some new LSGDM models based on social network analysis have been proposed recently [32, 50, 62], which will be reviewed later.

To deal with LSGDM problems, we have to consider the inherent uncertainty and vagueness of decision information. In many GDM problems, individual experts may tend to convey their opinions using linguistic information instead of numerical values [35]. Computing with words (CWW) is of necessity for managing GDM in linguistic information environment [36]. In literature, there are two classic types of CWW models: the semantic model [37, 38] and the symbolic model [39, 40]. These linguistic computation models have been proved to be effective in dealing with linguistic decision problems, in which single linguistic term is used for preference expression. However, in many practical linguistic GDM problems, decision makers may be reluctant to use single linguistic term to convey their preferences due to time pressures, lack of knowledge and inherent vagueness exhibited by themselves. Consequently, in some complex GDM problems, decision makers prefer to use more flexible linguistic expression rather than single linguistic term to articulate their preferences [41]. In the literature, Rodríguez et al. [42] adopted a context-free grammar-based approach to elicit comparative linguistic expressions (CLEs). To facilitate the elicitation of flexible linguistic expressions in linguistic GDM problems, decision makers can use CLEs which is close to the natural way of human thinking and reasoning. In the modeling of GDM problems with CLEs, CLEs are often transformed into hesitant fuzzy linguistic term sets (HFLTSs) [42]. It should be emphasized that there is an assumption implied in HFLTSs that the occurring possibilities of the linguistic terms in the HFLTSs are equal. However, in some real decision-making situations, participants might believe that some linguistic terms are more likely to reflect their preference than others. To improve the flexibility of preference expression in this case, linguistic distribution has become a popular tool in decision-making [43,44,45,46,47,48,49].

Although the existing studies have made significant contributions to LSGDM problems, we observe that the following related research issues still need to be addressed, which motivates us to study deeply:

  • In the existing consensus models for LSGDM in social network environment, information about social network is usually handled as extra input information of decision analysis. Nevertheless, it is not easy to obtain information about the social relations among decision makers in practice. Another possible way is to establish a more objective social network by mining the preference information of decision makers [50]. However, the relevant research is not sufficient.

  • Although some CRP models for conventional GDM problems can be extended to LSGDM problems, such simple expansion cannot effectively solve the problem of consensus reaching in LSGDM. Actually, the efficiency of consensus building is an important criterion to evaluate the performance of CRP model in LSGDM. Consequently, it is necessary to develop CRP models with higher efficiency to tackle the new challenges brought by LSGDM.

  • As mentioned earlier, in complex GDM situations, decision makers may prefer to use more flexible linguistic expressions rather than single linguistic term to convey their opinions. However, a more detailed survey of the literature showed that consensus modeling of LSGDM problem with flexible linguistic expressions has not been adequately considered.

Based on the above analysis, this paper aims to develop an influence-driven consensus model for LSGDM problems with CLEs. The main contributions of this work are summarized as follows:

(1) Develop a novel preference clustering method for LSGDM problems with CLEs. We propose a modified agglomerative hierarchical-clustering algorithm to detect subgroups in the group. Moreover, the informative output results of the clustering algorithm provide a way to measure the group consensus level. Specifically, cluster consensus is determined based on internal and external consistency, while the group consensus level is derived by identifying the highest-level consensus at optimal level of clustering.

(2) Construct the influence network among experts by mining the preference information about alternatives provided by experts. We propose to build an influence network through determining inter-cluster and intra-cluster influence relationships. It should be emphasized that the influence relationships are determined based on the preference information of experts. Thus, different from most social network group decision-making models, the proposed model does not require input of social relations among participants.

(3) Propose an influence-driven CRP model for LSGDM problems with CLEs. In the event of unacceptable group consensus state, we propose a two-stage feedback mechanism procedure to achieve agreement. The mechanism adopts group adjustment strategy or individual adjustment strategy depending on different group consensus levels, which can not only effectively improve the efficiency of consensus reaching but also take individual preference adjustment into account.

The remainder of this paper is organized as follows. Section 2 introduces some basic concepts and operators. Then, Sect. 3 presents a modified agglomerative hierarchical clustering algorithm for the considered LSGDM problems. We put forward the method of constructing influence network based on preferences provided by experts in Sect. 4. Section 5 designs the influence network-based consensus model for LSGDM with linguistic information. The feasibility and effectiveness of the proposed method are verified by the case of intelligent environmental protection project location decision in Sect. 6. Section 7 presents some discussions to illustrate the advantages and weaknesses of the proposal. Finally, we conclude the paper in the last section.

2 Preliminaries

This section reviews some relevant basic knowledge regarding comparative linguistic expressions (CLEs), linguistic distribution assessments (LDAs) and some related preference aggregation operators.

Let \(X = \left\{ {x_{1} ,x_{2} , \ldots ,x_{n} } \right\}\) be a set of alternatives and \(E = \left\{ {e_{1} , \ldots ,e_{m} } \right\}\) be a set of \({\text{m}}\) DMs or experts in a group. Generally speaking, when the number of DMs \(m \ge 20\), the group can be regarded as large-scale group. In the proposed model, it is supposed that each DM \(e_{k} \left( {k = 1,2, \ldots ,m} \right)\) provides her/his preferences on \(X\) using CLEs to increase the flexibility in eliciting linguistic judgements. In [42], Rodríguez et al. proposed a context-free grammar \(G_{H} = \left( {V_{N} ,V_{T} ,I,P} \right)\) and extended it to produce CLEs. Moreover, CLEs can be transformed into HFLTSs by utilizing a transformation function \(E_{{G_{H} }}\). By pairwise comparison, DM’s preferences on \(X\) can be expressed by a preference relation, the entries of which are CLEs. For the convenience of modeling, preference relations with CLEs are transformed into distribution linguistic preference relations in the model. Let \(D^{k}\) represent the transformed distribution linguistic preference relation of DM \(e_{k} \left( {k = 1,2, \ldots ,m} \right)\) in the following. To reduce information loss, LDA is also used to represent the aggregation results of linguistic preference information of large-scale group members. In the following, the concept of LDA proposed by Zhang et al. is introduced.

Definition 1

(Linguistic distribution assessment, LDA) [43]. Let \(S = \left\{ {s_{0} ,s_{1} , \ldots ,s_{g} } \right\}\) be a predefined linguistic term set. Supposed that \(d = \left\{ {\left( {s_{k} ,\alpha_{k} } \right),k = 0,1, \ldots ,g} \right\}\) satisfies \(s_{k} \in S\), \(\alpha_{k} \ge 0,\;\mathop \sum \limits_{k = 0}^{g} \alpha_{k} = 1\), where \(\alpha_{k}\) denotes the symbolic proportion associated with the linguistic term \(s_{k}\). Then \(d\) is called linguistic distribution assessment on the linguistic term set \(S\).

Let \({ }d_{1} = \left\{ {\left( {s_{k} ,\alpha_{k}^{1} } \right),k = 0,1, \ldots ,g} \right\}\) and \(d_{2} = \left\{ {\left( {s_{k} ,\alpha_{k}^{2} } \right),k = 0,1, \ldots ,g} \right\}{ }\) be two given LDAs on linguistic term set S. The distance measure between \(d_{1}\) and \(d_{2} { }\) is defined as [51]:

$$ {\text{dis}}\left( {d_{1} ,d_{2} } \right) = \frac{1}{g}\mathop \sum \limits_{k = 0}^{g} \left| {\mathop \sum \limits_{r = 0}^{k} \alpha_{r}^{1} - \mathop \sum \limits_{r = 0}^{k} \alpha_{r}^{2} } \right|. $$
(1)

Further, the similarity measure between \(d_{1}\) and \(d_{2}\) can be expressed as:

$$ {\text{sim}}\left( {d_{1} ,d_{2} } \right) = 1 - {\text{dis}}\left( {d_{1} ,d_{2} } \right). $$
(2)

Note that the distance measure and the similarity measure satisfy \(0 \le {\text{dis}}\left( {d_{1} ,d_{2} } \right) \le 1{\text{ and}}\;{ }0 \le {\text{sim}}\left( {d_{1} ,d_{2} } \right) \le 1\).

Definition 2

(Distribution linguistic preference relation, DLPR) [43] Given a linguistic term set \(S = \left\{ {s_{0} ,s_{1} , \ldots ,s_{g} } \right\}\) and an set of alternative \(X = \left\{ {x_{1} , \ldots ,x_{n} } \right\}\), a DLPR on \(X\) is defined as D ⊂ X × X, \(D = \left( {d_{ij} } \right)_{n \times n}\), where LDA \(d_{ij} = \left\{ {\left( {s_{k} ,\alpha_{k,ij} } \right),k = 0,1, \ldots ,g} \right\}\) indicates the preference degrees of the alternative \(x_{i}\) over \({ }x_{j}\).

According to Eq. (1), the distance between DLPRs \(D^{h} = \left( {d_{ij}^{h} } \right)_{n \times n}\) and \(D^{k} = \left( {d_{ij}^{k} } \right)_{n \times n}\) of DMs \(e_{h}\) and \(e_{k}\) is defined as:

$$ {\text{dis}}\left( {D^{h} ,D^{k} } \right) = \frac{2}{{n\left( {n - 1} \right)}}\mathop \sum \limits_{i = 1}^{n - 1} \mathop \sum \limits_{j = i + 1}^{n} {\text{dis}}\left( {d_{ij}^{h} ,d_{ij}^{k} } \right). $$
(3)

Accordingly, the similarity of the DLPRs \(D^{h}\) and \(D^{k}\) is computed as:

$$ {\text{sim}}_{hk} = 1 - {\text{dis}}\left( {D^{h} ,D^{k} } \right). $$
(4)

Obviously,\(0 \le {\text{dis}}\left( {D^{h} ,D^{k} } \right) \le 1,{ }0 \le {\text{sim}}\left( {D^{h} ,D^{k} } \right) \le 1\).

For the sake of logic self-consistency, the weighted averaging operator of distribution assessments (DAWA) and the induced ordered weighted averaging operator (IOWA) are detailed.

Definition 3

(\(DAWA Operator\)) [43] Let \(S = \left\{ {s_{0} ,s_{1} , \ldots ,s_{g} } \right\}\) be a linguistic term set, and \(d = \left\{ {d_{1} ,d_{2} , \ldots ,d_{q} } \right\}\) be a set of distribution linguistic assessments on S, where \(d_{i} = \left\{ {\left( {s_{k} ,\alpha_{k}^{i} } \right),k = 0,1, \ldots ,g} \right\}\). Let \(w = \left\{ {w_{1} , \ldots ,w_{q} } \right\}^{{\text{T}}}\) be an associated weighting vector that satisfies \(w_{i} > 0{\text{ and}}\;\sum\nolimits_{i = 1}^{q} {w_{i} = 1}\). The weighted averaging operator of distribution linguistic assessments is computed as:

$$ {\text{DAWA}}\left( {d_{1} , \ldots ,d_{q} } \right) = \left\{ {\left( {s_{k} ,\mathop \sum \limits_{i = 1}^{q} w_{i} \alpha_{k}^{i} } \right),k = 0, \ldots ,g} \right\}. $$
(5)

Definition 4

(\(IOWA Operator\)) [52] An IOWA operator of dimension \(q\) is a function \(\emptyset_{W} :\left( {{\mathbb{R}} \times {\mathbb{R}}} \right)^{n} \to {\mathbb{R}}\), to which a set of weights is associated, \(W = \left( {w_{1} , \ldots ,w_{q} } \right)\), with \(w_{i} \in \left[ {0,1} \right],{ }\sum\nolimits_{i = 1}^{q} {w_{i} = 1}\) and it is defined to aggregate a list of \(q\) two-tuples \(\left\{ {\left\langle {u_{1} ,p_{1} } \right\rangle , \ldots ,\left\langle {u_{q} ,p_{q} } \right\rangle } \right\}\) according to the following expression:

$$ \emptyset_{W} \left( {\left\{ {\left\langle {u_{1} ,p_{1} } \right\rangle , \ldots ,\left\langle {u_{q} ,p_{q} } \right\rangle } \right\}} \right) = \mathop \sum \limits_{i = 1}^{q} w_{i} .p_{\sigma \left( i \right)} , $$
(6)

where \(\left\{ {\sigma \left( 1 \right), \ldots ,\sigma \left( q \right)} \right\}\) is a permutation of \(\left\{ {1, \ldots ,q} \right\}\) such that \(u_{\sigma \left( i \right)} \ge u_{{\sigma \left( {i + 1} \right)}} ,\forall i = 1, \ldots ,q - 1\).

3 Clustering Process and Consensus Measure

One of the main challenges in LSGDM problems is to handle the opinion of a large number of participants simultaneously. To deal with this problem, clustering technique is often used to classify the initial group of participants into smaller and more manageable subgroups or clusters. In the proposed model, the traditional agglomerative hierarchical clustering (AHC) algorithm is adapted to solve experts clustering in LSGDM problems with preferences represented by DLPRs. An important benefit of adopting AHC-based clustering process is that its informative output provides an effective way to measure the group consensus level in the considered problem. Specifically, the group consensus level is defined as the highest cluster consensus index at optimal clustering level.

As a well-established clustering technique, AHC algorithm does not need to predetermine the initial partitions. In addition, AHC algorithm can output very informative descriptions for the potential data clustering structure [53]. Because of the above advantages, AHC algorithm is also applied to analyze GDM problems [54, 55]. For example, Chen et al. [55] introduced a novel aggregation method for HFLTS possibility distributions to achieve representative outcomes in terms of the summarization of central tendency guided by the idea of AHC. In their work, AHC algorithm is adapted to develop possibilistic 2-tuple linguistic pair (P2TLP) clustering framework. Considering that the experts’ preferences are represented by DLPRs in the considered LSGDM problems, we have to modify the AHC model to make it appropriate to the considered LSGDM problems. The modification of AHC clustering algorithm only needs to adjust the distance measure in the algorithm. In detail, the proposed model uses the distance measure between LDAs to calculate the distance between DLPRs, and then develops a clustering procedure using the basic idea of the traditional AHC method. The AHC algorithm based on the distance measure between DLPRs is described in Algorithm1.

figure a

The hierarchical clustering sequence output by Algorithm 1 can be displayed by a graphic known as dendrogram. Based on the dendrogram, a particular partition of the experts is determined by cutting it horizontally at a particular level. Setting a cutting level is equivalent to choosing the number of clusters. Consequently, the proposed AHC algorithm for LSGDM with DLPRs provides flexibility in determining the number of clusters, which is controlled by the horizontal cutting level.

Another important issue related to choosing the number of clusters is the group consensus measure. In our context, we observe that partitions of experts at different cutting level result in different group consensus level. Considering that the purpose of CRP is to achieve a satisfactory group consensus level, it seems reasonable to choose the clustering cutting level that maximizes consensus level of the group [54]. In the follows, we discuss how to measure consensus based on AHC algorithm for LSGDM with DLPRs.

Let \(L = \left\{ {\alpha {|}2,3, \ldots ,m - 1} \right\}\) be the set of the clustering level α of the hierarchical-clustering output by Algorithm 1. Let \(C_{l} = \left\{ {C_{lk} {|}k = 1,2, \ldots l} \right\}\) denote the set of clusters at cutting level \(\alpha_{l}\). To measure the group consensus, the cluster internal cohesion index \(\delta_{{{\text{int}}}}\) and the cluster external cohesion index \(\delta_{{{\text{ext}}}}\) are introduced. The cluster internal cohesion \(\delta_{int}\) is an index to measure the similarity degree within cluster. The \(\delta_{{{\text{int}}}}\) of the cluster \(C_{lk}\), \(\delta_{{{\text{int}}}} \left( {C_{lk} } \right)\), is computed as:

$$ \delta_{{{\text{int}}}} \left( {C_{lk} } \right) = \frac{{\mathop \sum \nolimits_{{i \in C_{lk} }} \mathop \sum \nolimits_{{j \in C_{lk} }} {\text{sim}}_{ij} }}{{\left( {\# C_{lk} } \right)^{2} }}, $$
(7)

The cluster external cohesion \(\delta_{{{\text{ext}}}}\) is an index that measures the degree of agreement between one cluster and all other different clusters. The \(\delta_{{{\text{ext}}}}\) of the cluster \(C_{lk}\), \(\delta_{{{\text{ext}}}} \left( {C_{lk} } \right)\), is computed as:

$$ \delta_{{{\text{ext}}}} \left( {C_{lk} } \right) = \frac{{\mathop \sum \nolimits_{{i \in C_{lk} }} \mathop \sum \nolimits_{{j \notin C_{lk} }} {\text{sim}}_{ij} }}{{\# C_{lk} \left( {m - \# C_{lk} } \right)}}. $$
(8)

Based on the cluster internal cohesion index \(\delta_{{{\text{int}}}}\) and the cluster external cohesion index \(\delta_{{{\text{ext}}}}\), we introduce consensus contribution degree to measure the contribution of a cluster to the large-scale group consensus. The consensus contribution degree of the cluster \(C_{lk}\), \(\delta_{{{\text{cc}}}} \left( {C_{lk} } \right),\) is computed as:

$$ \delta_{{{\text{cc}}}} \left( {C_{lk} } \right) = \frac{{\# C_{lk} \cdot \delta_{{{\text{int}}}} \left( {C_{lk} } \right)}}{m} + \frac{{\left( {m - \# C_{lk} } \right) \cdot \delta_{{{\text{ext}}}} \left( {C_{lk} } \right)}}{m}. $$
(9)

At the level of \(\alpha = l\), the group consensus level \({\text{GCL}}\left( l \right)\) of the large-scale group is computed as:

$$ {\text{GCL}}\left( l \right) = \frac{1}{l}\mathop \sum \limits_{k = 1}^{l} \delta_{{{\text{cc}}}} \left( {C_{lk} } \right). $$
(10)

Consequently, the optimal \(\alpha\) level can be determined as

$$ r = {\text{argmax}}_{l} \left( {{\text{GCL}}\left( l \right)} \right).{ } $$
(11)

It can be seen that the optimal \(\alpha\) level of clustering is the \(\alpha\) level that maximizes the group consensus level, and the clustering result corresponding to the optimal \(\alpha\) level is the optimal clustering. Obviously, the optimal \(\alpha\) level and the optimal clustering result are dynamic in the CRP.

4 Constructing the Influence Network

As already stated, in the existing models for GDM in social network environment, information about social network is usually regarded as extra input information of decision analysis. However, it is hard to get information about the social relationships apart from the participants’ opinion in practice. Ureña et al. [50] provided another possible way to obtain social network relationship in social network GDM. In the work, they developed a mechanism to infer the participant relationships directly from their opinions, and developed a social network. Motivated by this idea, for LSGDM problem with CLEs, this paper proposes to construct influence network based on preference information provided by experts. Specifically, we first measure the influence of experts and clusters using the preference information, and then construct the influence-based network.

4.1 Influence Measure of Experts

To measure the influence of single expert in GDM, individual consistency and preference similarity are important factors to be considered [50]. For LSGDM problems where expert’s opinions are represented by DLPR, we argue that certainty of judgment information in DLPR is also an important factor to measure the influence of expert due to the fact that DLPR is a kind of preference information with high uncertainty. As a matter of fact, preference information in the form of LDA implies the degree of experts' confidence in pairwise comparison. Consequently, it is believed that individual consistency, preference similarity and certainty of judgment information have positive correlation with the influence of experts, which guides us to measure the influence of single expert using preference information provided by him/her. Keeping this in mind, we develop a consistency-similarity-certainty based influence measure method for experts. Next, we discuss how to quantify the above-mentioned factors and how to aggregate related indices to measure the influence of single expert.

(1) Measure consistency of DLPRs. Let \(D^{h}\) be the DLPR of the expert \(e_{h}\) and \({\text{ACI}}_{h}\) be the consistency index of \(D^{h}\). Tang et al. [56] defined the consistency of DLPRs and developed a consistency index for DLPRs. The proposed model adopts this consistency index for DLPRs. For DLPR \(D^{h} = \left( {d_{ij}^{h} } \right)_{n \times n}\), \(ACI_{h}\) is computed as

$$ {\text{ACI}}_{h} = 1 - \frac{2}{{n\left( {n - 1} \right)}}{ }\mathop \sum \limits_{1 \le i \le j \le n} \left| {{\text{NS}}\left( {d_{ij}^{h} } \right) - \overline{p}_{ij} } \right|, $$
(12)

where \({\text{NS}}\left( {d_{ij}^{h} } \right){ }\) is the numerical index of \(d_{ij}^{h}\) [57], and \(\overline{P} = \left( {\overline{p}_{ij} } \right)_{n \times n}\) is an additively consistent matrix constructed based on \(P = \left( {{\text{NS}}\left( {d_{ij}^{h} } \right)} \right)_{n \times n}\). Notice that \(0 \le {\text{ACI}}_{h} \le 1\).

(2) Measure preference similarity. For the expert \(e_{h}\), preference similarity refers to the preference similarity between the expert \(e_{h}\) and all the other experts in the group. In Sect. 2, we have given a method to calculate the similarity between DLPRs. Therefore, the average value of preference similarity between the expert \(e_{h}\) and other experts can be used as an index to measure preference similarity. For \(e_{h} \in E\), preference similarity can be calculated as

$$ {\text{sim}}_{h} = \frac{1}{m - 1}\mathop \sum \limits_{k = 1,k \ne h}^{m} {\text{sim}}_{hk} $$
(13)

Obviously, \(0 \le {\text{sim}}_{h} \le 1\). As an impact factor of influence, the higher preference similarity means the greater influence of the expert in the group.

(3) Measure certainty of judgment. Certainty of judgment refers to the accuracy of the expert's judgment. In the LSGDM problems discussed in this paper, the preferences of experts about alternatives are expressed by DLPRs, in which the elements are LDAs. As a kind of fuzzy expression defined on the linguistic term set, LDA has the characteristics of inherent uncertainty. It can be roughly understood that the smaller the number of linguistic terms with positive symbolic proportion, the higher the certainty of the judgment. For DLPR \(D^{h} = \left( {d_{ij}^{h} } \right)_{n \times n}\), let \(A\left( {d_{ij}^{h} } \right)\) denote the number of linguistic terms with positive symbolic proportion in \(d_{ij}^{h}\). For the convenience of calculation, we use the standard 0–1 transformation to convert it into a value on the interval [0,1]:

$$ {\text{CJ}}\left( {d_{ij}^{h} } \right) = \frac{{{\text{max}}_{ij} \left\{ {A\left( {d_{ij}^{h} } \right)} \right\} - A\left( {d_{ij}^{h} } \right)}}{{{\text{max}}_{ij} \left\{ {A\left( {d_{ij}^{h} } \right)} \right\} - {\text{min}}_{ij} \left\{ {A\left( {d_{ij}^{h} } \right)} \right\}}}. $$

For \(e_{h} \in E\), certainty of judgment can be calculated as:

$$ {\text{CJ}}_{h} = \frac{2}{{n\left( {n - 1} \right)}}\mathop \sum \limits_{i = 1}^{n - 1} \mathop \sum \limits_{j = i + 1}^{n} {\text{CJ}}\left( {d_{ij}^{h} } \right). $$
(14)

Suppose that \(w_{1}^{E} ,w_{2}^{E} ,w_{3}^{E}\) are the weights of individual consistency, preference similarity and certainty of judgment, respectively. Using weighted averaging method, the influence of the expert \(e_{h} \in E\), denoted by \({\text{INF}}_{h}\), can be aggregated as:

$$ {\text{INF}}_{h} = w_{1}^{E} {\text{ ACI}}_{h} + w_{2}^{E} {\text{sim}}_{h} + w_{3}^{E} {\text{CJ}}_{h} . $$
(15)

4.2 Influence measure of clusters

To construct influence network by utilizing preferences of participants, not only the influence of single expert but also the influence of each cluster as a whole is measured in the proposed model. Similar to influence measure of single expert, preference similarity and certainty of judgment are used as two important indexes to measure the influence a cluster. Since a cluster consists of experts with different opinions, we have reason to believe that the higher the consensus level of experts in the cluster, the greater the influence of the cluster. Consequently, the consensus level of a cluster is used as the third index to measure the influence of cluster. Based on the above analysis, we develop a consensus-similarity-certainty based influence measure method for cluster. Let \(\alpha = l\) be the optimal clustering level and \(C_{l} = \left\{ {C_{lk} {|}k = 1,2, \ldots, l} \right\}\) be the set of clusters at optimal level \(l\). Remember that we have proposed the concept of the cluster internal cohesion \(\delta_{{{\text{int}}}}\), which is an index to measure the similarity degree within cluster. The cluster internal cohesion \(\delta_{{{\text{int}}}}\) is also an appropriate index to measure the agreement level among experts in a cluster. Let \({\text{CL}}\left( {C_{lk} } \right)\) represent the cluster consensus level of the cluster \(C_{lk} \in C_{l}\), then it is defined as:

$$ {\text{CL}}\left( {C_{lk} } \right) = \delta_{{{\text{int}}}} (C_{lk} ). $$
(16)

Preference similarity of a cluster is to measure the degree of similarity between the cluster and all other clusters in opinions. Similarly, the cluster external cohesion index \(\delta_{{{\text{ext}}}}\) can be used to measure the preference similarity of a cluster. The preference similarity of cluster \(C_{lk} \in C_{l}\), \({\text{sim}}(C_{lk} )\) is expressed as:

$$ {\text{sim}}(C_{lk} ) = \delta_{{{\text{ext}}}} (C_{{{\text{l}}k}} ). $$
(17)

Certainty of judgment of a cluster is defined as the average level of the accuracy of the DLPRs provided by the experts in the cluster. Certainty of judgment of a cluster \(C_{lk} \in C_{l}\), \(CJ\left( {C_{lk} } \right)\), is expressed as

$$ {\text{CJ}}\left( {C_{lk} } \right) = \frac{1}{{\# C_{lk} }}\mathop \sum \limits_{{e_{h} \in C_{{{\text{l}}k}} }} {\text{CJ}}_{h} . $$
(18)

Suppose that \(w_{1}^{C} ,w_{2}^{C} ,w_{3}^{C}\) are the weights of consensus level, preference similarity and certainty of judgment, respectively. Using weighted averaging method, the influence of cluster \(C_{lk} \in C_{{\text{l}}}\), \(INF(C_{lk} )\), can be computed as:

$$ {\text{INF}}\left( {C_{lk} } \right) = w_{1}^{C} {\text{CL}}\left( {C_{lk} } \right) + w_{2}^{C} {\text{sim}}\left( {C_{lk} } \right) + w_{3}^{C} {\text{CJ}}\left( {C_{lk} } \right). $$
(19)

4.3 Network construction

On the basis Of the influence measurement, this subsection proposes to construct the influence network which consists of intra-cluster and inter-cluster influence relationships.

From Eqs. (15) and (19), we can see that the influence is denoted as \({\text{ INF}}_{h}\) for \(e_{h} \in E\) and the influence is denoted as \({\text{INF}}\left( {C_{lk} } \right){\text{ for }}C_{lk} \in C_{l}\). According to influence of experts and clusters, the proposed approach classifies the experts and clusters into different profiles. Given a set of experts \(E = \left\{ {e_{1} , \ldots ,e_{m} } \right\}\) and a set of clusters \(C_{l} = \left\{ {C_{lk} {|}k = 1,2, \ldots, {\text{l}}} \right\}\) at level \(l\) and given a minimum influence threshold \({\text{ INF}}_{{{\text{min}}}} \in \left[ {0,1} \right]\) and a maximum influence threshold \({\text{ INF}}_{{{\text{max}}}} \in \left[ {0,1} \right]\), where \({\text{INF}}_{{{\text{min}}}} < {\text{INF}}_{{{\text{max}}}}\), the experts (clusters) are classified into the following levels:

(1) High-influence experts (clusters)

An expert \(e_{h} \in E\) is called a high-influence expert when he/she satisfies \({\text{INF}}_{h} > {\text{INF}}_{\max }\).

A cluster \(C_{lk} \in C_{l}\) is called a high-influence cluster when it satisfies \({\text{INF}}(C_{lk} ) > {\text{INF}}_{{{\text{max}}}}\).

(2) Medium- influence experts (clusters)

An expert \(e_{h} \in E\) is called a medium-influenced expert when he/she satisfies \({\text{INF}}_{{{\text{min}}}} < {\text{INF}}_{h} \le {\text{INF}}_{{{\text{max}}}}\).

A cluster \(C_{lk} \in C_{l}\) is called a medium-influenced cluster when it satisfies \({\text{INF}}_{{{\text{min}}}} < {\text{INF}}(C_{lk} ) \le {\text{INF}}_{{{\text{max}}}}\).

(3) Low-influence experts (clusters)

An expert \(e_{h} \in E\) is called a low-influence expert when he/she satisfies \({\text{INF}}_{h} \le {\text{INF}}_{{{\text{min}}}}\).

A cluster \(C_{lk} \in C_{l}\) is called a low-influence cluster when it satisfies \({\text{INF}}(C_{lk} ) \le {\text{INF}}_{{{\text{min}}}}\).

As mentioned earlier, our proposal aims to model consensus reaching problem based on influence network which is inferred from experts’ opinions. In the proposed model, the feedback mechanism is guided by intra-cluster and inter-cluster influence relationships. To construct the influence network, we consider the influence relationship between experts in the same cluster and the influence relationship between clusters. It is assumed that an expert only receives recommendations from the experts with the same or higher level of influence. Additionally, the experts with low-influence never provide recommendations. For the experts with the same level of influence, the direction of the influence relationship depends on the individual consistency of the provided DLPR. More specifically, the expert with higher consistency provides recommendation, while the expert with lower consistency receives recommendation. Similarly, the influence relationship between clusters can be constructed. It is worth noting that the direction of the relationship between clusters with the same level of influence depends on the cluster consensus level. Let \({\text{IJ}}\) represent the all the pairs of experts with influence relationship, i.e., \({\text{IJ}} = \{ \left( {i,j} \right)|e_{i} {\text{ has influence on }}e_{j} \}\).

Sallaberry et al. stated in [58] that people in social networks are more likely to interact and communicate with people with similar opinions. Following this basic idea, this paper develops a similarity-based influence network to spread experts’ preferences with the purpose of reaching a consensus solution. In essence, a network is a directed graph \(G = \left( {N,R} \right)\) where \(N\) is a set of nodes and \(R{ }\) is a set of directed edges that connect nodes. In the proposed similarity-based influence network, there are \(m\) nodes, each of them corresponds to an expert in the LSGDM. And the set of directed edges \(R{ }\) is characterized by an adjacency matrix.

Let \(M = \left( {m_{ij} } \right)_{m \times m}\) be the adjacency matrix, where \(m_{ij}\) represents the influence of expert \(e_{i}\) on \(e_{j}\). According to the above-mentioned influence relationships and the similarity \({\text{sim}}_{ij}\) between experts \(e_{i}\) and \(e_{j}\), \(m_{ij}\) can be defined as

$$ m_{ij} = \left\{ {\begin{array}{*{20}c} {{\text{sim}}_{ij} {\text{sim}}_{ij} \ge \delta_{{{\text{sim}}}} \wedge \left( {i,j} \right) \in IJ;} \\ {0\, {\text{Otherwise}}. } \\ \end{array} } \right.{ ,} $$
(20)

where \(\delta_{sim}\) is the similarity threshold.

Equation (20) ensures that experts with too low similarity are not connected.

Similarly, we can determine the adjacency matrix \(M^{C} = \left( {m_{ij}^{C} } \right)_{l \times l}\) between clusters.

5 Consensus Model Guided by Influence Relationship

5.1 Two-Stage Feedback Mechanism Based on Influence Network

This subsection focuses on the feedback mechanism to build the CRP model for LSGDM problems. Generally speaking, the CRP is a dynamic process that adjusts the preferences of group members to achieve collective agreement. In the proposed model, the influence relationship is considered as the guiding relationship for generating preference recommendations. To improve the efficiency of consensus reaching in LSGDM, we introduce a two-stage consensus model based on influence network. When the group consensus level is significantly lower than the satisfactory consensus level (the first stage), the feedback mechanism adopts cluster adjustment strategy which means that all experts in the cluster with the lowest consensus contribution degree need to adjust preferences synchronously. Otherwise, when the group consensus level is higher but has not reached the satisfactory consensus level (the second stage), the mechanism adopts individual adjustment strategy which means only the expert with the most deviated opinion from the group needs to modify his/her opinion.

If the preference of an expert or cluster in the influence network needs to be adjusted, preference recommendation should be generated based on the preference of the related experts or clusters. In our model, a recommendation for an expert or cluster is the aggregated preferences of the nodes connected to him/her fused by means of an influence induced ordered weighted averaging operator (INF-IOWA), which allocates more importance to those nodes that presents higher influence. To generate preference recommendation, we introduce the following influence-based IOWA operator.

Definition 5

(\(INF - IOWA Operator\)). Let \(E_{q} = \left\{ {e_{1} , \ldots ,e_{q} } \right\}\) be a set of experts. These experts provide their preferences about a set of alternatives, \(X = \left\{ {x_{1} ,x_{2} , \ldots ,x_{n} } \right\}\), using distribution linguistic preference relations (DLPRs), \(D = \left\{ {D^{1} , \ldots ,D^{q} } \right\}\). An influence-based IOWA operator (INF-IOWA) of dimension \(q\), \(\emptyset_{W}^{INF}\), is an IOWA operator whose set of order inducing values is the set of influence values, \(\left\{ {INF_{1} , \ldots ,INF_{q} } \right\}\), associated with the set of experts.

Let \(D\) represent the aggregated DLPR. Therefore, \(D\) can be denoted as:

$$ D = \emptyset_{W}^{{{\text{INF}}}} \left( {\left\{ {\left\langle {{\text{INF}}_{1} ,D^{1} } \right\rangle , \ldots ,\left\langle {{\text{INF}}_{q} ,D^{q} } \right\rangle } \right\}} \right) $$
(21)

More specifically, denote \(D^{h} = \left( {d_{ij}^{h} } \right)_{n \times n} \left( {h = 1, \ldots ,q} \right)\), where \(d_{ij}^{h} = \left\{ {\left( {s_{k} ,\alpha_{k}^{ij,h} } \right),k = 0,1, \ldots ,g} \right\}\) and denote \(D = \left( {d_{ij} } \right)_{n \times n}\), where \(d_{ij} = \left\{ {\left( {s_{k} ,\alpha_{k}^{ij} } \right),k = 0,1, \ldots ,g} \right\}\). Then, \(d_{ij}\) is computed as follows:

$$ d_{ij} = \left\{ {\left( {s_{k} ,\mathop \sum \limits_{i = 1}^{q} w_{h} \alpha_{k}^{ij,\sigma \left( h \right)} } \right),k = { }0, \ldots ,g} \right\}, $$
(22)

where \(\left\{ {\sigma \left( 1 \right),\sigma \left( 2 \right), \ldots ,\sigma \left( q \right)} \right\}\) is a permutation of \(\left\{ {1,2, \ldots ,q} \right\}\) such that \({\text{INF}}_{{\sigma \left( {h - 1} \right)}} \ge {\text{INF}}_{\sigma \left( h \right)}\), \(w_{{\sigma \left( {h - 1} \right)}} \ge w_{\sigma \left( h \right)}\) (\(\forall h = 1,2, \ldots ,q\)) with \(\sum\nolimits_{h = 1}^{q} {w_{h} = 1}\). Based on the influence of experts and the linguistic quantifier, we can allocate different importance weights to different experts, which is computed in the following way:

$$ w_{h} = Q\left( {\frac{{\mathop \sum \nolimits_{k = 1}^{h} INF_{\sigma \left( k \right)} }}{{\mathop \sum \nolimits_{k = 1}^{q} INF_{k} }}} \right) - Q\left( {\frac{{\mathop \sum \nolimits_{k = 1}^{h - 1} INF_{\sigma \left( k \right)} }}{{\mathop \sum \nolimits_{k = 1}^{q} INF_{k} }}} \right), $$
(23)

where \(Q\) is a Basic Unit-interval Monotone function [52]: \(\left[ {0,1} \right] \to \left[ {0,1} \right]\) such that \(Q\left( 0 \right) = 0\), \(Q\left( 1 \right) = 1\) and if \(x > y\) then \(Q\left( x \right) \ge Q\left( y \right)\).

Let \({\text{GCL}}\) represent the group consensus level in CRP. Given a consensus threshold \(\underline{{{\text{GCL}}}}\) and a satisfactory consensus level \(\overline{{{\text{GCL}}}}\), where \(\underline{{{\text{GCL}}}} (\underline{{{\text{GCL}}}} < \overline{{{\text{GCL}}}} )\) is the critical value for the stage division.

(1) The first stage (\({\text{GCL}} < \underline{{{\text{GCL}}}}\)): cluster adjustment

To begin with, we must identify the cluster that need to adjust preference at each iteration. Since the purpose of CRP is to improve the group consensus level, it is reasonable to choose the cluster with the lowest contribution to \({\text{GCL}}\) to adjust preference. Denote the cluster with the lowest contribution to \({\text{GCL}}\) as \(C_{{lk_{0} }}\) at current iteration. The cluster \(C_{{lk_{0} }}\) can be determined by the following equation:

$$ \delta_{{{\text{cc}}}} \left( {C_{{lk_{0} }} } \right) = \mathop {\min }\limits_{k = 1, \ldots ,l} \left\{ {\delta_{cc} \left( {C_{lk} } \right)} \right\}. $$
(24)

Then, the clusters that have influence on the cluster \(C_{{lk_{0} }}\) should also be determined according to the influence network. Let \({\text{CS}}_{{k_{0} }}\) be the set of clusters connected to the cluster \(C_{{lk_{0} }}\). The set \({\text{CS}}_{{k_{0} }}\) can be determined as:

$$ {\text{CS}}_{{k_{0} }} = \left\{ {C_{lh} {|}m_{{hk_{0} }}^{C} \ne 0,{ }h = 1, \ldots ,l} \right\}{ }. $$
(25)

For convenience, let \(I\left( {k_{0} } \right) \subseteq \left\{ {1,2, \ldots ,l} \right\}\) be the subscript set of h satisfying \(m_{{hk_{0} }}^{C} \ne 0\left( {{ }h = 1, \ldots ,l} \right)\). Then, we can denote \({\text{CS}}_{{k_{0} }}\) as \({\text{CS}}_{{k_{0} }} = \left\{ {C_{lh} {|}h \in I\left( {k_{0} } \right)} \right\}\).

It is believed that, when preference adjustment is needed, an expert is willing to refer to the opinions of experts who have influence on him/her. Consequently, the preference of the clusters in \({\text{CS}}_{{k_{0} }}\) is important reference to generate recommendation for the cluster \(C_{{lk_{0} }}\). Specifically, recommendation can be obtained by aggregating the preference of the clusters in \({\text{CS}}_{{k_{0} }}\). Further, for cluster \(C_{lh} \in {\text{CS}}_{{k_{0} }} ,\) the preference of \(C_{lh}\) can be obtained by aggregating the preference of the experts in \(C_{lh}\). Preference fusion is conducted using the INF-IOWA operator, which is described as follows.

For \(C_{lh} \left( {h \in I\left( {k_{0} } \right)} \right)\), denote \(C_{lh} = \left\{ {e_{lh,1} , \ldots ,e_{{lh,m_{lh} }} } \right\}\), where \(m_{lh}\) represent the number of experts in \(C_{lh}\). Let \(D^{lh,i}\) and \({\text{INF}}_{lh,i}\) be the DLPRs and the influence of the expert \(e_{lh,i} \left( {i = 1, \ldots ,m_{lh} } \right)\) respectively. Let \(D_{lh}^{{{\text{cluster}}}}\) represent the cluster preference of \(C_{lh} \left( {h \in I\left( {k_{0} } \right)} \right)\). Then, \(D_{lh}^{{{\text{cluster}}}} \left( {h \in I\left( {k_{0} } \right)} \right)\) is computed as:

$$ D_{lh}^{{{\text{cluster}}}} = \emptyset_{W}^{{{\text{INF}}}} \left( {\left\{ {\left\langle {{\text{INF}}_{lh,i} ,D^{{lh,{\text{i}}}} } \right\rangle |i = 1, \ldots ,m_{lh} } \right\}} \right). $$
(26)

Then, the recommendation for the cluster \(C_{{lk_{0} }}\), denoted as \(D^{{ref,k_{0} }}\), is computed as:

$$ D^{{{\text{ref}},k_{0} }} = \emptyset_{W}^{{{\text{INF}}}} \left( {\left\{ {\left\langle {{\text{INF}}\left( {C_{lh} } \right),D_{lh}^{{{\text{cluster}}}} } \right\rangle |h \in I\left( {k_{0} } \right)} \right\}} \right). $$
(27)

As mentioned earlier, the cluster adjustment strategy means that all experts in the cluster \(C_{{lk_{0} }}\) need to adjust preferences synchronously. \(\forall e_{i} \in C_{{lk_{0} }}\), the adjusted preference \(\overline{D}^{i}\) is calculated as follows:

$$ \overline{D}^{i} = \beta D^{i} + \left( {1 - \beta } \right) \cdot D^{{ref,k_{0} }} ,{ } $$
(28)

where the scalar \(\beta\) is the adjustment parameter. For each expert in the cluster \(C_{{lk_{0} }}\), a new DLPR is obtained as the weighted sum of the current DLPR and the recommendation at each iteration.

(2) The second stage (\(\underline{{{\text{GCL}}}} \le {\text{GCL}} < \overline{{{\text{GCL}}}}\)): individual adjustment

When GCL is higher than \(\underline{{{\text{GCL}}}}\) but lower than \(\overline{GCL}\), the feedback mechanism adopts individual adjustment strategy. In this situation, the expert who need to adjust preference is the expert with low consensus contribution. Here, we identify the expert by searching in the cluster with the lowest consensus contribution degree, i.e., \(C_{{lk_{0} }}\). For \(e_{i} \in C_{{lk_{0} }}\), the consensus degree of the expert \(e_{i}\), denoted by \(CD_{i} ,\) can be computed as:

$$ {\text{CD}}_{i} = \frac{{\mathop \sum \nolimits_{{e_{j} \in C_{{lk_{0} }} }} {\text{sim}}_{ij} }}{m - 1} + \frac{{\mathop \sum \nolimits_{{e_{j} \notin C_{{lk_{0} }} }} {\text{sim}}_{ij} }}{m - 1}. $$
(29)

Based on the consensus degree defined by Eq. (29), the expert with the lowest consensus contribution, denoted by \(e_{{i_{0} }}\), can be determined by

$$ {\text{CD}}_{{i_{0} }} = \mathop {\min }\limits_{{e_{i} \in C_{{lk_{0} }} }} \left\{ {{\text{CD}}_{i} } \right\}. $$
(30)

Next, the experts that have influence on the expert \(e_{{i_{0} }}\) should also be determined according to the influence network. Let \({\text{ES}}_{{i_{0} }}\) be the set of experts in the cluster \(C_{{lk_{0} }}\) that have influence on the expert \(e_{{i_{0} }}\). The set \({\text{ES}}_{{i_{0} }}\) is determined as

$$ {\text{ES}}_{{i_{0} }} = \left\{ {\left. {e_{i} } \right|e_{i} \in C_{{lk_{0} }} \wedge \left( {m_{{ii_{0} }} \ne 0} \right)} \right\}. $$
(31)

For convenience, let \(I\left( {i_{0} } \right)\) be the subscript set of \(i\) satisfying \(e_{i} \in C_{{lk_{0} }} \wedge \left( {m_{{ii_{0} }} \ne 0} \right)\). Then, the set \({\text{ES}}_{{i_{0} }}\) can be also denoted as \({\text{ES}}_{{i_{0} }} = \left\{ {e_{i} {|}i \in I\left( {i_{0} } \right)} \right\}\). Using the INF-IOWA operator, we can obtain the preference recommendation for the expert \(e_{{i_{0} }}\) as:

$$ D^{{{\text{ref}},i_{0} }} = \emptyset_{W}^{{I{\text{NF}}}} \left( {\left\{ {\left\langle {{\text{INF}}_{h} ,D^{h} } \right\rangle |h \in I\left( {i_{0} } \right)} \right\}} \right), $$
(32)

where \({\text{INF}}_{h} {\text{ and }}D^{h}\) represent the influence and the DLPR of the expert \(e_{h} \left( {h \in I\left( {i_{0} } \right)} \right)\) respectively.

For the expert \(e_{{i_{0} }}\), the adjusted preference \(\overline{D}^{{i_{0} }}\) is calculated as follows:

$$ \overline{D}^{{i_{0} }} = \beta_{{i_{0} }} D^{{i_{0} }} + \left( {1 - \beta_{{i_{0} }} } \right) \cdot D^{{ref,i_{0} }} ,{ } $$
(33)

where \(D^{{i_{0} }}\) is the DLPR of the expert \(e_{{i_{0} }}\) before adjustment and \(\beta_{{i_{0} }}\) is the adjustment parameter.

In the first stage of preference adjustment, the initial consensus level of the large group is relatively low. Consequently, adopting the cluster adjustment strategy is conducive to improve the efficiency of consensus reaching. In the second stage, the group consensus level has reached a high level (but not a satisfactory level). However, there may be some experts whose opinions are far away from opinions of the group. In this situation, the individual adjustment strategy can conduct personalized preference adjustment. In Eqs. (28) and (33), the adjustment parameter \(\beta \left( {\beta_{{i_{0} }} } \right)\) indicate the extent to which an expert retains the current opinion. Generally, a larger \(\beta \left( {\beta_{{i_{0} }} } \right)\) means more preference protection, and a smaller \(\beta \left( {\beta_{{i_{0} }} } \right)\) is conducive to accelerating the consensus reaching. Thus, there is often a tradeoff between accelerating the consensus reaching and preserving the initial preference of the experts.

5.2 Selection process

If the group consensus level meets the condition of \(\mathrm{GCL}\ge \overline{\mathrm{GCL}}\), it is considered that group consensus has been reached among experts. In this situation, a high-quality (i.e., low-disagreement) large group decision can be made using selection process. To obtain the best alternative, we propose to use the INF-IOWA operator to calculate the overall evaluation of the alternatives. In the following, let \({C}_{l}=\left\{{C}_{lk}|k=\mathrm{1,2},\dots, l\right\}\) denote the set of clusters at optimal cutting level \({\alpha }_{l}\) when the group consensus reaches the satisfactory level. We describe the general process for alternative selection in the following steps.

Step 1: Aggregating experts’ evaluations into cluster’s evaluation.

For the cluster \(C_{lk} \in C_{l}\), we aggregate the assessment information of experts on alternatives into a cluster’s assessment using the INF-IOWA operator. Let \(D_{lk}^{{{\text{cluster}}}}\) represent the cluster preference of \(C_{lk} \left( {k = 1,2, \ldots, l} \right)\). Without loss of generality, denote \(C_{lk} = \left\{ {e_{lk,1} , \ldots ,e_{{lk,m_{lk} }} } \right\}\), where \(m_{lk}\) represent the number of experts in \(C_{lk}\). Then, \(D_{lk}^{{{\text{cluster}}}} \left( {k = 1,2, \ldots, l} \right)\) is computed as:

$$ D_{lk}^{{{\text{cluster}}}} = \emptyset_{W}^{{I{\text{NF}}}} \left( {\left\{ {\left\langle {{\text{INF}}_{lk,i} ,D^{{lk,{\text{i}}}} } \right\rangle |i = 1, \ldots ,m_{lk} } \right\}} \right), $$
(34)

where \({\text{INF}}_{lk,i}\) and \(D^{{lk,{\text{i}}}}\) are the influence and DLPR of the expert \(e_{lk,i}\) in \(C_{lk}\).

Step 2: Aggregating clusters’ evaluations into the large group’s evaluation.

Based on the clusters’ preferences \(D_{l1}^{{{\text{cluster}}}} , \ldots ,D_{ll}^{{{\text{cluster}}}}\), we can obtain the large group’s evaluation \(D^{{{\text{LG}}}}\) using the INF-IOWA operator.

$$ D^{{{\text{LG}}}} = \emptyset_{W}^{{{\text{INF}}}} \left( {\left\{ {\left\langle {I{\text{NF}}\left( {C_{lk} } \right),D_{lk}^{{{\text{cluster}}}} } \right\rangle |k = 1, \ldots ,l} \right\}} \right). $$
(35)

Step 3: Ranking alternatives

Based on the aggregated preference relation \(D^{LG} = \left( {d_{ij}^{LG} } \right)_{n \times n}\), the weighted average operator of LDAs in Eq.

(5) is used as the alternative’s evaluation operator, where \(w_{i} = \frac{1}{n}\), let \(v = \left( {v_{1} ,v_{2} , \ldots ,v_{n} } \right)^{{\text{T}}}\) is the overall evaluation value of each corresponding alternative in the set \(X\), then the overall evaluation of the alternative \(x_{i}\) can be calculated as follows:

$$ v_{i} = {\text{NS}}\left( {\text{DAWA }} \right) = {\text{NS}}\left( {{\text{DAWA }}\left( {d_{i1}^{G} , \ldots ,d_{in}^{G} } \right)} \right). $$
(36)

Let \(v_{{{\text{opt}}}} = \max \left( {v_{1} ,v_{2} , \ldots ,v_{n} } \right)\) and therefore, the alternative \(x_{{{\text{opt}}}}\) corresponding to \(v_{{{\text{opt}}}}\) is the best alternative in the set \(X\).

The framework of the proposed approach is presented in Fig. 1

Fig. 1
figure 1

The framework of the proposed model

6 Case Analysis

Ecological civilization construction has become an essential strategy for resolving China’s severe resource and environmental issues. In recent years, the Chinese government has attached great importance to environmental protection and pollution prevention and control, and has repeatedly put forward the requirement of actively exploring new patterns of environmental management. With the rapid development of new generation of information technology, smart environmental protection construction has become an important means to promote environmental management innovation and improve environmental management performance. In this section, the feasibility and effectiveness of the proposed model are verified by analyzing the location decision-making problem of smart environmental protection project in a city in China.

As a pilot city of smart environmental protection project, a city in Central China decided to promote the construction of a smart environmental protection project in the urban area. Location decision-making problem is a key issue in the construction of the smart environmental protection project, which has attracted extensive public attention. On the basis of comprehensive consideration of construction requirements, supporting facilities, environmental impact and other factors, the management department of the project selected four locations in the city as alternatives for the construction site of smart environmental protection project. To improve the scientificalness, transparency and social consensus of the site selection decision of smart environmental protection projects, the management department of the project invited 20 experts to form a decision-making group, which will jointly make the site selection decision of smart environmental protection projects. This group is composed of experts in relevant fields of the project and some representative members of the public. Due to the large scale and different background of participants, it is difficult for the group to make consistent location decisions. To improve the consensus level of decision-making, this paper applies the proposed influence-based consensus model to analyze this problem. Let \(X=\left\{{x}_{1},{x}_{2},{x}_{3},{x}_{4}\right\}\) denote the set of the candidate sites, and \(E = \left\{ {e_{1} ,e_{2} , \ldots ,e_{20} } \right\}\) be the set of 20 decision-making participants. Given linguistic term set \(S = \left\{ {s_{0} :{\text{very poor,}}\;s_{1} :{\text{generally poor,}}\;s_{2} :{\text{poor,}}\;s_{3} :{\text{medium,}}\;s_{4} :{\text{good,}}\;s_{5} :{\text{generally good,}}\;s_{6} :{\text{very good}}} \right\}\). The participants' comparative linguistic expressions preference information about alternatives is collected by questionnaire surveys. We list the DLPRs provided by the expert \(e_{1}\) on the set \(X\), as shown in Table 1. Due to space limitations, the preferences of other experts are not listed in the paper.

Table 1 DLPR provided by expert \({{\varvec{e}}}_{1}\)

We use the proposed consensus model based on the influence network to solve the site selection problem of smart environmental protection project. In this case study, given the consensus threshold for the stage \(\underline{{{\text{GCL}}}} = 0.85\), the satisfactory consensus level \(\overline{{{\text{GCL}}}} = 0.90\). For the first stage of the CRP, the thresholds for the division of influence groups are \({\text{INF}}_{{{\text{min}}}} = 0.65,\;{\text{INF}}_{{{\text{max}}}} = 0.75\), and the similarity threshold of clusters \(\delta_{{{\text{sim}}}} = 0.65\). After iterations in the first stage, the group consensus level and preference similarity have been significantly improved. Consequently, we use different parameters in the second stage. In the second stage, we set \({\text{INF}}_{{{\text{min}}}} = 0.7,\;{\text{INF}}_{{{\text{max}}}} = 0.8\), and \(\delta_{{{\text{sim}}}} = 0.8.\) There is a preference adjustment parameter \(\beta \left( {\beta_{{i_{0} }} } \right)\) in both stages. For simplicity, the same parameter value is used in this example, i.e., \(\beta = \beta_{{i_{0} }} = 0.7\). To investigate the impact of the adjustment parameter on the consensus reaching, a sensitivity analysis will be conducted later.

Firstly, the modified AHC method is applied to detect subgroups in the large-scale group of experts. After the AHC procedure for LSGDM with DLPRs as given in Algorithm 1 is carried out, the visualization of clustering outcomes is demonstrated by a dendrogram (Fig. 2). As shown in Fig. 2, if the dendrogram is cut at different heights, the experts are partitioned into different clusters. For example, if the dendrogram is cut at height 0.37, experts are grouped into four clusters: \(({e}_{4},{e}_{15},{e}_{8})\), \(({e}_{2},{e}_{3},{{e}_{7},e}_{8},{e}_{9},{{e}_{10},e}_{11},{{e}_{13},e}_{20})\), \(({e}_{1},{e}_{6})\) and \(({e}_{5},{e}_{12},{e}_{14},{e}_{16},{e}_{17},{e}_{19})\). Given the set of \(\alpha \) levels as \(\left\{\mathrm{2,3},\dots ,19\right\}\), the cluster internal cohesion index \({\delta }_{\mathrm{int}}\), the cluster external cohesion index \({\delta }_{\mathrm{ext}}\), the consensus contribution degree \({\delta }_{\mathrm{cc}}\) and the group consensus level \(\mathrm{GCL}\) are calculated. Table 2 provides the related results for \(\alpha =\mathrm{2,3},\mathrm{4,5},6\). According to the results, it can be observed that the maximum initial consensus level is 0.6374, and the optimal \(\alpha \) level is \(\alpha \)=2. Due to the fact that current \(GCL\) is less than the consensus threshold \(\underset{\_}{\mathrm{GCL}}\) (0.6374 < 0.85), the CRP must be carried out. Moreover, the cluster adjustment strategy should be adopted in the feedback mechanism.

Fig. 2
figure 2

Dendrogram generated by Algorithm 1 at the first iteration

Table 2 Initial clustering results and group consensus level

Secondly, the CRP is carried out to improve the group consensus level. In the following, we only take the first iteration as an example to give the calculation details. According to the CRP proposed in Sect. 5, the first step is to identify the cluster with the lowest consensus level at each iteration.

As mentioned above, the optimal level \(\alpha \)=2 and experts are grouped into two clusters:

$${C}_{21}=\left\{{e}_{5},{e}_{12},{e}_{14},{e}_{16},{e}_{17},{e}_{19}\right\}$$

and

\({C}_{22}=\left\{{e}_{1},{e}_{2},{e}_{3},{e}_{4},{e}_{6},{e}_{7},{e}_{8},{e}_{9},{e}_{10},{e}_{11},{e}_{13},{e}_{15},{e}_{18},{e}_{20}\right\}\).

As shown in Table 2, we can observe that \({C}_{21}\) is the cluster with the lowest consensus contribution. According to the cluster adjustment strategy, experts in cluster \({C}_{21}\) should change their preferences synchronously.

To generate recommendation, we should construct the influence network. The influences of the two clusters \({C}_{21}\) and \({C}_{22}\) calculated by Eq. (19) are 0.7126 and 0.6948, which mean that they belong to the medium-influence clusters. Further, we can compute the similarity between the clusters \({C}_{21}\) and \({C}_{22}\) is \(0.7739\). According to the cluster consensus level, we can determine that the cluster \({C}_{22}\) provides recommendation for the cluster \({C}_{21}\). The influences of the experts in \({C}_{22}\) calculated by Eq. (15) is {0.8055 0.8199 0.8226 0.7853 0.7999 0.8105 0.7939 0.8065 0.8051 0.7924 0.8147 0.7875 0.7852 0.7978}. Using the method in Sect. 4, we can obtain the influence network. From Eq. (26), the preference of the cluster \({C}_{22}\) can be obtained. And the preference of the experts in \({C}_{21}\) can be adjusted using Eq. (28). To save space, the details are not listed here. After 10 iterations in the first stage, the group consensus level \((\mathrm{GCL}=0.8519)\) reached the threshold \((\underset{\_}{\mathrm{GCL}}=0.85)\).

In the second stage, the optimal level \(\alpha \)=3 and the experts are grouped into three clusters:

$${C}_{31}=\left\{{e}_{1},{e}_{2},{e}_{3},{e}_{4},{e}_{5},{e}_{6},{e}_{7},{e}_{9},{e}_{11}\right\}$$
$${C}_{32}=\left\{{e}_{8},{e}_{10},{e}_{17},{e}_{18},{e}_{19},{e}_{20}\right\}$$
$${C}_{33}=\left\{{e}_{12},{e}_{13},{e}_{14},{e}_{15},{e}_{16}\right\}$$

and

Using Eq. (9), we can obtain the following cluster consensus levels: 0.8581, 0.8493, 0.8484, respectively. We can see that \({C}_{33}\) is the cluster with the lowest consensus contribution to the group consensus. Further, the expert \({e}_{16}\) can be identified to modify his/her preference by applying Eq. (29). To generate recommendation, we should determine the experts in \({C}_{33}\) that have influence on the expert \({e}_{16}\). Using Eqs. (4), (15) and (20), we can determine that the experts \({e}_{12},{e}_{13},{e}_{14},{e}_{15}\) have influence on the experts \({e}_{16}\). Using the INF-IOWA operator, we can obtain the preference recommendation for the expert \({e}_{16}\). The adjusted preference can be obtained using Eq. (33).

Figure 3 shows the clustering dendrogram after the group consensus level have reached the satisfaction level. The large-scale group are classified into five clusters, i.e., \({C}_{51}=\left\{{e}_{1},{e}_{2},{e}_{3},{e}_{4},{e}_{5}\right\}\), \({C}_{52}=\left\{{e}_{6},{e}_{7}\right\}\), \({C}_{53}=\left\{{e}_{8},{e}_{9},{e}_{10},{e}_{11}\right\}\), \({C}_{54}=\left\{{e}_{12},{e}_{13},{e}_{14},{e}_{15}\right\}\), \({C}_{55}=\left\{{e}_{16},{e}_{17},{e}_{18},{e}_{19},{e}_{20}\right\}\). The selection process can be conducted to determine the best alternative. We can obtain the cluster preference using Eq. (34). And the preference of the clusters can be aggregated into the group preference using Eq. (35). According to Eq. (36), the weighted average operator is used to calculate the overall evaluation value of the four alternatives,\({z}_{1}=0.5524\), \({z}_{2}=0.3508\), \({z}_{3}=0.2373\), \({z}_{4}=0.4610\), which determine the ranking of the alternatives is \({x}_{1}\succ {x}_{4}\succ {x}_{2}\succ {x}_{3}\). Therefore, alternative \({x}_{1}\) can be submitted to the decision maker as the best alternative.

Fig. 3
figure 3

Dendrogram at the last iteration

Considering that the adjustment parameter is an important parameter in the CRP, a sensitivity analysis about this parameter is conducted in the numerical analysis. We run the numerical example for the adjustment parameter values \(\beta ={\beta }_{{i}_{0}}=0.5, 0.6, 0.7, 0.8\), respectively. Figure 4 reveals the impact of the parameter \(\beta ({\beta }_{{i}_{0}})\) on the evolution of the group consensus level. The following conclusions can be summarized from Fig. 4. Firstly, evolution of the group consensus level has a prominent stage characteristic. When the group consensus level is lower than \(\underset{\_}{\mathrm{GCL} }\)(the first stage), the improvement speed of the group consensus level is obviously faster. Apparently, the main reason is that different preference adjustment strategies are adopted in different stage of the CRP. Secondly, generally speaking, the smaller the preference adjustment parameter, the faster the speed of the consensus improvement. From Fig. 4, we can see that when the adjustment parameter \(\beta ({\beta }_{{i}_{0}})\) is set to be 0.5, 0.6, 0.7 and 0.8, it takes 9, 11, 10 and 20 iterations respectively to make the group consensus level reach the threshold \(\underset{\_}{\mathrm{GCL}}\). Moreover, it takes 22, 29, 41 and 63 iterations respectively to make the group consensus level reach the satisfactory level \(\overline{\mathrm{GCL}}\). Thirdly, the structure of the influence network is also a factor affecting the convergence of group consensus. Although it is pointed out that the smaller the preference adjustment parameter, the faster the speed of the consensus improvement, we can find that for \(\beta =\) 0.6 and \(\beta =\) 0.7, the evolution curve of group consensus level intersects at some points. It is believed that the specific characteristics of the influence network are the important reasons for this phenomenon.

Fig. 4
figure 4

Evolution of the group consensus level with different adjustment parameters

7 Discussions: advantages and limitations

Advantages Here, we find some advantages of the proposed model by comparing it with some closely related works.

(1). Comparison with [59,60,61] from the perspective of preference clustering. Preference clustering process plays an important role in LSGDM. Zhang et al. [59] developed a preference clustering approach based on broad first searching neighbors for the LSGSM with CLEs. In this clustering algorithm, one of the most important input information is the similarity degree matrix among decision makers, which is calculated from the associated fuzzy preference relations. For LSGDM with hesitant fuzzy linguistic information, Zhong et al. [60] presented a clustering method integrating the correlation and consensus to divide the large-scale experts into several clusters. Zheng et al. [61] proposed a bi-objective clustering algorithm based on the group consensus degree indicator and group information entropy indicator to divide the experts into different clusters, considering the similar relationship and the quality of evaluation information simultaneously. In our work, the conventional AHC algorithm is adapted to detect subgroups in LSGDM problem with CLEs. Compared with these works, the proposed clustering algorithm uses an effective distance measure between LDAs to calculate information distance. More importantly, based on the informative output of the proposed clustering algorithm, an effective index can be constructed to measure the group consensus level.

(2). Comparison with [24, 32, 50, 62] from the perspective of social network construction. Pérez et al. [24] presented a model that gathers the experts’ initial opinions and provides a framework to represent the influence of a given expert over the other(s). Tian et al. [32] developed a novel SNA-based decision framework for addressing LSGDM problems with incomplete interval type-2 fuzzy information. Gai et al. [62] proposed a framework of joint feedback strategy to help large-scale group decision makers to reach an agreement by combing social network context and feedback behavior. In these works, information about social network is usually handled as extra input information of decision analysis. In contrast, our model does not need to collect extra information about the social network in advance, but directly uses the preference information about alternatives to construct the influence network. As mentioned earlier, the influence network construction in our work is inspired by Ureña et al. [50]. Different from Ureña et al. [50], we incorporate certainty of judgment into the influence measurement by considering the fact that DLPR is a kind of preference information with high uncertainty. We design a consistency-similarity-certainty based influence measure to model the influence of experts. Meanwhile, a consensus-similarity-certainty based influence measure is developed to model the influence of clusters.

(3). Comparison with [13, 63, 64] from the perspective of CRP. Compared with CRP in GDM with a few experts, CRP in LSGDM is more important and necessary because opinions among a large number of participants tend to be easily polarized and conflicting. In recent years, many proposals have been introduced to handle the CRP in LSGDM problems with linguistic information. Rodríguez et al. [13] introduced a new cohesion measure for HFLTS for measuring the cluster cohesiveness to drive the consensus process and thus reduce the impact of internal disagreements risen in majority driven CRPs. Further, the measure is integrated in a new cohesion-driven CRP approach based on LSGDM to deal with CLEs. Gou et al. [63] proposed a CRP model for LSGDM with double hierarchy hesitant fuzzy linguistic preference relations. To ensure the implementation of CRP, they also proposed the similarity degree-based clustering method, the double hierarchy information entropy-based weights-determining method and the consensus measures. Xiao et al. [64] developed a framework to address personalized individual semantics and consensus in LSGDM with linguistic distribution preference relations. They devised a two-stage consensus-reaching model to manage the individual consistency and group consensus, which seeks to minimize the preference information loss. Compared with these CRP models in linguistic information context, we develop a two-stage feedback mechanism guided by influence network is established for the CRP, which adopts cluster adjustment strategy and individual adjustment strategy depending on the different consensus levels. The proposed mechanism can not only effectively improve the efficiency of CRP, but also take individual preference adjustment into account.

Limitations Meanwhile, we also find some weaknesses of the proposed model, which need to be overcome in the future research.

(1). For CLEs provided by experts, the proposed model transforms them into LDAs with equal proportions. Although this transformation can simplify the analysis, it is not necessarily the most reasonable transformation. We argue that it would be worth introducing the linguistic distribution-based optimization approach for transforming CLEs into LDAs [59].

(2). In LSGDM dealing with CWW, there is a fact that words imply different things for different people. In recent years, personalized individual semantics (PIS) have attracted extensive attention in GDM due to their influence on the final decision in linguistic context [65,66,67,68]. However, the PIS issue is not considered in our model. We argue that it would be very interesting in the future to address PIS and consensus in LSGDM problems with complex linguistic information.

8 Conclusion

This paper proposed an influence-driven consensus model for LSGDM problems with comparative linguistic expressions. Firstly, an agglomerative hierarchical clustering algorithm is designed for LSGDM with DLPRs. The developed algorithm uses the distance measure between linguistic distribution assessments to calculate the distance between DLPRs. More importantly, the proposed clustering method can determine the optimal clustering level by considering the measurement of group consensus level. Secondly, we propose a novel method for constructing influence network among participants in LSGDM. In most existing decision models for LSGDM in social network environment, information about social network is usually considered as extra input information of decision analysis. It is highlighted that the influence network proposed in this paper is inferred from the opinions of participants. Thirdly, an influence-driven CRP model is proposed for LSGDM problems. In the proposed CRP, the feedback mechanism is designed by combining the cluster adjustment strategy and the individual adjustment strategy, which can not only improve the efficiency of consensus reaching, but also take the individual adjustment of preferences into account.

In the future, we need to develop optimization-based approach to deal with personalized individual semantics (PIS) and CRP modeling in LSGDM with complex linguistic information by considering the fact that words mean different things for different people. Also, GDM process in practice involves not only mathematical issues but also psychological factors (such as non-cooperative behaviors). Therefore, it is very interesting to consider the psychological factors of decision makers in the proposed LSGDM framework.