Primary node election based on probabilistic linguistic term set with confidence interval in the PBFT consensus mechanism for blockchain

This study proposes a primary node election method based on probabilistic linguistic term set (PLTS) for the practical Byzantine fault tolerance (PBFT) consensus mechanism to effectively enhance the efficiency of reaching consensus. Specifically, a novel concept of the probabilistic linguistic term set with a confidence interval (PLTS-CI) is presented to express the uncertain complex voting information of nodes during primary node election. Then, a novel score function based on the exponential semantic value and confidence approximation value for the PLTS-CI, called Score-ESCA, is used to solve the problems of comparing different nodes with various voting attitudes. This method helps select the node with the highest score by utilizing complex decision attitudes, making it an accurate primary node election solution. Furthermore, the feasibility of our proposed method is proved by both theoretical analysis and experimental evaluations.

The PBFT consensus mechanism has shown great potential in enhancing the consensus reaching efficiency due to the existence of the primary node [32,33]. It is worth noting that the primary node plays a critical role in generating blocks in turn in the PBFT consensus mechanism. Only one primary node exists in each round of consensus, and its identity is confirmed before the consensus process [23,34]. The primary node has the authority to sort and broadcast transactions [35][36][37]. Hence, the election of the primary node poses threats to a distributed blockchain system. For instance, a vulnerable primary node is prohibited from sending broadcast messages by an attacker, or the primary node may be an elected malicious node [38,39]. These cases will cause the consensus process of the blockchain network to stall.
As the quantity of blockchain nodes increases, the security challenges of the PBFT consensus mechanism become more important [40][41][42]. Some studies have aimed to overcome these security threats. Lao et al. [43] proposed a location-based and scalable PBFT consensus mechanism, called Geographic-PBFT, for IoT-blockchain applications. Li et al. [21] proposed a double-layer PBFT protocol that can accommodate more faulty nodes. However, these methods focus only on optimizing the scalability when nodes communicate frequently. The consensus security issues caused by a malicious primary node are not considered or ignored. As the primary node is chosen based on the time at which it joins the network [44], the probability of a malicious node being elected as the primary node is relatively high. Consequently, the fault tolerance of blockchain will be compromised. Xu et al. [45] proposed a concurrent PBFT consensus method, called C-PBFT. After classifying the peers in the supply chain into several clusters, the trusted primary node is elected by reputation assessment. Wang et al. [46] proposed an improved credit-based PBFT consensus algorithm (CPBFT), in which the probability of a primary election is affected by the past behavior. Therefore, the cost of the iterative calculation of credit values and consensus stage of a primary node is relatively large. Li et al. [47] proposed that the primary node of the PBFT consensus mechanism be elected according to a voting strategy. This can reduce the probability of a malicious node being elected as the primary node. However, primary node election among nodes does not reflect actual complex decision situations when expressing voting attitudes. Therefore, the fuzzy set theory was introduced as a tool to cope with uncertainty in the decision-making process by considering many different perspectives [48]. Karaşan et al. [48] utilized the hesitant fuzzy Z-numbers to consider both the preciseness of the data and the hesitancy of the experts when evaluating the risks of blockchain technology. Thus, this paper focuses on blockchain risk evaluation rather than node selection. Xu et al. [49] proposed a selection method for agent nodes in the DPoS consensus mechanism based on vague set, similar to how human elections are held. However, there still may be a situation in which the rankings of the nodes are indistinguishable. Liu et al. [50] pointed out that probabilistic linguistic term set (PLTS) that can be seen as a representation of uncertain voting information to select delegates in the DPoS consensus mechanism. However, the primary node election involves multiple factors, also called attributes, which usually have different importance weights. These improvements consider only a single attribute of nodes one-sidedly, ignoring other aspects of complex voting attitudes and sensitive information about human emotions. Therefore, based on the difficulty of primary node election, the security of the PBFT consensus mechanism still cannot be guaranteed effectively.
Throughout the paper, we focus on primary node election in the PBFT consensus mechanism. To represent more complicated voting attitudes, a novel PLTS with a confidence interval (PLTS-CI) is proposed to deal with the uncertain voting information in the PBFT consensus mechanism. Then the primary node election process is formulated as a multiple attribute decision-making (MADM) problem to find the most suitable primary node by synthetically evaluating the values of multiple attributes of all nodes. The proposed consensus mechanism can select a credible primary node according to the voting before the implementation of consensus, which reduces the probability of malicious nodes participating in block generation in the blockchain. With the operation of multiple rounds of consensus, a node with a score in the blockchain that is always high can be quickly identified as the primary node. Similarly, the nodes that always have low scores can be identified as malicious nodes and removed, making the blockchain more stable. Therefore, the probability of malicious nodes being elected as the primary node is greatly reduced, and the security of the proposed consensus mechanism is more effectively guaranteed than in previous literature. In summary, the major contributions of this article are as follows: 1. The new concept of PLTS-CI is proposed to express the uncertain complex voting preference information of nodes. 2. By analyzing the subscript deviation of PLTS-CI, a novel exponential semantic value of the linguistic term is defined. 3. A confidence approximation value in PLTS-CI is proposed to enhance the degree of evaluation certainty of the decision-makers. 4. The proposed score function based on exponential semantic value and confidence approximation value in PLTS-CI (Score-ESCA) can better distinguish the PLTS-CIs used to indicate the evaluation of nodes.
5. Furthermore, the Score-ESCA method is adopted for an improved technique for order preference by similarity to ideal solution (TOPSIS) method for primary node election in the PBFT consensus mechanism.
The rest of this paper is organized as follows: the basic definitions relevant to the PBFT consensus mechanism, linguistic term set, PLTS, and framework of MADM are reviewed in the next section. Following section puts forward the novel concept of PLTS-CI, then a novel score function called Score-ESCA is developed. The primary node election process is formalized as a probabilistic linguistic MADM. The next section presents a primary node election example using our methodology. Theoretical analysis and experimental analysis are given to prove the advantages of our proposed method in the next section. Finally, the last section gives the conclusions of this paper.

Preliminaries
In this section, we briefly review the basic knowledge of the PBFT, linguistic term set (LTS), PLTS, and framework of MADM.

PBFT consensus mechanism
PBFT is a state machine replication mechanism that correctly survives Byzantine faults in asynchronous networks [51]. The client and consensus nodes work together to complete the consensus process. The consensus nodes are divided into one primary node and several replica nodes: 1. Primary node: Before the consensus process begins, the primary node is responsible for receiving a certain number of transactions from the client and multicasting them sequentially to other replica nodes. In particular, the primary node is voted on by valid consensus nodes. 2. Replica nodes: Replica nodes execute in the order specified by the primary node, and the consistent content of blocks is guaranteed by ensuring that requests are executed in a consistent order.
In the PBFT consensus mechanism, the client will issue a request to the primary node. After the primary and replica nodes agree upon the request, it is decided whether the request can be executed or not. As shown in Fig. 1, the specific consensus process consists of the following steps: Step 1: before implementing the PBFT consensus mechanism, all consensus nodes are equally likely to be elected as the primary node. Then, a view-new message is sent to synchronize the data of all nodes after initializing the primary node.
Step 2: the client sends a request to activate the service operation on the primary node. After the request validation succeeds, the primary node broadcasts the request and sends pre-prepare messages to all replicas.
Step 3: after a valid replica node verifies the correctness of the message, it broadcasts the prepare message to all nodes and enters the prepare stage.
Step 4: the replica node collects the prepare messages for the request. When more than 2f requests are counted (f is the number of tolerable Byzantine nodes), the node enters the commit phase and broadcasts the commit message.
Step 5: the node counts the number of received commit messages. When more than 2f commit messages are counted, this means that most of the nodes have reached a consensus. The node then writes the data, caches the last request of the client, and reports it back to the client.
Step 6: if the client receives f + 1 identical reply messages, it means that the request initiated by the client has reached network consensus. Otherwise, the client needs to determine whether to resend the request to the primary node.

Additive linguistic term sets
The LTS is a set of linguistic variables whose terms are in a natural language. Decision-makers can use the LTS to elaborate their views on a subject. It consists of two parts: a linguistic description operator and semantics. The LTS is the modeling basis of the PLTS. An additive LTS can be defined as follows [52]: where s ρ denotes a linguistic variable. In particular, δ is a positive integer and δ + 1 denotes the cardinality of S. S has the following characteristics: 1. If α > β, then s α > s β ; 2. The negation operator is defined as follows: neg(s α ) s β , where α + β δ; 3. The maximum operator is defined as follows: if s α ≥ s β , then max (s α , s β ) s α ; 4. The minimum operator is defined as follows: if s β ≥ s α , then min (s α , s β ) s α ; Example 2.1 Let us suppose that some decision-makers evaluate the memory of a node, and their attitudes vary from "very high" to "very low", then an LTS S 1 can be described as follows: where δ 6 and the cardinality of S 1 is 7.

Probabilistic linguistic term set
Pang et al. [53] proposed a PLTS comprising several linguistic terms with probabilities. The PLTS embodies the fuzziness and hesitation regarding decision information and contains the probability information of decision information [54]. The mathematical expression of a PLTS can be given as follows: Definition 2.1 [53]. Let S {s ρ |ρ 0, 1, …, δ} be an LTS, then a PLTS is defined as where L (k) (p (k) ) represents the linguistic term L (k) attached to the matching probability p (k) . #L(p) denotes the number of different linguistic terms in L(p). Due to the complexity or specialization of the evaluation problem and the fuzziness of human cognition, the evaluation information that decision-makers usually provide is incomplete. The sum of the probabilities of all possible linguistic terms in a PLTS is less than 1. To eliminate this partial ignorance, the normalization of the PLTS can be performed as follows.
In addition, the numbers of linguistic terms in different PLTSs are usually different, which increases the complexity of the decision calculation. Thus, we usually extend PLTSs with fewer linguistic terms to ensure that different PLTSs have equal numbers of elements. [53]. Let L i (p) {L i (k) (p i (k) ) | k 1, 2, …, #L i (p)} be two PLTSs, where i 1, 2. If #L 1 (p) > #L 2 (p), then we append #L 1 (p)-#L 2 (p) linguistic terms to L 2 (p). The added linguistic term is the smallest term in L 2 (p), and its linguistic probability is zero.

Definition 2.3
The normalization of L 1 (p) and L 2 (p) can be divided into two steps:

Multiple attribute decision-making analysis
The goal of an MADM problem is to rank several selected alternatives [55][56][57]. The basic framework of the MADM analysis process can be summarized as follows: 3. Rank the alternatives N 1 , N 2 , …, N m by processing the above decision matrix.

Primary node election based on PLTS-CI
In this section, the novel concept of the PLTS-CI is proposed. Afterward, to compare several PLTS-CIs, a novel score function called Score-ESCA is presented. Then, an attribute weights calculation method based on the relationship among the decision data is improved, based on which; an improved TOPSIS method for primary node election based on Score-ESCA for MADM is developed.

The concept of the PLTS-CI
In this subsection, the primary node election process of the PBFT consensus mechanism in blockchain is formalized as a decision-making problem that can be described by the PLTS-CI.
In the PBFT consensus mechanism, the consensus nodes are computers in the blockchain network, which are responsible for the transactions ordering and block composition [58][59][60]. The PLTS is an important tool for representing complex decision information that can express the evaluation information of nodes. The primary node election process of the PBFT consensus mechanism is considered a decisionmaking process. In particular, linguistic terms are regarded as the voting options by the decision-makers to express the evaluation information of different alternative nodes.
However, for complex blockchain network scenarios, nodes equipped with similar hardware configurations do not necessarily receive the same evaluation information. For instance, although the memory of the current node may be evaluated as "very high", the decision-makers still may be unable to give a completely specific decision as this linguistic term may not satisfy the decision-makers. There-fore, the confidence degree of the decision-makers regarding the linguistic terms is an important part of the evaluation information. However, in many cases, it has shown great limitations in providing a firm assessment of linguistic terms for decision-makers. Hence, the confidence interval is introduced to express the evaluation confidence information of decision-makers. For example, when a decision-maker considers the memory of a node to be "high", the evaluation information can be expressed as (s 5 , [0.5, 0.6]) by using the linguistic terms in Example 2.1. Note that the confidence interval value [0.5, 0.6] represents the confidence degree of the decision-maker that the voting attitude of the node memory is "high".
Therefore, to better express the decision-making information of decision-makers, the probabilistic linguistic term set with a confidence interval (PLTS-CI) is defined. This linguistic term set not only contains multiple linguistic terms with probability information but also gives the evaluation confidence information in the form of interval values. We obtain the following definition. Definition 3.1 Let S {s ρ |ρ 0, 1, …, δ} be an LTS, then a PLTS-CI can be defined as: where I L k , I R k denotes the confidence interval of linguistic term Q (k) , and I L k , I R k is a nonempty subinterval on [0, 1]. In particular, I L k and I R k denote the lower limit and upper limit of the linguistic confidence interval respectively. Q (k) p (k) , I L k , I R k ] represents the linguistic term Q (k) attached to the matching probability p (k) and confidence interval I L k , I R k . #Q(pI) denotes the number of different linguistic terms in Q(pI).

The score function of the PLTS-CI
In this subsection, the exponential semantic value and confidence approximation value for considering the emotional sensitivities and confidence degree of decision-makers are developed respectively, based on which a novel Score-ESCA function to measure whether a node can be elected as a primary node is proposed.
In the PLTS-CI, the influence of human emotion on linguistic terms and the confidence degree of the decisionmakers using the linguistic terms describe the evaluation information of the nodes. Therefore, the linguistic terms of the decision-makers are critically important in expressing the decision information when electing the primary node.
To intuitively reflect the impact of emotional sensitivity on linguistic terms, an exponential semantic value is defined as follows: The LTS is the source of the information expression of the primary node evaluation decision among n alternative nodes in the PBFT consensus mechanism. Valid voting can give opinions regarding the node N i (i 1, 2, …, m) under consideration. Let S {s ρ |ρ 0, 1, …, δ} be an additive LTS; S is used to describe the evaluation information of node N i . In addition, s δ/2 represents the middle linguistic term used to describe N i , and the rest of the linguistic terms are evenly distributed on either side of it. Thus, if experts' psychological distances between any adjacent linguistic terms in the LTS are equal, then the differences in their corresponding semantic values are also equal. For example, S 1 in Example 2.1 aims to determine whether N i can be elected as the primary node. The evaluations "s 4 slightly high", "s 5 high" and "s 6 very high" are three adjacent linguistic terms. When human emotion is not a factor, the semantic value difference between "s 4 slightly high" and "s 5 high" is equal to the semantic value difference between "s 5 high" and "s 6 very high".
However, the emotions of decision-makers can greatly affect the linguistic terms in decision problems. In practical decision-making, the higher the value of the linguistic subscripts is, the higher the psychological sensitivity of humans. When expressing the evaluation information of a node, the larger the substandard value of the linguistic term is, the more sensitive the psychology of the decision-makers is. For example, when evaluating the pros and cons of nodes, "slightly high" and "high" are two adjacent levels. "High" and "very high" are also two adjacent levels. Intuitively, the difference between the latter pair is greater than the difference between the former pair. Thus, the semantic value difference between the latter should be greater than that between the former.
To consider experts' psychology, some linguistic scale functions (LSFs) [61][62][63] have been proposed to transform each linguistic term into its corresponding semantic value. Let S {s ρ |ρ 0, 1, …, δ} be an LTS, v ρ be the semantic value of linguistic term s ρ ; then the LSF is defined as a mapping function F: The relationship between the linguistic term and its corresponding semantic value can be expressed by using a directly proportional function defined as: The exponential function is a monotonically increasing function. When the value of the independent variable increases isometrically, the value of the dependent variable increases at an unequal distance. This is similar to human emotion when evaluating the merits of node N i . When the evaluation information of node N i is closer to the best rank, human psychology is more sensitive, which means it is The linguistic term and its corresponding semantic value more difficult to recognize this evaluation. As the linguistic term approaches "very high", the semantic value difference between the two adjacent linguistic terms increases. Therefore, we propose a novel exponential semantic value for PLTS-CIs to reflect human psychological behavior, as shown in Eq. (3). The relationship between the linguistic term and its corresponding semantic value is shown in Fig. 2.
be a PLTS-CI, then the exponential semantic value is defined as: It has the following characteristics: (2) The exponential semantic value D(s ρ ) is monotonically increasing in the domain.
The confidence degree of decision-makers is another factor affecting the primary node election. Therefore, we propose a confidence approximation value to convert the confidence interval values into real values, as shown in Eq. (4).
be a PLTS-CI, then the confidence approximation value is defined as: It has the following characteristic: The value of confidence is evaluated by the Euclidean distance between the interval value I L k , I R k and [1,1]. For a linguistic term Q (k) , if C(Q (k) ) is closer to 1, then the evaluation information of the linguistic term is more effective and the decision results are more reliable.
In this paper, to compare several PLTS-CIs, a novel score function for PLTS-CIs, called Score-ESCA, is defined based on the exponential semantic value in Eq. (3) and the confidence approximation value in Eq. (4), as shown in Eq. (5).
be a PLTS-CI. Then, the score of Q(pI) is: where s i is the linguistic term corresponding to Q (k) in the PLTS-CI Q(pI).
Next, a comparison method for PLTS-CIs based on the exponential semantic value and confidence approximation value is presented.
be a PLTS-CI, and let Q 1 (pI) and Q 2 (pI) be two PLTS-CIs. Then the comparison between the PLTS-CIs can be given as follows: 1.

Attribute weights calculation based on Score-ESCA
To select a primary node in the PBFT consensus mechanism, it is assumed that the decision-makers evaluate n attributes of m nodes. Thus, the finite alternative set is N {N 1 , N 2 , …, N m }, and the attribute set is A {A 1 , A 2 , …, A n }. The weight of each attribute is an important part of the integration of ideas, which directly affects the final decision. However, the attribute weights of n nodes w (w 1 , w 2 , …, w n ) are completely unknown. Hence, we will discuss the weightdetermining method based on Score-ESCA for each attribute.
be a PLTS-CI, and let Q [Q ij (pI)] m×n be a decision matrix based on PLTS-CIs, where Q ij (pI) is a PLTS-CI. Then, the optimal weight w (w 1 , w 2 , …, w n ) can be obtained as: where and s ρ is the linguistic term corresponding to Q (k) in the PLTS-CI Q(pI).

TOPSIS method based on Score-ESCA
The TOPSIS method is one of the most widely used methods in MADM and is a kind of ranking method used to approximate an ideal solution. The PIS and NIS among the alternatives are first obtained. Then, to calculate the closeness coefficient of each alternative node, the distance between each alternative and the PIS and NIS are calculated. Finally, the alternative set is sorted and selected according to the closeness coefficient.
According to the above discussion and analysis in "The score function of the PLTS-CI", an improved TOPSIS method based on Score-ESCA for MADM is proposed to process the primary node election problem in the PBFT consensus mechanism. Its implementation process is shown in Algorithm 1.

Algorithm 1 Input: the original linguistic decision matrices containing voting information.
Output: the ranking of alternative nodes.
Step 1: using the voting information of original linguistic decision matrices, a normalized multiple attribute decision matrix of alternative nodes is constructed. Step 2: the attribute weight set w (w 1 , w 2 , ..., w n ) is calculated by Eq. (6).
Step 3: the PIS and NIS are obtained by applying Definition 3.7 and 3.8.
Step 4: the distance between alternative node N i and the PIS is calculated as: The smaller the value d(N i , Q(pI) + ) is, the better the alternative node N i will be. Then, we obtain Step 5: the distance between alternative node N i and the NIS is calculated as: The larger the value d(N i , Q(pI) -) is, the better the alternative node N i will be. Then, we obtain Step 6: the closeness coefficient CI(N i ) for all alternative nodes N i (i 1, 2, …, m) is calculated as: Step 7: the alternative nodes N i (i 1, 2, …, m) are ranked by their closeness coefficient.

Decision-making process for primary node election
In this section, we utilize the method to solve a real primary node election problem in the PBFT consensus mechanism.

Decision problem description
Any node in the blockchain can be the primary node in the PBFT consensus mechanism. If the primary node is maliciously attacked, it may cause the view to change frequently, posing a security risk to the system [47]. Different decisionmakers may make different subjective judgments according to the attributes of the node. To select the most appropriate node from among multiple nodes, four typical attributes are employed: (1) A 1 : bandwidth; (2) A 2 : I/O; (3) A 3 : CPU; and (4) A 4 : memory [64][65][66]. Suppose that the attribute weights are completely unknown. The voting nodes use the LTS S {s 0 , s 1 , s 2 , s 3 , s 4 , s 5 , s 6 } to describe the above four attributes. For attribute A 1 , the linguistic terms are s 0 "very small", s 1 "small", s 2 "slightly small", s 3 "medium", s 4 "slightly large", s 5 "large", and s 6 "very large". For the attributes (A 2 , A 3 , A 4 ), the linguistic terms are s 0 "very low", s 1 "low", s 2 "slightly low", s 3 "medium", s 4 "slightly high", s 5 "high", and s 6 "very high". Specially, the expert who give no opinion will be considered a "medium" vote. Furthermore, there are nine alternative nodes to be considered as the primary node at present. To the best of our knowledge, evaluation information in most of the literature comes from the decision-making of invited experts [53-55, 57, 67, 68]. Therefore, before implementing the PBFT consensus mechanism, seven experts are invited to evaluate the four qualitative criteria of the nine nodes. It is assumed that the opinions given by the experts are trustworthy and authoritative. The original linguistic decision matrices are shown in Tables 2, 3, 4, 5, 6, 7 and 8.

Decision-making process
In this subsection, the novel TOPSIS method based on Score-ESCA is used for primary node election in the PBFT consensus mechanism as follows.
Output: the ranking of the nine alternative nodes.
Step 1: using the original linguistic decision matrices in Tables 2, 3, 4, 5, 6, 7 and 8, a normalized decision matrix is shown in Table 9. Step 2: we can calculate the weights of the four attributes A 1 , A 2 , A 3 , and A 4 by Eq. Step 3: the PIS Q(pI) + and the NIS Q(pI)are computed in Table 9: Step 4: the distances between the alternative nodes N i (i 1, 2, …, 9) and the PIS are computed by Eq. (7): Step 5: the distances between the alternative nodes N i (i 1, 2, …, 9) and the NIS are computed by Eq. (9): Then by Eq. (10) we can obtain d max (N i , Q(pI) -) 0.0486.
Step 6: the closeness coefficients CI(N i ) for the nine alternative nodes N i are calculated by Eq. (11) as: Step 7: The ranking result of the nine alternative nodes is Through the improved PBFT consensus mechanism we proposed, N 3 is at the top and should be selected as the primary node.

Comparative analysis
In this section, we compare the proposed node election method with the methods [30,47,49,53] to prove that the election method we proposed has great effectiveness. By comparing these methods with our proposed TOPSIS method and analyzing them, the effectiveness of our method is verified. In the previous section, different methods are used to calculate the same decision data as in Table 9 to acquire the ranking results of the alternative nodes. Finally, all the methods are compared and analyzed.

Comparison with the classical decision method
The classical primary node election method in the PBFT consensus mechanism considers only the number of votes in support [30]. That is, the attribute of the node is recognized as "high" or "large". To compare the classical PBFT consensus mechanism with our improved PBFT consensus mechanism based on PLTSs, we take the decision data in Table 9 and count the votes toward "for", "abstention" and "against" decisions. The statistics are presented in Table 10 with reference to the decision matrix.   In the classical PBFT consensus mechanism, only the number of votes toward "for" is considered in selecting to be the primary node. The ranking result of the nine alternative nodes can be shown as We easily find that N 3 is the top node, and many nodes will have the same number of votes during the decision-making process.

Comparison with the election strategy
Li et al. [47] proposed a voting-based primary node election strategy in the PBFT consensus mechanism. In the voting period, the primary node is selected from among the alternative nodes. All experts express their choice regarding the alternative nodes, such as support, abstention, and opposition. The votes are denoted by v ji ∈ {1, 0, − 1}, where j is a valid voting of the expert. v ji 1 means a supportive vote from expert j for node i, v ji 1 means an opposing vote, and v ji 0 means abstention.
Considering that Li et al. only set up three voting options, to use the decision data in Table 9 in the primary node election strategy, votes for "very high", "high" and "slightly high" are regarded as "for". We treat "medium" votes as "abstention". Votes for "very low", "low" and "slightly low" are considered "against". Therefore, by the election strategy proposed by Li et al., the voting results are as shown in Table 11. According to the proposed voting strategy, the corresponding votes received by each node are added; that is, a supportive vote is recorded as 1 point, an absenting vote is recorded as 0 points, and an opposing vote is recorded as − 1 point. Finally, the nine nodes obtain their respective voting results. The ranking result of the nine alternative nodes can be shown as: We easily find that N 7 is the top node, and nodes will have with the same number of votes during the decision-making process.

Comparison with the voting method based on vague set
Xu et al. [49] improved the DPoS consensus mechanism in blockchain based on a vague set. Considering the selection of agent nodes in the DPoS consensus mechanism as a decision process, Xu et al. increased the votes against and abstaining for greater similarity to actual situations in the real world.
Similar to the above, to substitute the decision data in Table 9 into the agent nodes election method, votes for "very high", "high" and "slightly high" are regarded as "for". Treat "medium" votes as "abstention". Votes for "very low", "low" and "slightly low" are considered "against".
Thus, according to the selection method of agent nodes proposed by Xu et al. [49], the results of the statistics of the nine nodes are as shown in Table 12. Specifically, [t A (N i ), 1 − f A (N i )] is the vague value of a node N i . Here, t A (N i ) denotes the proportion of favorable votes, and f A (N i ) denotes the proportion of opposing votes. The fuzzy value of the node is calculated according to the formula: Therefore, the corresponding vague value and fuzzy value are calculated for each node. The ranking result of the nine alternative nodes can be shown as We easily find that N 7 is the top node.

Comparison with the extended TOPSIS method
We next use the extended TOPSIS method proposed by Pang et al. [53] to solve our decision problem. Note that the evaluation confidence information of the decision-maker is not considered.
Step 1. The weight values of the four attributes are calculated as follows: Step 3. The distances between the alternative nodes N i (i 1, 2, …, 9) and the PIS can be shown as: Step 4. The distances between the alternative nodes N i (i 1, 2, …, 9) and the NIS can be shown as: Step 5. The closeness coefficient CI(N i ) for the nine alternative nodes N i (i 1, 2, …, 9) can be calculated as: Step 6. The ranking result of the nine alternative nodes can be shown as: As shown in Fig. 3, the storage capacity is of greatest concern for the nodes when selecting a primary node in the PBFT consensus mechanism, and the I/O is least important. The results of the attribute weights obtained by the proposed method are the same as those of the method proposed by Pang et al. [53]. This validates the effectiveness of the proposed TOPSIS method of improvement. In addition, since the decision confidence of the decision-maker is considered during the decision-making process and is expressed in the form of the confidence interval, the classification of the attribute weights calculated by the proposed TOPSIS method is more obvious. The accuracy of the proposed TOPSIS method is verified.

Analysis and experimental discussion
According to the above analysis results, a summary of the ranking results obtained by the five methods is shown in Table 13. From Table 13, it can be seen that our proposed method derives the most suitable primary node N 3 since it considers the number of term sets [49,50] as the expression of the voting options and attribute weights [53,54]. N 3 is ranked first in the classical method [30] of sorting the results and is ranked second by Li et al. [47] and Xu et al. [49]. In the decision method proposed by Pang et al. [53], N 3 is ranked third. The specific reasons are analyzed below.
The number of voting options greatly influences the ranking results of the primary node election. The classical method [30], Li et al. [47], and Xu et al. [49] consider only no more than three fuzzy voting attitudes. The node with the highest number of supporting votes is selected as the primary node in the classical PBFT consensus mechanism [30]. The decision methods proposed by Li et al. [47] and Xu et al. [49] consider only three decision options, namely, favoring, abstaining, and opposing. Compared with the classical primary node selection method, their methods increase the options from a single voting option to multiple voting options, which still has limitations.
The different attribute values of nodes also affect the ranking results. The classical method [30], Li et al. [47], and Xu et al. [49] have the drawback that they cannot consider the weight ratio of the node attributes. From Table 13, the classical method [30] and Li et al. [47] cannot distinguish partial nodes that have the same ordering result. Therefore, the decision result in selecting the primary node by these methods is not accurate.
In addition, Pang et al. [53] consider node attributes and seven voting attitudes. Thus, the sorting result is more accurate than those of the classical method [30], Li et al. [47], and Xu et al. [49], which can perform a multidimensional selection of optimal primary nodes. The differential distribution of linguistic terms also influences the ranking results of the TOPSIS method. However, the semantic value differences between different linguistic terms are the same in the method proposed by Pang et al. [53], which cannot reflect the psychological preferences of decision-makers, resulting in the low accuracy and credibility of the decision results.
Our proposed method uses the linguistic term exponential semantic value and confidence interval to consider both var-ious decision options and node attributes. Considering that the existing linguistic terms are uniformly distributed, we propose modifying the semantic value difference between the adjacent linguistic terms to reflect the difference in the decision-making attitude of nodes. In addition, the concept of the confidence interval is introduced to describe the evaluation information of decision-makers more accurately. Therefore, compared with other primary node election methods [30,47,49,53], it can be verified that our method is more reasonable.
A comparison of the characteristics of the six methods with those of our proposed method is shown in Table 14. Our proposed method can overcome the weaknesses of the classical method and those of Li et al. [47], Xu et al. [49], and Liu et al. [50]. Relative to these previous methods, our proposed method ensures to express the decision-makers' voting information by increasing the number of voting options and attributes adequately and comprehensively by increasing the number of voting options and attributes. Compared to the method proposed by Pang et al. [53], the main characteristics of our proposed method are that it not only flexibly expresses different distributions of the linguistic terms but also considers the psychological behaviors of decision-makers.
To verify the proposed method, a simulation experiment platform is needed that provides specific conditions for the problem to be applied. Hyperledger Fabric is a blockchain development platform that is supported by pluggable modules [69]. It is applied to implement our improved PBFT consensus mechanism. Hyperledger Caliper, an open-source performance benchmark framework, performs performance tests on blockchain-based solutions [70]. The specific experimental environment is shown in Table 15.
First, the consensus reaching rate of our primary node election is verified. As can be seen from Table 13, the nodes N i (i 1, 2, …, 9) have a probability of being selected as the primary node. Therefore, the nine nodes are set as a primary node in turn in the PBFT consensus mechanism for experimental verification. Consensus reaching is determined by testing whether the blocks can be generated. We set the number of transactions from 10 to 500, using the Hyperledger Caliper for the performance evaluation. The experimental verification results are shown in Fig. 4.
From the figure, node N 3 has the highest probability of generating blocks among the nine nodes. The main reason is that the malicious nodes will deliberately filter transactions in the PBFT consensus mechanism when packaging transactions and storing them in blocks [44,71]. Since only the primary node is selected differently in our experiment, the block cannot be generated successfully if there exists malicious behavior on the primary node. This means that the success rate of reaching consensus is highest when N 3 is chosen as the primary node in our improved PBFT consensus mechanism.

Methods
Rankings of the methods Classical method [30] Li et al.'s method [47] Xu et al.'s method [49] Pang et al.'s method [53] Our proposed method  Then the improved Byzantine fault tolerance rate is evaluated. According to the classical PBFT consensus mechanism [51], to ensure that all nodes in the network can reach consensus, the number of nodes n must satisfy n ≥ 3f + 1, where f is the number of Byzantine nodes. In a maximum of 4 Byzantine nodes, a total of 13 nodes are set up in our experiment. And the improved PBFT consensus mechanism in this paper is used for experimental verification of the 6 cases when the number of Byzantine nodes is 0-6 respectively.
Transaction per second (TPS) refers to the ratio of the total number of transactions that are ultimately successfully stored in the blockchain to the elapsed time, which is an important index to measure the concurrency capability of the system. The higher the throughput, the more efficient the consensus mechanism is and the more capable of processing transactions. Since the throughput of the blockchain system is closely related to the performance indicators of the server, including CPU, memory, I/O, etc. [69,70,72], throughput is It can be seen from Fig. 5 that when the number of Byzantine nodes is f ≤ 4, all nodes in the network can reach a consensus. The consensus performance of the blockchain system will be reduced once Byzantine nodes exist in the PBFT consensus mechanism. Experiments have also shown that throughput drops slightly as the total number of Byzantine nodes increases. In addition, when the number of Byzantine nodes is f ≥ 5, the throughput is 0, indicating that the blockchain system cannot reach a consensus at this time. Therefore, our improved PBFT consensus mechanism meets The experiment also shows that the throughput of the blockchain system with N 3 as the primary node is higher when the number of Byzantine nodes is the same. That is, node N 3 processes transactions faster than other nodes, and is more efficient in the consensus reaching process. Since the throughput of the blockchain system is closely related to the performance indicators of the server, including CPU, memory, I/O, etc. [69,70,72], the performance of node N 3 is the best, which is consistent with our election results. Therefore, our proposed method is better and more efficient than the other methods for addressing the problem of primary node election in the PBFT consensus mechanism.
Moreover, the application of the proposed primary node election method to other scenarios is theoretically feasible, such as the replica placement algorithm in the Spark environment. The challenge of the replica placement algorithm is to select an appropriate node from multiple nodes to store replica data, which can effectively improve the performance of data reading in the Spark. A node selection is a comprehensive evaluation process, as many performance attributes of the nodes must be considered. Therefore, similar to the primary node election method in the PBFT consensus mechanism, a node selection in the replica placement algorithm can be formalized as decision-making process. Specifically, assume that one of the nine nodes in the Spark will be selected to store replica data. We first define five attributes to evaluate the performance of nine nodes, which include bandwidth, I/O, CPU, memory, and disk size. Adopting the proposed Score-ESCA method for TOPSIS method in this paper, the multiple experts are invited to express their voting attitude toward nodes concerning different attributes using an LTS. Finally, the comprehensive score values of the nodes are calculated. The node with the highest score value is selected in the replica placement algorithm. From the theoretical perspective, the proposed primary node election method is applicable to other scenarios.

Conclusions
The PLTS is an important means of solving complex decision problems in a fuzzy linguistic environment. In this paper, we formulate the primary node selection method in the PBFT consensus mechanism as an MADM problem by establishing a decision analysis process. To address the MADM problem of primary node selection, we introduce the definition of a PLTS-CI to obtain the group decision information. Subsequently, using the novel exponential semantic value and confidence approximation value, we propose a novel score function, called Score-ESCA. It considers more information about human psychological sensitivity than existing score functions. An improved TOPSIS method for PBFT primary node selection is proposed based on the novel Score-ESCA method. The proposed improved TOPSIS method is applied to solve the problem of PBFT consensus mechanism primary node election. Finally, the proposed method is compared with previous node selection methods to verify its effectiveness. In the future research, we will focus on achieving greater applicability for the proposed method when the subscript symmetric LTS is used. right holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.