Keywords

1 Introduction

Data aggregation from multiple sources has become more prevalent in many applications including sensor fusion [8], and crowdsourcing [20]. In such aggregation contexts, the fuzzy integral (FI) which is specified in respect to a fuzzy measure (FM) [9] is often used to capture the importance of information arising from different combinations of sources. Generally, FMs are defined by experts or generated through algorithms, such as the Sugeno \(\lambda \)-measure [19] and the Decomposable measure [7] which leverage the ‘worth’ of the singletons (individual sources), a.k.a. the densities. Another approach to generating FMs is optimization based on tuning an FM in respect to the behaviour of an aggregation function such as the FI and training data [2, 4]. If training data or information on the densities is limited or missing, specifying a FM is a challenging task, even though such a situation arises often, for example in aggregating crowdsourced data. To deal with this, Wagner and Anderson [21] first extracted FMs directly from the input data (the evidence) by analyzing and extracting key properties such as agreement, and specificity. Later, Havens et al. [11, 12] introduced more data-driven FMs which refined the established agreement FM in particular to leveraging a generic similarity measure (SM) to extract the property of ‘agreement’ amongst evidence from combinations of sources. This paper focuses on a recently introduced SM – bidirectional subsethood based SM [14, 15] which has been shown to address a number of limitations in common existing SMs such as Jaccard [13] and Dice [6], and explores the impact of its use in conjunction with agreement-based FMs.

So far, three agreement-based FMs have been proposed—the FM of Agreement (AG) [21], the FM of Generalized Accord (GenA) [12], and the Additive Measure of Agreement (AA) [11]. The AG FM captures the source agreement by using the intersection operation which considers only the overlap amongst multi-source data without tracking changes in their cardinality/size. This limitation of the intersection operation causes the AG FM to generate the same agreement and thus worth for very different subsets of sources. Figure 1 shows such a situation for interval-valued data with the AG FM. On the other hand, the use of the Jaccard or Dice SM with the GenA and AA FMs to estimate the source agreement makes the resulting FM susceptible to limitations of these measures, in particular aliasing–returning the same similarity for very different sets of intervals [14, 15]. Figure 2 presents such a case where the GenA and AA FMs produce identical agreement values and thus worth for different sets.

Fig. 1.
figure 1

Example highlighting the behaviour of AG FM [21], where \(\overline{U}_2\) and \(\overline{U}_3\) capture the union of the intersections of all of two and three source combinations as per (8).

Fig. 2.
figure 2

Two different interval-valued sets \(\overline{h}=[\overline{h}_1,\overline{h}_2]\) and \(\overline{r}=[\overline{r}_1,\overline{r}_2]\) with equal Jaccard similarity of 0.33 and Dice similarity of 0.50 respectively. Clearly, the intervals within \(\overline{h}\) and \(\overline{r}\) do not appear to be in equal agreement to each other.

Given this context, this paper focuses on developing a new instance of an agreement FM to avoid the limitations of the existing ones. The proposed FM leverages the bidirectional subsethood based SM [14, 15] to minimize aliasing in the inter-source agreement and worth calculation. The proposed FM is designed following the concept of the GenA FM [12], and considers both cases where sources are overlapping (some agreement) or non-overlapping (no agreement). When sources are non-overlapping, the proposed FM in combination with the FI gracefully degrades to an average operator, whereas existing agreement FMs are not designed to deal with such cases. Beyond developing this FM, this paper also demonstrates its behaviour against the existing agreement FMs in aggregating interval datasets when used in combination with the Choquet FI (CFI) [5].

The paper is structured as follows: Sect. 2 reviews FMs and FIs along with a brief discussion of subsethood and the bidirectional subsethood based SM [14, 15]. Section 3 discusses existing agreement FMs. Section 4 develops a new instance of the agreement-based FM exploiting the bidirectional subsethood based SM. Section 5 demonstrates the behaviour of the proposed FM against the existing agreement FMs in aggregating interval-valued datasets when used with an FI for both synthetic and real-world datasets. Finally, Sect. 6 concludes the paper with suggestions and future work (Table 1).

Table 1. Acronyms and notation

2 Background

This section initially reviews FMs and FIs and then provides a short discussion on subsethood and the new bidirectional subsethood based SM [14, 15].

2.1 Fuzzy Measures

FMs are defined as a hierarchical weighting structures (lattices) that capture the worth of all subsets in a set of sources, including that of the singletons, also referred to as the densities. Mathematically, an FM, g defined on a finite set of sources, \(X=\{x_{1},...,x_{n}\}\) is a function \(g:2^X\rightarrow [0,1]\) satisfying the properties [9]:

  • (P1) \(g(\emptyset )=0\) and \(g(X)=1\) (Boundedness)

  • (P2) If \(a \subseteq b \subseteq X\) then \(g(a) \subseteq g(b) \subseteq g(X)\) (Monotonicity)

Here, g(a) is the worth of a subset a of X. Property (P1) states that the worth of empty set (\(\emptyset \)) is 0 and the worth of universal set (X) is 1. We note that the worth of the universal set is not always required to be 1, but this convention is adopted here. Property (P2) shows the monotonicity of g, stating that if a is a subset of b (\(a\subset b\)), the worth of a is smaller or equal to the worth of b. There is a third property of continuous FMs, which is not applicable to discrete FMs, as used in this paper and most practical applications.

In practice, the FMs are defined in various ways, such as expert-defined, or derived by algorithms or optimization based on existing data and in conjunction with an aggregation functions such as the FI; for more details, please see [11, 22]. This paper focuses only on algorithmically derived FMs leveraging the evidence data arising from multiple sources. Section 3 reviews such FMs that are derived on the concept of source agreement.

2.2 Fuzzy Integrals

FIs have been efficiently used as powerful non-linear aggregation operators in evidence fusion [3, 9]. They aggregate multi-source data (evidence) by combining it with the worth information of all subsets of sources (captured by an FM). Two well-known FIs are the Sugeno FI (SFI) [19] and the Choquet FI (CFI) [5]. In practice, discrete SFI and CFI are commonly used [17] and in this paper, we focus on the discrete CFI as it is most popular for evidence aggregation.

Let \(h:X\rightarrow [0,\infty )\) be a real-valued function that presents the evidence from a source. The discrete CFI is defined as

$$\begin{aligned} \int _{CFI} h \circ g = CFI_g(h)=\sum _{i=1}^n h(x_{\pi (i)})[g(A_i )- g(A_{i-1})], \end{aligned}$$
(1)

where \(\pi \) is a permutation of X arranged like \(h(x_{\pi (1)})\ge h(x_{\pi (2)})\ge \) ... \(\ge h(x_{\pi (n)})\). \(A_i=\{x_{\pi (1)}, x_{\pi (2)},..., x_{\pi (i)}\}\) is a subset of sources. g is the FM where \(g(A_i)\) is the worth of the subset \(A_i\) with \(g(A_0)=0\).

In most cases, the multi-source data h is provided in a numeric form. However, in some applications h is better represented by interval-valued or fuzzy set-valued data. Considering this, FIs have been generalized for non-numeric evidence [1, 10, 16]. Let \(\overline{h}:X\rightarrow I(\mathbb {R})\) be a set of interval-valued data where \(I(\mathbb {R})\) is the set of all closed intervals over the real numbers and \(\overline{h}_i=\overline{h}(x_i)=[h_i^-,h_i^+]\) be the ith interval (where \(h_i^-\) and \(h_i^+\) are the left and right endpoints respectively). Following the notation in [12], the CFI on \(\overline{h}\) is defined as

$$\begin{aligned} \int _{CFI} \overline{h} \circ g=CFI_g(\overline{h})=[CFI_g(h^- ),CFI_g(h^+)], \end{aligned}$$
(2)

where the output \(CFI_g(\overline{h})\) is itself interval-valued [7]. In other words, the CFI for interval-valued data is computed by applying the CFI for the numeric case of the left and right interval endpoints separately. Please see [11, 12, 21] for more detail about the interval aggregation using the FM and the CFI.

2.3 Subsethood

The subsethood between two sets a and b is a relation, indicating the degree to which a is a subset of b [18]. It is defined as

$$\begin{aligned} S_h\left( a,b\right) =\frac{\left| a\cap b\right| }{\left| a\right| }, \end{aligned}$$
(3)

where \(\left| a\cap b\right| \) is the cardinality of the intersection of a and b, and \(\left| a\right| \) is the cardinality of a. It is always bounded on the interval [0, 1], where 1 means that a is a subset of b (\(a\subseteq b\)) and 0 means that a and b are disjoint (\(a\not \subset b\)).

Similarly, the degree of subsethood of two intervals \(\overline{a}\) and \(\overline{b}\) can be defined as

(4)

where is the size of the intersection between \(\overline{a}\) and \(\overline{b}\) and \(\left| \overline{a}\right| \ne 0\).

2.4 Bidirectional Subsethood Based Similarity Measure

A new SM was introduced in [14, 15] which uses the reciprocal subsethoods of intervals to capture their similarity. This measure for two intervals \(\overline{a}\) and \(\overline{b}\) is,

(5)

where \(\bigstar \) is a t-norm. We can rewrite (5) using the definition of \(S_h\) at (4) as

(6)

3 Existing Agreement Fuzzy Measures

Here, we briefly recapture the AG [21], GenA [12], and AA [11] FMs with respect to a set of intervals \(\overline{h}=\{\overline{h}_1,\overline{h}_2,...,\overline{h}_n\}\) arising from n individual sources.

3.1 Fuzzy Measure of Agreement

Wagner and Anderson [21] proposed the AG FM by extracting it from the interval-valued data with no prior knowledge about sources. The AG FM is defined as

where \(\overline{A}_i=\{\overline{h}_{\pi (1)},\overline{h}_{\pi (2)}...,\overline{h}_{\pi (i)}\}\) is the permuted set of intervals with \(\overline{A}_0=\emptyset \), \(z_i=\frac{i}{n}\) and |.| refers to the cardinality/size of the interval. Here, \(\overline{U}_K(\overline{A}_i)\) unites the intersections of the K-tuples in \(\overline{A}_i \subseteq \overline{h}\) as defined in (8) [11, 12].

$$\begin{aligned}&\overline{U}_K(\overline{A}_i)\quad = \bigcup _{k_1=1}^{i-K+1}\bigcup _{k_2=k_1+1}^{i-K+2}...\bigcup _{k_K=k_{K-1}+1}^{i}(\overline{h}_{\pi (k_1)} \cap \overline{h}_{\pi (k_2)}\cap ...\cap \overline{h}_{\pi (k_K)}) \end{aligned}$$
(8)

Further, the \(\tilde{g}^{AG}(\overline{A}_i)\) is normalized by \(\tilde{g}^{AG}(\overline{h})\) to satisfy the property of the FM, i.e., \(g^{AG}(\overline{A}_i) = \frac{\tilde{g}^{AG}(\overline{A}_i)}{\tilde{g}^{AG}(\overline{h})}.\)

3.2 Additive Measure of Agreement

Havens et al. [11] proposed the AA FM in order to alleviate the asymmetry issue of agreement FMs. This FM utilizes the SMs for determining the source agreement. The AA FM is expressed in (9).

$$\begin{aligned} \tilde{g}^{AA}(\overline{A}_i) = \tilde{g}^{AA}(\overline{A}_{i-1}) + \sum _{\begin{array}{c} j=1 \\ j\ne i \end{array}} ^{n}S^p(\overline{h}_j,\overline{h}_{\pi (i)}), i=[n], p \ge 0 \end{aligned}$$
(9)

where p is a tuning parameter and S is the SM. Further, \(\tilde{g}^{AA}(\bar{A}_i)\) is normalized by \(\tilde{g}^{AA}(\bar{A}_n)\) like \(g^{AA}(A_i) = \frac{\tilde{g}^{AA}(\overline{A}_i)}{\tilde{g}^{AA}(\overline{A}_n)}.\)

3.3 Fuzzy Measure of Generalized Accord

Havens et al. [12] proposed the GenA FM leveraging a generic SM to estimate the agreement (accord) of subsets of sources. The GenA FM is defined as

where \(\overline{A}_i=\{\overline{h}_{\pi (1)},\overline{h}_{\pi (2)}....,\overline{h}_{\pi (i)}\}\) is the permuted set of intervals with \(\overline{A}_0=\emptyset \), and \(S_K(\overline{A}_i)\) is defined in (11).

$$\begin{aligned} { S_K(\overline{A}_i)= \left( {\begin{array}{c}n\\ K\end{array}}\right) ^{-1}\sum \nolimits _{{k_1=1}}^{{i-K}}\sum \nolimits _{{k_2=\atop k_1+1}}^{{i-K+1}}... \sum \nolimits _{{k_K=\atop k_{K-1}+1}}^{{i}} S(\{\overline{h}_{\pi (k_1)}, \overline{h}_{\pi (k_2)},...,\overline{h}_{\pi (k_K)}\})} \end{aligned}$$
(11)

Here, \(\left( {\begin{array}{c}n\\ K\end{array}}\right) \) is the number of possible K-tuples in \(\overline{h}\) and S is the SM. The quantity \(S_K(\overline{A}_i)\) is the sum of similarities of the K-tuples in \(\overline{A}_i \subseteq \overline{h}\), weighted by \(\left( {\begin{array}{c}n\\ K\end{array}}\right) ^{-1}\). Further, the constant \(\alpha _{\overline{h}}\) is defined in (12) so that \(g^{GenA}(\overline{h})=1\).

$$\begin{aligned} \alpha _{\overline{h}} = \left( \sum _{K=2}^n S_K(\bar{A}_n)\right) ^{-1} \end{aligned}$$
(12)

In [11, 12], the GenA and AA FMs are explored in respect to the popular SMs (within (11) and (9)). As detailed in [14, 15], we note however that Jaccard or Dice SMs are liable to aliasing, thus making the GenA and AA FMs to generate the same worth for very different subsets of sources which in turn affects the quality of the overall aggregation. To avoid this, in the next section, we leverage the recently introduced bidirectional subsethood based SM (minimizing aliasing), designing a new instance of the GenA FM.

4 A New Instance of the Agreement Fuzzy Measure Based on Bidirectional Subsethood

Here, we develop a new instance of agreement FM following the concept of the GenA FM and exploit the new bidirectional subsethood based SM for computing the source agreement. As the new SM minimizes aliasing, it helps the proposed FM avoid generating the same agreement and worth for different subsets of sources. This section first defines the subsethood for a set of intervals. Then, the new SM at (5) is revisited to enable it to compute similarity for a set of intervals. Finally, the new instance of agreement FM involving the new SM is introduced.

4.1 Defining Subsethood for a Set of Intervals

The subsethood of an interval, \(\overline{h}_r\) as regards to a set of intervals \(\overline{A}_{i}\subseteq \overline{h}\) is defined as a mean of its subsethood to each interval \(\overline{h}_t\) in \(\overline{A}_{i}\). It is expressed as

$$\begin{aligned} S_h(\overline{h}_r,\overline{A}_{i}) = \frac{1}{|\overline{A}_{i}|}\sum \limits _{\overline{h}_t\in \overline{A}_{i}}S_h(\overline{h}_r,\overline{h}_t)= \frac{1}{|\overline{A}_{i}|}\sum \limits _{\overline{h}_t\in \overline{A}_{i}}\frac{|\overline{h}_r\cap \overline{h}_t|}{|\overline{h}_r|}, \end{aligned}$$
(13)

where \(S_h(\overline{h}_r,\overline{A}_{i})\rightarrow [0,1]\) such that \(S_h(\overline{h}_r,\overline{A}_{i})=1\) when \(\overline{h}_r \subset \overline{h}_t\), for all \(\overline{h}_t\in \overline{A}_{i}\) and \(S_h(\overline{h}_r,\overline{A}_{i})=0\) when \(\overline{h}_r \not \subset \overline{h}_t\) for any of \(\overline{h}_t\in \overline{A}_{i}\).

4.2 Defining Bidirectional Subsethood Based Similarity Measure for a Set of Intervals

The bidirectional subsethood based SM, \(S_{S_h}\) for \(\overline{h}\) is the t-norm (\(\bigstar \)) of their reciprocal subsethoods, i.e.,

$$\begin{aligned} \begin{aligned} S_{S_h}\left( \overline{h}\right)&= \bigstar \left( S_h(\overline{h}_1,\{\overline{h}_2,...,\overline{h}_n\}),...,S_h(\overline{h}_n,\{\overline{h}_1,...,\overline{h}_{n-1}\})\right) \\&=\bigstar \left( S_h(\overline{h}_1,\overline{h}\backslash \overline{h}_1),...,S_h(\overline{h}_n,\overline{h}\backslash \overline{h}_n)\right) \end{aligned} \end{aligned}$$
(14)

where \(\overline{h}\backslash \overline{h}_i\) is the nonempty subset of intervals excluding \(\overline{h}_i\), \(i\in \{1,...,n\}\). In this paper, we use the minimum t-norm (\(\bigstar \)) as it is the most common in practice.

4.3 Bidirectional Subsethood Based Agreement Fuzzy Measure

Consider again the set of n intervals, \(\overline{h}\). For any nonempty subset \(\overline{A}_i\in \overline{h}\), \(1\le i\le n\), the new FM, \(\tilde{g}^{AS_h}\) using the new SM (14) is defined as follows (which is later normalized to a proper FM, \(g^{AS_h}\)):

$$\begin{aligned} \tilde{g}^{AS_h}(\overline{A}_0)&= 0,\end{aligned}$$
(15a)
$$\begin{aligned} \tilde{g}^{AS_h}(\overline{A}_1)&= \left( {\begin{array}{c}n\\ 1\end{array}}\right) ^{-1}\times \sum _{k_1=1}^{1}S_{S_h}\left( \overline{h}_{k_1},\overline{h}_{k_1}\right) =\frac{1}{n},\end{aligned}$$
(15b)
$$\begin{aligned} \tilde{g}^{AS_h}(\overline{A}_i)&=i \times \tilde{g}^{AS_h}(\overline{A}_1)+\left( {\begin{array}{c}n\\ 2\end{array}}\right) ^{-1}\sum _{k_1=1}^{i-1}\sum _{k_2=k_1+1}^{i}S_{S_h}\left( \overline{h}_{k_1},\overline{h}_{k_2}\right) +...\\ {}&+\left( {\begin{array}{c}n\\ i\end{array}}\right) ^{-1}S_{S_h}\left( \overline{h}_1,...,\overline{h}_{i}\right) ,\nonumber \end{aligned}$$
(15c)

where \(\overline{A}_0=\emptyset \), \(\overline{A}_1\) is a singleton subset, and \(\overline{A}_i\) is a non-singleton subset with i sources, \(1<i\le n\). \(\left( {\begin{array}{c}n\\ K\end{array}}\right) \) is total number of K-tuples in the set, \(\overline{h}\), where \(1\le K\le n\). (15a) is the worth of \(\overline{A}_0\), which is always 0. (15b) is the worth of \(\overline{A}_1\), which is the similarity of 1, weighted by \(\left( {\begin{array}{c}n\\ 1\end{array}}\right) ^{-1}\). (15c) is the worth of \(\overline{A}_i\), which is the sum of the similarities of all K-tuples in \(\overline{A}_i\), \(1\le K\le i\), weighted by \(\left( {\begin{array}{c}n\\ K\end{array}}\right) ^{-1}\).

Remark 1

(15b) captures the worth of singleton subsets (\(\overline{A}_1\)) which is, \(\tilde{g}^{AS_h}(\overline{A}_1)\) = \(\frac{1}{n}\), where \(n=|\overline{h}|\). For a non-singleton subset consisting of all disagreeing sources, the inclusion of the worth of the singleton subsets in (15c) enables it to generate the worth information for this set.

Following [11, 12], (15c) is rewritten as follows,

$$\begin{aligned} \tilde{g}^{AS_h}(\overline{A}_i) = \frac{i}{n}+\sum _{K=2}^i\left[ \left( {\begin{array}{c}n\\ K\end{array}}\right) ^{-1} Z_K(\overline{A}_i)\right] , \text { }i\ge 1, \end{aligned}$$
(16)

where the first part of (16) is the sum of the worth of all singletons in \(\overline{A}_i\). The other part gives summation of the similarities of all K-tuples in \(\overline{A}_i\) (\(K\ge 2\)), weighted by \(\left( {\begin{array}{c}n\\ K\end{array}}\right) ^{-1}\). \(Z_K(\overline{A}_i)\) captures the cumulative similarity for all K-tuples in \(A_i\) (\(K\ge 2\)) using (14) and is defined in (17).

$$\begin{aligned} Z_K\left( \overline{A}_i\right) =\sum \nolimits _{{k_1=1}}^{{i-K+1}}\sum \nolimits _{{\begin{array}{c} k_2=\atop k_1+1 \end{array}}}^{{i-K+2}}...\sum \nolimits _{{\begin{array}{c} k_K=\atop k_{K-1}+1 \end{array}}}^{{i}} \bigstar \left( S_h(\overline{h}_{k_1 },\overline{A}_i\backslash \overline{h}_{k_1}),..., S_h(\overline{h}_{k_K},\overline{A}_i\backslash \overline{h}_{k_K})\right) \end{aligned}$$
(17)

Finally, \(\tilde{g}^{AS_h}(\overline{A}_i)\) is normalized by \(\tilde{g}^{AS_h}(\overline{h})\) in (18) so that \(g^{AS_h}(\overline{A}_i)\le 1\) and \(g^{AS_h}(\overline{h})=1\), which maintains the bounded property of the FM.

$$\begin{aligned} g^{AS_h}(\overline{A}_i) = \frac{\tilde{g}^{AS_h}(\overline{A}_i)}{\tilde{g}^{AS_h}(\overline{h})},\text { } 1\le i\le n. \end{aligned}$$
(18)

In the following Example 1 demonstrates that unlike the \(g^{GenA}\) and \(g^{AA}\) FMs, the new instance agreement FM, \(g^{AS_h}\) avoids generating the same agreement and worth for different sets of sources. In addition, Example 2 presents the interval aggregation using the \(g^{AS_h}\) FM and the CFI.

Fig. 3.
figure 3

Example showing avoidance of generating same FM lattice for different subsets of sources by the \(g^{AS_h}\) FM. Any subset \(\{\overline{h}_1,\overline{h}_2\}\) or \(\{\overline{r}_1,\overline{r}_2\}\) is presented as \(\{1,2\}\).

Example 1: Consider two interval-valued datasets, \(\overline{h}\) and \(\overline{r}\), as shown in Fig. 3. Their corresponding FM lattices using the \(g^{AS_h}\), \(g^{AG}\), \(g^{GenA}\), and \(g^{AA}\) FMs are also shown in Fig. 3 (we skip showing the FM values for \(\emptyset \) and \(\overline{h}\)). Due to aliasing of the Jaccard SM, both \(g^{GenA}\) and \(g^{AA}\) FMs generate the same FM lattices for these sets whereas the \(g^{AS_h}\) and \(g^{AG}\) FMs generate distinct FM lattice.

Example 2: Consider the interval-valued dataset, \(\overline{r}\) in Fig. 3(b) and its corresponding \(g^{AS_h}\) FM lattice in Fig. 3(d). Using (1), the aggregation of left interval endpoints is, \(CFI_g(h^-)=3\times [g^{AS_h}(\{1\})-g^{AS_h}(\{\emptyset \})]+1\times [g^{AS_h}(\{1,3\})-g^{AS_h}(\{1\})]+0\times [g^{AS_h}(\{1,2,3\})-g^{AS_h}(\{1,3\})]=3\times [0.22-0]+1\times [0.54-0.22]+0\times [1-0.54]=0.98\). Similarly, the aggregation of right interval endpoints is, \(CFI_g(h^+)=10\times [0.22-0]+6\times [0.54-0.22]+3\times [1-0.54]=5.5\). Finally, using (2) the interval aggregation is, \(CFI_g(\overline{h})=[CFI_g(h^-),CFI_g(h^+)]=[0.98,5.5]\).

Fig. 4.
figure 4

Comparison of aggregation results from the CFI with the \(AS_h\), AG, GenA and AA FMs for four different interval-sets.

5 Demonstration

This section demonstrates the behaviour of the new FM against the AG, GenA, and AA FMs for two synthetic datasets and a real-world example. For convenience, the new instance of agreement FM is denoted as \(AS_h\) and the CFI is used throughout. Further, the Jaccard SM is used for the GenA and AA FMs, and AVG represents the arithmetic mean of the left and right endpoints of the intervals respectively. In all experiments, we follow the assumption that no worth information of sources is available (e.g. as in crowdsourcing). If there was such information, it could be captured and a meta-measure could be created (see [21]).

5.1 Demonstration with Synthetic Dataset-1

Figure 4 shows four examples of synthetic datasets together with aggregated results based on the CFI using the \(AS_h\), AG, GenA, and AA FMs.

(1) The interval-valued set-I shown in Fig. 4(a) consists of three smaller intervals \(\overline{h}_4\), \(\overline{h}_5\), and \(\overline{h}_6\) that agree completely and three larger intervals \(\overline{h}_1\), \(\overline{h}_2\) and \(\overline{h}_3\) agreeing to a certain degree. The aggregation results (Fig. 4(a)) show that the AG FM gives importance only to the subset of larger intervals, whereas the GenA and AA FMs are influenced by the subset of smaller intervals as they agree totally. However, the \(AS_h\) FM not only gives more importance to the subset of smaller intervals having a complete agreement, but also considers other subsets, \(\{\overline{h}_1,\overline{h}_3\}\) and \(\{\overline{h}_2,\overline{h}_3\}\) with agreement to a certain degree.

(2) For the interval-valued set-II shown in Fig. 4(b), there are three intervals \(\overline{h}_1\), \(\overline{h}_2\) and \(\overline{h}_3\) having higher agreement than three other intervals \(\overline{h}_4\), \(\overline{h}_5\) and \(\overline{h}_6\). Here, the AG FM is greatly influenced by the subset \(\{\overline{h}_1,\overline{h}_2,\overline{h}_3\}\), whereas the GenA, AA, and \(AS_h\) FMs show more balanced aggregation by considering the two subsets (\(\{\overline{h}_1,\overline{h}_2,\overline{h}_3\}\) and \(\{\overline{h}_4,\overline{h}_5,\overline{h}_6\}\)) when used with the CFI.

(3) The interval-valued set-III shown in Fig. 4(c) includes three intervals agree to each other completely and the other three wholly disagrees. Here, the AG, GenA and AA FMs are completely influenced by the subset of agreed intervals, i.e., \(\{\overline{h}_4,\overline{h}_5,\overline{h}_6\}\). Like other FMs, the \(AS_h\) FM shows the influence of the subset \(\{\overline{h}_4,\overline{h}_5,\overline{h}_6\}\), concurrently, it also considers disagreed singletons, \(\{\overline{h}_1\), \(\overline{h}_2\), \(\overline{h}_3\}\).

(4) The interval-valued set-IV shown in Fig. 4(d) consists of five intervals where all intervals are completely non-overlapped. At this situation, the AG, GenA, and AA FMs are not designed to generate the worth information for the subsets of sources and hence do not provide aggregation when combined with the CFI. Contrarily, the \(AS_h\) FM, by its construction, assigns worth to all singletons, which is later normalized by \(\tilde{g}^{AS_h}(\overline{h})\). Even though there is no agreement amongst the sources regarding their intervals, the \(AS_h\) FM still can estimate the worth of other subsets by utilizing the worth of singletons. Table 2 shows the normalized worth of all subsets of intervals for the dataset-IV (in Fig. 4(d)) using the \(AS_h\) FM, where all intervals are in complete disagreement. Intuitively, when there is no overlap between intervals and all intervals are unique, then all sources should be treated with an equal worth and the aggregation should be equal to the average. In Fig. 4(d), only the \(AS_h\) FM with the CFI generates the aggregation results accordingly (i.e., performs like an average operator).

Table 2. The normalized worth of subsets of intervals using the \(AS_h\) FM (\(g^{AS_h}\))

5.2 Demonstration with Synthetic Dataset-2

Here, we investigate how the FMs in combination with the CFI behave in producing the aggregation result when the overlap between intervals are gradually decreased. Five different sets of two intervals \(\overline{h}_1\) and \(\overline{h}_2\) are considered in Fig. 5(a) with \(100\%\), \(75\%\), \(50\%\), \(25\%\), and \(0\%\) overlap respectively. Note that \(\overline{h}_1\) is set to [0, 1] in all five sets, while \(\overline{h}_2\) is altered depending on the \(\%\) of overlap. Figure 5(b) shows that all FMs (used with the CFI) aggregates the intervals equally (i.e., [0, 1]) when \(100\%\) overlap exists. However, despite degrading overlap, the AG and GenA FMs continue to show the same aggregation (i.e., [0, 1]), whereas the AA and \(AS_h\) FMs follow the overlap degradation and aggregate the intervals accordingly. Finally, when the intervals are in complete disagreement (i.e., \(0\%\) overlap), the \(AS_h\) FM with the CFI performs like an average operator, whereas the other FMs do not support aggregation.

5.3 A Real-World Example

This experiment uses the outcome of different ageing methods (Pubic Symphysis (PS), Auricular Surface (AS), Ectocranial Suture-Vault (ESV), and Ectocranial Suture-Lateral Anterior(ESLA)) to estimate the age-at-death of an individual skeleton [3] which is useful for forensic and biological anthropologists. Each of them provides an estimated age range for the individual skeleton. Considering the worth information of the aging methods are unknown, here our aim is to fuse their estimated age range directly to get a combined view of the skeletal age-at-death. In this aggregation experiment, the more intuitive aggregation outcome is likely to be a narrow age range capturing the actual age-at-death. Figure 6 presents the estimated age range of each aging methods for three individual skeletons together with their true chronological age-at-death. Figure 6 also shows the aggregation results for all agreement FMs when used with the CFI. The results reveal that the \(g^{AS_h}\) FM specifies the age range more narrowly (while also capturing the true chronological age-at-death) compared to other agreement-based FMs. While this is only one example and not an extensive study, it demonstrates the interesting potential robustness in aggregation outcome of the proposed agreement FM.

Fig. 5.
figure 5

(a) Five sets of interval-valued data with degrading interval-overlap (b) Aggregation results of the AG, GenA, AA and \(AS_h\) FMs with the CFI for Fig. 5(a).

Fig. 6.
figure 6

Aggregation of estimated age range of four different ageing methods using the agreement FMs with the CFI. The vertical line shows chronological age-at-death.

6 Conclusions

As the agreement calculation of agreement FMs are affected by the limitations of popular SMs, this paper has developed a new instance of an evidence-driven agreement FM for interval-valued datasets building on the structure of GenA FM, and leveraging a recently introduced SM [14, 15] to provide better capture of the inter-source agreement and worth estimation. Further, the proposed FM is designed to deal with cases where no agreement exists amongst the evidence arising from sources. Here, in combination with the CFI, it gracefully degrades to an average operator, whereas existing agreement FMs are not designed to deal with such instances. The behaviour of this FM has been compared with existing agreement FMs by aggregating both synthetic and real interval-valued data in combination with the CFI, showing that it provides robust and qualitatively superior outcomes in agreement-based data aggregation. In future, we will experiment with this new instance of agreement FM in combination with the FI for aggregating fuzzy set-valued data. In addition, we will extend this FM to address the asymmetry issue noted in [11].