Multi-label feature selection based on fuzzy neighborhood rough sets

Xu, Jiucheng; Shen, Kaili; Sun, Lin

doi:10.1007/s40747-021-00636-y

Multi-label feature selection based on fuzzy neighborhood rough sets

Original Article
Open access
Published: 10 January 2022

Volume 8, pages 2105–2129, (2022)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

Multi-label feature selection based on fuzzy neighborhood rough sets

Download PDF

2667 Accesses
26 Citations
Explore all metrics

Abstract

Multi-label feature selection, a crucial preprocessing step for multi-label classification, has been widely applied to data mining, artificial intelligence and other fields. However, most of the existing multi-label feature selection methods for dealing with mixed data have the following problems: (1) These methods rarely consider the importance of features from multiple perspectives, which analyzes features not comprehensive enough. (2) These methods select feature subsets according to the positive region, while ignoring the uncertainty implied by the upper approximation. To address these problems, a multi-label feature selection method based on fuzzy neighborhood rough set is developed in this article. First, the fuzzy neighborhood approximation accuracy and fuzzy decision are defined in the fuzzy neighborhood rough set model, and a new multi-label fuzzy neighborhood conditional entropy is designed. Second, a mixed measure is proposed by combining the fuzzy neighborhood conditional entropy from information view with the approximate accuracy of fuzzy neighborhood from algebra view, to evaluate the importance of features from different views. Finally, a forward multi-label feature selection algorithm is proposed for removing redundant features and decrease the complexity of multi-label classification. The experimental results illustrate the validity and stability of the proposed algorithm in multi-label fuzzy neighborhood decision systems, when compared with related methods on ten multi-label datasets.

Multi-label feature selection via redundancy of the selected feature set

Article 30 August 2022

Multi-label Feature Selection with Fuzzy Rough Sets

Multi-label Attribute Evaluation Based on Fuzzy Rough Sets

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

In recent years, multi-label classification occupies a very important position in the fields of artificial intelligence and machine learning, which attracts the attention of more and more scholars and a series of multi-label classification methods are proposed [1,2,3,4,5]. In traditional classification learning, each sample has only one category label, namely single-label learning [6, 7]. However, in actual application, most of the samples may belong to multiple category labels at the same time, which named multi-label learning [8,9,10]. There are a large number of features in multi-label data, but some of which may be irrelevant or redundant information, which will lead to such problems such as high computational cost, over-fitting, low classification performance of multi-label learning algorithm and long process of classification learning. Therefore, dimension reduction of multi-label data is the focus of current research. Feature selection is one of the most common dimensionality reduction methods for analyzing high dimensional multi-label data, which aims to eliminate redundant and irrelevant features in classification learning task, and extract useful information [11,12,13].

With the increasing availability of multi-label data related to multiple labels in an instance, a great quantity of feature selection methods for multi-label learning are developed to reduce dimensions and improve learning performance [14,15,16,17]. These methods commonly can be divided into three categories: filter [18,19,20], wrapper [21, 22] and embedded [23] methods, where the filter method is independent of the specific learner, and it has less computation cost and stronger generalization ability. Therefore, our proposed method focuses on the filter strategy. The evaluation criteria commonly based on filter method include information measure [24,25,26,27,28,29,30,31,32], dependency measure [33,34,35,36,37,38], distance measure [39,40,41,42] and consistency measure [43, 44].

Rough set theory is a familiar method to deal with uncertain data, which does not need any prior information except data, so it has been widely used in feature selection of data [45]. However, the traditional rough set theory is based on equivalence relation, which is only suitable for discrete data. To solve this problem, some scholars have extended the rough set model. For example, the neighborhood rough sets model (NRS), which is the most common model to deal with numerical data, and the neighborhood relation is used to replace the equivalence relation. Duan et al. [46] defined the lower approximation and dependency of NRS in multi-label learning, and proposed a multi-label feature selection algorithm based on neighborhood rough sets model (MNRS). Unfortunately, NRS cannot deal with the fuzziness of data effectively. So Lin et al. [47] used different fuzzy relations to construct a multi-label fuzzy rough sets model (MFRS), which estimated the similarity between samples under different labels, and directly evaluated the attributes of multi-label data, solved the problem of low separability about fuzzy similarity and defined the dependency function. But FRS is sensitive to noise, these noisy data will affect the calculation of fuzzy lower approximation and limit their practical application [48]. To solve the above problems, the fuzzy neighborhood rough sets model (FNRS) is designed. Wang et al. [49] combined NRS with FRS, proposed a feature selection algorithm based on FNRS via dependency to select feature subset. Chen et al. [48] designed a multi-label attribute reduction method based on variable precision FNRS, which used parameterized fuzzy neighborhood granule to define the fuzzy decision and decision class, and calculated importance of features using dependency measure, but the reduction based on the positive region does not take into account the influence of the uncertain information in the upper approximation on the importance of the attribute. Inspired by these observations, this paper designs a multi-label feature selection method based on FNRS and the approximation accuracy is introduced into our proposed multi-label feature selection method.

In the latest decades, the multi-label feature selection methods are classified into two kinds of views. The first is the algebra view based on approximate accuracy, which considers the effect of some features on the labels with the change of approximation accuracy, while confirms whether these features can be eliminated. For instance, Liang et al. [17] presented the selection of the optimal number of particles in the multi-grain and multi-label decision table, which makes certain positive region reduction more suitable for multi-label datasets. Li et al. [35] designed a robust MFRS by the kernelized information and obtained a lower approximation. The second is the information view based on information entropy, which considers the influence of some features on the decision subset with the information entropy and decides whether these features can be eliminated. For example, Lin et al. [25] designed a multi-label feature selection based on neighborhood mutual information, extended neighborhood information entropy to adapt to multi-label data, and introduced three new measurement methods. Li et al. [29] developed a multi-label feature selection based on information gain, which measured the correlation between features and labels. Xu et al. [24] proposed a fuzzy neighborhood conditional entropy for feature selection. Inspired by these contributions, we design a novel fuzzy neighborhood conditional entropy to judge whether exclude these features on multi-label data. However, these methods cannot provide a more accurate and comprehensive assessment of the importance of features from different perspectives. Therefore, Sun et al. [39] developed a multi-label feature selection which combined neighborhood mutual information with the approximate accuracy in multi-label neighborhood decision systems, and this method of combining two views obtained great the classification performance. Combine the above contributions, this paper proposes a multi-label feature selection method, which combines the fuzzy neighborhood conditional entropy with the approximate accuracy, to evaluate the importance of features from two views. Thus, the major contributions of this article can be briefly described as follows:

Considering that the similarity of samples is also affected by 0-value label, the average value of decision under different labels is calculated as fuzzy decision. The concepts of fuzzy neighborhood upper approximation, lower approximation and fuzzy neighborhood approximation accuracy are proposed, which improves the integrity of multi-label fuzzy neighborhood decision system.
This work proposed the definitions of fuzzy neighborhood information entropy, fuzzy neighborhood joint entropy and fuzzy neighborhood conditional entropy for multi-label data, and their related properties and proofs are discussed, by improving the single-label fuzzy neighborhood entropy.
Combining the approximate accuracy of fuzzy neighborhood under the view of algebra with the fuzzy neighborhood conditional entropy under the view of information theory, a mixed measure method is proposed to evaluate the correlation between feature subset and label set in the multi-label fuzzy neighborhood decision system. Finally, a forward multi-label feature selection algorithm based on fuzzy neighborhood rough sets is designed for multi-label classification.

The remainder of this paper is structured as follows. The next section briefly introduces the related knowledge of NRS, MNRS and FNRS. In the subsequent section, the fuzzy neighborhood rough set model, fuzzy neighborhood conditional entropy and hybrid measure are introduced. The multi-label feature selection algorithm is designed in the next section. Then the experimental results are provided. Finally, the conclusions of our research are provided in the last section.

Related knowledge

Classical neighborhood rough sets

Suppose there exists a neighborhood decision system which can be simplified as NDS $=<U,A\bigcup D,V,\varDelta ,\delta>$, where $U=\{{{x}_{1}},{{x}_{2}},\ldots , {{x}_{n}}\}$ is a nonempty samples set; $A=\{{{a}_{1}},{{a}_{2}},\ldots , {{a}_{m}}\}$ is a features set; D is decision class of samples; $V={{\bigcup }}_{a\in A}\,{{V}_{a}}$, where ${{V}_{a}}$ is the value of feature a; $\varDelta $ indicates distance function; and $\delta (0\le \delta \le 1)$ is a neighborhood radius. If $\varDelta $ satisfy the following properties [50] as

(1)
$\forall {{x}_{1}},{{x}_{2}}\in U,\varDelta ({{x}_{1}},{{x}_{2}})\ge 0, $where $\varDelta ({{x}_{1}},{{x}_{2}})=0$ if and only if ${{x}_{1}}={{x}_{2}}$;
(2)
$\forall {{x}_{1}},{{x}_{2}}\in U,\varDelta ({{x}_{1}},{{x}_{2}})=\varDelta ({{x}_{2}},{{x}_{1}})$;
(3)
$\forall {{x}_{1}},{{x}_{2}},{{x}_{3}}\in U,\varDelta ({{x}_{1}},{{x}_{3}})\le \varDelta ({{x}_{1}},{{x}_{2}})+\varDelta ({{x}_{2}},{{x}_{3}})$.

Then $\left\langle U,\varDelta \right\rangle $ is called metric space, in general, the distance in the metric space can be expressed as

$$\begin{aligned}\varDelta ({{x}_{i}},{{x}_{j}})={{\left( \sum \limits _{a=1}^{m}{{{\left| {{x}_{ia}}-{{x}_{ja}} \right| }^{p}}}\right) }^{{}^{1}/{}_{\text {p}}}}, \end{aligned}$$

when $p=1$, $\varDelta $ represents Manhattan distance; when $p=2$, $\varDelta $ represents Euclidean distance; when $p\rightarrow \infty $, $\varDelta ({{x}_{i}},{{x}_{j}})={\max }_{a}\,\left| {{x}_{ia}},{{x}_{ja}} \right| $.

Suppose the nonempty metric space $<U,\varDelta>$, for $\forall B\subseteq A$, ${{\delta }_{B}}(x)=\{y\left| x,y\in U,\varDelta (x,y)\le \delta , \right. \delta \ge 0\}$ [46]. $\varDelta (x,y)$ is a function to measure the distance between x and y, ${{\delta }_{B}}(x)$ can also be called the neighborhood granularity of x under B.

Multi-label neighborhood rough sets

Suppose there exists a multi-label neighborhood decision system which can be abbreviated to MNDS $=<U,A\bigcup D,\delta>$, for $\forall B\subseteq A$, $D=\{{{d}_{1}},{{d}_{2}},\ldots ,{{d}_{t}}\}$, ${{D}_{i}}=\{{{d}_{j}}|{{d}_{j}}({{x}_{i}})=1,{{d}_{j}}\in D\}$ represents the related label set of ${{x}_{i}}$, and ${{D}^{\text {j}}}=\{{{x}_{i}}|{{d}_{j}}({{x}_{i}})=1,{{x}_{i}}\in U\}$ denotes a set of samples with the label ${{d}_{j}}$. Then the upper approximation and lower approximation of the neighborhood rough sets of D with respect to B are defined [46], respectively, as

$$\begin{aligned} \overline{{{N}_{B}}}D= & {} \left\{ {{x}_{i}}\left| \forall {{d}_{j}} \right. \in {{D}_{i}},{{\delta }_{B}}({{x}_{i}})\bigcap {{D}^{j}}\ne \varnothing ,{{x}_{i}}\in U\right\} , \end{aligned}$$

(1)

$$\begin{aligned} \underline{{{N}_{B}}}D= & {} \{{{x}_{i}}\left| \forall {{d}_{j}}\in {{D}_{i}},{{\delta }_{B}} \right. ({{x}_{i}})\subseteq {{D}^{j}},{{x}_{i}}\in U\}. \end{aligned}$$

(2)

Then, for $\forall B\subseteq A$, the neighborhood entropy of ${{x}_{i}}\in U$ is expressed [25] as

$$\begin{aligned} NE(B)=-\log \frac{\left| {{\delta }_{B}}({{x}_{i}}) \right| }{\left| U \right| }. \end{aligned}$$

(3)

Fuzzy neighborhood rough sets

Suppose there exists a fuzzy neighborhood decision system which can be short for FNDS $=<U,A\bigcup D,\delta>$, where $U=\{{{x}_{1}},{{x}_{2}},\ldots ,{{x}_{n}}\}$ is the nonempty set of samples, and A is the set of features for $\forall B\subseteq A$. The fuzzy binary relation ${{R}_{B}}$ is derived from B [49]. For $\forall x,y\in U$, ${{R}_{B}}(x,y)$ is called fuzzy similarity relation between samples x and y under features set B when it satisfies the following conditions:

(1)
Reflexivity: ${{R}_{B}}(x,x)=1,\forall x\in U$;
(2)
Symmetry: ${{R}_{B}}(x,y)={{R}_{B}}(y,x),\forall x,y\in U$.

Then ${{R}_{B}}$ is also known as the fuzzy similarity relation.

Suppose there exists FNDS $ =<U,A\bigcup D,\delta>$ with for $\forall B\subseteq A$, $\forall a\in B$, $\forall x,y\in U$, the fuzzy similarity matrix is ${{\left[ x \right] }_{a}}(y)={{R}_{\text {a}}}(x,y)$, ${{R}_{a}}$ is a fuzzy similarity relation for $\forall a\in B$, then we can express ${{R}_{B}}=\bigcap \nolimits _{a\in B}{{{R}_{a}}}$. Then the fuzzy similarity matrix of x with respect to B over U is defined [24] as

$$\begin{aligned} {{[x]}_{B}}(y)=\underset{a\in B}{\mathop {\min }}\,({{\left[ x \right] }_{a}}(y)), y\in U. \end{aligned}$$

Given FNDS $=<U,A\bigcup D,\delta>$ with $\forall B\subseteq A$, ${U}/{D}=\{{{D}_{1}},{{D}_{2}},\ldots {{D}_{\text {r}}}\}$, for $\forall x,y\in U$, the parameterized fuzzy neighborhood information granule is constructed as follows:

$$\begin{aligned} {FN_{B}}(x)=\left[ x \right] _{B}^{\delta }(y)=\left\{ \begin{matrix} {{R}_{B}}(x,y), {{R}_{B}}(x,y)\ge \delta \\ 0,\quad {{R}_{B}}(x,y)<\delta \\ \end{matrix} \right. , \end{aligned}$$

(4)

where $\delta $ is called the fuzzy neighborhood radius and satisfies $0\le \delta \le 1$. The fuzzy neighborhood of $\forall x\in U$ can be determined by fuzzy similarity relation ${{R}_{B}}$ and neighborhood radius $\delta $.

Let FNDS $=<U,A\bigcup D,\delta>$ be a fuzzy neighborhood decision system, ${U}/{D}=\{{{D}_{1}},{{D}_{2}},\cdots {{D}_{\text {r}}}\}$, for $\forall B\subseteq A$, the upper and lower approximations of D with respect to B are expressed, respectively, as

$$\begin{aligned} \overline{FN_{B}^{\delta }}({{D}_{j}})= & {} \left\{ x\in U\left| {FN_{B}} \right. (x)\bigcap {{D}_{j}}\ne \varnothing \right\} , \end{aligned}$$

(5)

$$\begin{aligned} \underline{FN_{B}^{\delta }}({{D}_{j}})= & {} \{x\in U\left| {FN_{B}}(x) \right. \subseteq {{D}_{j}}\}.\end{aligned}$$

(6)

For $\forall B\subseteq C$, the fuzzy neighborhood approximation accuracy of D with respect to B is described as

$$\begin{aligned} AP_{B}^{\delta }=\frac{\left| \underline{FN_{B}^{\delta }}({{D}_{j}}) \right| }{\left| \overline{FN_{B}^{\delta }}({{D}_{j}}) \right| }. \end{aligned}$$

(7)

Proposed method

In this section, we improve the multi-label fuzzy neighborhood rough set model based on the relevant basic knowledge introduced in the previous section. First, the parameterized fuzzy similarity relation is used to calculate the fuzzy neighborhood granule. Because a sample in multi-label data may belong to multiple labels at the same time, the multi-label fuzzy decision is obtained by averaging values in multiple labels, which is different from the single-label fuzzy decision. Secondly, the fuzzy neighborhood approximation accuracy is introduced to consider the uncertain information of upper approximation. Then the fuzzy neighborhood conditional entropy for multi-label data is proposed. Finally, the fuzzy neighborhood approximation accuracy and fuzzy neighborhood conditional entropy are combined to form a mixed measure, and the relevant proof process is given.

Multi-label fuzzy neighborhood approximation accuracy and fuzzy decision

Definition 1

A multi-label fuzzy neighborhood decision system can be denoted as MFNDS $=<U,A\bigcup D,T,\delta>$. $U=\{{{x}_{1}},{{x}_{2}},\ldots ,{{x}_{n}}\}$ is a nonempty finite set of samples; $A=\{{{a}_{1}},{{a}_{2}},\ldots {{a}_{m}}\}$ indicates a set of features; $D=\{{{d}_{1}},{{d}_{2}},\ldots ,{{d}_{t}}\}$ represents a set of labels; $T=\{({{x}_{i}},A({{x}_{i}}),D({{x}_{i}}))|{{x}_{i}}\in U\}$, $\forall {{x}_{i}}\in U$, it allows $A({{x}_{i}})=({{a}_{1}}({{x}_{i}}),{{a}_{2}}({{x}_{i}}),\ldots ,{{a}_{m}}({{x}_{i}}))$, where ${{a}_{m}}({{x}_{i}})$ is the value of the sample ${{x}_{i}}$ in the feature ${{a}_{m}}$, $D({{x}_{i}})=({{d}_{1}}({{x}_{i}}),{{d}_{2}}({{x}_{i}}),\ldots ,{{d}_{t}}({{x}_{i}}))$, where ${{d}_{j}}({{x}_{i}})=\{0,1\}$, ${{d}_{j}}({{x}_{i}})$ indicates whether the sample ${{x}_{i}}$ contains label ${{d}_{j}}$, if ${{x}_{i}}$ contains label ${{d}_{j}}$, then ${{d}_{j}}({{x}_{i}})=1$; otherwise, ${{d}_{j}}({{x}_{i}})=0$.

Definition 2

Given MFNDS $=<U,A\bigcup D,T,\delta>$, let $\{D_{0}^{1},D_{1}^{1},D_{0}^{2},D_{1}^{2},\ldots ,D_{1}^{t}\}$ denote a label determined coverage of U, then the parameterized fuzzy decision is constructed as follows:

$$\begin{aligned} {\tilde{D}}_{p}^{\text {j}}(x)=\frac{\left| {{[x]}_{A}}(y)\bigcap D_{p}^{j} \right| }{\left| {{[x]}_{A}}(y) \right| }, \end{aligned}$$

(8)

where $D_{p}^{j}$ represents a sample set which is p in the column of the label ${{d}_{j}}$, $j=1,2,\ldots ,t$, $p=0,1$.

$$\begin{aligned} {\tilde{D}}_{p}^{j}=\{{\tilde{D}}_{p}^{j}({{x}_{1}}),{\tilde{D}}_{p}^{j}({{x}_{2}}),\ldots ,{\tilde{D}}_{p}^{j}({{x}_{n}})\},\end{aligned}$$

(9)

where ${\tilde{D}}_{\text {p}}^{j}({x}_{i})$ is the fuzzy membership degree of ${x}_{i}$ with respect to ${D}_{\text {p}}^{j}$; ${\tilde{D}}_{\text {p}}^{j}$ is the fuzzy set of the equivalence decision class of the samples.

$$\begin{aligned}&{{{\tilde{D}}}_{p}}({{x}_{i}})=\frac{1}{t}\sum \limits _{j=1}^{t}{{\tilde{D}}_{p}^{j}}({{x}_{i}}), \end{aligned}$$

(10)

$$\begin{aligned}&{{{\tilde{D}}}_{p}}=\{{{{\tilde{D}}}_{p}}({{x}_{1}}),{{{\tilde{D}}}_{p}}({{x}_{2}}),\cdots ,{{{\tilde{D}}}_{p}}({{x}_{n}})\},\end{aligned}$$

(11)

where ${\tilde{D}}_{\text {p}}({{x}_{i}})$ is the fuzzy set of the sample ${x}_{i}$ which belongs label p.

$$\begin{aligned} {\tilde{D}}=\{{{{\tilde{D}}}_{0}}^{T},{{{\tilde{D}}}_{1}}^{T}\}, \end{aligned}$$

(12)

where $\{{{{\tilde{D}}}_{0}},{{{\tilde{D}}}_{1}}\}$ is the fuzzy decision of the samples induced by D.

Definition 3

[49] Let ${F}'$ and ${R}'$ are the two fuzzy sets, the inclusion degree between ${F}'$ and ${R}'$ can be defined as

$$\begin{aligned} P({F}',{R}')=\frac{\left| {F}'\bigcap {R}' \right| }{\left| U \right| },\end{aligned}$$

(13)

where $P({F}',{R}')$ represents the inclusion degree of fuzzy set ${F}'$ in fuzzy set ${R}'$, $\left| {F}'\bigcap {R}' \right| $ represents the number of samples whose membership degree of fuzzy set ${F}'$ is not greater than that of fuzzy set ${R}'$.

Example 1

Given a set $U=\{{{x}_{1}},{{x}_{2}},\ldots ,{{x}_{6}}\}$, ${F}'$ and ${R}'$ are two fuzzy sets defined on U, which represent the membership degree of samples separately, as follows:

$$\begin{aligned}{F}'=\left\{ \frac{0.7}{{{x}_{1}}},\frac{0.9}{{{x}_{2}}},\frac{0.4}{{{x}_{3}}},\frac{0.3}{{{x}_{4}}},\frac{0.6}{{{x}_{5}}},\frac{0.5}{{{x}_{6}}}\right\} ,\\{R}'=\left\{ \frac{0.5}{{{x}_{1}}},\frac{0.9}{{{x}_{2}}},\frac{0.7}{{{x}_{3}}},\frac{0.6}{{{x}_{4}}},\frac{0.3}{{{x}_{5}}},\frac{0.4}{{{x}_{6}}}\right\} .\end{aligned}$$

So, we can get

$$\begin{aligned}&\left| {F}'\bigcap {R}' \right| =\left| {{x}_{2}},{{x}_{3}},{{x}_{4}} \right| =3,\\&\left| {R}'\bigcap {F}' \right| =\left| {{x}_{1}},{{x}_{2}},{{x}_{5}},{{x}_{6}} \right| =4.\end{aligned}$$

Definition 4

Given MFNDS $=<U,A\bigcup D,\delta>$ with $\forall B\subseteq A$, $D=\{{{d}_{1}},{{d}_{2}},\ldots ,{{d}_{t}}\}$ represents a set of labels; $\delta $ is called the fuzzy neighborhood radius and satisfies $0\le \delta \le 1$. For $\forall x,y\in U$, the parameterized fuzzy neighborhood information granule is constructed as follows:

$$\begin{aligned} {{\delta }_{B}}(x)=\left[ x \right] _{B}^{\delta }(y)=\left\{ \begin{matrix} {{R}_{B}}(x,y), {{R}_{B}}(x,y)\ge 1-\delta \\ 0,\quad {{R}_{B}}(x,y)< 1-\delta , \\ \end{matrix} \right. ,\end{aligned}$$

(14)

where ${{R}_{B}}$ is the fuzzy similarity relation induced by B on U, when ${{B}_{1}}\subseteq {{B}_{2}}$, ${{R}_{{{B}_{2}}}}\subseteq {{R}_{{{B}_{1}}}}$; when ${{\delta }_{1}}\le {{\delta }_{2}}$, for $\forall x\in U$, ${\left[ x \right] _{B}^{{\delta }_{1}}}\subseteq {\left[ x \right] _{B}^{{\delta }_{2}}}$.

Definition 5

Given $MFNDS=<U,A\bigcup D,\delta>$ with $\forall B\subseteq A$, $\delta $ is called the fuzzy neighborhood radius; $\left\{ {{{{\tilde{D}}}}_{0}},{{{{\tilde{D}}}}_{1}} \right\} $ is the fuzzy decision of samples induced by D. The upper and lower approximations of the fuzzy neighborhood of D is relative to B are defined, separately, as

$$\begin{aligned} \overline{R_{B}^{\delta }}(D)=\left\{ \overline{R_{B}^{\delta }}({{{\tilde{D}}}_{1}}),\overline{R_{B}^{\delta }}({{{\tilde{D}}}_{2}}),\ldots \overline{R_{B}^{\delta }}({{{\tilde{D}}}_{p}})\right\} ,\end{aligned}$$

(15)

$$\begin{aligned} \underline{R_{B}^{\delta }}(D)=\left\{ \underline{R_{B}^{\delta }}({{{\tilde{D}}}_{1}}),\underline{R_{B}^{\delta }}({{{\tilde{D}}}_{2}}),\ldots \underline{R_{B}^{\delta }}({{{\tilde{D}}}_{p}})\right\} ,\end{aligned}$$

(16)

where

$$\begin{aligned}&\overline{R_{B}^{\delta }}({{{\tilde{D}}}_{p}})=\left\{ x\in U\left| P({{\delta }_{B}}(x),{{{{\tilde{D}}}}_{p}}) \right. >\beta \right\} ,0\le \beta <0.5,\end{aligned}$$

(17)

$$\begin{aligned}&\underline{R_{B}^{\delta }}({{{\tilde{D}}}_{p}})=\left\{ x\in U\left| P({{\delta }_{B}}(x),{{{{\tilde{D}}}}_{p}}) \right. \ge \alpha \right\} ,0.5\le \alpha \le 1.\end{aligned}$$

(18)

Definition 6

Given MFNDS $=<U,A\bigcup D,\delta>$ with $\forall B\subseteq A$, $\delta $ is called the fuzzy neighborhood radius; $\left\{ {{{{\tilde{D}}}}_{0}},{{{{\tilde{D}}}}_{1}} \right\} $ is the fuzzy decision of samples induced by D; ${{R}_{B}}$ is the fuzzy similarity relation induced by B on U. The fuzzy neighborhood approximation accuracy is defined as

$$\begin{aligned} \alpha _{B}^{\delta }(D)=\frac{\sum \nolimits _{p=1}^{r}{\left| \underline{R_{B}^{\delta }}({{{{\tilde{D}}}}_{p}}) \right| }}{\sum \nolimits _{p=1}^{r}{\left| \overline{R_{B}^{\delta }}({{{{\tilde{D}}}}_{p}}) \right| }},\end{aligned}$$

(19)

where $\left| \centerdot \right| $ represents the cardinality of the set. $\left| \underline{R_{B}^{\delta }}({{{{\tilde{D}}}}_{p}}) \right| \le \left| \overline{R_{B}^{\delta }}({{{{\tilde{D}}}}_{p}}) \right| $, so $0\le \alpha _{B}^{\delta }(D)\le 1$.

Property 1

Given MFNDS $=<U,A\bigcup D,\delta>$ with $\forall B\subseteq A$, ${{\delta }_{1}}$ and ${{\delta }_{2}}$ are two fuzzy neighborhood radii, if ${{\delta }_{1}}\le {{\delta }_{2}}$, then $\alpha _{B}^{{{\delta }_{2}}}(D)\le \alpha _{B}^{{{\delta }_{1}}}(D)$.

Proof

For $\forall x\in U$, according to Definition 4, the fuzzy neighborhood information granule satisfies the relation is obtained ${\left[ x \right] _{B}^{{\delta }_{1}}}\subseteq {\left[ x \right] _{B}^{{\delta }_{2}}}$, then $\underline{R_{B}^{{{\delta }_{2}}}}({{{\tilde{D}}}_{p}})\subseteq \underline{R_{B}^{{{\delta }_{1}}}}({{{\tilde{D}}}_{p}})$, $\overline{R_{B}^{{{\delta }_{1}}}}({{{\tilde{D}}}_{p}})\subseteq \overline{R_{B}^{{{\delta }_{2}}}}({{{\tilde{D}}}_{p}})$, so there is $\alpha _{B}^{{{\delta }_{2}}}(D)\le \alpha _{B}^{{{\delta }_{1}}}(D)$. $\square $

Property 2

Given MFNDS $=<U,A\bigcup D,\delta>$ with $\forall B\subseteq A$, $\delta $ is a fuzzy neighborhood radius, if ${{B}_{1}}\subseteq {{B}_{2}}$, we can get the property: $\alpha _{{{B}_{1}}}^{\delta }(D)\le \alpha _{{{B}_{2}}}^{\delta }(D)$.

Proof

Since ${{B}_{1}}\subseteq {{B}_{2}}$, according to the fuzzy neighborhood granule satisfies the relation is obtained $\left[ x \right] _{{{B}_{2}}}^{\delta }\subseteq \left[ x \right] _{{{B}_{1}}}^{\delta }$, then according to Definitions 5 and 6, we have $\underline{R_{{{B}_{1}}}^{\delta }}({{{\tilde{D}}}_{p}})\subseteq \underline{R_{{{B}_{2}}}^{\delta }}({{{\tilde{D}}}_{p}})$, $\overline{R_{{{B}_{2}}}^{\delta }}({{{\tilde{D}}}_{p}})\subseteq \overline{R_{{{B}_{1}}}^{\delta }}({{{\tilde{D}}}_{p}})$. Then, $\alpha _{{{B}_{1}}}^{\delta }(D)\le \alpha _{{{B}_{2}}}^{\delta }(D)$ holds. $\square $

Example 2

Given a multi-label decision table MDT=$<U,A\bigcup D>$ to display in Table 1, $U=\{{{x}_{1}},{{x}_{2}},{{x}_{3}},{{x}_{4}},{{x}_{5}},{{x}_{6}}\}$ represents a sample set, $A=\{{{a}_{1}},{{a}_{2}},{{a}_{3}}\}$ means a feature set, $D=\{{{d}_{1}},{{d}_{2}},{{d}_{3}}\}$ indicates a label set, ${{R}_{A}}$ is based on the fuzzy similarity relation induced by A, let the value of fuzzy neighborhood radius be 0.

Table 1 A multi-label decision table

Multi-label feature selection based on fuzzy neighborhood rough sets

Abstract

Similar content being viewed by others

Multi-label feature selection via redundancy of the selected feature set

Multi-label Feature Selection with Fuzzy Rough Sets

Multi-label Attribute Evaluation Based on Fuzzy Rough Sets

Introduction

Related knowledge

Classical neighborhood rough sets

Multi-label neighborhood rough sets

Fuzzy neighborhood rough sets

Proposed method

Multi-label fuzzy neighborhood approximation accuracy and fuzzy decision

Definition 1

Definition 2

Definition 3

Example 1

Definition 4

Definition 5

Definition 6

Property 1

Proof

Property 2

Proof

Example 2

Multi-label fuzzy neighborhood conditional entropy

Definition 7

Definition 8

Definition 9

Property 3

Proof

Definition 10

Definition 11

Property 4

Proof

Property 5

Proof

Property 6

Proof

Definition 12

Definition 13

Remark 1

Multi-label feature selection algorithm based on fuzzy neighborhood rough sets

Experimental results and analysis

Experimental preparation

Parameter discussion

Comparison results of methods under MLKNN

Comparison results of methods under MLFE

Statistical analysis

Conclusion

References

Acknowledgements

Open Access

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation