Identification of untrained class data using neuron clusters

Lee, Young-Woo; Chae, Heung-Seok

doi:10.1007/s00521-023-08265-x

Identification of untrained class data using neuron clusters

Original Article
Open access
Published: 31 January 2023

Volume 35, pages 10801–10819, (2023)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

Identification of untrained class data using neuron clusters

Download PDF

1092 Accesses
2 Citations
Explore all metrics

Abstract

Convolutional neural networks (CNNs), a representative type of deep neural networks, are used in various fields. There are problems that should be solved to operate CNN in the real-world. In real-world operating environments, the CNN’s performance may be degraded due to data of untrained types, which limits its operability. In this study, we propose a method for identifying data of a type that the model has not trained on based on the neuron cluster, a set of neurons activated based on the type of input data. In experiments performed on the ResNet model with the MNIST, CIFAR-10, and STL-10 datasets, the proposed method identifies data of untrained and trained types with an accuracy of 85% or higher. The more data used for neuron cluster identification, the higher the accuracy; conversely, the more complex the dataset's characteristics, the lower the accuracy. The proposed method uses only the information of activated neurons without any addition or modification of the model’s structure; hence, the computational cost is low without affecting the classification performance of the model.

Towards Effective Classification of Imbalanced Data with Convolutional Neural Networks

Intelligent fault diagnosis of rolling bearings using a semi-supervised convolutional neural network

Article 28 October 2020

One-Class Self-Attention Model for Anomaly Detection in Manufacturing Lines

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Deep learning has been studied since 1940 and has recently attracted attention with the development of technology and hardware for artificial neural networks [1]. In addition, artificial neural networks that can handle large amounts of data and solve complex problems are required. Deep neural networks (DNNs), which increase the number of hidden layers in artificial neural networks, have emerged to meet this requirement [2].

Convolutional neural networks (CNNs) are a representative type of DNN and are used in various fields, such as image classification [3, 4], face recognition [5], and video processing [6], as well as in safety–critical systems [7,8,9], such as those of autonomous vehicles, in which robustness is very important [10,11,12].

Some CNN models can classify images with higher accuracy than humans for specific datasets, but there are problems that should be solved to operate in the real-world. The models should be tolerant to untrained data [13, 38]. However, a common assumption in deep learning is that the training dataset contains all types of data that exist in real-world operating environments [14, 15]. This assumption can be easily violated in the real-world, and if untrained data are input into the CNN, they are classified as trained with high confidence [16].

In real-world operating environments, the performance of CNNs can be degraded owing to data of untrained types, which limits its operability [17]. Misclassification, particularly in safety–critical systems based on artificial intelligence, can lead to catastrophic consequences [18]. For example, autonomous vehicles can misclassify untrained elements in environments such as deserts or countryside where lanes and traffic signals do not exist, countries with different traffic signal systems, and roads with newly added signs. This can have a detrimental effect on human life, property, etc. Thus, studies have been conducted to solve these problems [19,20,21].

Previous studies use information from data or models to identify the data of types that were not used to train DNNs. Data-based studies analyze the characteristics of data using the Euclidean distance [22] or extreme value theory [23]. Although data-based studies can identify data of untrained types at a lower cost than model-based studies, the accuracy is low. Model-based studies identify the data of new untrained types by adding prototypes [16, 24] or adding or modifying extra structures or modules to the model [13, 40]. Model-based studies increase the computational cost and require more resources owing to the increased number of parameters in the model, which can affect its performance.

In the field of neuroscience, studies have been conducted to characterize the representation of brain regions [25, 26, 39]. Those studies have identified brain regions that perceive stimuli by providing input signals through sensory organs and characterizing the information recognized by specific brain regions. The studies conducted representational similarity analysis between the input signals and the brain regions and showed that if the input signals are similar, the brain regions that perceive stimuli are similar. The structure of the DNN is inspired by the human brain and works similarly to the human brain system [27,28,29]. Therefore, it is expected that a relation exists between the type of data input to the DNN and the specific region of the DNN. This study defines concepts and methods and conducts experiments to answer the following research questions.

RQ1 (Characteristic): Does a set of neurons that play a major role in each class exist? Can a set of neurons identify class characteristics?
RQ2 (Similarity): Can the similarity between neuron sets be used as a criterion to identify data from trained and untrained classes?

Inspired by neuroscience research, this study identifies the class neuron cluster (CNC), a set of neurons that are activated based on the class of input data, and analyzes the relation between the class of input data and the CNC. Based on the analysis results, we confirm whether the activated neurons are similar if the characteristics of the classes are similar. Furthermore, we identify data types that the model has not trained on using the CNCs.

First, we measure the criteria for identifying the data of classes that the model has not trained using sets of neurons activated by the training and validation data. Next, based on these criteria, we identify the data from the untrained classes.

The method proposed in this study uses only activated neuron information without adding or modifying the model structure; this has the advantages of low-computational cost and no impact on the classification performance of the model.

The main contributions of this study are as follows:

We propose a CNC, which is a set of neurons used to identify a specific class. We show that the CNC recognizes class characteristics by conducting a similarity analysis between CNCs for the CIFAR-10 and STL-10 datasets and ResNet models.
We propose a new model-based method for identifying untrained class data using the CNC similarity.
We conduct experiments on public datasets (MNIST, CIFAR-10, and STL-10) to demonstrate the feasibility and effectiveness of the untrained class data identification method.

Section 2 describes some related work on the identification of the data of classes that the model has not trained on. Section 3 details the definition of the CNC, a set of neurons that are activated based on the class of input data, the identification of the CNC, and the analysis of the similarities between CNCs. Section 4 describes how to identify untrained class data. Section 5 presents the datasets, models, and configurations used in the experiments for identifying untrained class data. Section 6 presents and analyzes the experimental results. Section 7 provides the conclusions and the scope for future work. All variables and acronyms used in this paper are listed in Appendix 1.

2 Related work

The purpose of this study is to identify untrained class data. Therefore, the purpose is different from that of studies that identify modified data or outliers of known classes based on the generative adversarial network (GAN) and adversarial attack methods. Model training speed, performance improvement, or pruning for weight reduction are not addressed in this study.

Studies on the identification of untrained class data can be classified into two categories: data-based and model-based studies. In data-based study, Mendes et al. [22] proposed an open-set nearest neighbor method to identify untrained class data based on the Euclidean distance, depending on whether the label of the closest training data in the input data matched the label of the second closest training data. The method identifies trained and untrained class data with high accuracy of approximately 80% or more in experiments on simple datasets; however, the accuracy on complex datasets, such as Caltech-256, is lower than approximately 50%.

Zhang and Patel [23] proposed a sparse representation-based open-set recognition method to identify untrained class data by analyzing the difference between the input data and the training data using the extreme value theory method. The method identifies trained and untrained class data with a high accuracy of approximately 90% or more in experiments on simple datasets; however, the accuracy on complex datasets, including Caltech-256, is approximately 68–80%.

Similar to [23], Yu et al. [43] proposed the open set fault diagnostic method, which identifies untrained class data by analyzing it with the extreme value theory method. On industrial datasets, the proposed method increased the identification accuracy by approximately 10%.

In model-based study, Yang et al. [16] proposed generalized convolutional prototype learning with prototype loss for learning CNN models by adding prototype parameters. This method increases the classification accuracy by reducing the distance between the intra-class data and increasing the distance between the inter-class data based on the Euclidean distance. In addition, it improves the robustness of the CNN by training it to classify the added new class by adding a prototype for a new class. The method increases the accuracy by approximately 0.2% in the experiment on the MNIST dataset and by approximately 0.5% in the experiment on the CIFAR-10 dataset.

Similar to Yang et al. [16], Wang et al. [24] increased the classification accuracy by reducing the distance between the intra-class data and increasing the distance between the inter-class data using the CNN-based prototype ensemble technique. They identified untrained class data by calculating the novelty of each class using the open-world nearest mean classifier. The method identifies untrained class data with an accuracy of approximately 80% in the experiment on the simple characteristic Fashion-MNIST dataset and approximately 60% in the experiment on the complex CINIC dataset. Prototype-based methods, such as those of Yang et al. [16] and Wang et al. [24], require the addition of prototypes for the untrained class as parameters to the model, and an increase in the number of parameters increases the computational cost and required resources.

Gao et al. [41] proposed Convolutional open-world multi-task image Stream classifier with Intrinsic Similarity Metrics (CSIM), which identifies untrained class data based on similarity metrics. The method identified untrained class data with approximately 70% and less than 50% accuracy in experiments on MNIST and CIFAR-10 datasets, respectively.

Prototype-based methods, such as those of Yang et al. [16] and Wang et al. [24], require the addition of prototypes for the untrained class as parameters to the model, and an increase in the number of parameters increases the computational cost and required resources. These methods, including Gao et al. [41], require regular updating of model parameters to enhance the model’s untrained class data identification performance.

Zhou et al. [44] identify untrained class data using the learning to classify with incremental new class method based on entropy and probability. When data are input, if the combined score of entropy and probability is greater than the threshold, it is classified as an untrained class, and the accuracy is approximately 2% higher than the comparative methods.

Ma et al. [45] proposed a method for identifying untrained class data in a generative adversarial network when the discriminator’s score is greater than a threshold. The proposed method’s accuracy was approximately 5% higher than the comparative methods.

Yoshihashi et al. [13] proposed classification-reconstruction learning for open-set recognition, a method of probabilistic identification of untrained class data using deep hierarchical reconstruction nets, designed based on an openmax classifier modified with softmax. The method improves the F1-score by approximately 0.6 in the experiments on the modified ImageNet and LSUN datasets.

Dudi and Rajesh [40] proposed the shark smell-based whale optimization algorithm (SS-WOA), which identifies untrained class data based on the activation function values and classification costs of the CNN model. The method derives a threshold using the activation function value when data are input and identify the input data as untrained class data when the classification cost is less than the threshold. The method improves the classification accuracy by approximately 0.57–4% compared with other classification models in the experiment on plant-leaf datasets. The methods proposed in [13] and [40] increase the computational cost because separate modules (extra layer and algorithm) are added to the CNN model.

The untrained class data identification method proposed in this study differs from data-based methods in that it uses model information. In addition, it does not modify the model, such as a loss function or classifier, or add new modules to identify untrained class data; it uses only the activation information of the neurons in the trained model. Thus, because the method is model-independent, it has the advantages of low computational cost and no influence on the classification performance of the model, unlike previous model-based methods.

3 Class neuron cluster

Here, we define the CNC as a set of neurons that are activated by the input data of a particular type (class). Then, we identify the CNC for each class of input data and analyze the relation between the class of input data and the CNC.

3.1 Definition

Neurons recognize the characteristics of the input data and transmit the recognized information to the next layer [30]. The CNC is a set of neurons that recognize the characteristics of the data of a particular class, and these neurons are activated when the data of a particular class are input. Before defining the CNC, we define a function that determines whether a neuron is activated as in Definition 1.

Definition 1

Neuron activation function.

A set of neurons: $N = \left\{ {n_{1} , n_{2} , \ldots , n_{l} } \right\}$
A set of classes: $S = \left\{ {c_{1} , c_{2} , \ldots , c_{m} } \right\}$
A set of data in class$c$: $D_{c} = \left\{ {d_{1} , d_{2} , \ldots , d_{n} } \right\}$

For input data $d$, if the activation value of the neuron exceeds 0, neuron $n$ is determined to be activated by data $d$. For activation function $f$, the function used to determine the activation of the neuron is expressed in Eq. (1).

$${\text{active}}\left( {n,\,d} \right) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}~f\left( {n,\,d} \right) > 0} \hfill \\ {0,} \hfill & {{\text{else}}} \hfill \\ \end{array} } \right.$$

(1)

The CNC is a set of neurons that are activated more than a certain rate by the data of a particular class. Therefore, the neuron activation ratio is defined as in Definition 2 based on the neuron activation function in Definition 1.

Definition 2

Neuron activation ratio.

The activation ratio of neuron $n$ for dataset $D_{c}$ of a particular class $c$ is the ratio of data that activates the neuron among the total data of the dataset, and it is expressed in Eq. (2).

$${\text{active}}\_{\text{ratio}}\left( {n, \,D_{c} } \right) = \frac{{{\Sigma }_{{d \in D_{c} }} {\text{active}}\left( {n, \,d} \right)}}{{\left| {D_{c} } \right|}}$$

(2)

When the data of a particular class are input, a CNC, a set of neurons that are activated more than a certain rate, is defined as Definition 3 based on Definitions 1 and 2.

Definition 3

Class neuron cluster.

The ${\text{CNC}}_{c}$ of a particular class $c$ is a set of neurons whose neuron activation ratio is greater than or equal to the threshold for dataset $D_{c}$ and is expressed as Eq. (3).

$${\text{CNC}}_{c} = \{ n \in N|{\text{active}}\_{\text{ratio}}\left( {n, \,D_{c} } \right) \ge {\text{threshold}}\}$$

(3)

For example, as depicted in Fig. 1, when the number of data of class $c$ is 100 and the threshold is set at 0.5, neurons activated more than 50 times are identified as the CNC of class $c$.

3.2 Class neuron cluster similarity analysis

To verify that the previously defined CNC recognizes the characteristics of the input data, we identify CNCs for each class of input data and analyze the similarities between CNCs.

3.2.1 Class neuron cluster similarity analysis approach

A CNC is a set of neurons used to recognize the characteristics of data. Therefore, when the CNC is measured for each class, the CNCs of classes with similar characteristics are expected to have many activated neurons in common. Conversely, the CNCs of classes with only slightly similar characteristics are expected to have fewer commonly activated neurons. To confirm this, we analyze the similarity between CNCs. Figure 2 shows an overview of the CNC similarity analysis.

When the model that trains the training data composed of m classes ($S = \left\{ {c_{1} , c_{2} , \ldots , c_{m} } \right\}$) $D_{S}^{{{\text{tr}}}}$ is $M_{S}$, we identify the CNC for each class by inputting $D_{S}^{{{\text{tr}}}}$ into $M_{S}$. Then, we measure the similarities between CNCs and analyze the similarity between CNCs by representing them as a similarity matrix. The method for measuring the similarity between CNCs is described in Definition 4.

Definition 4

Similarity measurement.

The similarity between two clusters, ${\text{CNC}}_{{c_{i} }}$ and ${\text{CNC}}_{{c_{j} }}$, is measured as expressed in Eq. (4).

$${\text{CNC}}_{{c_{i} }} \oplus {\text{CNC}}_{{c_{j} }} = \frac{{{\text{CNC}}_{{c_{i} }} \cap {\text{CNC}}_{{c_{j} }} }}{{{\text{CNC}}_{{c_{i} }} \cup {\text{CNC}}_{{c_{j} }} }}$$

(4)

When the similarity between two CNCs is the proportion of neurons that are commonly activated as described in Definition 4, we measure all the similarities between the identified CNCs and express them as a similarity matrix.

The CNC similarity analysis is conducted for the CIFAR-10 [31] and STL-10 [32] datasets and the ResNet model. CIFAR-10 and STL-10 are datasets comprising 10 classes (living-being: 6 and object: 4); CIFAR-10 has 5,000 training data, and STL-10 has 500 training data for each class.

We analyze the similarity in the CIFAR-10 dataset for 24,576 neurons in the last six layers of the ResNet-20 model, and the STL-10 dataset for 368,640 neurons in the last 10 layers of the ResNet-32 model. The reason for targeting the latter layers is explained in Sect. 5.

Analyzing the CNC similarity identifies CNCs by inputting 500 training data for each class into the model, measures all similarities between the identified CNCs, and expresses them as a similarity matrix.

3.2.2 Class neuron cluster similarity analysis results

When the threshold of the neuron activation ratio is set at 0.5, the similarity matrixes representing the results of the CNC similarity analysis with 10 classes of CIFAR-10 and STL-10 are presented in Figs. 3 and 4, respectively.

The analysis results on the CIFAR-10 dataset show that the similarities between living-being–living-being pairs and between object–object pairs are higher than the similarities between living-being–object pairs. In the living-being–living-being pairs, the cat–dog pair has the highest similarity, and in the object–object pairs, the ship–airplane pair has the highest similarity. The pair with the lowest similarity is the horse–ship pair, which is a living-being–object pair.

The analysis results on the STL-10 dataset, similar to those on the CIFAR-10 dataset, show that the similarities between living-being–living-being pairs and object–object pairs are higher than those between living-being–object pairs. In the living-being–living-being pairs, the dog–monkey pair has the highest similarity, and in the object–object pairs, the ship–airplane pair has the highest similarity. The pair with the lowest similarity is the horse–airplane pair, which is a living-being–object pair.

The similarity analysis results indicate that the characteristics of classes with high similarity between CNCs are more similar than those with low similarity, and the CNC is a set of neurons that recognizes the data characteristics of a particular class.

As described in Definition 3, CNC identification depends on the threshold. When the threshold is small, neurons activated by a small amount of data are included in the CNC, so that the CNC can be identified as coarse-grained. A coarse-grained CNC can flexibly identify data of various shapes; however, the data may be mistaken for a different class. Conversely, when the threshold is large, the CNC can be identified as fine-grained, which may not recognize data in various shapes and may result in poor flexibility. We conducted a similarity analysis based on the thresholds on the CIFAR-10 dataset and ResNet-20 model, and the analysis results are described in Appendix 2.

4 Untrained class data identification approach

The identification of untrained class data consists of a measure of model identifiability and identification of untrained class data. Figure 5 shows a flowchart of the untrained class data identification approach.

The first step measures the model identifiability based on the training and validation data. The second step measures the maximum class similarity based on the unknown class data and training data and identifies the untrained class data by comparing it with the model identifiability. Each step is described in detail in the following subsections.

4.1 Measure of model identifiability

Model identifiability is a criterion for determining whether arbitrary data inputs to the model are data of classes that the model is not trained on and is measured based on class identifiability. Figure 6 shows an overview of model identifiability measurements.

We denote the training data excluding the data of class $c_{h}$ as $D_{{\overline{{c_{h} }} }}^{{{\text{tr}}}}$, the validation data excluding the data of $c_{h}$ as $D_{{\overline{{c_{x} }} }}^{{{\text{va}}}}$, and the model that is trained on training data $D_{{\overline{{c_{h} }} }}^{{{\text{tr}}}}$ as $M_{{\overline{{c_{h} }} }}$. First, we input training data $D_{{\overline{{c_{h} }} }}^{{{\text{tr}}}}$ and validation data $D_{{\overline{{c_{x} }} }}^{{{\text{va}}}}$ into model $M_{{\overline{{c_{h} }} }}$ and identify the CNC for the training data and validation data, respectively. The validation data are selected from the original training data and do not overlap with the training data used for CNC identification.

Next, we measure the similarity between the CNCs of the training and validation data for each class. The similarity is used to identify each class. Class identifiability is a criterion for determining whether arbitrary data inputs to the model are data of a class that the model is trained on at the class level. Class identifiability is described in Definition 5.

Definition 5

Class identifiability.

The identifiability of a particular class $c_{k}$ is the similarity between the ${\text{CNC}}_{{c_{k} }}^{{{\text{tr}}}}$ identified by inputting the training data, and the ${\text{CNC}}_{{c_{k} }}^{{{\text{va}}}}$ identified by inputting the validation data, as expressed in Eq. (5).

$${\text{identifiability}}_{{c_{k} }} = {\text{CNC}}_{{c_{k} }}^{{{\text{tr}}}} \oplus {\text{CNC}}_{{c_{k} }}^{{{\text{va}}}}$$

(5)

Model identifiability is a criterion for determining whether the input data are the data of the trained class at the model level. As expressed in Definition 6, it is measured as the minimum value among the class identifiabilities.

Definition 6

Model identifiability.

The identifiability of model $M_{s}$ trained on $S$ composed of $m$ classes is the minimum value among the identifiabilities of $m$ classes, as expressed in Eq. (6).

$${\text{identifiability}}_{{M_{S} }} = {\text{min}}\left( {\left\{ {{\text{identifiability}}_{{c_{i} }} } \right\}_{{i = 1}}^{m} } \right)$$

(6)

4.2 Identification of untrained class data

Untrained class data are identified by measuring the similarity between the CNC identified based on the input data and the CNC identified based on the training data and comparing the similarity with the model identifiability. Figure 7 shows an overview of untrained class data identification.

Because a small number of data may not be sufficient in identifying neuron clusters, we augment unknown class data, $D_{{c_{h} }}^{{{\text{un}}}}$ with rotation, zoom, brightness, and shift [42] techniques. The augmented data are then used for ${\text{CNC}}_{{c_{h} }}^{{{\text{un}}}}$ identification. The unknown class data are selected from the test data.

We measure the class maximum similarity to determine whether the unknown class data are an untrained class. The class max similarity is measured by comparing the similarity between ${\text{CNC}}_{{c_{h} }}^{{{\text{un}}}}$ and CNCs based on the training data and is defined in Definition 7.

Definition 7

Class max similarity.

For the ${\text{CNC}}_{{c_{h} }}^{{{\text{un}}}}$ based on class $c_{h}$, the maximum similarity of $c_{h}$ is the maximum of the similarities measured among the CNCs based on $m$ training data and is expressed in Eq. (7).

$${\text{similarity}}_{{c_{h} }}^{{{\text{max}}}} = {\text{max}}\left( {\left\{ {{\text{CNC}}_{{c_{i} }}^{{{\text{tr}}}} \oplus {\text{CNC}}_{{c_{h} }}^{{{\text{un}}}} } \right\}_{i = 1}^{m} } \right)$$

(7)

The untrained class data are identified by comparing the model identifiability measured in the previous step to the unknown class maximum similarity. The method for identifying the untrained class data is described in Definition 8.

Definition 8

Untrained class data identification.

If the unknown class max similarity, ${\text{similarity}}_{{c_{h} }}^{{{\text{max}}}}$ based on class $c_{h}$, is less than the model identifiability, ${\text{identifiability}}_{M}$, $c_{h}$ is identified as an untrained class, as expressed in Eq. (8).

$$c_{h} = \left\{ {\begin{array}{*{20}l} {{\text{untrained~class}},~} \hfill & {{\text{if}}~{\text{similarity}}_{{c_{h} }}^{{{\text{max}}}} < {\text{identifiability}}_{M} } \hfill \\ {{\text{trained~class}},} \hfill & {{\text{else}}} \hfill \\ \end{array} } \right.$$

(8)

If the unknown class max similarity is less than the model identifiability, the characteristics of the data of the unknown class are considered to be dissimilar to the characteristics of the data of all classes that the model is trained on; therefore, the data of the unknown class are identified as untrained class data.

To verify the performance of the untrained class data identification method, we define class identification accuracy based on the accuracy used in the existing classification test [37]. The class identification accuracy is measured based on the identification results obtained by comparing the model identifiability and the class max similarity. It measures the degree of identification of not only untrained class data but also trained class data as trained class data. The class identification accuracy is described in Definition 9.

Definition 9

Class identification accuracy.

The class identification accuracy is the ratio of correctly classified trained and untrained class data to the total data, as expressed in Eq. 9.

$${\text{Class Identification Accuracy}} = \frac{{\text{TP + TN}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}}$$

(9)

TP: Identify untrained class data as an untrained class
TN: Identify trained class data as a trained class
FP: Identify trained class data as an untrained class
FN: Identify untrained class data as a trained class

To analyze the performance of the untrained class data identification method in detail, we measure the sensitivity, specificity, false-positive rate, and false-negative rate. Each performance metric is as follows.

$${\text{Sensitivity}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}}$$

(10)

$${\text{Specificity}} = \frac{{{\text{TN}}}}{{{\text{TN}} + {\text{FP}}}}$$

(11)

$${\text{FNR}} \left( {\text{False negative rate}} \right) = \frac{{{\text{FN}}}}{{{\text{TP}} + {\text{FN}}}}$$

(12)

$${\text{FPR}} \left( {\text{Flase positive rate}} \right) = \frac{{{\text{FP}}}}{{{\text{TN}} + {\text{FP}}}}$$

(13)

5 Experimental design and settings

Here, we describe the experimental environments and methods used to measure model identifiability and identify the untrained class data.

5.1 Design to measure of model identifiability

Untrained class data are identified on the MNIST [33], CIFAR-10 [31], and STL-10 [32] datasets and ResNet [3], a CNN model. Table 1 presents the experimental datasets, models, and number of neurons to be measured.

Table 1 Experimental datasets and models

Identification of untrained class data using neuron clusters

Abstract

Similar content being viewed by others

Towards Effective Classification of Imbalanced Data with Convolutional Neural Networks

Intelligent fault diagnosis of rolling bearings using a semi-supervised convolutional neural network

One-Class Self-Attention Model for Anomaly Detection in Manufacturing Lines

1 Introduction

2 Related work

3 Class neuron cluster

3.1 Definition

Definition 1

Definition 2

Definition 3

3.2 Class neuron cluster similarity analysis

3.2.1 Class neuron cluster similarity analysis approach

Definition 4

3.2.2 Class neuron cluster similarity analysis results

4 Untrained class data identification approach

4.1 Measure of model identifiability

Definition 5

Definition 6

4.2 Identification of untrained class data

Definition 7

Definition 8

Definition 9

5 Experimental design and settings

5.1 Design to measure of model identifiability

5.2 Design for identification of untrained class data

5.3 Configurations

6 Experimental results

6.1 Results of measure of model identifiability

6.2 Results of identification of untrained class data

7 Conclusion

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: All variables and acronyms

Appendix 2: CNC similarity analysis based on threshold

Appendix 3: CNC min/average/max size based on threshold

Appendix 4: Model identifiability based on threshold

Appendix 5: Performance metric measurement results

5.1 MNIST

5.2 CIFAR-10

5.3 STL-10

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation