Feature selection for fault detection systems: application to the Tennessee Eastman process

Chebel-Morello, Brigitte; Malinowski, Simon; Senoussi, Hafida

doi:10.1007/s10489-015-0694-6

Feature selection for fault detection systems: application to the Tennessee Eastman process

Published: 25 July 2015

Volume 44, pages 111–122, (2016)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Brigitte Chebel-Morello¹,
Simon Malinowski¹ &
Hafida Senoussi²

884 Accesses
17 Citations
Explore all metrics

Abstract

In fault detection systems, a massive amount of data gathered from the life-cycle of equipment is often used to learn models or classifiers that aims at diagnosing different kinds of errors or failures. Among this huge quantity of information, some features (or sets of features) are more correlated with a kind of failure than another. The presence of irrelevant features might affect the performance of the classifier. To improve the performance of a detection system, feature selection is hence a key step. We propose in this paper an algorithm named STRASS, which aims at detecting relevant features for classification purposes. In certain cases, when there exists a strong correlation between some features and the associated class, conventional feature selection algorithms fail at selecting the most relevant features. In order to cope with this problem, STRASS algorithm uses k-way correlation between features and the class to select relevant features. To assess the performance of STRASS, we apply it on simulated data collected from the Tennessee Eastman chemical plant simulator. The Tennessee Eastman process (TEP) has been used in many fault detection studies and three specific faults are not well discriminated with conventional algorithms. The results obtained by STRASS are compared to those obtained with reference feature selection algorithms. We show that the features selected by STRASS always improve the performance of a classifier compared to the whole set of original features and that the obtained classification is better than with most of the other feature selection algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical feature selection based on relative dependency for gear fault diagnosis

Article 07 November 2015

Feature Selection Scheme Based on Pareto Method for Gearbox Fault Diagnosis

FFT-2PCA: A New Feature Extraction Method for Data-Based Fault Detection

Notes

http://penglab.janelia.org/software/

References

Agrawal R, Ghosh S, Imielinski T, Iyer B, Swami A (1992) An interval classifier for database mining applications. In: Proceedings of the 18th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, VLDB 92,560–573
Almuallim H, Dietterich TG (1991) Learning with many irrelevant features. In: Proceedings of the Ninth National Conference on Artificial Intelligence, AAAI Press, pp 547–552
Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69:279–305
Article MATH MathSciNet Google Scholar
Bache K, Lichman M (2013) UCI machine learning repository
Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97:245–271
Article MATH MathSciNet Google Scholar
Casillas J, Cordn O, Jesus MJD, Herrera F, Casillas J, Herrera F (2000) Genetic feature selection in a fuzzy rule-based classification system learning process for high dimensional problems
Casimir R, Boutleux E, Clerc G, Yahoui A (2006) The use of features selection and nearest neighbors rule for faults diagnostic in induction motors. Eng Appl Artif Intell 19(2):169–177
Article Google Scholar
Chebel Morello B, Michaut D, Baptiste P (2001) A knowledge discovery process for a flexible manufacturing system. In: Emerging Technologies and Factory Automation, 2001. Proceedings. 2001 8th IEEE International Conference on, pp 651–658 vol. 1
Chiang LH, Kotanchek ME, Kordon AK (2004) Fault diagnosis based on fisher discriminant analysis and support vector machines. Comput Chem Eng 28(8):1389–1401
Article Google Scholar
Cui P, Li J, Wang G (2008) Improved kernel principal component analysis for fault detection. Expert Syst Appl 34(2):1210– 1219
Article Google Scholar
Dash M, Liu H, Motoda H (2000) Consistency based feature selection. In: Terano T, Liu H, Chen A (eds) Knowledge Discovery and Data Mining, Current Issues and New Applications Lecture Notes in Computer Science, vol 1805. Springer, Berlin, pp 98– 109
Downs J, Vogel E (1993) A plant-wide industrial process control problem. Comput Chem Eng 17(3):245–255
Article Google Scholar
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1-3):389–422
Article MATH Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software An update. SIGKDD Explor Newsl 11(1):10–18
Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. Morgan Kaufmann, pp 359–366
Jack L, Nandi A (2000) Genetic algorithms for feature selection in machine condition monitoring with vibration signals. Vision, Image and Signal Process IEE Proc 147(3):205–212
Article Google Scholar
Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the Tenth National Conference on Artificial Intelligence, AAAI Press, AAAI’92, pp 129–134
Kononenko I (1994) Estimating attributes: analysis and extensions of relief. Springer Verlag 171–182
Kononenko I, Simec E, Robnik-Sikonja M (1997) Overcoming the myopia of inductive learning algorithms with relieff, vol 7, pp 39–55
Langley P, Sage S (1997) Computational learning theory and natural learning systems: Volume iv. MIT Press, Cambridge MA, USA. chap Scaling to Domains with Irrelevant Features, pp 51–63
Google Scholar
Lanzi PL (1997) Fast feature selection with genetic algorithms: a filter approach. In: Evolutionary Computation, 1997.,IEEE International Conference on, pp 537–540
Liu H, Motoda H (1998) Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Norwell MA, USA
Book MATH Google Scholar
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. Knowl Data Eng, IEEE Trans on 17(4):491–502
Article Google Scholar
Marcotorchino F (1984) Utilisation des comparaisons par paires en statistique des contingences. Centre scientifique IBM Paris Etudes F-069, F-071, F-081
Michaut D (1999) Filterign and variable selection in learning processes. PhD Univ of Franche Comt
Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26(9):917–922
Article MATH Google Scholar
Noruzi Nashalji M, Aliyari Shoorehdeli M, Teshnehlab M (2010). In: Gao XZ, Gaspar-Cunha A, Kppen M, Schaefer G, Wang J (eds) Fault detection of the tennessee eastman process using improved pca and neural classifier
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238
Article Google Scholar
Ricker NL (1996) Decentralized control of the tennessee eastman challenge process. J Process Control 6 (4):205–221
Article MathSciNet Google Scholar
Riverol C, Carosi C (2008) Integration of fault diagnosis based on case-based reasoning principles in brewing. Sens & Instrumen Food Qual 2(1):15–20
Article Google Scholar
Senoussi H, Chebel-Morello B (2008) A new contextual based feature selection. In: Neural Networks, 2008. IJCNN 2008 (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on, pp 1265–1272
Sugumaran V, Muralidharan V, Ramachandran K (2007) Feature selection using decision tree and classification through proximal support vector machine for fault diagnostics of roller bearing, vol 21, pp 930–942
Thrun S, Bala J, Bloedorn E, Bratko I, Cestnik B, Cheng J, Jong KD, Dzeroski S, Hamann R, Kaufman K, Keller S, Kononenko I, Kreuziger J, Michalski R, Mitchell T, Pachowicz P, Roger B, Vafaie H, de Velde WV, Wenzel W, Wnek J, Zhang J (1991) The MONK’s problems: A performance comparison of different learning algorithms. Tech. Rep. CMU-CS-91-197, Carnegie. Mellon University, Computer Science Department, Pittsburgh, PA
Google Scholar
Tibshirani R (1994) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288
MathSciNet Google Scholar
Torkkola K, Venkatesan S, Liu H (2004) Sensor selection for maneuver classification. In: Proceedings of the 7th IEEE International ITSC Conference
Tyan CY, Wang PP, Bahler DR (1996) An application on intelligent control using neural network and fuzzy logic. Neurocomputing 12(4):345–363
Article Google Scholar
Verron S, Tiplica T, Kobi A (2008) Fault detection and identification with a new feature selection based on mutual information. J Process Control 18(5):479–490
Article Google Scholar
Wang L, Yu J (2005) Fault feature selection based on modified binary pso with mutation and its application in chemical process fault diagnosis. In: Wang L, Chen K, Ong Y (eds) Advances in Natural Computation, Lecture Notes in Computer Science, vol 3612, Springer Berlin Heidelberg, pp 832–840
Widodo A, Yang BS (2007) Application of nonlinear feature extraction and support vector machines for fault diagnosis of induction motors. Expert Syst Appl 33(1):241–250
Article Google Scholar
Yang BS, Widodo A (2008) Support Vector Machine for Machine Fault Diagnosis and Prognosis. J Syst Des Dynamics 2:12– 23
Article MATH Google Scholar
Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5:1205–1224
MATH Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

FEMTO-ST/ENSMM, Besançon, France
Brigitte Chebel-Morello & Simon Malinowski
University of Sciences and Technology, Mohamed Boudiaf, Oran, Algeria
Hafida Senoussi

Authors

Brigitte Chebel-Morello
View author publications
You can also search for this author in PubMed Google Scholar
Simon Malinowski
View author publications
You can also search for this author in PubMed Google Scholar
Hafida Senoussi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Malinowski.

Appendices

Appendix A: The Tennessee Eastman Process

The Tennessee Eastman Process (TEP) is a chemical process, created by the Eastman Chemical Company to provide a realistic industrial process in order to evaluate process control and monitoring methods [12]. This process was simulated on Matlab by Ricker [29]. The simulator was used to generate overlapping data sets to evaluate the classification performance. Figure 5 shows a flow sheet of TEP. There are four unit operations: an exothermic two-phase reactor, a flash separator, a re-boiler striper, and a recycle compressor. The TEP process produces two products (G and H) and one (undesired) by-product F from four reactants (A, C, D and E). This process has 12 input variables and 41 output variables. Only 52 variables are taken into account in this problem because one of the input variables (the reactor agitator speed) is constant. The system has fifteen types of identified faults. In this paper, we considered only three types of fault : fault 4, 9 and 11. These faults are described in Table 8.

Table 8 Description of the faults used in this paper

Full size table

Appendix B: Synthetic data

We describe in this appendix the synthetic data used in this paper for simulation purposes.The LED display domain data set is available on the UCI data set repository [4].

The MONK’s problems [33] are composed of three target concepts:

MONK-1 : (x ₁=x ₂)∨(x ₃=1)

MONK-2 : exactly two of :

$$\{x_{1} = 1,x_{2} = 1,x_{3} = 1,x_{4} = 1,x_{5} = 1,x_{6} = 1\}$$

MONK-3 : (x ₅=3 ∧ x ₄=1)∨(x ₅≠4 ∧ x ₂≠3)

The BOOL data set is composed of a function of six Boolean features giving a Boolean class, for instance : y _{c
l
a
s
s}=(x ₁⊕x ₂)∨(x ₃∧x ₄)∨(x ₅∧x ₆). Six other randomly generated Boolean features are added to these features.

The Parity data set is composed of a function of three Boolean features y _{c
l
a
s
s}=x ₁⊕x ₂⊕x ₃. Seven randomly generated Boolean features are added. This data set is particularly interesting because no relevant features taken separately can be distinguished from irrelevant ones.

The Parity2 data set is the same as the Parity data set to which 2 redundant features are added : x ₁₁=x ₁ and x ₁₂=x ₂. This data set allows to test the algorithms ability to work with redundant features.

The Coral data set is composed of six binary features x ₁ to x ₆ among which x ₅ is irrelevant and x ₆ is correlated to 75 % with the feature class y _{c
l
a
s
s}=(x ₁∧x ₂)∨(x ₃∧x ₄).

Agrawal’s functions are a series of classification functions of increasing complexity that uses nine features to classify people into different groups. More details can be found in [1].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chebel-Morello, B., Malinowski, S. & Senoussi, H. Feature selection for fault detection systems: application to the Tennessee Eastman process. Appl Intell 44, 111–122 (2016). https://doi.org/10.1007/s10489-015-0694-6

Download citation

Published: 25 July 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s10489-015-0694-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection for fault detection systems: application to the Tennessee Eastman process

Abstract

Access this article

Similar content being viewed by others

Hierarchical feature selection based on relative dependency for gear fault diagnosis

Feature Selection Scheme Based on Pareto Method for Gearbox Fault Diagnosis

FFT-2PCA: A New Feature Extraction Method for Data-Based Fault Detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: The Tennessee Eastman Process

Appendix B: Synthetic data

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature selection for fault detection systems: application to the Tennessee Eastman process

Abstract

Access this article

Similar content being viewed by others

Hierarchical feature selection based on relative dependency for gear fault diagnosis

Feature Selection Scheme Based on Pareto Method for Gearbox Fault Diagnosis

FFT-2PCA: A New Feature Extraction Method for Data-Based Fault Detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: The Tennessee Eastman Process

Appendix B: Synthetic data

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation