Extending information processing in a Fuzzy Random Forest ensemble

Cadenas, Jose M.; Garrido, M. Carmen; Martínez, Raquel; Bonissone, Piero P.

doi:10.1007/s00500-011-0777-1

Extending information processing in a Fuzzy Random Forest ensemble

Focus
Published: 14 October 2011

Volume 16, pages 845–861, (2012)
Cite this article

Soft Computing Aims and scope Submit manuscript

Jose M. Cadenas¹,
M. Carmen Garrido¹,
Raquel Martínez¹ &
…
Piero P. Bonissone²

452 Accesses
24 Citations
Explore all metrics

Abstract

Imperfect information inevitably appears in real situations for a variety of reasons. Although efforts have been made to incorporate imperfect data into classification techniques, there are still many limitations as to the type of data, uncertainty, and imprecision that can be handled. In this paper, we will present a Fuzzy Random Forest ensemble for classification and show its ability to handle imperfect data into the learning and the classification phases. Then, we will describe the types of imperfect data it supports. We will devise an augmented ensemble that can operate with others type of imperfect data: crisp, missing, probabilistic uncertainty, and imprecise (fuzzy and crisp) values. Additionally, we will perform experiments with imperfect datasets created for this purpose and datasets used in other papers to show the advantage of being able to express the true nature of imperfect information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Suitability of Type-1 Fuzzy Regression Tree Forests for Complex Datasets

A comparison of random forest based algorithms: random credal random forest versus oblique random forest

Article 17 November 2018

Fuzzy Random Forest with C–Fuzzy Decision Trees

References

Ahn H, Moon H, Fazzari J, Lim N, Chen J, Kodell R (2007) Classification by ensembles from random partitions of high dimensional data. Comput Stat Data Anal 51:6166–6179
Article MathSciNet MATH Google Scholar
Asuncion A, Newman DJ (2007) UCI Machine Learning Repository.University of California, School of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/mlearn/MLRepository.html
Bonissone PP (1997) Approximate reasoning systems: handling uncertainty and imprecision in information systems. In: Motro, A, Smets, Ph (eds) Uncertainty management in information systems: from needs to solutions. Kluwer Academic Publishers, Dordrecht, pp 369–395
Bonissone PP, Cadenas JM, Garrido MC, Díaz-Valladares RA (2010) A fuzzy random forest. Int J Approx Reason 51(7):729–747
Article Google Scholar
Casillas J, Sánchez L (2006) Knowledge extraction from data fuzzy for estimating consumer behavior models. In: Proceedings of IEEE conference on Fuzzy Systems, Vancouver, BC, Canada, pp 164–170
Coppi R, Gil MA, Kiers HAL (2006) The fuzzy approach to statistical analysis. Comput Stat Data Anal 51:1–14
Article MathSciNet MATH Google Scholar
Dubois D, Prade H (1988) Possibility theory: an approach to computerized processing of uncertainty. Plenum Press, New York
Dubois D, Guyonnet D (2011) Risk-informed decision-making in the presence of epistemic uncertainty. Int J Gen Syst 40(2):145–167
Article MathSciNet MATH Google Scholar
Duda RO, Hart PE, Stork DG (2001) Pattern classification. John Wiley and Sons, Inc, New York
Fernández A, del Jesus MJ, Herrera F (2009) Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets. Int J Approx Reason 50 (3):561–577
Article MATH Google Scholar
García S, Fernández A, Luengo J, Herrera F (2009) A study statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977
Article Google Scholar
Garrido MC, Cadenas JM, Bonissone PP (2010) A classification and regression technique to handle heterogeneous and imperfect information. Soft Comput 14(11):1165–1185
Article Google Scholar
Hernández J, Ramírez MJ, Ferri C (2004) Introducción a la Minería de Datos. Pearson-Prentice Hall, Englewood Cliffs
Janikow CZ (1996) Exemplar learning in fuzzy decision trees. In: Proceedings of the FUZZ-IEEE, New Orleans, USA, pp 1500–1505
Janikow CZ (1998) Fuzzy decision trees: issues and methods. IEEE Trans Man Syst Cybern 28:1–14
Article Google Scholar
Langseth H, Nielsen TD, Rum R, Salmern A (2009) Inference in hybrid Bayesian networks. Reliab Eng Syst Saf 94:1499–1509
Article Google Scholar
Mackay DJC (2003) Information theory, inference and learning algorithms. Cambridge University Press, Cambridge
McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley Series in Probability and Statistics, New York
Mitra S, Pal SK (1995) Fuzzy multi-layer perceptron, inferencing and rule generation. IEEE Trans Neural Netw 6(1):51–63
Article Google Scholar
Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell 11:169–198
MATH Google Scholar
Otero A, Otero J, Sánchez L, Villar JR (2006) Longest path estimation from inherently fuzzy data acquired with GPS using genetic algorithms. In: Proceedings on International Symposium on Evolving Fuzzy Systems, Lancaster, UK, pp 300–305
Palacios AM, Sánchez L, Couso I (2009) Extending a simple genetic cooperative–competitive learning fuzzy classifier to low quality datasets. Evol Intell 2:73–84
Article Google Scholar
Palacios AM, Sánchez L, Couso I (2010) Diagnosis of dyslexia with low quality data with genetic fuzzy systems. Int J Approx Reason 51:993–1009
Article Google Scholar
Palacios AM, Sánchez L, Couso I (2011) Linguistic cost-sensitive learning of genetic fuzzy classifiers for imprecise data. Int J Approx Reason 52:841–862
Article Google Scholar
Quaeghebeur E, Cooman G (2005) Imprecise probability models for inference in exponential families. In: 4th international symposium on imprecise probabilities and their applications, Pittsburgh, Pennsylvania, pp 287–296
Quinlan JR (1993) C4.5: programs for machine learning. The Morgan Kaufmann Series in Machine Learning. Morgan Kaufmann Publishers, San Mateo
Ruiz A, López de Teruel PE, Garrido MC (1998) Probabilistic inference from arbitrary uncertainty using mixtures of factorized generalized Gaussians. J Artif Intell Res 9:167–217
MATH Google Scholar
Sánchez L, Suárez MR, Villar JR, Couso I (2008) Mutual information-based feature selection and partition design in fuzzy rule-based classifiers from vague data. Int J Approx Reason 49(3):607–622
Article Google Scholar
Sánchez L, Couso I, Casillas J (2009) Genetic learning of fuzzy rules based on low quality data. Fuzzy Sets Syst 160:2524–2552
Article MATH Google Scholar
Witten IH, Frank E (2000) Data mining. Morgan Kaufmann Publishers, San Francisco

Download references

Acknowledgments

Supported by the project TIN2008-06872-C04-03 of the MICINN of Spain and European Fund for Regional Development. Thanks also to the Funding Program for Research Groups of Excellence with code 04552/GERM/06 granted by the “Fundación Séneca”, Murcia, Spain. R. Martínez is supported by the scholarship program FPI from the “Fundación Séneca” of Spain. Thanks to Luciano Sánchez and Ana Palacios for their help in creating extended boxplot.

Author information

Authors and Affiliations

Dept. Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain
Jose M. Cadenas, M. Carmen Garrido & Raquel Martínez
GE Global Research, One Research Circle, Niskayuna, NY, 12309, USA
Piero P. Bonissone

Authors

Jose M. Cadenas
View author publications
You can also search for this author in PubMed Google Scholar
M. Carmen Garrido
View author publications
You can also search for this author in PubMed Google Scholar
Raquel Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Piero P. Bonissone
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jose M. Cadenas.

Appendix: Combination methods

In this appendix, we use the notation above defined in Sect. 3.2. For each strategy for fuzzy classifier module in the FRF ensemble that is described in Sect. 3.2 we can define several functions Faggre1₁, Faggre1₂ and Faggre2. The overall set of functions is described in Bonissone et al. (2010). In this appendix, the utilized methods in this paper are described in detail:

1.1 Non-trainable methods

In these combination methods, a transformation is applied to the matrix L_FRF in Step 2 in Algorithms 3 and 4 so that each leaf reached assigns a simple vote to the majority class.

$$ \begin{aligned} {\rm For}\,t &= 1,\ldots,T \\ &{\rm For}\, n = 1,\ldots, N\\ & \quad {\rm For}\,i = 1,\ldots, I\\ &\qquad L\_FRF {-}mod_{t,n,i} = \left\{ \begin{array}{ll} 1 & {\rm if}\ i=\arg\displaystyle\max_{j, j=1,\ldots,I} \{L\_FRF_{t,n,j}\}\\ 0 &{\rm otherwise} \\ \end{array}\right. \end{aligned} $$

Within this group we define the following methods. We get two versions depending on the strategy used:

Simple Majority vote:

Strategy 1→ method SM1

The function Faggre1₁ in Algorithm 3 is defined as
$$ Faggre1_1(t,i,L\_FRF)=\left\{ \begin{array}{ll} 1 & {\rm if}\ i=\arg\displaystyle\max_{j, j=1,\ldots,I} \left\{ \sum_{n=1}^{N_t} L\_FRF-mod_{t,n,j} \right\}\\ 0 & {\rm otherwise} \\ \end{array}\right. $$

In this method, each tree t assigns a simple vote to the most voted class among the N _t reached leaves by example e in the tree.

The function Faggre1₂ in Algorithm 3 is defined as
$$ Faggre1_2(i,T\_FRF)=\sum_{t=1}^{T} T\_FRF_{t,i} $$
Strategy 2 → method SM2

For Strategy 2, it is necessary to define the function Faggre2 combining information from all leaves reached in the ensemble by example e. Thus, the function Faggre2 in Algorithm 4 is defined as
$$ Faggre2(i,L\_FRF)= \sum_{t=1}^{T} \sum_{n=1}^{N_t} L\_FRF_mod{t,n,i} $$

1.2 Trainable explicitly dependent methods

Majority vote Weighted by Leaf:

In this combination method, a transformation is applied to the matrix L_FRF in Step 2 of Algorithms 3 and 4 so that each leaf reached assigns a weighted vote to the majority class. The vote is weighted by the degree of satisfaction with which example e reaches the leaf.
$$ \begin{aligned} {\rm For}\,t &= 1,\ldots,T \\ &{\rm For}\, n = 1,\ldots, N\\ & \quad {\rm For}\,i = 1,\ldots, I\\ &\qquad L\_FRF {-}mod_{t,n,i} = \left\{ \begin{array}{ll} \chi_{t,n(e)} & {\rm if}\ i=\arg\displaystyle\max_{j, j=1,\ldots,I} \{L\_FRF_{t,n,j}\}\\ 0 & {\rm otherwise} \\ \end{array}\right. \end{aligned} $$

Again, we have two versions according to the strategy used.

Strategy 1 → method MWL1

The functions Faggre1₁ and Faggre1₂ are defined as
$$ Faggre1_1(t,i,L\_FRF)=\left\{ \begin{array}{ll} 1 & {\rm if}\ i=\arg\displaystyle\max_{j, j=1,\ldots,I} \left\{ \sum_{n=1}^{N_t} L\_FRF-mod_{t,n,j} \right\}\\ 0 & {\rm otherwise} \\ \end{array}\right. $$

$$ Faggre1_2(i,T\_FRF)=\sum_{t=1}^{T} T\_FRF_{t,i} $$
Strategy 2 → method MWL2

The function Faggre2 is defined as
$$ Faggre2(i,L\_FRF)= \sum_{t=1}^{T} \sum_{n=1}^{N_t} L\_FRF_{t,n,i} $$
- Majority vote Weighted by Leaf and by Tree:
Again, in this combination method, a transformation is applied to the matrix L_FRF in Step 2 of Algorithms 3 and 4 so that each leaf reached assigns a weighted vote to the majority class. The vote is weighted by the degree of satisfaction with which example e reaches the leaf.
$$ \begin{aligned} {\rm For}\,t &= 1,\ldots,T \\ &{\rm For}\, n = 1,\ldots, N\\ & \quad {\rm For}\,i = 1,\ldots, I\\ &\qquad L\_FRF {-}mod_{t,n,i} = \left\{ \begin{array}{ll} \chi_{t,n}(e) & {\rm if}\ i=\arg\displaystyle\max_{j, j=1,\ldots,I} \{L\_FRF_{t,n,j}\}\\ 0 & {\rm otherwise} \\ \end{array}\right. \end{aligned} $$

In addition, in this method a weight for each tree obtained is introduced by testing each individual tree with the OOB dataset. Let $\overline{p}=(p_1,p_2,\ldots,p_T)$ be the vector with the weights assigned to each tree. Each p _t is obtained as $\frac{N\_success\_OOB_t}{size\_OOB_t}$ where N_success_OOB _t is the number of examples classified correctly from the OOB dataset used for testing the tth tree and size_OOB _t is the total number of examples in this dataset.

Strategy 1 → method MWLT1

The function Faggre1₁ is defined as:
$$ Faggre1_1(t,i,L\_FRF)=\left\{ \begin{array}{ll} 1 & {\hbox{if}}\ i=\arg\displaystyle\max_{j, j=1,\ldots,I} \left\{ \sum_{n=1}^{N_t} L\_FRF-mod_{t,n,j} \right\}\\ 0 & {\hbox{otherwise}} \\ \end{array}\right. $$

Vector $\overline{p}$ is used in the definition of function Faggre1₂:
$$ Faggr1_2(i,T\_FRF)=\sum_{t=1}^{T} p_t \cdot T\_FRF_{t,i} $$

Strategy 2 → method MWLT2

The vector of weights $\overline{p}$ is applied to Strategy 2.
$$ Faggre2(i,L\_FRF)=\sum_{t=1}^{T} p_t \sum_{n=1}^{N_t} L\_FRF_{t,n,i} $$
- Minimum Weighted by Leaf and by membership Function:
  
  In this combination method, a transformation is applied to the matrix L_FRF in Step 2 of Algorithms 3.
  $$ \begin{aligned} {\rm For}\,t &= 1,\ldots,T \\ &{\rm For}\, n = 1,\ldots, N\\ & \quad {\rm For}\,i = 1,\ldots, I\\ &\qquad L\_FRF {-}mod_{t,n,i} = \chi_{t,n}(e) \times \frac{E_i}{E_n} \end{aligned} $$
  
  Strategy 1 → method MIWLF1
  
  The function Faggre1₁ is defined as:
  $$ Faggre1_1(t,i,L\_FRF)=\left\{ \begin{array}{ll} 1 & {\hbox{if}} \ i=\displaystyle\arg\max_{j, j=1,\ldots,I}\left\{ min(L\_FRF-mod_{t,1,j},L\_FRF-mod_{t,2,j},\right.\\ & \phantom{si\ i=\displaystyle\arg\max_{j, j=1,\ldots,I}}\left.\ldots ,L\_FRF-mod_{t,N_t,j})\right\} \\ 0 & {\hbox{otherwise}} \\ \end{array}\right. $$
  
  The function Faggre1₂ incorporates the weighting, defined by a fuzzy membership function, for each tree:
  $$ Faggre1_2(i,T\_FRF)=\sum_{t=1}^{T} \mu_{pond}\left(\frac{errors_{(OOB_t)}}{size_{(OOB_t)}}\right) \cdot T\_FRF_{t,i} $$
  
  The membership function is defined by μ_pond(x):
  $$ \mu_{pond}(x)=\left\{\begin{array}{ll} 1 & 0 \leq x \leq (pmin+marg)\\ \frac{(pmax+marg)-x}{(pmax-pmin)} & (pmin+marg)\leq x \leq (pmax+marg)\\ 0 & (pmax+marg) \leq x \\ \end{array}\right. $$
  where
  - pmax is the maximum rate of errors in the trees of the FRF ensemble ($pmax=\max_{t=1,\ldots,T}\left\{\frac{errors_{(OOB_t)}}{size_{(OOB_t)}}\right\}$). The rate of errors in a tree t is obtained as $\frac{errors_{(OOB_t)}}{size_{(OOB_t)}}$ where errors _{(OOB_t)} is the number of classification errors of the tree t (using the OOB _t dataset as test set), and size _{(OOB_t)} is the cardinal of the OOB _t dataset. As we indicated above, the OOB _t examples are not used to build the tree t and they constitute an independent sample to test tree t. So we can measure the goodness of a tree t as the number of errors when classifying the set of examples OOB _t;
  - pmin is the minimum rate of errors in the trees of the FRF ensemble; and
  - $marg=\frac{pmax-pmin}{4}$

1.3 Trainable implicitly dependent methods

Within this group we define the following methods:

Minimum Weighted by membership Function: In this combination method, no transformation is applied to matrix L_FRF in Step 2 of Algorithm 3.

Strategy 1 → method MIWF1

The function Faggre1₁ is defined as
$$ Faggre1_1(t,i,L\_FRF)=\left\{ \begin{array}{ll} 1 & {\rm if}\ i=\displaystyle\arg\max_{j, j=1,\ldots,I}\left\{ min(L\_FRF_{t,1,j},L\_FRF_{t,2,j}, \right. \\ & \phantom{si\ i=\displaystyle\arg\max_{j, j=1,\ldots,I}}\left.\ldots ,L\_FRF_{t,N_t,j})\right\} \\ 0 & {\rm otherwise} \\ \end{array}\right. $$

The function Faggre1₂ incorporates the weighting defined by the previous fuzzy membership function for each tree:
$$ Faggre1_2(i,T\_FRF)=\sum_{t=1}^{T} \mu_{pond}\left(\frac{errors_{(OOB_t)}}{size_{(OOB_t)}}\right) \cdot T\_FRF_{t,i} $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cadenas, J.M., Garrido, M.C., Martínez, R. et al. Extending information processing in a Fuzzy Random Forest ensemble. Soft Comput 16, 845–861 (2012). https://doi.org/10.1007/s00500-011-0777-1

Download citation

Published: 14 October 2011
Issue Date: May 2012
DOI: https://doi.org/10.1007/s00500-011-0777-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extending information processing in a Fuzzy Random Forest ensemble

Abstract

Access this article

Similar content being viewed by others

On the Suitability of Type-1 Fuzzy Regression Tree Forests for Complex Datasets

A comparison of random forest based algorithms: random credal random forest versus oblique random forest

Fuzzy Random Forest with C–Fuzzy Decision Trees

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Combination methods

1.1 Non-trainable methods

1.2 Trainable explicitly dependent methods

1.3 Trainable implicitly dependent methods

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Extending information processing in a Fuzzy Random Forest ensemble

Abstract

Access this article

Similar content being viewed by others

On the Suitability of Type-1 Fuzzy Regression Tree Forests for Complex Datasets

A comparison of random forest based algorithms: random credal random forest versus oblique random forest

Fuzzy Random Forest with C–Fuzzy Decision Trees

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Combination methods

Appendix: Combination methods

1.1 Non-trainable methods

1.2 Trainable explicitly dependent methods

1.3 Trainable implicitly dependent methods

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation