Investigation of Random Subspace and Random Forest Regression Models Using Data with Injected Noise

Lasota, Tadeusz; Telec, Zbigniew; Trawiński, Bogdan; Trawiński, Grzegorz

doi:10.1007/978-3-642-37343-5_1

Tadeusz Lasota²³,
Zbigniew Telec²⁴,
Bogdan Trawiński²⁴ &
…
Grzegorz Trawiński²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7828))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

1013 Accesses
4 Citations

Abstract

The ensemble machine learning methods incorporating random subspace and random forest employing genetic fuzzy rule-based systems as base learning algorithms were developed in Matlab environment. The methods were applied to the real-world regression problem of predicting the prices of residential premises based on historical data of sales/purchase transactions. The accuracy of ensembles generated by the proposed methods was compared with bagging, repeated holdout, and repeated cross-validation models. The tests were made for four levels of noise injected into the benchmark datasets. The analysis of the results was performed using statistical methodology including nonparametric tests followed by post-hoc procedures designed especially for multiple N×N comparisons.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Atla, A., Tada, R., Sheng, V., Singireddy, N.: Sensitivity of different machine learning algorithms to noise. Journal of Computing Sciences in Colleges 26(5), 96–103 (2011)
Google Scholar
Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)
MathSciNet MATH Google Scholar
Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Bryll, R.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 20(6), 1291–1302 (2003)
Article Google Scholar
Bühlmann, P., Yu, B.: Analyzing bagging. Annals of Statistics 30, 927–961 (2002)
Article MathSciNet MATH Google Scholar
Cordón, O., Gomide, F., Herrera, F., Hoffmann, F., Magdalena, L.: Ten years of genetic fuzzy systems: current framework and new trends. Fuzzy Sets and Systems 141, 5–31 (2004)
Article MathSciNet MATH Google Scholar
Cordón, O., Herrera, F.: A Two-Stage Evolutionary Process for Designing TSK Fuzzy Rule-Based Systems. IEEE Tr. on Sys., Man, and Cyb.-Part B 29(6), 703–715 (1999)
Article Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
MATH Google Scholar
Fumera, G., Roli, F., Serrau, A.: A theoretical analysis of bagging as a linear combination of classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(7), 1293–1299 (2008)
Article Google Scholar
García, S., Herrera, F.: An Extension on “Statistical Comparisons of Classifiers over Multiple Data Sets” for all Pairwise Comparisons. Journal of Machine Learning Research 9, 2677–2694 (2008)
MATH Google Scholar
Gashler, M., Giraud-Carrier, C., Martinez, T.: Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous. In: 2008 Seventh International Conference on Machine Learning and Applications, ICMLA 2008, pp. 900–905 (2008)
Google Scholar
Graczyk, M., Lasota, T., Trawiński, B.: Comparative analysis of premises valuation models using keel, rapidminer, and weka. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS (LNAI), vol. 5796, pp. 800–812. Springer, Heidelberg (2009)
Chapter Google Scholar
Ho, T.K.: The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Article Google Scholar
Kalapanidas, E., Avouris, N., Craciun, M., Neagu, D.: Machine Learning Algorithms: A study on noise sensitivity. In: Manolopoulos, Y., Spirakis, P. (eds.) Proc. 1st Balcan Conference in Informatics 2003, Thessaloniki, pp. 356–365 (November 2003)
Google Scholar
Kempa, O., Lasota, T., Telec, Z., Trawiński, B.: Investigation of bagging ensembles of genetic neural networks and fuzzy systems for real estate appraisal. In: Nguyen, N.T., Kim, C.-G., Janiak, A. (eds.) ACIIDS 2011, Part II. LNCS, vol. 6592, pp. 323–332. Springer, Heidelberg (2011)
Chapter Google Scholar
Kotsiantis, S.: Combining bagging, boosting, rotation forest and random subspace methods. Artificial Intelligence Review 35(3), 223–240 (2011)
Article Google Scholar
Król, D., Lasota, T., Trawiński, B., Trawiński, K.: Investigation of Evolutionary Optimization Methods of TSK Fuzzy Model for Real Estate Appraisal. International Journal of Hybrid Intelligent Systems 5(3), 111–128 (2008)
MATH Google Scholar
Lasota, T., Mazurkiewicz, J., Trawiński, B., Trawiński, K.: Comparison of Data Driven Models for the Validation of Residential Premises using KEEL. International Journal of Hybrid Intelligent Systems 7(1), 3–16 (2010)
MATH Google Scholar
Lasota, T., Telec, Z., Trawiński, B., Trawiński, G.: Evaluation of Random Subspace and Random Forest Regression Models Based on Genetic Fuzzy Systems. In: Graña, M., et al. (eds.) Advances in Knowledge-Based and Intelligent Information and Engineering Systems, pp. 88–97. IOS Press, Amsterdam (2012)
Google Scholar
Lasota, T., Telec, Z., Trawiński, B., Trawiński, K.: Investigation of the eTS Evolving Fuzzy Systems Applied to Real Estate Appraisal. Journal of Multiple-Valued Logic and Soft Computing 17(2-3), 229–253 (2011)
Google Scholar
Lasota, T., Telec, Z., Trawiński, G., Trawiński, B.: Empirical comparison of resampling methods using genetic fuzzy systems for a regression problem. In: Yin, H., Wang, W., Rayward-Smith, V. (eds.) IDEAL 2011. LNCS, vol. 6936, pp. 17–24. Springer, Heidelberg (2011)
Chapter Google Scholar
Lasota, T., Telec, Z., Trawiński, G., Trawiński, B.: Empirical comparison of resampling methods using genetic neural networks for a regression problem. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011, Part II. LNCS (LNAI), vol. 6679, pp. 213–220. Springer, Heidelberg (2011)
Chapter Google Scholar
Lughofer, E., Trawiński, B., Trawiński, K., Kempa, O., Lasota, T.: On Employing Fuzzy Modeling Algorithms for the Valuation of Residential Premises. Information Sciences 181, 5123–5142 (2011)
Article Google Scholar
Nettleton, D.F., Orriols-Puig, A., Fornells, A.: A study of the effect of different types of noise on the precision of supervised learning techniques. Artificial Intelligence Review 33(4), 275–306 (2010)
Article Google Scholar
Opitz, D.W., Maclin, R.F.: Popular Ensemble Methods: An Empirical Study. Journal of Artificial Intelligence Research 11, 169–198 (1999)
MATH Google Scholar
Schapire, R.E.: The strength of weak learnability. Mach. Learning 5(2), 197–227 (1990)
Google Scholar
Trawiński, B., Smętek, M., Telec, Z., Lasota, T.: Nonparametric Statistical Analysis for Multiple Comparison of Machine Learning Regression Algorithms. International Journal of Applied Mathematics and Computer Science 22(4),867–881 (2012)
Google Scholar
Wolpert, D.H.: Stacked Generalization. Neural Networks 5(2), 241–259 (1992)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Spatial Management, Wrocław University of Environmental and Life Sciences, ul. Norwida 25/27, 50-375, Wrocław, Poland
Tadeusz Lasota
Institute of Informatics, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, 50-370, Wrocław, Poland
Zbigniew Telec & Bogdan Trawiński
Faculty of Electronics, Wrocław University of Technology, Wybrzeże S. Wyspiańskiego 27, 50-370, Wrocław, Poland
Grzegorz Trawiński

Authors

Tadeusz Lasota
View author publications
You can also search for this author in PubMed Google Scholar
Zbigniew Telec
View author publications
You can also search for this author in PubMed Google Scholar
Bogdan Trawiński
View author publications
You can also search for this author in PubMed Google Scholar
Grzegorz Trawiński
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computational Science and Artificial Intelligence, University of the Basque Country, Manuel Lardizabal 1, 20018, San Sebastian, Spain
Manuel Graña
Vicomtech-IK4, Paseo Mijeletegui, 20009, San Sebastian, Spain
Carlos Toro
KES International, P.O. Box 2115, BN43 9AF, Shoreham-by-sea, UK
Robert J. Howlett
School of Engineering, University of Canberra, Mawson Lakes Campus, ACT 2601, Mawson Lakes, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lasota, T., Telec, Z., Trawiński, B., Trawiński, G. (2013). Investigation of Random Subspace and Random Forest Regression Models Using Data with Injected Noise. In: Graña, M., Toro, C., Howlett, R.J., Jain, L.C. (eds) Knowledge Engineering, Machine Learning and Lattice Computing with Applications. KES 2012. Lecture Notes in Computer Science(), vol 7828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37343-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-37343-5_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37342-8
Online ISBN: 978-3-642-37343-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics