Skip to main content

Parallel GEP Ensemble for Classifying Big Datasets

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11056))

Included in the following conference series:

Abstract

The paper describes a GEP-based ensemble classifier constructed using the stacked generalization concept. The classifier has been implemented with a view to enable parallel processing, with the use of Spark and SWIM - an open source genetic programming library. The classifier has been validated in computational experiments carried-out on benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Apache spark. http://spark.apache.org/. Accessed 30 Sept 2017

  2. Swim library. http://github.com/kkrawiec/swim. Accessed 30 Sept 2017

  3. Almseidin, M., Alzubi, M., Kovacs, S., Alkasassbeh, M.: Evaluation of machine learning algorithms for intrusion detection system. In: 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), pp. 000277–000282, September 2017

    Google Scholar 

  4. Casalicchio, G., et al.: OpenML: an R package to connect to the machine learning platform OpenML. Comput. Stat. 32(3), 1–15 (2017). https://doi.org/10.1007/s00180-017-0742-2

    Article  Google Scholar 

  5. Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst. 13(2), 87–129 (2001)

    MathSciNet  MATH  Google Scholar 

  6. Jalalirad, A., Tjalkens, T.: Using feature-based models with complexity penalization for selecting features. J. Sig. Process. Syst. 90(2), 201–210 (2018). https://doi.org/10.1007/s11265-016-1152-3

    Article  Google Scholar 

  7. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Gene expression programming ensemble for classifying big datasets. In: Nguyen, N.T., Papadopoulos, G.A., Jędrzejowicz, P., Trawiński, B., Vossen, G. (eds.) ICCCI 2017. LNCS (LNAI), vol. 10449, pp. 3–12. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67077-5_1

    Chapter  Google Scholar 

  8. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Combining expression trees. In: 2013 IEEE International Conference on Cybernetics (CYBCO), pp. 80–85, June 2013

    Google Scholar 

  9. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: GEP-induced expression trees as weak classifiers. In: Perner, P. (ed.) ICDM 2008. LNCS (LNAI), vol. 5077, pp. 129–141. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-70720-2_10

    Chapter  Google Scholar 

  10. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: A family of GEP-induced ensemble classifiers. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS (LNAI), vol. 5796, pp. 641–652. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04441-0_56

    Chapter  Google Scholar 

  11. Krawiec, K.: Behavioral Program Synthesis with Genetic Programming. Studies in Computational Intelligence, vol. 618. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-27565-9. http://www.springer.com/gp/book/9783319275635. http://www.cs.put.poznan.pl/kkrawiec/bps

    Book  Google Scholar 

  12. Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml

  13. Limón, X., Guerra-Hernández, A., Cruz-Ramírez, N., Acosta-Mesa, H.G., Grimaldo, F.: A windowing strategy for distributed data mining optimized through GPUs. Pattern Recogn. Lett. 93, 23–30 (2016)

    Article  Google Scholar 

  14. Liu, Y., Ma, C., Xu, L., Shen, X., Li, M., Li, P.: Mapreduce-based parallel gep algorithm for efficient function mining in big data applications. Concurr. Comput.: Pract. Exp., e4379-n/a. https://doi.org/10.1002/cpe.4379. e4379 CPE-17-0381.R1

  15. Neema, S., Soibam, B.: The comparison of machine learning methods to achieve most cost-effective prediction for credit card default. J. Manage. Sci. Bus. Intell. 2(2), 36–41 (2017)

    Google Scholar 

  16. Xu, L., Huang, Y., Shen, X., Liu, Y.: Parallelizing gene expression programming algorithm in enabling large-scale classification. Sci. Program. 2017 (2017). https://doi.org/10.1155/2017/5081526

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Izabela Wierzbowska .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jȩdrzejowicz, J., Jȩdrzejowicz, P., Wierzbowska, I. (2018). Parallel GEP Ensemble for Classifying Big Datasets. In: Nguyen, N., Pimenidis, E., Khan, Z., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2018. Lecture Notes in Computer Science(), vol 11056. Springer, Cham. https://doi.org/10.1007/978-3-319-98446-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98446-9_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98445-2

  • Online ISBN: 978-3-319-98446-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics