Skip to main content

Advertisement

Log in

Classification of Pathologies on Medical Images Using the Algorithm of Random Forest of Optimal-Complexity Trees

  • NEW MEANS OF CYBERNETICS, INFORMATICS, COMPUTER ENGINEERING, AND SYSTEMS ANALYSIS
  • Published:
Cybernetics and Systems Analysis Aims and scope

The authors propose an approach to the construction of classifiers in the class of Random Forest algorithms. A genetic algorithm is used to determine the optimal combination and composition of ensembles of features in the construction of forest trees. The principles of the group method of data handling are used to optimize the structure of the trees. Optimization of the tree voting procedure in the forest is implemented by the analytic hierarchy process. Examples of the use of the proposed algorithm for the detection of pathologies on medical images are provided, as well as the classification results in comparison with other known analogs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. I. H. Sarker, “Machine learning: Algorithms, real-world applications and research directions,” SN Comput. Sci., Vol. 2, Iss. 3, 160 (2021). https://doi.org/10.1007/s42979-021-00592-x.

  2. A. Mayr, H. Binder, O. Gefeller, and M. Schmid, “The evolution of boosting algorithms. From machine learning to statistical modelling,” Methods Inf. Med., Vol. 53, No. 06, 419–427 (2014). https://doi.org/10.3414/ME13-01-0122.

    Article  Google Scholar 

  3. A. H. Osman and H. M. Aljahdali, “An effective of ensemble boosting learning method for breast cancer virtual screening using neural network model,” IEEE Access, Vol. 8, 39165–39174 (2020). https://doi.org/10.1109/ACCESS.2020.2976149.

    Article  Google Scholar 

  4. T.-K. Ho, “Random decision forests,” in: Proc. 3rd Intern. Conf. on Document Analysis and Recognition (Montreal, QC, Canada, 14–16 August 1995), Vol. 1, IEEE (1995), pp. 278–282. https://doi.org/10.1109/ICDAR.1995.598994.

  5. Ie. Nastenko, V. Maksymenko, S. Potashev, V. Pavlov, V. Babenko, S. Rysin, O. Matviichuk, and V. Lazoryshinets, “Random forest algorithm construction for the diagnosis of coronary heart disease based on echocardiography video data streams,” Innov. Biosyst. Bioeng., Vol. 5, No. 1, 61–69 (2021). https://doi.org/10.20535/ibb.2021.5.1.225794.

  6. B. Pavlyshenko “Using stacking approaches for machine learning models,” in: 2018 IEEE Second Intern.Conf. on Data Stream Mining & Processing (DSMP) (Lviv, Ukraine, August 21–25, 2018), IEEE (2018), pp. 255–258. https://doi.org/10.1109/DSMP.2018.8478522.

    Article  Google Scholar 

  7. S. Indolia, A. K. Goswami, S. P. Mishra, and P. Asopa, “Conceptual understanding of convolutional neural network — a deep learning approach,” Procedia Comput. Sci., Vol. 132, 679–688 (2018). https://doi.org/10.1016/j.procs.2018.05.069.

    Article  Google Scholar 

  8. J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang, G. Wang, J. Cai, and T. Chen, “Recent advances in convolutional neural networks,” Pattern Recognition, Vol. 77, 354–377 (2018). https://doi.org/10.1016/j.patcog.2017.10.013.

    Article  Google Scholar 

  9. A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Physica D: Nonlinear Phenomena, Vol. 404, 132306 (2020). https://doi.org/10.1016/j.physd.2019.132306.

  10. C. S. Bojer and J. P. Meldgaard, “Kaggle forecasting competitions: An overlooked learning opportunity,” Int. J. Forecast., Vol. 37, Iss. 2, 587–603 (2021). https://doi.org/10.1016/j.ijforecast.2020.07.007.

  11. T. Gururaj, Y. M. Vishrutha, M. Uma, D. Rajeshwari, and B. K. Ramya, “Prediction of lung cancer risk using random forest algorithm based on Kaggle data set,” Int. J. Recen. Technol. Eng., 2020. Vol. 8, Iss. 6, 1623–1630. https://doi.org/10.35940/ijrte.F7879.038620.

  12. G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. W. M. van der Laak, B. van Ginneken, and C. I. Snchez, “A survey on deep learning in medical image analysis,” Medical Image Analysis, Vol. 42, 60–88 (2017). https://doi.org/10.1016/j.media.2017.07.005.

    Article  Google Scholar 

  13. Ie. Nastenko, V. Pavlov, O. Nosovets, V. Kruglyi, M. Honcharuk, A. Karliuk, D. Hrishko, O. Trofimenko, and V. Babenko, “Texture analysis application in medical images classification task solving,” Biomedical Engineering and Technology, No. 4, 69–82 (2020). https://doi.org/10.20535/2617-8974.2020.4.221876.

  14. Y. Cosgun, A. Yildirim, M. Yucel, A. E. Karakoc, G. Koca, A. Gonultas, G. Gursoy, H. Ustun, and M. Korkmaz, “Evaluation of invasive and noninvasive methods for the diagnosis of helicobacter pylori infection,” Asian Pac. J. Cancer Prev., Vol. 17, No. 12, 5265–5272 (2016). DOI: https://doi.org/10.22034/APJCP.2016.17.12.5265.

    Article  Google Scholar 

  15. M. Norouzi, M. D. Collins, D. J. Fleet, and P. Kohli, “CO2 Forest: improved random forest by continuous optimization of oblique splits,” arXiv:1506.06155v2 [cs.LG] 24 Jun (2015). https://doi.org/10.48550/arXiv.1506.06155.

  16. A. Chaudhary, S. Kolhe, and R. Kamal, “An improved random forest classifier for multi-class classification,” Inf. Process. Agric., Vol. 3, Iss. 4, 215–222 (2016). https://doi.org/10.1016/j.inpa.2016.08.002.

  17. E. Elyan and M. M. Gaber, “A genetic algorithm approach to optimising random forests applied to class engineered data,” Inf. Sci., Vol. 384, 220–234 (2017). https://doi.org/10.1016/j.ins.2016.08.007.

    Article  Google Scholar 

  18. I. Nastenko, V. Maksymenko, I. Dykan, O. Nosovets, B. Tarasiuk, V. Pavlov, V. Babenko, V. Kruhlyi, V. Soloduschenko, M. Dyba, and V. Umanets, “Liver pathological states identification in diffuse diseases with self-organization models based on ultrasound images texture features,” in: 2020 IEEE 15th Intern. Conf. on Computer Sciences and Information Technologies (CSIT) (Zbarazh, Ukraine, September 23–26, 2020), Vol. 2, IEEE (2020), pp. 21–25. https://doi.org/10.1109/CSIT49958.2020.9321999.

    Article  Google Scholar 

  19. I. Nastenko, V. Maksymenko, A. Galkin, V. Pavlov, O. Nosovets, I. Dykan, B. Tarasiuk, V. Babenko, V. Umanets, O. Petrunina, and D. Klymenko, “Liver pathological states identification with self-organization models based on ultrasound images texture features,” in: N. Shakhovska and M. O. Medykovskyy (eds.), Advances in Intelligent Systems and Computing V, CSIT 2020; Advances in Intelligent Systems and Computing, Vol. 1293, Springer, Cham (2021), pp. 401–418. https://doi.org/10.1007/978-3-030-63270-0_26.

  20. L. Anastasakis and N. Mort, “The development of self-organization techniques in modelling: A review of the group method of data handling (GMDH),” Research Report No. 813, University of Sheffield, United Kingdom (2001). URL: https://gmdhsoftware.com/GMDH_%20Anastasakis_and_Mort_2001.pdf.

  21. E. Furman, Y. Kye, and J. Su, “Computing the Gini index: A note,” Economics Letters, Vol. 185, 108753 (2019). https://doi.org/10.1016/j.econlet.2019.108753.

  22. X. Dong, M. Qian, and R. Jiang, “Packet classification based on the decision tree with information entropy,” J. Supercomput., Vol. 76, Iss. 6, 4117–4131 (2020). https://doi.org/10.1007/s11227-017-2227-z.

  23. D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, Vol. 21, No. 1, 6 (2020). https://doi.org/10.1186/s12864-019-6413-7.

    Article  Google Scholar 

  24. L. Breiman, “Bagging predictors,” Technical Report No. 421, University of California, Department of Statistics, Berkeley, California (1994).

  25. L. Breiman, “Random forests,” Mach. Learn., Vol. 45, Iss. 1, 5–32 (2001). https://doi.org/10.1023/A:1010933404324.

  26. L. Breiman, “Bagging predictors,” Mach. Learn., Vol. 24, Iss. 2, 123–140 (1996). https://doi.org/10.1007/BF00058655.

  27. D. E. Goldberg, Genetic Algorithms in Search, Optimization & Machine Learning, Addison-Wesley Longman Publishing Co., Inc., Boston (1989).

    MATH  Google Scholar 

  28. O. Nosovets, V. Babenko, I. Davydovych, O. Petrunina, O. Averianova, and L. D. Zyonh, “Personalized clinical treatment selection using genetic algorithm and analytic hierarchy process,” Adv. Sci. Technol. Eng. Syst. J., Vol. 6, No. 4, 406–413 (2021). https://doi.org/10.25046/aj060446.

  29. T. L. Saaty, Decision Making for Leaders: The Analytic Hierarchy Process for Decisions in a Complex World, RWS Publications, Pittsburgh (1990).

    Google Scholar 

  30. S. Sperandei, “Understanding logistic regression analysis,” Biochem. Med., Vol. 24, Iss. 1, 12–18 (2014). https://doi.org/10.11613/BM.2014.003.

  31. J. Žižka, F. Dařena, and A. Svoboda, “Adaboost,” in: Text Mining with Machine Learning, CRC Press, Boca Raton (2019), pp. 201–210. https://doi.org/10.1201/9780429469275-9.

  32. O. Petrunina, D. Shevaga, V. Babenko, V. Pavlov, S. Rysin, and I. Nastenko, “Comparative analysis of classification algorithms in the analysis of medical images from speckle tracking echocardiography video data,” Innov. Biosyst. Bioeng., Vol. 5, No. 3, 153–166 (2021). https://doi.org/10.20535/ibb.2021.5.3.234990.

  33. Ie. Nastenko, V. Maksymenko, S. Potashev, V. Pavlov, V. Babenko, S. Rysin, O. Matviichuk, and V. Lazoryshinets, “Group method of data handling application in constructing of coronary heart disease diagnosing algorithms,” Biomedical Engineering and Technology, No. 5, 1–9 (2021). https://doi.org/10.20535/2617-8974.2021.5.227141.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Babenko.

Additional information

Translated from Kibernetyka ta Systemnyi Analiz, No. 2, March–April, 2023, pp. 190–202.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Babenko, V., Nastenko, I., Pavlov, V. et al. Classification of Pathologies on Medical Images Using the Algorithm of Random Forest of Optimal-Complexity Trees. Cybern Syst Anal 59, 346–358 (2023). https://doi.org/10.1007/s10559-023-00569-z

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10559-023-00569-z

Keywords

Navigation