Skip to main content
Log in

Predicting Seminal Quality via Imbalanced Learning with Evolutionary Safe-Level Synthetic Minority Over-Sampling Technique

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Seminal quality has fallen dramatically over the past two decades. Research indicates that environmental factors, health status, and life habits might lead to the decline. Prediction of seminal quality is very useful in the early diagnosis of infertile patients. Recently, artificial intelligence (AI) technologies have been applied to the study of the male fertility potential. As it is common in many real applications about cognitive computation, seminal quality prediction faces the problem of class imbalance, and conventional algorithms are often biased towards the majority class. In this paper, an evolutionary safe-level synthetic minority over-sampling technique (ESLSMOTE) is proposed to synthesize the minority instances along the same line with different weight degree, called safe level. The profile of seminal of an individual from the fertility dataset is predicted via three classification methods with ESLSMOTE. Important indicators, such as accuracy, precision, recall, receiver operating characteristic (ROC) curve, and F1-score, are used to evaluate the performance of the classifiers with ESLSMOTE based on a tenfold cross-validation scheme. The experimental results show that the proposed ESLSMOTE can significantly improve the accuracy of back-propagation neural network, adaptive boosting, and support vector machine. The highest area under the ROC curve (97.2%) is given by the ESLSMOTE-AdaBoost model. Experimental results indicate that the ESLSMOTE-based classifiers outperform current state-of-the-art methods on predicting the seminal quality in terms of the accuracy and the area under the ROC curve. As such, the ESLSMOTE-based classifiers have the capability of predicting the seminal quality with high accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Kelly D. Methods for evaluating interactive information retrieval systems with users. Found Trends Inf Retr 2009;3(1):1–224.

    Google Scholar 

  2. Modha DS, Ananthanarayanan R, Esser SK, Ndirango A, Sherbondy AJ, Singh R. Cognitive computing. Commun ACM 2011;54(8):62–71.

    Article  Google Scholar 

  3. Arafat S, Aljohani N, Abbasi R, Hussain A, Miltiades L. Connections between e-learning, web science, cognitive computation and social sensing, and their relevance to learning analytics: a preliminary study. Comput Hum Behav 2019;92:478–86.

    Article  Google Scholar 

  4. Mao W, Jiang M, Wang J, Li Y. Online extreme learning machine with hybrid sampling strategy for sequential imbalanced data. Cogn Comput 2017;9(7):780–800.

    Article  Google Scholar 

  5. Yuan X, Xie L, Abouelenien M. A regularized ensemble framework of deep learning for cancer. detection from multi-class, imbalanced training data. Pattern Recogn 2018;77:160–72.

    Article  Google Scholar 

  6. Zhou P, Hu X, Li P, Wu X. Online feature selection for high-dimensional class-imbalanced data. Knowl-Based Syst 2017;136:187–199.

    Article  Google Scholar 

  7. Kolettis PN. Evaluation of the subfertile man. Am Fam Physician 2003;67(10):2165–72.

    Google Scholar 

  8. Wang H, Xu Q, Zhou L. Seminal quality prediction using clustering-based decision forests. Algorithms 2014;7(3):405–17.

    Article  Google Scholar 

  9. Carlsen E, Giwercman A, Keiding N, Skakkebæk NE. Evidence for decreasing quality of semen during past 50 years. BMJ 1992;305(6854):609–13.

    Article  Google Scholar 

  10. Huang C, Li B, Xu K, Liu D, Hu J, Yang Y, Nie HC, Fan L, Zhu W. Decline in semen quality among 30,636 young Chinese men from 2001 to 2015. Fertil Steril 2017;107(1):83–8. e2.

    Article  Google Scholar 

  11. Jiang M, Chen X, Yue H, Xu W, Lin L, Wu Y, Liu B. Semen quality evaluation in a cohort of 28213 adult males from Sichuan area of South-West China. Andrologia 2014;46(8):842–7.

    Article  Google Scholar 

  12. Jørgensen N, Vierula M, Jacobsen R, Pukkala E, Perheentupa A, Virtanen H E, Skakkebaek N E, Toppari J. Recent adverse trends in semen quality and testis cancer incidence among finnish men. Int J Androl 2011;34(4):e37–e48.

    Article  Google Scholar 

  13. Lackner J, Schatzl G, Waldhör T, Resch K, Kratzik C, Marberger M. Constant decline in sperm concentration in infertile males in an urban population: experience over 18 years. Fertil Steril 2005;84(6): 1657–61.

    Article  Google Scholar 

  14. Rao M, Meng T-Q, Hu S-H, Guan H-T, Wei Q-Y, Xia W, Zhu C-H, Xiong C-L. Evaluation of semen quality in 1808 university students, from Wuhan, Central China. Asian J Androl 2015;17(1):111.

    Article  Google Scholar 

  15. Romero-Otero J, Medina-Polo J, García-Gómez B, Lora-Pablos D, Duarte-Ojeda JM, García-González L, García-Cruz E, Rodríguez-Antolín A. Semen quality assessment in fertile men in madrid during the last 3 decades. Urology 2015;85(6):1333– 8.

    Article  Google Scholar 

  16. Virtanen HE, Jørgensen N, Toppari J. Semen quality in the 21st century. Nat Rev Urol 2017;14(2):120.

    Article  Google Scholar 

  17. Gil D, Girela JL, De Juan J, Gomez-Torres MJ, Johnsson M. Predicting seminal quality with artificial intelligence methods. Expert Syst Appl 2012;39(16):12564–73.

    Article  Google Scholar 

  18. Osser S, Beckman-Ramirez A, Liedholm P. Semen quality of smoking and non-smoking men in infertile couples in a swedish population. Acta Obstet Gynecol Scand 1992;71(3):215–8.

    Article  Google Scholar 

  19. Petrelli G, Mantovani A. Environmental risk factors and male fertility and reproduction. Contraception 2002; 65(4):297– 300.

    Article  Google Scholar 

  20. Jiang J, Trundle P, Ren J. Medical image analysis with artificial neural networks. Comput Med Imaging Graph 2010;34(8):617–31.

    Article  Google Scholar 

  21. Jinchang Ren. ANN vs. SVM: which one performs better in classification of MCCs in mammogram imaging. Knowl-Based Syst 2012;26:144–53.

    Article  Google Scholar 

  22. Ren J, Wang D, Jiang J. Effective recognition of MCCs in mammograms using an improved neural classifier. Eng Appl Artif Intell 2011;24(4):638–45.

    Article  Google Scholar 

  23. Helwan A, Khashman A, Olaniyi EO, Oyedotun OK, Oyedotun OA. Seminal quality evaluation with RBF neural network. Bull Transilv Univ Brasov Ser Math Inf Phys Ser III 2016;9(2):137.

    MathSciNet  Google Scholar 

  24. Gil D, Girela JL, De Juan J, Gomez-Torres MJ, Johnsson M. Predicting seminal quality with artificial intelligence methods. Expert Syst Appl 2012;39(16):12564–73.

    Article  Google Scholar 

  25. Molina D, LaTorre A, Herrera F. An insight into bio-inspired and evolutionary algorithms for global optimization: review, analysis, and lessons learnt over a decade of competitions. Cogn Comput 2018;10:517–44.

    Article  Google Scholar 

  26. Ghanem WAHM, Jantan A. A cognitively inspired hybridization of artificial bee colony and dragonfly algorithms for training multi-layer perceptrons. Cogn Comput 2018;10(6):1096–1134.

    Article  Google Scholar 

  27. Sahoo AJ, Kumar Y. Seminal quality prediction using data mining methods. Technol Health Care 2014; 22(4):531–45.

    Article  Google Scholar 

  28. Aljarah I, Ala’M A-Z, Faris H, Hassonah MA, Mirjalili S, Saadeh H. Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cogn Comput 2018;10(3): 478–495.

    Article  Google Scholar 

  29. Ma J, Zhen A, Guan S-U, Liu C, Huang X. International conference on brain inspired cognitive systems. Predicting seminal quality using back-propagation neural networks with optimal feature subsets. Berlin: Springer; 2018, pp. 25–33.

  30. Xu R, Chen T, Xia Y, Lu Q, Liu B, Wang X. Word embedding composition for data imbalances in sentiment and emotion classification. Cogn Comput 2015;7:226–40.

    Article  Google Scholar 

  31. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F. A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C Appl Rev 2012;42(4):463–84.

    Article  Google Scholar 

  32. Kubat M, Matwin S, et al. Addressing the curse of imbalanced training sets: one-sided selection. ICML. Nashville; 1997. p. 179–86.

  33. Marqués AI, García V, Sánchez JS. On the suitability of resampling techniques for the class imbalance problem in credit scoring. J Oper Res Soc 2013;64(7):1060–70.

    Article  Google Scholar 

  34. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. Smote: synthetic minority over-sampling technique. J Artif Intell Res 2002;16:321–57.

    Article  MATH  Google Scholar 

  35. Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C. DBSMOTE: density-based synthetic minority over-sampling technique. Appl Intell 2012;36(3):664–84.

    Article  Google Scholar 

  36. Wong S C, Gatt A, Stamatescu V, McDonnell M D. Understanding data augmentation for classification: when to warp 2016 International conference on digital image computing: techniques and applications (DICTA); 2016. p. 1–6.

  37. He H, Bai Y, Garcia EA, Li S. ADASYN: adaptive synthetic sampling approach for imbalanced learning. IEEE international joint conference on neural networks. IEEE; 2008. p. 1322–8.

  38. Barua S, Islam MM, Yao X, Murase K. MWMOTE–majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng 2014;26(2):405–25.

    Article  Google Scholar 

  39. Liu G, Yang Y, Li B. Fuzzy rule-based oversampling technique for imbalanced and incomplete data learning. Knowl-Based Syst 2018;158:154–74.

    Article  Google Scholar 

  40. Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C. Safe-level-smote: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. Pacific-Asia conference on knowledge discovery and data mining. Berlin: Springer; 2009. p. 475–82.

  41. Hertz A, Kobler D. A framework for the description of evolutionary algorithms. Eur J Oper Res 2000;126(1):1–12.

    Article  MathSciNet  MATH  Google Scholar 

  42. Drummond C, Holte RC. Explicitly representing expected cost: an alternative to ROC representation. Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining. ACM; 2000. p. 198–207.

  43. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F. A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C (Appl Rev) 2012;42(4):463– 84.

    Article  Google Scholar 

  44. Lin M, Tang K, Yao X. Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans Neural Netw Learn Syst 2013;24(4):647–60.

    Article  Google Scholar 

  45. Lim P, Goh CK, Tan KC. Evolutionary cluster-based synthetic oversampling ensemble (eco-ensemble) for imbalance learning. IEEE Trans Cybern 2017;47(9):2850–61.

    Article  Google Scholar 

  46. Chang C-C, Lin C-J. Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2011; 2(3):27.

    Article  Google Scholar 

  47. Girela JL, Gil DD, Johnsson M, Gomez-Torres MJ, De Juan J. Semen parameters can be predicted from environmental factors and lifestyle using artificial intelligence methods. Biol Reprod 2013;88(4):99–1.

    Article  Google Scholar 

  48. Bidgoli AA, Komleh HE, Mousavirad SJ. Seminal quality prediction using optimized artificial neural network with genetic algorithm. 2015 9th international conference on electrical and electronics engineering (ELECO). IEEE; 2015. p. 695–699.

Download references

Funding

This study was funded by the National Natural Science Foundation of China (Grant No. 61702353), the Natural Science Foundation of Jiangsu Province (Grant No. BK20160355), the Science and Technology Project of Ministry of Housing and Urban-Rural Development (Grant No. 2016-K1-019), the Suzhou Science and Technology Project (Grant No. SYG201603), and RIBDA research project at XJTLU (Grant No. RIBDA2018-IRP2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jieming Ma.

Additional information

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, J., Afolabi, D.O., Ren, J. et al. Predicting Seminal Quality via Imbalanced Learning with Evolutionary Safe-Level Synthetic Minority Over-Sampling Technique. Cogn Comput 13, 833–844 (2021). https://doi.org/10.1007/s12559-019-09657-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-019-09657-9

Keywords

Navigation