Effective large for gestational age prediction using machine learning techniques with monitoring biochemical indicators

  • Faheem Akhtar
  • Jianqiang Li
  • Muhammad Azeem
  • Shi Chen
  • Hui Pan
  • Qing Wang
  • Ji-Jiang YangEmail author


A newborn with a birth weight above the 90th percentile of same gestational age is termed as large for gestational age. Large for gestational age suffers from serious complications during and after the antepartum period because they do not get earlier identification of the disease. Earlier recognition of large for gestational age infants could slow progression and prevent further complication of the disease. In medical science, prevention and mitigation of disease require examination of biochemical indicators. Machine learning has been evolved and envisioned as a tool to predict large for gestational age infants with most deterministic characteristics. This study aims to identify most deterministic biochemical indicators for large for gestational age prediction with minimal computational overhead. To the best of my knowledge, this is the first time a study is carried out to identify the most deterministic risk factors associated with large for gestational age and to develop large for gestational age prediction model using machine learning techniques. To develop an efficient large for gestational age prediction model, we conducted three group of experiments that considered basic machine learning methods; feature selection; and imbalanced data, respectively. Support vector machine, logistic regression, Naive Bayes and Random Forest were trained using tenfold cross-validation on large for gestational age dataset; we selected precision and area under the curve as a performance evaluation metrics; information gain an entropy-based feature selection method was adopted to rank features; we introduced an ensemble data imbalance technique in the last group of experiments. For each group of experiments, support vector machine performed best compared to other machine learning classifiers by producing the highest prediction precision score of 85%. All of the classifiers performed best with thirty ranked features subset, which validates the applied method to recognize the most deterministic risk factors associated with large for gestational age prediction.


Large for gestational age Feature selection Machine learning Risk factors Prediction model Data imbalance Ensemble technique 



This work is supported by National Key Research and Development Program of China with project No. 2017YFB1400803.


  1. 1.
    Battaglia FC, Lubchenco LO (1967) A practical classification of newborn infants by weight and gestational age. J Pediatr 71(2):159–163CrossRefGoogle Scholar
  2. 2.
    Lazer S, Biale Y, Mazor M, Lewenthal H, Insler V (1986) Complications associated with the macrosomic fetus. J Reprod Med 31(6):501–505Google Scholar
  3. 3.
    Spellacy W, Miller S, Winegar A, Peterson P (1985) Macrosomia-maternal characteristics and infant complications. Obstet Gynecol 66(2):158–161Google Scholar
  4. 4.
    Xu H, Simonet F, Luo Z-C (2010) Optimal birth weight percentile cut-offs in defining small-or large-for-gestational-age. Acta Paediatr 99(4):550–555CrossRefGoogle Scholar
  5. 5.
    Wikström I, Axelsson O, Bergström R (1991) Maternal factors associated with high birth weight. Acta Obstet Gynecol Scand 70(1):55–61CrossRefGoogle Scholar
  6. 6.
    Meshari A, De Silva S, Rahman I (1990) Fetal macrosomiamaternal risks and fetal outcome. Int J Gynecol Obstet 32(3):215–222CrossRefGoogle Scholar
  7. 7.
    Oral E, Cağdaş A, Gezer A, Kaleli S, Aydinli K, Öçer F (2001) Perinatal and maternal outcomes of fetal macrosomia. Eur J Obstet Gynecol Reprod Biol 99(2):167–171CrossRefGoogle Scholar
  8. 8.
    Cheung T, Leung A, Chang A (1990) Macrosomic babies. Aust N Z J Obstet Gynaecol 30(4):319–322CrossRefGoogle Scholar
  9. 9.
    Whitaker RC, Dietz WH (1998) Role of the prenatal environment in the development of obesity. J Pediatr 132(5):768–776CrossRefGoogle Scholar
  10. 10.
    Michels KB, Trichopoulos D, Robins JM, Rosner BA, Manson JE, Hunter DJ, Colditz GA, Hankinson SE, Speizer FE, Willett WC (1996) Birthweight as a risk factor for breast cancer. Lancet 348(9041):1542–1546CrossRefGoogle Scholar
  11. 11.
    Wang T, Xu J, Zhang W, Gu Z, Zhong H (2018) Self-adaptive cloud monitoring with online anomaly detection. Future Gener Comput Syst 80:89–101CrossRefGoogle Scholar
  12. 12.
    Wang T, Zhang W, Ye C, Wei J, Zhong H, Huang T (2016) Fd4c: automatic fault diagnosis framework for web applications in cloud computing. IEEE Trans Syst Man Cybern Syst 46(1):61–75CrossRefGoogle Scholar
  13. 13.
    Wang T, Wei J, Zhang W, Zhong H, Huang T (2014) Workload-aware anomaly detection for web applications. J Syst Softw 89:19–32CrossRefGoogle Scholar
  14. 14.
    Li J, Wang F (2016) Semi-supervised learning via mean field methods. Neurocomputing 177:385–393CrossRefGoogle Scholar
  15. 15.
    Shmueli A, Nassie DI, Hiersch L, Ashwal E, Wiznitzer A, Yogev Y, Aviram A (2017) 241: prerecognition of large for gestational age (lga) fetus and its consequences. Am J Obstet Gynecol 216(1):S150–S151CrossRefGoogle Scholar
  16. 16.
    Moore GS, Kneitel AW, Walker CK, Gilbert WM, Xing G (2012) Autism risk in small-and large-for-gestational-age infants. Am J Obstet Gynecol 206(4):314-e1CrossRefGoogle Scholar
  17. 17.
    Littner Y, Mandel D, Mimouni FB, Dollberg S (2004) Decreased bone ultrasound velocity in large-for-gestational-age infants. J Perinatol 24(1):21CrossRefGoogle Scholar
  18. 18.
    Luangkwan S, Vetchapanpasat S, Panditpanitcha P, Yimsabai R, Subhaluksuksakorn P, Loyd RA, Uengarporn N (2015) Risk factors of small for gestational age and large for gestational age at buriram hospital. J Med Assoc Thail 98(Suppl 4):S71–S78Google Scholar
  19. 19.
    Institute of Medicine (2009) Weight gain during pregnancy: reexamining the guidelines. National Academies Press, Washington, DCGoogle Scholar
  20. 20.
    Kominiarek MA, Grobman W, Adam E, Buss C, Culhane J, Entringer S, Simhan H, Wadhwa PD, Kim KY, Keenan-Devlin L, Borders A (2018) Stress during pregnancy and gestational weight gain. J Perinatol 38(5):462–467CrossRefGoogle Scholar
  21. 21.
    Chiavaroli V, Castorani V, Guidone P, Derraik JG, Liberati M, Chiarelli F, Mohn A (2016) Incidence of infants born small-and large-for-gestational-age in an italian cohort over a 20-year period and associated risk factors. Ital J Pediatr 42(1):42CrossRefGoogle Scholar
  22. 22.
    Stoean R, Stoean C (2013) Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection. Expert Syst Appl 40(7):2677–2686MathSciNetCrossRefGoogle Scholar
  23. 23.
    Lu C, Zhu Z, Gu X (2014) An intelligent system for lung cancer diagnosis using a new genetic algorithm based feature selection method. J Med Syst 38(9):97CrossRefGoogle Scholar
  24. 24.
    Azar AT (2014) Neuro-fuzzy feature selection approach based on linguistic hedges for medical diagnosis. Int J Model Identif Control 22(3):195–206CrossRefGoogle Scholar
  25. 25.
    Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532CrossRefGoogle Scholar
  26. 26.
    Li J, Wang F (2017) Towards unsupervised gene selection: a matrix factorization framework. IEEE/ACM Trans Comput Biol Bioinf: TCBB 14(3):514–521CrossRefGoogle Scholar
  27. 27.
    Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Raju R (2012) Relative importance of fine needle aspiration features for breast cancer diagnosis: a study using information gain evaluation and machine learning. J Am Soc Cytopathol 1(1):S11CrossRefGoogle Scholar
  29. 29.
    Li J, Liu L, Sun J, Mo H, Yang J, Chen S, Liu H, Wang Q, Pan H (2016) Comparison of different machine learning approaches to predict small for gestational age infants. IEEE Trans Big Data. Google Scholar
  30. 30.
    Zhang S, Wang Q, Shen H (2015) Design implementation and significance of chinese free pre-pregnancy eugenics checks project. Natl Med J China 95(3):162–165Google Scholar
  31. 31.
    Li J, Yang J-J, Zhao Y, Liu B, Zhou M, Bi J, Wang Q (2017) Enforcing differential privacy for shared collaborative filtering. IEEE Access 5:35–49CrossRefGoogle Scholar
  32. 32.
    Zhu L, Zhang R, Zhang S, Shi W, Yan W, Wang X, Lyu Q, Liu L, Zhou Q, Qiu Q et al (2015) Chinese neonatal birth weight curve for different gestational age. Chin J Pediatr 53(2):97–103Google Scholar
  33. 33.
    Li J, Liu C, Liu B, Mao R, Wang Y, Chen S, Yang J-J, Pan H, Wang Q (2015) Diversity-aware retrieval of medical records. Comput Ind 69:81–91CrossRefGoogle Scholar
  34. 34.
    Khashei M, Eftekhari S, Parvizian J (2012) Diagnosing diabetes type ii using a soft intelligent binary classification model. Rev Bioinf Biom 1:9–23Google Scholar
  35. 35.
    Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28CrossRefGoogle Scholar
  36. 36.
    Bammann K (2006) Statistical models: theory and practice. Biometrics 62(3):943–943MathSciNetCrossRefGoogle Scholar
  37. 37.
    Zhang H, Su J (2004) Naive bayesian classifiers for ranking. In: European Conference on Machine Learning. Springer, pp 501–512Google Scholar
  38. 38.
    Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefzbMATHGoogle Scholar
  39. 39.
    Corp N IBM (2013) Ibm spss statistics for windows. Version, vol 22Google Scholar
  40. 40.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12(Oct):2825–2830MathSciNetzbMATHGoogle Scholar
  41. 41.
    Zar JH et al (1999) Biostatistical analysis. Pearson Education India, BengaluruGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Faculty of Information TechnologyBeijing University of TechnologyBeijingChina
  2. 2.Department of Computer ScienceSukkur IBA UniversitySukkurPakistan
  3. 3.Department of Endocrinology, Peking Union Medical College HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
  4. 4.Tsinghua National Laboratory for Information Science and TechnologyTsinghua UniversityBeijingChina

Personalised recommendations