Skip to main content

A machine learning autism classification based on logistic regression analysis


Autistic Spectrum Disorder (ASD) is a neurodevelopmental condition associated with significant healthcare costs; early diagnosis could substantially reduce these. The economic impact of autism reveals an urgent need for the development of easily implemented and effective screening methods. Therefore, time-efficient ASD screening is imperative to help health professionals and to inform individuals whether they should pursue formal clinical diagnosis. Presently, very limited autism datasets associated with screening are available and most of them are genetic in nature. We propose new machine learning framework related to autism screening of adults and adolescents that contain vital features and perform predictive analysis using logistic regression to reveal important information related to autism screening. We also perform an in-depth feature analysis on the two datasets using information gain (IG) and Chi square testing (CHI) to determine the influential features that can be utilized in screening for autism. Results obtained reveal that machine learning technology was able to generate classification systems that have acceptable performance in terms of sensitivity, specificity and accuracy among others.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3


  1. Abdelhamid N, Thabtah F, Abdel-jaber H. Phishing detection: A recent intelligent machine learning comparison based on models content and features. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 72–77. 2017/7/22, Beijing, China, 2017.

  2. American Psychiatric Association. Diagnostic and statistical manual of mental disorders: DSM-5. Washington, D.C: American Psychiatric Association; 2013.

    Book  Google Scholar 

  3. Allison C, Auyeung B, Baron-Cohen S. Toward brief “Red Flags” for autism screening: the short autism spectrum quotient and the short quantitative checklist for autism in toddlers in 1,000 cases and 3,000 controls. J Am Acad Child Adolesc Psychiatr. 2012;51(2):202–17.

    Article  Google Scholar 

  4. American Psychiatric Association (APA). Diagnostic and statistical manual of mental disorders. 5th ed. Arlington, VA: APA; 2013.

    Book  Google Scholar 

  5. Auyeung BBC. The autism spectrum quotient: children’s version (aq-child). J Autism Dev Disord. 2008;38(7):1230–40.

    Article  Google Scholar 

  6. Baron-Cohen S, Wheelwright S, Skinner R, Martin J, Clubley E. The autism-spectrum quotient (AQ): evidence from Asperger syndrome/high-functioning autism, males and females, scientists and mathematicians. Journal of Autism Development Disorder. 2001;31:5–17.

    Article  Google Scholar 

  7. Bishop D. Definition, diagnosis & assessment in a history of autism by A. Feinstein. Chichester: Wiley-Blackwell; 2010.

    Google Scholar 

  8. Bone D, Bishop S, Black M, Goodwin M, Lord C, Narayanan S. Use of machine learning to improve autism screening and diagnostic instruments: effectiveness, efficiency, and multi-instrument fusion. J Child Psychol Psychiatry. 2016;57:927–37.

    Article  Google Scholar 

  9. Bone D, Goodwin M, Black M, Lee C, Audhkhasi K, Narayanan S. Applying machine learning to facilitate autism diagnostics: pitfalls and promises. J Autism Dev Disord. 2014;45(5):1–16.

    Google Scholar 

  10. Constantino J. (SRS™) Social Responsiveness Scale. WPS, 2005. Accessed 9 Dec 2018.

  11. Duda M, Ma R, Haber N, Wall DP. Use of machine learning for behavioral distinction of autism and ADHD. Transl Psychiatr. 2016;9(6):732.

    Article  Google Scholar 

  12. Fischbach G, Lord C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors. Neuron. 2010;68:192–5.

    Article  Google Scholar 

  13. Garnett M, Attwood T. The Australian scale for Asperger syndrome. Australian National Autism Conference. Brisbane, Australia; 1995.

  14. Geschwind D, et al. The autism genetic resource exchange: a resource for the study of autism and related neuropsychiatric conditions. Am J Hum Genet. 2001;69:463–6.

    Article  Google Scholar 

  15. Hall D, Huerta MF, McAuliffe MJ, Farber GK. Sharing heterogeneous data: the national database for autism research. Neuroinformatics. 2012;10:331–9.

    Article  Google Scholar 

  16. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I. The WEKA data mining software: an update. SIGKDD Explor. 2009;11(1):10–8.

    Article  Google Scholar 

  17. Le Cessie S, van Houwelingen JC. Ridge estimators in logistic regression. Appl Stat. 1992;41(1):191–201.

    Article  Google Scholar 

  18. Lord C, Rutter M, Le Couteur A. Autism diagnostic interview—revised: a revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. J Autism Dev Disord. 1994;24:659–85.

    Article  Google Scholar 

  19. Liu H, Setiono R. Chi2: feature selection and discretization of numeric attribute. Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, November 5-8, 1995, pp. 388.

  20. Luo G. Automatically explaining machine learning prediction results: a demonstration on type 2 diabetes risk prediction. Health Inf Sci Syst. 2016;4(1):2.

    Article  Google Scholar 

  21. Mohammad R, Thabtah F, McCluskey L. Intelligent rule-based phishing websites classification. IET Inf Secur. 2014;8(3):153–60.

    Article  Google Scholar 

  22. Qabajeh I, Thabtah F, Chiclana F. Dynamic classification rules data mining method. J Manag Anal. 2015;2(3):233–53.

    Google Scholar 

  23. Quinlan J. Induction of decision trees. Mach Learn. 1986;1(1):81–106.

    Google Scholar 

  24. Robins D, Fein D, Barton M, Green J. The modified checklist for autism in toddlers: an initial study investigating the early detection of autism and pervasive developmental disorders. J Autism Dev Disord. 2001;31(2):131–44.

    Article  Google Scholar 

  25. Thabtah F. Autism spectrum disorder screening: machine learning adaptation and DSM-5 fulfilment. Proceedings of the 1st International Conference on Medical and Health Informatics 2017, pp. 1–6. Taichung City, Taiwan, ACM; 2017.

  26. Thabtah F. ASDTests. A mobile app for ASD screening. Accessed November 30th, 2017.

  27. Thabtah F. Machine learning in autistic spectrum disorder behavioral research: a review and ways forward. Inform Health Soc Care. 2018;43(2):1–20.

    Google Scholar 

  28. Thabtah F. An accessible and efficient autism screening method for behavioural data and predictive analyses. Health Inform J. 2018;19:1460458218796636.

    Article  Google Scholar 

  29. Thabtah. Detecting autistic traits using computational intelligence & machine learning techniques. Master of Research Thesis, School of Health, Department of Psychology, University of Huddersfield; 2019.

  30. Thabtah F, Peebles D. A new machine learning model based on induction of rules for autism detection. Health Inform J. 2019.

    Article  Google Scholar 

  31. Thabtah F, Kamalov F, Rajab K. A new computational intelligence approach to detect autistic features for autism screening. Int J Med Inform. 2018;117:112–24.

    Article  Google Scholar 

  32. Towle P, Patrick P. Autism spectrum disorder screening instruments for very young children: a systematic review. Autism Res Treat. 2016;2016:4624829.

    Google Scholar 

  33. Wall DP, Kosmiscki J, Deluca TF, Harstad L, Fusaro VA. Use of machine learning to shorten observation-based screening and diagnosis of autism. Transl Psychiatr. 2012;2(4):e100.

    Article  Google Scholar 

  34. Wall DP, Dally R, Luyster R, Jung JY, Deluca TF. Use of artificial intelligence to shorten the behavioral diagnosis of autism. PLoS ONE. 2012;7(8):e43855.

    Article  Google Scholar 

  35. Witten I, Frank E. Data mining: practical machine learning tools and techniques. Burlington: Morgan Kaufmann; 2005.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Fadi Thabtah.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Thabtah, F., Abdelhamid, N. & Peebles, D. A machine learning autism classification based on logistic regression analysis. Health Inf Sci Syst 7, 12 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Autism spectrum disorder
  • Classification
  • Clinical decision making
  • Data mining
  • Feature analysis
  • Machine learning
  • Sensitivity
  • Specificity