Abstract
Autism Spectrum Disorder (ASD) is defined as polygenetic developmental and neurobiological disorders that cover a variety of development delays in social interactions. In recent years, computational methods using gene expression data have been proved to be effective in predicting ASD at the early stage. Feature selection methods directly affect the prediction performance of the ASD prognosis methods. With the advances of computational methods and exploding of high-dimensional ASD gene expression data, there is a need to examine the performance of different computational techniques in predicting ASD. In this paper, we review and conduct a comparison study of 22 different feature selection methods for predicting ASD from gene expression data. The methods are categorised into traditional methods (14 methods) and network-based methods (8 methods). The experimental results have shown that the network-based methods generally outperform the traditional feature selection methods in all three accuracy measures, including AUC (area under the curve), F1-score, and Matthews Correlation Coefficient.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
American Psychiatric Association et al. Diagnostic and statistical manual of mental disorders (DSM-5\({\textregistered }\)). American Psychiatric Pub (2013)
McConachie, H., Le Couteur, A., Honey, E.: Can a diagnosis of asperger syndrome be made in very young children with suspected autism spectrum disorder? J. Autism Dev. Disord. 35, 167–176 (2005)
Sahin, M., Sur, M.: Genes, circuits, and precision therapies for autism and related neurodevelopmental disorders. Science 350(6263), aab3897 (2015)
Lai, M.-C., Lombardo, M.V., Baron-Cohen, S.: Autism. The Lancet 383(9920), 896–910 (2014)
Gabis, L., Raz, R., Kesner-Baruch, Y.: Paternal age in autism spectrum disorders and ADHD. Pediatr. Neurol. 43(4), 300–302 (2010)
Muhle, R.A., Reed, H.E., Stratigos, K.A., Veenstra-VanderWeele, J.: The emerging clinical neuroscience of autism spectrum disorder. JAMA Psychiatry 75(5), 514 (2018)
Baron-Cohen, S.: Two new theories of autism: hyper-systemising and assortative mating. Arch. Dis. Child. 91, 2–5 (2006)
Ecker, C., Bookheimer, S.Y., Murphy, D.G.M.: Neuroimaging in autism spectrum disorder: brain structure and function across the lifespan. Lancet Neurol. 14(11), 1121–1134 (2015)
Hall, L., Kelley, E.: The contribution of epigenetics to understanding genetic factors in autism. Autism 18(8), 872–881 (2013)
Wong, C.C.Y., et al.: Methylomic analysis of monozygotic twins discordant for autism spectrum disorder and related behavioural traits. Mol. Psychiatry 19(4), 495–503 (2013)
Nagarajan, R.P., Hogart, A.R., Gwye, Y., Martin, M.R., LaSalle, J.M.: Reduced MeCP2 expression is frequent in autism frontal cortex and correlates with aberrant MECP2 promoter methylation. Epigenetics 1, e1–e11 (2006)
Garbett, K., et al.: Immune transcriptome alterations in the temporal cortex of subjects with autism. Neurobiol. Dis. 30(3), 303–311 (2008)
Voineagu, I., Eapen, V.: Converging pathways in autism spectrum disorders: interplay between synaptic dysfunction and immune responses. Front. Hum. Neurosci. 7, 738 (2013)
Walker, S.J., Fortunato, J., Gonzalez, L.G., Krigsman, A.: Identification of unique gene expression profile in children with regressive autism spectrum disorder (ASD) and ileocolitis. PLoS One 8(3), e58058 (2013)
Emanuele, E., et al.: Increased dopamine DRD4 receptor mRNA expression in lymphocytes of musicians and autistic individuals: bridging the music-autism connection. Neuro Endocrinol. Lett. 31, 122–125 (2010)
Chien, W.-H., et al.: Increased gene expression of FOXP1 in patients with autism spectrum disorders. Mol. Autism 4(1), 23 (2013)
Zhang, Z., Zhu, Q., Xie, G.-S., Chen, Y., Li, Z., Wang, S.: Discriminative margin-sensitive autoencoder for collective multi-view disease analysis. Neural Netw. 123, 94–107 (2020)
Oh, D.H., Kim, I.B., Kim, S.H., Ahn, D.H.: Predicting autism spectrum disorder using blood-based gene expression signatures and machine learning. Clin. Psychopharmacol. Neurosci. 15(1), 47–52 (2017)
Kong, S.W., et al.: Characteristics and predictive value of blood transcriptome signature in males with autism spectrum disorders. PLoS One 7(12), e49475 (2012)
Liu, H., Setiono, R.: Chi2: Feature selection and discretization of numeric attributes. In: 1995 Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE (1995)
Fleuret, F.: Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5(2004), 1531–1555 (2004)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2012)
Wright, S.: The interpretation of population structure by f-statistics with special regard to systems of mating. Evolution 19(3), 395–420 (1965)
Gini, C.W.: Variability and mutability, contribution to the study of statistical distributions and relations. Studi cconomico-giuridici della r. Universita de cagliari (1912). Reviewed in: Light, R.J., Margolin, B.H.: An analysis of variance for categorical data. J. Am. Stat. Assoc. 66, 534–544 (1971)
Jakulin, A.: Machine learning based on attribute interactions. Ph.D. thesis, University of Ljubljana (2005)
Yang, H.H., Moody, J.: Data visualization and feature selection: new algorithms for nongaussian data. In: Advances in Neural Information Processing Systems, pp. 687–693 (2000)
Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. In: Data Classification: Algorithms and Applications, pp. 37–64 (2014)
Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: AAAI, vol. 2, pp. 129–134 (1992)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint L2, 1-norms minimization. In: Advances in Neural Information Processing Systems, pp. 1813–1821 (2010)
Nie, F., Xiang, S., Jia, Y., Zhang, C., Yan, S.: Trace ratio criterion for feature selection. In: AAAI, vol. 2, pp. 671–676 (2008)
Weirauch, M.T.: Gene coexpression networks for the analysis of DNA microarray data. Appl. Stat. Netw. Biol.: Methods Syst. Biol. 1, 215–250 (2011)
Bello, S.M., et al.: Disease ontology: improving and unifying disease annotations across species. Dis. Models Mech. 11(3), dmm032839 (2018)
Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25 (2000)
Köhler, S., et al.: The human phenotype ontology in 2017. Nucleic Acids Res. 45(D1), D865–D876 (2016)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Alanis-Lobato, G., Andrade-Navarro, M.A., Schaefer, M.H.: HIPPIE v2.0: enhancing meaningfulness and reliability of protein-protein interaction networks. Nucleic Acids Res. 45(D1), D408–D414 (2016)
Wold, S., Sjöström, M., Eriksson, L.: PLS-regression: a basic tool of chemometrics. Chemometr. Intell. Lab. Syst. 58(2), 109–130 (2001)
Morrison, J.L., Breitling, R., Higham, D.J., Gilbert, D.R.: GeneRank: using search engine technology for the analysis of microarray experiments. BMC Bioinf. 6(1), 233 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
This work has been supported by the National Natural Science Foundation of China (61702069, 61963001), the Yunnan Fundamental Research Projects (202001AT070024), the NHMRC Grant (1123042), and the Australian Research Council Discovery Grant (DP170101306).
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, J., Nguyen, T., Truong, B., Liu, L., Li, J., Le, T.D. (2020). Computational Methods for Predicting Autism Spectrum Disorder from Gene Expression Data. In: Yang, X., Wang, CD., Islam, M.S., Zhang, Z. (eds) Advanced Data Mining and Applications. ADMA 2020. Lecture Notes in Computer Science(), vol 12447. Springer, Cham. https://doi.org/10.1007/978-3-030-65390-3_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-65390-3_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65389-7
Online ISBN: 978-3-030-65390-3
eBook Packages: Computer ScienceComputer Science (R0)