Skip to main content

Advertisement

Log in

Identification of important gene signatures in schizophrenia through feature fusion and genetic algorithm

  • Published:
Mammalian Genome Aims and scope Submit manuscript

Abstract

Schizophrenia is a debilitating psychiatric disorder that can significantly affect a patient’s quality of life and lead to permanent brain damage. Although medical research has identified certain genetic risk factors, the specific pathogenesis of the disorder remains unclear. Despite the prevalence of research employing magnetic resonance imaging, few studies have focused on the gene level and gene expression profile involving a large number of screened genes. However, the high dimensionality of genetic data presents a great challenge to accurately modeling the data. To tackle the current challenges, this study presents a novel feature selection strategy that utilizes heuristic feature fusion and a multi-objective optimization genetic algorithm. The goal is to improve classification performance and identify the key gene subset for schizophrenia diagnostics. Traditional gene screening techniques are inadequate for accurately determining the precise number of key genes associated with schizophrenia. Our innovative approach integrates a filter-based feature selection method to reduce data dimensionality and a multi-objective optimization genetic algorithm for improved classification tasks. By combining the filtering and wrapper methods, our strategy leverages their respective strengths in a deliberate manner, leading to superior classification accuracy and a more efficient selection of relevant genes. This approach has demonstrated significant improvements in classification results across 11 out of 14 relevant datasets. The performance on the remaining three datasets is comparable to the existing methods. Furthermore, visual and enrichment analyses have confirmed the practicality of our proposed method as a promising tool for the early detection of schizophrenia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

The data underlying this article are available in Gene Expression Omnibus (GEO) publicly accessible database. All the datasets were derived from sources in the public domain:

GSE12649: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12649.

GSE12654: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12654.

GSE12679: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12679.

GSE17612: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE17612.

GSE21138: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE21138.

GSE21935: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE21935.

GSE26927: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26927.

GSE35974: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE35974.

GSE35977: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE35977.

GSE35978: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE35978.

GSE53987: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE53987.

GSE62191: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62191.

GSE87610: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE87610.

GSE93987: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE93987.

References

  • Adhao R, Pachghare V (2020) Feature selection using principal component analysis and genetic algorithm. J Discrete Math Sci Crypt 23(2):595–602

    Google Scholar 

  • Aghamaleki-Sarvestani Z et al (2020) Catechol-O-methyltransferase gene expression in stress-induced and non-stress induced schizophrenia. Psychiatr Genet 30(1):10–18

    Article  CAS  PubMed  Google Scholar 

  • Alkelai A et al (2022) The benefit of diagnostic whole genome sequencing in schizophrenia and other psychotic disorders. Mol Psychiatry 27(3):1435–1447

    Article  CAS  PubMed  Google Scholar 

  • Almutiri T, Saeed F (2022) A hybrid feature selection method combining Gini index and support vector machine with recursive feature elimination for gene expression classification. Int J Data Min Modelling Manage 14(1):41–62

    Google Scholar 

  • Archie SR, Al Shoyaib A, Cucullo L (2021) Blood-brain barrier dysfunction in CNS disorders and putative therapeutic targets: an overview. Pharmaceutics 13(11):1779

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Arion D et al (2015) Distinctive transcriptome alterations of prefrontal pyramidal neurons in schizophrenia and schizoaffective disorder. Mol Psychiatry 20(11):1397–1405

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • AY P, Rayanki B (2020) A generic algorithmic protocol approaches to improve network life time and energy efficient using combined genetic algorithm with simulated annealing in MANET. Int J Intell Unmanned Syst 8(1):23–42

    Article  Google Scholar 

  • Bozzatello P et al (2020) Effects of omega 3 fatty acids on main dimensions of psychopathology. Int J Mol Sci 21(17):6042

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bracher-Smith M, Crawford K, Escott-Price V (2021) Machine learning for genetic prediction of psychiatric disorders: a systematic review. Mol Psychiatry 26(1):70–79

    Article  PubMed  Google Scholar 

  • Bracher-Smith M et al (2022) Machine learning for prediction of schizophrenia using genetic and demographic factors in the UK biobank. Schizophr Res 246:156–164

    Article  PubMed  PubMed Central  Google Scholar 

  • Ceccarelli F et al (2020) Bringing data from curated pathway resources to Cytoscape with OmniPath. Bioinformatics 36(8):2632–2633

    Article  CAS  PubMed  Google Scholar 

  • Chen C et al (2018) The transcription factor POU3F2 regulates a gene coexpression network in brain tissue from patients with psychiatric disorders. Sci Transl Med 10(472):eaat8178

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen K et al (2020) An evolutionary multitasking-based feature selection method for high-dimensional classification. IEEE Trans Cybernetics 52(7):7172–7186

    Article  Google Scholar 

  • Cruz-Martinez C, Reyes-Garcia CA, Vanello N (2022) A novel event-related fMRI supervoxels-based representation and its application to schizophrenia diagnosis. Comput Methods Programs Biomed 213:106509

    Article  PubMed  Google Scholar 

  • Cui H, Xu J, Zhou H (2022) The Effectiveness of Cognitive Behavioral Therapy on Schizophrenia in China: A Systematic Reveiw. In, 8th International Conference on Humanities and Social Science Research (ICHSSR 2022). Atlantis Press; 2022. p. 2112–2116

  • Datta D et al (2020) Mapping phosphodiesterase 4D (PDE4D) in macaque dorsolateral prefrontal cortex: postsynaptic compartmentalization in layer III pyramidal cell circuits. Front Neuroanat 14:578483

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Guan F et al (2022) Integrative omics of schizophrenia: from genetic determinants to clinical classification and risk prediction. Mol Psychiatry 27(1):113–126

    Article  PubMed  Google Scholar 

  • Gunasekara CJ et al (2021) A machine learning case–control classifier for schizophrenia based on DNA methylation in blood. Translational Psychiatry 11(1):412

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • GuolinKe QM et al (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30:52

    Google Scholar 

  • Harris LW et al (2008) The cerebral microvasculature in Schizophrenia: a laser capture Microdissection Study. PLoS ONE 3(12):e3964

    Article  PubMed  PubMed Central  Google Scholar 

  • Henkel ND et al (2022) A disorder of broken brain bioenergetics. Mol Psychiatry 27(5):2393–2404Schizophrenia

    Article  CAS  PubMed  Google Scholar 

  • Iwamoto K et al (2004) Molecular characterization of bipolar disorder by comparing gene expression profiles of postmortem brains of major mental disorders. Mol Psychiatry 9(4):406–416

    Article  CAS  PubMed  Google Scholar 

  • Iwamoto K, Bundo M, Kato T (2005) Altered expression of mitochondria-related genes in postmortem brains of patients with bipolar disorder or schizophrenia, as revealed by large-scale DNA microarray analysis. Hum Mol Genet 14(2):241–253

    Article  CAS  PubMed  Google Scholar 

  • Jahromi AH, Taheri M (2017) A non-parametric mixture of Gaussian naive Bayes classifiers based on local independent features. 2017 Artificial intelligence and signal processing conference (AISP) :209–212

  • Kakhramonovich TP (2022) Epidemiology of Pysichiatric disorders. Tex J Med Sci 12:102–105

    Google Scholar 

  • Kavitha K et al (2020) ,. Laplacian score and Top scoring pair Feature selection algorithms. 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC) :214–219

  • Kusko R et al (2018) Large-scale transcriptomic analysis reveals that pridopidine reverses aberrant gene expression and activates neuroprotective pathways in the YAC128 HD mouse. Mol Neurodegener 13:1–5

    Article  Google Scholar 

  • Lanz TA et al (2019) Postmortem transcriptional profiling reveals widespread increase in inflammation in schizophrenia: a comparison of prefrontal cortex, striatum, and hippocampus among matched tetrads of controls with subjects diagnosed with schizophrenia, bipolar or major depressive disorder. Transl Psychiatry 9(1):151

    Article  PubMed  PubMed Central  Google Scholar 

  • Leske M et al (2022) BiGAMi: bi-objective genetic algorithm fitness function for feature selection on Microbiome datasets. Methods Protocols 5(3):42

    Article  PubMed  PubMed Central  Google Scholar 

  • Li T et al (2017) A scored human protein–protein interaction network to catalyze genomic interpretation. Nat Methods 14(1):61–64

    Article  CAS  PubMed  Google Scholar 

  • Li X et al (2020) Variants and expression changes in PPAR-encoding genes display no significant association with schizophrenia. Biosci Rep 40(7):BSR20201083

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Li Z et al (2022) Identification of potential biomarkers and their correlation with immune infiltration cells in schizophrenia using combinative bioinformatics strategy. Psychiatry Res 314:114658

    Article  CAS  PubMed  Google Scholar 

  • Luo F et al (2019) DeepPhos: prediction of protein phosphorylation sites with deep learning. Bioinformatics 35(16):2766–2773

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Maycox PR et al (2009) Analysis of gene expression in two large schizophrenia cohorts identifies multiple changes associated with nerve terminal function. Mol Psychiatry 14(12):1083–1094

    Article  CAS  PubMed  Google Scholar 

  • Mirjalili S (2019) Genetic algorithm. Evolutionary algorithms and neural networks: theory and applications. Springer International Publishing, Cham, pp 43–55

    Chapter  Google Scholar 

  • Murray AJ et al (2021) Oxidative stress and the pathophysiology and symptom profile of schizophrenia spectrum disorders. Front Psychiatry 12:703452

    Article  PubMed  PubMed Central  Google Scholar 

  • Nohara Y et al (2022) Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Comput Methods Programs Biomed 214:106584

    Article  PubMed  Google Scholar 

  • Oughtred R et al (2019) The BioGRID interaction database: 2019 update. Nucleic Acids Res 47(D1):D529–D541

    Article  CAS  PubMed  Google Scholar 

  • Pardinas AF et al (2018) Common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nat Genet 50(3):381–389

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Piñero J et al (2021) The DisGeNET cytoscape app: exploring and visualizing disease genomics data. Comput Struct Biotechnol J 19:2960–2967

    Article  PubMed  PubMed Central  Google Scholar 

  • Pourpanah F et al (2023) A review of artificial fish swarm algorithms: recent advances and applications. Artif Intell Rev 56(3):1867–1903

    Article  Google Scholar 

  • Qureshi MNI, Oh J, Lee B (2019) 3D-CNN based discrimination of schizophrenia using resting-state fMRI. Artif Intell Med 98:10–17

    Article  PubMed  Google Scholar 

  • Sharma I et al (2022) Association of toll-like receptor 2 gene polymorphism (rs3804099) with susceptibility to Schizophrenia risk in the Dogra population of Jammu region, North India. Eur J Psychiatry 36(2):106–113

    Article  Google Scholar 

  • Shin W et al (2021) Influence of cytochrome P450 2D6 polymorphism on hippocampal white matter and treatment response in schizophrenia. Npj Schizophrenia 7(1):5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Szklarczyk D et al (2020) The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res 49(D1):D605–D612

    Article  PubMed Central  Google Scholar 

  • Tunç S et al (2019) Serum ceruloplasmin-ferroxidase activity in bipolar disorder is elevated compared to major depressive disorder and schizophrenia: a controlled study. Psychiatry Clin Psychopharmacol 29(3):307–314

    Article  Google Scholar 

  • Türei D et al (2021) Integrated intra-and intercellular signaling knowledge for multicellular omics analysis. Mol Syst Biol 17(3):e9923

    Article  PubMed  PubMed Central  Google Scholar 

  • Wang D, Tan D, Liu L (2018) Particle swarm optimization algorithm: an overview. Soft Comput 22:387–408

    Article  Google Scholar 

  • Wang J et al (2019) Systematic analysis and prediction of type IV secreted effector proteins by machine learning approaches. Brief Bioinform 20(3):931–951

    Article  CAS  PubMed  Google Scholar 

  • Wang P-H, Tu Y-S, Tseng YJ (2019b) PgpRules: a decision tree based prediction server for P-glycoprotein substrates and inhibitors. Bioinformatics 35(20):4193–4195

    Article  CAS  PubMed  Google Scholar 

  • Wang Q et al (2019c) A bayesian framework that integrates multi-omics data and gene networks predicts risk genes from schizophrenia GWAS data. Nat Neurosci 22(5):691–699

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wei G et al (2020) A novel hybrid feature selection method based on dynamic feature importance. Appl Soft Comput 93:106337

    Article  Google Scholar 

  • Wiharto W et al (2022) Hybrid feature selection method based on genetic algorithm for the diagnosis of Coronary Heart Disease. J Inform Communication Convergence Eng 20(1):31–40

    Google Scholar 

  • Xie Q et al (2019) A core collection of pan-schizophrenia genes allows building cohort-specific signatures of affected brain. Sci Rep 9(1):12671

    Article  PubMed  PubMed Central  Google Scholar 

  • Yan W et al (2022) Mapping relationships among schizophrenia, bipolar and schizoaffective disorders: a deep classification and clustering framework using fMRI time series. Schizophr Res 245:141–150

    Article  PubMed  Google Scholar 

  • Yang Q et al (2020a) Consistent gene signature of schizophrenia identified by a novel feature selection strategy from comprehensive sets of transcriptomic data. Brief Bioinform 21(3):1058–1068

    Article  CAS  PubMed  Google Scholar 

  • Yang Z et al (2020b) Robust discriminant feature selection via joint L2, 1-norm distance minimization and maximization. Knowl Based Syst 207:106090

    Article  Google Scholar 

  • Yuan Z et al (2021) ,. Large-scale robust deep auc maximization: A new surrogate loss and empirical studies on medical image classification. Proceedings of the IEEE/CVF International Conference on Computer Vision :3040–3049

  • Yuan X et al (2022) Pro-inflammatory cytokine levels are elevated in female patients with schizophrenia treated with clozapine. Psychopharmacology 239(3):765–771

    Article  CAS  PubMed  Google Scholar 

  • Zahiri J et al (2020) Protein complex prediction: a survey. Genomics 112(1):174–183

    Article  CAS  PubMed  Google Scholar 

  • Zhou Y et al (2019) Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun 10(1):1–10

    Google Scholar 

Download references

Funding

The work was jointly supported by the Zhejiang Provincial Natural Science Foundation of China (No.LY21F020017), National Natural Science Foundations of China (No. U20A20386), Science and Technology Program of Zhejiang Province (No. 2022C03043, 2022C01016), GuangDong Basic and Applied Basic Research Foundation (No.2022A1515110570).

Author information

Authors and Affiliations

Authors

Contributions

ZC: Methodology, software, visualization, Writing—original draft. RG: Investigation, methodology, Writing—review & editing, funding acquisition, project administration. CW: Methodology, formal analysis, supervision, funding acquisition. AE: Formal analysis, investigation, data curation, Writing—review & editing. XF: Formal analysis, review & editing, visualization. WM: Conceptualization, Writing—review & editing. FQ: Conceptualization, formal analysis, Writing—review & editing, funding acquisition. GJ:Data curation, review & editing. XF: Investigation, data curation, Writing—review & editing.

Corresponding author

Correspondence to Ruiquan Ge.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval 

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 1433.3 kb)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Ge, R., Wang, C. et al. Identification of important gene signatures in schizophrenia through feature fusion and genetic algorithm. Mamm Genome (2024). https://doi.org/10.1007/s00335-024-10034-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00335-024-10034-7

Navigation