Pattern analysis of genetics and genomics: a survey of the state-of-art
- 78 Downloads
Abstract
The endless enhancement and decreasing charges of a complete human genome have given rise to fast acceptance of genetic and genomic information at both research institutions and clinics. Biologists are enchanting the primary steps in the direction of knowing the locations and functions of all the genes and controlling sites in the genomes of various organisms. As these researchers govern the nucleotide arrangement of large stretches of the human genome, they are constructing excessive volumes of sequence data. Direct research laboratory investigation of this data is expensive and tough, creating computational techniques vital. The arena of pattern analysis, which intends to build computer algorithms that enhance with knowledge, embraces the capacity to empower computers to support humans in the analysis of complex, large genetic and genomic data sets. Here, an overview of pattern analysis techniques for the study of genome sequencing datasets, as well as the proteomics, epigenetic and metabolomic data is delivered. These techniques employ data pre-processing, feature extraction and selection, classification and clustering. The aim of this survey is to present deliberations and recurring challenges in the application of pattern analysis methods, as well as of discriminative and reproductive modeling approaches and discuss the future research directions of these methods for the analysis of genomic and genetic data sets.
Keywords
Genomic Genetic Pattern analysis Pre-processing Feature selection Classification ClusteringNotes
References
- 1.Abeel T, Helleputte T, Van de Peer Y, Dupont P, Saeys Y (2009) Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3):392–398CrossRefGoogle Scholar
- 2.Ahmed AA, Vias M, Iyer NG, Caldas C, Brenton JD (2004) Microarray segmentation methods significantly influence data precision. Nucleic Acids Res 32(5):1–7CrossRefGoogle Scholar
- 3.Akgün M, Bayrak AO, Ozer B, Sağıroğlu MŞ (2015) Privacy preserving processing of genomic data: a survey. J Biomed Inform 56:103–111CrossRefGoogle Scholar
- 4.Alexa A, Rahnenführer J, Lengauer T (2006) Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 22(13):1600–1607CrossRefGoogle Scholar
- 5.Alexe G, Alexe S, Hammer PL, Vizvari B (2006) Pattern-based feature selection in genomics and proteomics. Ann Oper Res 148(1):189–201zbMATHCrossRefGoogle Scholar
- 6.Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat Biotechnol 33(8):831–838CrossRefGoogle Scholar
- 7.Allendorf FW, Hohenlohe PA, Luikart G (2010) Genomics and the future of conservation genetics. Nat Rev Genet 11(10):697–709CrossRefGoogle Scholar
- 8.Ambroise C, McLachlan GJ (2002) Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci 99(10):6562–6566zbMATHCrossRefGoogle Scholar
- 9.Angerer P, Haghverdi L, Büttner M, Theis FJ, Marr C, Buettner F (2015) Destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 32(8):1241–1243CrossRefGoogle Scholar
- 10.Arcuri A (2018) Evaluating search-based techniques with statistical tests. In ACM Proceedings of the 11th International Workshop on Search-Based Software Testing 21–21Google Scholar
- 11.Ardaneswari G, Bustamam A, Sarwinda D (2017) Implementation of plaid model biclustering method on microarray of carcinoma and adenoma tumor gene expression data. In Journal of Physics: Conference Series 893(1)Google Scholar
- 12.Armstrong SA, Staunton JE, Silverman LB, Pieters R, den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ (2002) MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30(1):41CrossRefGoogle Scholar
- 13.Arsenio J, Kakaradov B, Metz PJ, Kim SH, Yeo GW, Chang JT (2014) Early specification of CD8+ T lymphocyte fates during adaptive immunity revealed by single-cell gene-expression analyses. Nat Immunol 15(4):365–372CrossRefGoogle Scholar
- 14.Aßhauer KP, Wemheuer B, Daniel R, Meinicke P (2015) Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics 31(17):2882–2884CrossRefGoogle Scholar
- 15.Ayday E, Raisaro JL, Hengartner U, Molyneaux A, Hubaux JP (2014) Privacy-preserving processing of raw genomic data. In Data Privacy Management and Autonomous Spontaneous Security Springer (Berlin, Heidelberg) 133–147Google Scholar
- 16.Barros RC, Basgalupp MP, Freitas AA, De Carvalho AC (2014) Evolutionary design of decision-tree algorithms tailored to microarray gene expression data sets. IEEE Trans Evol Comput 18(6):873–892CrossRefGoogle Scholar
- 17.Bartenhagen C, Klein HU, Ruckert C, Jiang X, Dugas M (2010) Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data. BMC bioinformatics 11(1):1–11CrossRefGoogle Scholar
- 18.Ben-Dor A, Chor B, Karp R, Yakhini Z (2003) Discovering local structure in gene expression data: the order-preserving submatrix problem. J Comput Biol 10(3–4):373–384CrossRefGoogle Scholar
- 19.Best MG, Sol N, Kooi I, Tannous J, Westerman BA, Rustenburg F, Schellen P, Verschueren H, Post E, Koster J, Ylstra B, Ameziane N, Dorsman J, Smit EF, Verheul HM, Noske DP, Rejineveld JC, Nilsson JA, Wurdinger T (2015) RNA-Seq of tumor-educated platelets enables blood-based pan-cancer, multiclass, and molecular pathway cancer diagnostics. Cancer Cell 28(5):666–676CrossRefGoogle Scholar
- 20.Birnbaum K, Shasha DE, Wang JY, Jung JW, Lambert GM, Galbraith DW, Benfey PN (2003) A gene expression map of the Arabidopsis root. Science 302(5652):1956–1960CrossRefGoogle Scholar
- 21.Bolón-Canedo V, Sánchez-Marono N, Alonso-Betanzos A, Benítez JM, Herrera F (2014) A review of microarray datasets and applied feature selection methods. Inf Sci 282:111–135CrossRefGoogle Scholar
- 22.Botía JA et al (2017) An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks. BMC Syst Biol 11(1):47CrossRefGoogle Scholar
- 23.Brennecke P, Reyes A, Pinto S, Rattay K, Nguyen M, Küchler R, Huber W, Kyewski B, Steinmetz LM (2015) Single-cell transcriptome analysis reveals coordinated ectopic gene-expression patterns in medullary thymic epithelial cells. Nat Immunol 16(9):933–941CrossRefGoogle Scholar
- 24.Brozynska M, Furtado A, Henry RJ (2016) Genomics of crop wild relatives: expanding the gene pool for crop improvement. Plant Biotechnol J 14(4):1070–1085CrossRefGoogle Scholar
- 25.Bruneau M, Mottet T, Moulin S, Kerbiriou M, Chouly F, Chretien S, Guyeux C (2016) A clustering tool for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Models. arXiv preprint 1–17Google Scholar
- 26.Bumgarner R (2013) Overview of DNA microarrays: types, applications, and their future. Current protocols in molecular biology 101(1):1–11Google Scholar
- 27.Caldecott KW (2008) Single-strand break repair and genetic disease. Nat Rev Genet 9(8):619–631CrossRefGoogle Scholar
- 28.Campbell K, Ponting CP, Webber C (2015) Laplacian eigenmaps and principal curves for high resolution pseudotemporal ordering of single-cell RNA-seq profiles. bioRxiv Google Scholar
- 29.Castillo-Davis CI, Hartl DL (2003) GeneMerge—post-genomic analysis, data mining, and hypothesis testing. Bioinformatics 19(7):891–892CrossRefGoogle Scholar
- 30.Çetin GS, Chen H, Laine K, Lauter K, Rindal P, Xia Y (2017) Private queries on encrypted genomic data. BMC Med Genet 10(2):1–14Google Scholar
- 31.Chandra B, Gupta M (2011) Robust approach for estimating probabilities in Naïve–Bayes classifier for gene expression data. Expert Syst Appl 38(3):1293–1298CrossRefGoogle Scholar
- 32.Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Computers & Electrical Engineering 40(1):16–28CrossRefGoogle Scholar
- 33.Chavez-Alvarez R, Chavoya A, Mendez-Vazquez A (2014) Discovery of possible gene relationships through the application of self-organizing maps to DNA microarray databases. PLoS One 9(4):e93233CrossRefGoogle Scholar
- 34.Cheadle C, Vawter MP, Freed WJ, Becker KG (2003) Analysis of microarray data using Z score transformation. The Journal of molecular diagnostics 5(2):73–81CrossRefGoogle Scholar
- 35.Chen YJ, Kodell R, Sistare F, Thompson KL, Morris S, Chen JJ (2003) Normalization methods for analysis of microarray gene-expression data. J Biopharm Stat 13(1):57–74zbMATHCrossRefGoogle Scholar
- 36.Chen KH, Wang KJ, Tsai ML, Wang KM, Adrian AM, Cheng WC, Yang TS, Teng NC, Tan KP, Chang KS (2014) Gene selection for cancer identification: a decision tree model empowered by particle swarm optimization algorithm. BMC bioinformatics 15(1):49CrossRefGoogle Scholar
- 37.Chen KH, Wang KJ, Wang KM, Angelia MA (2014) Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data. Appl Soft Comput 24:773–780CrossRefGoogle Scholar
- 38.Chen Y, Li Y, Narayan R, Subramanian A, Xie X (2016) Gene expression inference with deep learning. Bioinformatics 32(12):1832–1839CrossRefGoogle Scholar
- 39.Chen Y, Zhang Z, Zheng J, Ma Y, Xue Y (2017) Gene selection for tumor classification using neighborhood rough sets and entropy measures. J Biomed Inform 67:59–68CrossRefGoogle Scholar
- 40.Chen X, Huang JZ, Wu Q, Yang M (2017) Subspace weighting co-clustering of gene expression data. IEEE/ACM transactions on computational biology and bioinformatics Google Scholar
- 41.Chinnaswamy A, Srinivasan R (2016) Hybrid feature selection using correlation coefficient and particle swarm optimization on microarray gene expression data. In Springer Innovations in Bio-Inspired Computing and Applications 229–239Google Scholar
- 42.Chinnaswamy A, Srinivasan R (2017) Performance analysis of classifiers on filter-based feature selection approaches on microarray data. In Bio-Inspired Computing for Information Retrieval Applications 41–70Google Scholar
- 43.Chou CC, Chen CH, Lee TT, Peck K (2004) Optimization of probe length and the number of probes per gene for optimal microarray analysis of gene expression. Nucleic Acids Res 32(12):1–8CrossRefGoogle Scholar
- 44.Chu Z, Cao B, Yu F (2018) Study on Ensemble based Clustering Algorithm for Gene Expression Data. In Journal of Physics: Conference Series 1069(1)Google Scholar
- 45.Cohen IR, Domany E, Quintana FJ, Hed G, Getz G (2018) US Patent Application No 10(/082):503Google Scholar
- 46.Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szczesniak MW, Gaffney DJ, Elo LL, Zhang X, Mortazavi A (2016) A survey of best practices for RNA-seq data analysis. Genome Biol 17(1):1–19CrossRefGoogle Scholar
- 47.Corus D, Dang DC, Eremeev AV, Lehre PK (2017) Level-based analysis of genetic algorithms and other search processes. IEEE Trans Evol ComputGoogle Scholar
- 48.Craddock TJ, Harvey JM, Nathanson L, Barnes ZM, Klimas NG, Fletcher MA, Broderick G (2015) Using gene expression signatures to identify novel treatment strategies in gulf war illness. BMC Med Genet 8(1):1–13Google Scholar
- 49.Cui P, Zhong T, Wang Z, Wang T, Zhao H, Liu C, Lu H (2018) Identification of human circadian genes based on time course gene expression profiles by using a deep learning method. Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease 1864(6):2274–2283CrossRefGoogle Scholar
- 50.Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221CrossRefGoogle Scholar
- 51.Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221CrossRefGoogle Scholar
- 52.Dai JJ, Lieu L, Rocke D (2006) Dimension reduction for classification with gene expression microarray data. Statistical applications in genetics and molecular biology 5(1)Google Scholar
- 53.Damelin SB, Gu Y, Wunsch DC, Xu R (2015) Fuzzy adaptive resonance theory diffusion maps and their applications to clustering and biclustering. Mathematical Modelling of Natural Phenomena 10(3):206–211MathSciNetzbMATHCrossRefGoogle Scholar
- 54.Danaee P, Ghaeini R, Hendrix DA (2017) A deep learning approach for cancer detection and relevant gene identification. In PACIFIC SYMPOSIUM ON BIOCOMPUTING 219–229Google Scholar
- 55.Das K, Mishra D (2016) Hybridized univariate and multivariate filter based approaches for gene selection. Int J Pharm Bio Sci 7(3):1215–1226Google Scholar
- 56.Das S, Deb T, Dey N, Ashour AS, Bhattacharya DK, Tibarewala DN (2018) Optimal choice of k-mer in composition vector method for genome sequence comparison. Genomics 110(5):263–273CrossRefGoogle Scholar
- 57.DeLaughter DM, Bick AG, Wakimoto H, McKean D, Gorham JM, Kathiriya IS, Hinson JT, Gray J, Pu W, Bruneau BG, Seidman JG, Seidman CE (2016) Single-cell resolution of temporal gene expression during heart development. Dev Cell 39(4):480–490CrossRefGoogle Scholar
- 58.Dettling M, Bühlmann P (2003) Boosting for tumor classification with gene expression data. Bioinformatics 19(9):1061–1069CrossRefGoogle Scholar
- 59.Dettling M, Bühlmann P (2003) Boosting for tumor classification with gene expression data. Bioinformatics 19(9):1061–1069CrossRefGoogle Scholar
- 60.D'haeseleer P (2005) How does gene expression clustering work? Nat Biotechnol 23(12):1499–1501CrossRefGoogle Scholar
- 61.Dheda K, Huggett JF, Bustin SA, Johnson MA, Rook G, Zumla A (2004) Validation of housekeeping genes for normalizing RNA expression in real-time PCR. Biotechniques 37(1):112–119CrossRefGoogle Scholar
- 62.Díaz-Uriarte R, De Andres SA (2006) Gene selection and classification of microarray data using random forest. BMC bioinformatics 7(1):1–13CrossRefGoogle Scholar
- 63.Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinforma Comput Biol 3(02):185–205CrossRefGoogle Scholar
- 64.Dopazo J, Erten C (2017) Graph-theoretical comparison of normal and tumor networks in identifying BRCA genes. BMC Syst Biol 11(1):1–17CrossRefGoogle Scholar
- 65.Edwards D (2003) Non-linear normalization and background correction in one-channel cDNA microarray studies. Bioinformatics 19(7):825–833CrossRefGoogle Scholar
- 66.El-Assaad W, El-Kouhen K, Mohammad AH, Yang J, Morita M, Gamache I, Mamer O, Avizonis D, Hermance N, Kersten S, Tremblay ML, Kelliher MA, Teodoro JG (2015) Deletion of the gene encoding G0/G1 switch protein 2 (G0s2) alleviates high-fat-diet-induced weight gain and insulin resistance, and promotes browning of white adipose tissue in mice. Diabetologia 58(1):149–157CrossRefGoogle Scholar
- 67.Eren AM, Morrison HG, Lescault PJ, Reveillaud J, Vineis JH, Sogin ML (2015) Minimum entropy decomposition: unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences. The ISME journal 9(4):968–979CrossRefGoogle Scholar
- 68.Fan R, Zhong M, Wang S, Zhang Y, Andrew A, Karagas M, Chen H, Amos CI, Xiong M, Moore JH (2011) Entropy-based information gain approaches to detect and to characterize gene-gene and gene-environment interactions/correlations of complex diseases. Genet Epidemiol 35(7):706–721CrossRefGoogle Scholar
- 69.Fang HR, Sakellaridi S, Saad Y (2009) Multilevel nonlinear dimensionality reduction for manifold learning. Technical report, Minnesota Supercomputer Institute, University of MinnesotaGoogle Scholar
- 70.Frandsen PB, Calcott B, Mayer C, Lanfear R (2015) Automatic selection of partitioning schemes for phylogenetic analyses using iterative k-means clustering of site rates. BMC Evol Biol 15(1):13CrossRefGoogle Scholar
- 71.Franzén O, Hu J, Bao X, Itzkowitz SH, Peter I, Bashir A (2015) Improved OTU-picking using long-read 16S rRNA gene amplicon sequencing and generic hierarchical clustering. Microbiome 3(1):43CrossRefGoogle Scholar
- 72.Friedman N, Linial M, Nachman I, Pe'er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7(3–4):601–620CrossRefGoogle Scholar
- 73.Fundel K, Haag J, Gebhard PM, Zimmer R, Aigner T (2008) Normalization strategies for mRNA expression data in cartilage research. Osteoarthr Cartil 16(8):947–955CrossRefGoogle Scholar
- 74.Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10):906–914CrossRefGoogle Scholar
- 75.Gamazon ER et al (2015) A gene-based association method for mapping traits using reference transcriptome data. Nat Genet 47(9):1091CrossRefGoogle Scholar
- 76.Gao C, McDowell IC, Zhao S, Brown CD, Engelhardt BE (2016) Context specific and differential gene co-expression networks via Bayesian biclustering. PLoS Comput Biol 12(7):e1004791CrossRefGoogle Scholar
- 77.Gardner JW, Boilot P, Hines EL (2005) Enhancing electronic nose performance by sensor selection using a new integer-based genetic algorithm approach. Sensors Actuators B Chem 106(1):114–121CrossRefGoogle Scholar
- 78.Geiss GK, Bumgarner RE, An MC, Agy MB, van't Wout AB, Hammersmark E, Carter V, Upchurch D, Mullins J, Katze MG (2000) Large-scale monitoring of host cell gene expression during HIV-1 infection using cDNA microarrays. Virology 266(1): 8–16Google Scholar
- 79.Gerstung M, Pellagatti A, Malcovati L, Giagounidis A, Della Porta MG, Jädersten M, Dolatshad H, Verma A, Cross NCP, Vyas P, Hellström-Lindberg E, Cazzola M, Papaemmanuil E, Campbell PJ, Boultwood J, Killick S (2015) Combining gene mutation with gene expression data improves outcome prediction in myelodysplastic syndromes. Nat Commun 6:5901CrossRefGoogle Scholar
- 80.Ghasemi R, Al Aziz MM, Mohammed N, Dehkordi MH, Jiang X (2017) Private and efficient query processing on outsourced genomic databases. IEEE journal of biomedical and health informatics 21(5):1466–1472CrossRefGoogle Scholar
- 81.Ghosh A, Barman S (2016) Application of Euclidean distance measurement and principal component analysis for gene identification. Gene 583(2):112–120CrossRefGoogle Scholar
- 82.Ginsburg GS, Willard HF (2009) Genomic and personalized medicine: foundations and applications. Transl Res 154(6):277–287CrossRefGoogle Scholar
- 83.Goodwin CR, Covington BC, Derewacz DK, McNees CR, Wikswo JP, McLean JA, Bachmann BO (2015) Structuring microbial metabolic responses to multiplexed stimuli via self-organizing metabolomics maps. Chem Biol 22(5):661–670CrossRefGoogle Scholar
- 84.Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adicoins X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7):644CrossRefGoogle Scholar
- 85.Guo G, Pinello L, Han X, Lai S, Shen L, Lin TW, Zou K, Orkin SH (2016) Serum-based culture conditions provoke gene expression variability in mouse embryonic stem cells as revealed by single-cell analysis. Cell Rep 14(4):956–965CrossRefGoogle Scholar
- 86.Gupta A, Wang H, Ganapathiraju M (2015) Learning structure in gene expression data using deep architectures, with an application to gene clustering. In IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 1328–1335Google Scholar
- 87.Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182zbMATHGoogle Scholar
- 88.Ha VS, Nguyen HN (2016) C-KPCA: custom kernel PCA for cancer classification. In Springer Machine Learning and Data Mining in Pattern Recognition 459–467Google Scholar
- 89.Haghverdi L, Buettner F, Theis FJ (2015) Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics 31(18):2989–2998CrossRefGoogle Scholar
- 90.Hamid JS, Hu P, Roslin NM, Ling V, Greenwood CM, Beyene J (2009) Data integration in genetics and genomics: methods and challenges. Human genomics and proteomics: HGP 2009(869093):1–13Google Scholar
- 91.Hartuv E, Schmitt AO, Lange J, Meier-Ewert S, Lehrach H, Shamir R (2000) An algorithm for clustering cDNA fingerprints. Genomics 66(3):249–256CrossRefGoogle Scholar
- 92.Hauskrecht M, Pelikan R, Valko M, Lyons-Weiler J (2007) Feature selection and dimensionality reduction in genomics and proteomics. In Fundamentals of data mining in genomics and proteomics Springer (Boston, MA) 149–172Google Scholar
- 93.He KY, Ge D, He MM (2017) Big data analytics for genomic medicine. Int J Mol Sci 18(2):1–18CrossRefGoogle Scholar
- 94.Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, Harp LF, Ye Z, Lee LK, Stuart RK, Ching CW, Ching KA, Antosiewicz-Bourget JE, Liu H, Zhang X, Green RD, Lobanencov VV, Stewart R, Thomson JA, Crawford GE, Kellis M, Ren B (2009) Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459(7243):108–112CrossRefGoogle Scholar
- 95.Hernandez JCH, Duval B, Hao JK (2007) A genetic embedded approach for gene selection and classification of microarray data. In European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, Springer, Berlin, Heidelberg 90–101Google Scholar
- 96.Herrero J, Díaz-Uriarte R, Dopazo J (2003) Gene expression data preprocessing. Bioinformatics 19(5):655–656CrossRefGoogle Scholar
- 97.Herrero J, Al-Shahrour F, Diaz-Uriarte R, Mateos A, Vaquerizas JM, Santoyo J, Dopazo J (2003) GEPAS: a web-based resource for microarray gene expression data analysis. Nucleic Acids Res 31(13):3461–3467CrossRefGoogle Scholar
- 98.Heydarian Z, Gruber M, Glick BR, Hegedus DD (2018) Gene Expression Patterns in Roots of Camelina sativa With Enhanced Salinity Tolerance Arising From Inoculation of Soil With Plant Growth Promoting Bacteria Producing 1-Aminocyclopropane-1-Carboxylate Deaminase or Expression the Corresponding acdS Gene. Frontiers in microbiology 9 Google Scholar
- 99.van Hijum SA, Baerends RJ, Zomer AL, Karsens HA, Martin-Requena V, Trelles O, Kok Jan, Kuipers OP (2008) Supervised Lowess normalization of comparative genome hybridization data–application to lactococcal strain comparisons. BMC bioinformatics 9(1): 1–10Google Scholar
- 100.Hira ZM, Gillies DF (2015) A review of feature selection and feature extraction methods applied on microarray data. Adv Bioinforma 2015(198363):1–13CrossRefGoogle Scholar
- 101.Huang DS, Zheng CH (2006) Independent component analysis-based penalized discriminant method for tumor classification using gene expression data. Bioinformatics 22(15):1855–1862CrossRefGoogle Scholar
- 102.Inza I, Sierra B, Blanco R, Larrañaga P (2002) Gene selection by sequential search wrapper approaches in microarray cancer class prediction. Journal of Intelligent & Fuzzy Systems 12(1):25–33zbMATHGoogle Scholar
- 103.Jain I, Jain VK, Jain R (2018) Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput 62:203–215CrossRefGoogle Scholar
- 104.Jaskowiak PA, Campello RJ, Costa IG (2014, January) On the selection of appropriate distances for gene expression data clustering. BMC bioinformatics 15(2):1–17Google Scholar
- 105.Jiang D, Tang C, Zhang A (2004) Cluster analysis for gene expression data: a survey. IEEE Trans Knowl Data Eng 16(11):1370–1386CrossRefGoogle Scholar
- 106.Jin X, Xu A, Bie R, Guo P (2006) Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. In International Workshop on Data Mining for Biomedical Applications Springer (Berlin, Heidelberg) 106–115Google Scholar
- 107.Johnson TA, Stedtfeld RD, Wang Q, Cole JR, Hashsham SA, Looft T, Zhu YG, Tiedje JM (2016) Clusters of antibiotic resistance genes enriched together stay together in swine agriculture. MBio 7(2):1–11CrossRefGoogle Scholar
- 108.Kamal MS, Parvin S, Ashour AS, Shi F, Dey N (2017) De-Bruijn graph with MapReduce framework towards metagenomic data classification. Int J Inf Technol 9(1):59–75Google Scholar
- 109.Kamal MS, Trivdedi, MC, Alam JB, Dey N, Ashour AS, Shi F, Tavares JMR (Preprint) Big DNA datasets analysis under push down automata. Journal of Intelligent & Fuzzy Systems: 1–11Google Scholar
- 110.Kar S, Sharma KD, Maitra M (2015) Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique. Expert Syst Appl 42(1):612–627CrossRefGoogle Scholar
- 111.Kasabov NK (2014) NeuCube: a spiking neural network architecture for mapping, learning and understanding of spatio-temporal brain data. Neural Netw 52:62–76CrossRefGoogle Scholar
- 112.Keller NP (2015) Translating biosynthetic gene clusters into fungal armor and weaponry. Nat Chem Biol 11(9):671CrossRefGoogle Scholar
- 113.Kelley DR, Snoek J, Rinn JL (2016) Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome ResGoogle Scholar
- 114.Khalid S, Khalil T, Nasreen S (2014) A survey of feature selection and feature extraction techniques in machine learning. In IEEE Science and Information Conference (SAI) 372–378Google Scholar
- 115.Kim D. H, et.al. (2015) Single-cell transcriptome analysis reveals dynamic changes in lncRNA expression during reprogramming. Cell Stem Cell 16(1): 88–101Google Scholar
- 116.Kooperberg C, Fazzio TG, Delrow JJ, Tsukiyama T (2002) Improved background correction for spotted DNA microarrays. J Comput Biol 9(1):55–66CrossRefGoogle Scholar
- 117.Kursa MB (2014) Robustness of random Forest-based gene selection methods. BMC bioinformatics 15(1):1–8CrossRefGoogle Scholar
- 118.Kuznetsova I, Lugmayr A, Holzinger A (2018) Visualisation Methods of Hierarchical Biological Data: A Survey and Review. International SERIES on Information Systems and Management in Creative eMedia (CreMedia) (2017/2), 32–39Google Scholar
- 119.Lamparter D, Marbach D, Rueedi R, Kutalik Z, Bergmann S (2016) Fast and rigorous computation of gene and pathway scores from SNP-based summary statistics. PLoS Comput Biol 12(1):e1004714CrossRefGoogle Scholar
- 120.Lan K, Wang DT, Fong S, Liu LS, Wong KK, Dey N (2018) A survey of data mining and deep learning in bioinformatics. J Med Syst 42(8):139CrossRefGoogle Scholar
- 121.Lancashire LJ, Rees RC, Ball GR (2008) Identification of gene transcript signatures predictive for estrogen receptor and lymph node status using a stepwise forward selection artificial neural network modelling approach. Artif Intell Med 43(2):99–111CrossRefGoogle Scholar
- 122.Landfors M, Philip P, Rydén P, Stenberg P (2011) Normalization of high dimensional genomics data where the distribution of the altered variables is skewed. PLoS One 6(11)Google Scholar
- 123.Lazar C, Taminau J, Meganck S, Steenhoff D, Coletta A, Molter C, Schaetzen V, Duque R, Bersini H, Nowe A (2012) A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 9(4):1106–1119CrossRefGoogle Scholar
- 124.Lazzeroni L, Owen A (2002) Plaid models for gene expression data. Stat Sin 12(1):61–86MathSciNetzbMATHGoogle Scholar
- 125.Lê Cao KA, Rohart F, McHugh L, Korn O, Wells CA (2014) YuGene: a simple approach to scale gene expression data derived from different platforms for integrated analyses. Genomics 103(4):239–251CrossRefGoogle Scholar
- 126.Leardi R, Nørgaard L (2004) Sequential application of backward interval partial least squares and genetic algorithms for the selection of relevant spectral regions. Journal of Chemometrics: A Journal of the Chemometrics Society 18(11):486–497CrossRefGoogle Scholar
- 127.Lee PS, Lee KH (2000) Genomic analysis. Curr Opin Biotechnol 11(2):171–175CrossRefGoogle Scholar
- 128.Lee Y, Lee CK (2003) Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics 19(9):1132–1139CrossRefGoogle Scholar
- 129.Lee G, Rodriguez C, Madabhushi A (2008) Investigating the efficacy of nonlinear dimensionality reduction schemes in classifying gene and protein expression studies. IEEE/ACM Transactions on Computational Biology and Bioinformatics 5(3):368–384CrossRefGoogle Scholar
- 130.Lee AB, Luca D, Klei L, Devlin B, Roeder K (2010) Discovering genetic ancestry using spectral graph theory. Genetic Epidemiology: The Official Publication of the International Genetic Epidemiology. Society 34(1):51–59Google Scholar
- 131.Leung YF, Cavalieri D (2003) Fundamentals of cDNA microarray data analysis. Trends Genet 19(11):649–659CrossRefGoogle Scholar
- 132.Li L, Weinberg CR, Darden TA, Pedersen LG (2001) Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12):1131–1142CrossRefGoogle Scholar
- 133.Li L, Darden TA, Weingberg CR, Levine AJ, Pedersen LG (2001) Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Comb Chem High Throughput Screen 4(8):727–739CrossRefGoogle Scholar
- 134.Li T, Zhang C, Ogihara M (2004) A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15):2429–2437CrossRefGoogle Scholar
- 135.Li Q, Fraley C, Bumgarner RE, Yeung KY, Raftery AE (2005) Donuts, scratches and blanks: robust model-based segmentation of microarray images. Bioinformatics 21(12):2875–2882CrossRefGoogle Scholar
- 136.Li MW, Han DF, Wang WL (2015) Vessel traffic flow forecasting by RSVR with chaotic cloud simulated annealing genetic algorithm and KPCA. Neurocomputing 157:243–255CrossRefGoogle Scholar
- 137.Li J, Malley JD, Andrew AS, Karagas MR, Moore JH (2016) Detecting gene-gene interactions using a permutation-based random forest method. BioData mining 9(1):14CrossRefGoogle Scholar
- 138.Liang H, Sun D, Ding Z, Ge M (2015) Protein function prediction using multi-label learning and ISOMAP embedding. In: Bio-inspired computing-theories and applications. Springer, Berlin, pp 249–259CrossRefGoogle Scholar
- 139.Liberzon A, Birger C, Thorvaldsdóttir H, Ghandi M, Mesirov JP, Tamayo P (2015) The molecular signatures database hallmark gene set collection. Cell systems 1(6):417–425CrossRefGoogle Scholar
- 140.Liew AWC, Law NF, Yan H (2010) Missing value imputation for gene expression data: computational techniques to recover missing data from available information. Brief Bioinform 12(5):498–513CrossRefGoogle Scholar
- 141.Liu H, Li J, Wong L (2002) A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Genome informatics 13:51–60Google Scholar
- 142.Liu B, Cui Q, Jiang T, Ma S (2004) A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC bioinformatics 5(1):1–12CrossRefGoogle Scholar
- 143.Liu Z, Chen D, Bensmail H (2005) Gene expression data classification with kernel principal component analysis. Biomed Res Int 2005(2):155–159Google Scholar
- 144.Liu J, Pérez-Liébana D, Lucas SM (2017) Bandit-based random mutation hill-climbing. In IEEE Congress on Evolutionary Computation (CEC) 2145–2151Google Scholar
- 145.Loomba R, Schork N, Chen CH, Bettencourt R, Bhatt A, Ang B, Nguyen P, Hernandez C, Richards L, Salotti J, Lin S, Seki E, Nelson KE, Sirlin CB, Brenner D (2015) Heritability of hepatic fibrosis and steatosis based on a prospective twin study. Gastroenterology 149(7):1784–1793CrossRefGoogle Scholar
- 146.Lu H, Meng Y, Yan K, Xue Y, Gao Z (2017) Classifying Non-linear Gene Expression Data Using a Novel Hybrid Rotation Forest Method. In Springer International Conference on Intelligent Computing 732–743Google Scholar
- 147.Luo F, Tang K, Khan L (2003, March) Hierarchical clustering of gene expression data. In Proceedings. Third IEEE Symposium on Bioinformatics and. Bioengineering:328–335Google Scholar
- 148.Mallick P, Ghosh O, Seth P, Ghosh A (2019) Kohonen’s Self-organizing Map Optimizing Prediction of Gene Dependency for Cancer Mediating Biomarkers. In Springer Emerging Technologies in Data Mining and Information Security 863–870Google Scholar
- 149.Manikandan SP, Manimegalai R, Hariharan M (2016) Gene selection from microarray data using binary Grey Wolf algorithm for classifying acute leukemia. Current Signal Transduction Therapy 11(2):76–83CrossRefGoogle Scholar
- 150.Mann KM, Newberg JY, Black MA, Jones DJ, Amaya-Manzanares F, Guzman-Rojas L, Kodama T, Ward JM, Rust AG, Weyden L, Yew CCK, Waters JL, Leung ML, Rogers K, Rogers SM, McNoe LA, Selvanesan L, Navin N, Jenkins NA, Copeland NG, Mann MB (2016) Analyzing tumor heterogeneity and driver genes in single myeloid leukemia cells with SBCapSeq. Nat Biotechnol 34(9):962–972CrossRefGoogle Scholar
- 151.McCarthy MI (2010) Genomics, type 2 diabetes, and obesity. N Engl J Med 363(24):2339–2350CrossRefGoogle Scholar
- 152.McGee M, Chen Z (2006) Parameter estimation for the exponential-normal convolution model for background correction of affymetrix GeneChip data. Statistical applications in genetics and molecular biology 5(1)Google Scholar
- 153.McInerney JO, Smith T, Mahony S, Golden A (2017) Gene prediction using the Self-Organizing Map: automatic generation of multiple gene models. Cancer Google Scholar
- 154.McLachlan GJ, Bean RW, Peel D (2002) A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18(3):413–422CrossRefGoogle Scholar
- 155.McPherson JD, Marra M, Hillier L, Waterston RH, Chinwalla A, Wallis J, Fulton R (2001) A physical map of the human genome. Nature 409(6822):934–942CrossRefGoogle Scholar
- 156.McSharry PE, Crampin EJ (2016) Identifying statistically significant patterns in gene expression data arXiv preprint arXiv:1606.02801Google Scholar
- 157.Medvedovic M, Sivaganesan S (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18(9):1194–1206CrossRefGoogle Scholar
- 158.Mehrotra P (2016) Biosensors and their applications–a review. Journal of oral biology and craniofacial research 6(2):153–159CrossRefGoogle Scholar
- 159.Melo ALDA, Soccol VT, Soccol CR (2016) Bacillus thuringiensis: mechanism of action, resistance, and new applications: a review. Crit Rev Biotechnol 36(2):317–326CrossRefGoogle Scholar
- 160.Meng J, Zhang J, Luan Y (2015) Gene selection integrated with biological knowledge for plant stress response using neighborhood system and rough set theory. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 12(2):433–444CrossRefGoogle Scholar
- 161.Min X, Wang H, Yang Z, Ge S, Zhang J, Shao N (2015) Relevant component locally linear embedding dimensionality reduction for gene expression data analysis. Metallurgical & Mining Industry 4:186–194Google Scholar
- 162.Moorthy K, Saberi Mohamad M, Deris S (2014) A review on missing value imputation algorithms for microarray gene expression data. Curr Bioinforma 9(1):18–22CrossRefGoogle Scholar
- 163.Murray SN, Walsh BP, Kelliher D, O'Sullivan DTJ (2014) Multi-variable optimization of thermal energy efficiency retrofitting of buildings using static modelling and genetic algorithms–a case study. Build Environ 75:98–107CrossRefGoogle Scholar
- 164.National Research Council. (1988). Mapping and sequencing the human genome. National Academies PressGoogle Scholar
- 165.Newton MA, Kendziorski CM, Richmond CS, Blattner FR, Tsui KW (2001) On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. J Comput Biol 8(1):37–52CrossRefGoogle Scholar
- 166.Nilsson J (2006) Nonlinear dimensionality reduction of gene expression data. Centre for Mathematical Sciences, Lund UniversityGoogle Scholar
- 167.Nimmy SF, Sarowar MG, Dey N, Ashour AS, Santosh KC (2018) Investigation of DNA discontinuity for detecting tuberculosis. Journal of Ambient Intelligence and Humanized Computing 1–15Google Scholar
- 168.Njeunje FON, Czaja W, Benedetto JJ (2014) Linear and Non-linear Dimension Reduction Applied to Gene Expression Data of Cancer Tissue SamplesGoogle Scholar
- 169.Oba S, Sato MA, Takemasa I, Monden M, Matsubara KI, Ishii S (2003) A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16):2088–2096CrossRefGoogle Scholar
- 170.Oghabian A, Kilpinen S, Hautaniemi S, Czeizler E (2014) Biclustering methods: biological relevance and application in gene expression analysis. PLoS One 9(3):e90801CrossRefGoogle Scholar
- 171.Ogutu JO, Schulz-Streeck T, Piepho HP (2012) Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions. BMC Proc 6(2):1–6Google Scholar
- 172.Orsenigo C, Vercellis C (2013) Dimensionality reduction via isomap with lock-step and elastic measures for time series gene expression classification. In European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics Springer (Berlin, Heidelberg) 92–103Google Scholar
- 173.Ott J, Wang J, Leal SM (2015) Genetic linkage analysis in the age of whole-genome sequencing. Nat Rev Genet 16(5):275–284CrossRefGoogle Scholar
- 174.Palmer OMP, Rogers G, Yende S, Angus DC, Clermont G, Langston MA (2018) Graph theoretical analysis of genome-scale data: examination of gene activation occurring in the setting of community-acquired pneumonia. Shock 50(1):53–59CrossRefGoogle Scholar
- 175.Pan M, Zhang J (2018) Quantile normalization for combining gene-expression datasets. Biotechnology & Biotechnological Equipment 32(3):751–758MathSciNetCrossRefGoogle Scholar
- 176.Paradis E, Gosselin T, Goudet J, Jombart T, Schliep K (2017) Linking genomics and population genetics with R. Mol Ecol Resour 17(1):54–66CrossRefGoogle Scholar
- 177.Parikshak NN, Swarup V, Belgard TG, Irimia M, Ramaswami G, Gandal MJ, Harti C, Leppa V, Ubieta LT, Huang J, Lowe JK, Blencowe BJ, Horvath S, Geschwind DH (2016) Genome-wide changes in lncRNA, splicing, and regional gene expression patterns in autism. Nature 540(7633):423–427CrossRefGoogle Scholar
- 178.Parmigiani G, Garrett ES, Irizarry RA, Zeger SL (2003) The analysis of gene expression data: an overview of methods and software. In The analysis of gene expression data Springer (New York, NY) 1–45Google Scholar
- 179.Parry RM, Jones W, Stokes TH, Phan JH, Moffitt RA, Fang H, Shi L, Oberthuer A, Fischer M, Tong W, Wang MD (2010) K-nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The pharmacogenomics journal 10(4):292–309CrossRefGoogle Scholar
- 180.Perkins AD, Langston MA (2009) Threshold selection in gene co-expression networks using spectral graph theory techniques. In BMC bioinformatics 10 (11): S4Google Scholar
- 181.Petralia F, Wang P, Yang J, Tu Z (2015) Integrative random forest for gene regulatory network inference. Bioinformatics 31(12):197–205CrossRefGoogle Scholar
- 182.Pickett JA, Khan ZR (2016) Plant volatile-mediated signalling and its application in agriculture: successes and challenges. New Phytol 212(4):856–870CrossRefGoogle Scholar
- 183.Pillati M, Viroli C (2005) Locally linear embedding for nonlinear dimension reduction in classification problems: an application to gene expression data. Statistica 65(1):61–71MathSciNetzbMATHGoogle Scholar
- 184.Pillati M, Viroli C (2005) Supervised locally linear embedding for classification: an application to gene expression data analysis. In Proceedings of 29th Annual Conference of the German Classification Society 15–18Google Scholar
- 185.Prabhakaran S, Azizi E, Carr A, Pe’er D (2016) Dirichlet process mixture model for correcting technical variation in single-cell gene expression data. In International Conference on Machine Learning 1070–1079Google Scholar
- 186.Qiu X, Wu H, Hu R (2013) The impact of quantile and rank normalization procedures on the testing power of gene differential expression analysis. BMC bioinformatics 14(1):1–10CrossRefGoogle Scholar
- 187.Quang D, Xie X (2016) DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res 44(11):1–6CrossRefGoogle Scholar
- 188.Rajan K (2015) Materials informatics: the materials “gene” and big data. Annu Rev Mater Res 45:153–169CrossRefGoogle Scholar
- 189.Rajan K (2015) Materials informatics: the materials “gene” and big data. Annu Rev Mater Res 45:153–169CrossRefGoogle Scholar
- 190.Ramalho JS, Tolmachova T, Hume AN, McGuigan A, Gregory-Evans CY, Huxley C, Seabra MC (2001) Chromosomal mapping, gene structure and characterization of the human and murine RAB27B gene. BMC Genet 2(1)Google Scholar
- 191.Ray SS, Ganivada A, Pal SK (2016) A granular self-organizing map for clustering and gene selection in microarray data. IEEE transactions on neural networks and learning systems 27(9):1890–1906MathSciNetCrossRefGoogle Scholar
- 192.Reverter F, Vegas E, Oller JM (2014) Kernel-PCA data integration with enhanced interpretability. BMC Syst Biol 8(2):1–9Google Scholar
- 193.Ritchie ME, Silver J, Oshlack A, Holmes M, Diyagama D, Holloway A, Smyth GK (2007) A comparison of background correction methods for two-colour microarrays. Bioinformatics 23(20):2700–2707CrossRefGoogle Scholar
- 194.Ritchie MD, Holzinger ER, Li R, Pendergrass SA, Kim D (2015) Methods of integrating data to uncover genotype–phenotype interactions. Nat Rev Genet 16(2):85–97CrossRefGoogle Scholar
- 195.Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1):139–140CrossRefGoogle Scholar
- 196.Rocke DM, Durbin B (2003) Approximate variance-stabilizing transformations for gene-expression microarray data. Bioinformatics 19(8):966–972CrossRefGoogle Scholar
- 197.Rodríguez-Rodríguez J, Sevilla A, Martínez-Bazán C, Gordillo JM (2015) Generation of microbubbles with applications to industry and medicine. Annu Rev Fluid Mech 47:405–429MathSciNetCrossRefGoogle Scholar
- 198.Roffler GH, Schwartz MK, Pilgrim KL, Talbot SL, Sage GK, Adams LG, Luikart G (2016) Identification of landscape features influencing gene flow: how useful are habitat selection models? Evol Appl 9(6):805–817CrossRefGoogle Scholar
- 199.Romualdi C, Campanaro S, Campagna D, Celegato B, Cannata N, Toppo S, Valle G, Lanfranchi G (2003) Pattern recognition in gene expression profiling using DNA array: a comparative study of different statistical methods applied to cancer classification. Hum Mol Genet 12(8):823–836CrossRefGoogle Scholar
- 200.Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2006) Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recogn 39(12):2383–2392CrossRefGoogle Scholar
- 201.Rupp R, Mucha S, Larroque H, McEwan J, Conington J (2016) Genomic application in sheep and goat breeding. Animal Frontiers 6(1):39–44CrossRefGoogle Scholar
- 202.Ryman N (2006) Chifish: a computer program testing for genetic heterogeneity at multiple loci using chi-square and Fisher's exact test. Mol Ecol Notes 6(1):285–287CrossRefGoogle Scholar
- 203.Saelens W, Cannoodt R, Saeys Y (2018) A comprehensive evaluation of module detection methods for gene expression data. Nat Commun 9(1):1–12CrossRefGoogle Scholar
- 204.Saghir H, Megherbi DB (2013) An efficient comparative machine learning-based metagenomics binning technique via using Random forest. In IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA) 191–196Google Scholar
- 205.Salleh AHM, Mohamad MS, Deris S, Omatu S, Fdez-Riverola F, Corchado JM (2015) Gene knockout identification for metabolite production improvement using a hybrid of genetic ant colony optimization and flux balance analysis. Biotechnol Bioprocess Eng 20(4):685–693CrossRefGoogle Scholar
- 206.Saul LK, Weinberger KQ, Ham JH, Sha F, Lee DD (2006) Spectral methods for dimensionality reduction. Semisupervised learning:293–308Google Scholar
- 207.Schmitt P, Mandel J, Guedj M (2015) A comparison of six methods for missing data imputation. Journal of Biometrics & Biostatistics 6(1):1–6Google Scholar
- 208.Seno A, Kasai T, Ikeda M, Vaidyanath A, Masuda J, Mizutani A, Murakami H, Ishikawa T, Seno M (2016) Characterization of gene expression patterns among artificially developed cancer stem cells using spherical self-organizing map. Cancer informatics 15, CIN-S39839Google Scholar
- 209.Sewer A, Gubian S, Kogel U, Veljkovic E, Han W, Hengstermann A, Peitsch MC, Hoeng J (2014) Assessment of a novel multi-array normalization method based on spike-in control probes suitable for microRNA datasets with global decreases in expression. BMC research notes 7(1):1–18CrossRefGoogle Scholar
- 210.Shabani M, Borry P (2015) Challenges of web-based personal genomic data sharing. Life sciences, society and policy 11(1):1–13CrossRefGoogle Scholar
- 211.Shamir R, Sharan R (2002) Algorithmic approaches to clustering gene expression data. Current Topics in Computational Molecular Biology 269Google Scholar
- 212.Sharbaf FV, Mosafer S, Moattar MH (2016) A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. Genomics 107(6):231–238CrossRefGoogle Scholar
- 213.Shehu A, De Jong KA (2014) Evolutionary search algorithms for protein modeling: from de novo structure prediction to comprehensive maps of functionally-relevant structures of protein chains and assemblies. In Proceedings of the ACM Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation 839–856Google Scholar
- 214.Sherlock G (2000) Analysis of large-scale gene expression data. Curr Opin Immunol 12(2):201–205CrossRefGoogle Scholar
- 215.Shimada K, Nakamura M, Ishida E, Higuchi T, Yamamoto H, Tsujikawa K, Konishi N (2008) Prostate cancer antigen-1 contributes to cell survival and invasion though discoidin receptor 1 in human prostate cancer. Cancer Sci 99(1):39–45Google Scholar
- 216.Shreem SS, Abdullah S, Nazri MZA (2014) Hybridising harmony search with a Markov blanket for gene selection problems. Inf Sci 258:108–121MathSciNetCrossRefGoogle Scholar
- 217.Simerska P, Moyle PM, Toth I (2011) Modern lipid-, carbohydrate-, and peptide-based delivery systems for peptide, vaccine, and gene products. Med Res Rev 31(4):520–547CrossRefGoogle Scholar
- 218.Simko I (2016) High-resolution DNA melting analysis in plant research. Trends Plant Sci 21(6):528–537CrossRefGoogle Scholar
- 219.Singh D, al e (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209CrossRefGoogle Scholar
- 220.Slonim DK (2002) From patterns to pathways: gene expression data analysis comes of age. Nat Genet 32:502–508CrossRefGoogle Scholar
- 221.Southern EM (1992) Genome mapping: cDNA approaches. Curr Opin Genet Dev 2(3):412–416CrossRefGoogle Scholar
- 222.Steiner L, Hopp L, Wirth H, Galle J, Binder H, Prohaska SJ, Rohlf T (2012) A global genome segmentation method for exploration of epigenetic patterns. PLoS One 7(10)Google Scholar
- 223.Sun S, Peng Q, Shakoor A (2014) A kernel-based multivariate feature selection method for microarray data classification. PloS one 9(7)Google Scholar
- 224.Tabakhi S, Najafi A, Ranjbar R, Moradi P (2015) Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168:1024–1036CrossRefGoogle Scholar
- 225.Tan AC, Gilbert D (2003) Ensemble machine learning on gene expression data for cancer classificationGoogle Scholar
- 226.Tang EK, Suganthan PN, Yao X (2006) Gene selection algorithms for microarray data based on least squares support vector machine. BMC bioinformatics 7(1):95CrossRefGoogle Scholar
- 227.Tang H, Jiang X, Wang X, Wang S, Sofia H, Fox D, Lauter K, Malin B, Telenti A, Xiong L, Ohno-Machado L (2016) Protecting genomic data analytics in the cloud: state of the art and opportunities. BMC Med Genet 9(1):1–9Google Scholar
- 228.Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci 99(10):6567–6572CrossRefGoogle Scholar
- 229.Tran LH, Tran LH (2017) Applications of (SPARSE)-PCA and LAPLACIAN EIGENMAPS to biological network inference problem using gene expression data. International Journal of Advances in Soft Computing & Its Applications 9(2):45–62Google Scholar
- 230.Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17(6):520–525CrossRefGoogle Scholar
- 231.Tuikkala J, Elo LL, Nevalainen OS, Aittokallio T (2008) Missing value imputation improves clustering and interpretation of gene expression microarray data. BMC bioinformatics 9(1):1–14CrossRefGoogle Scholar
- 232.Tutz G, Ramzan S (2015) Improved methods for the imputation of missing data by nearest neighbor methods. Computational Statistics & Data Analysis 90:84–99MathSciNetzbMATHCrossRefGoogle Scholar
- 233.Uğuz H (2011) A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowl-Based Syst 24(7):1024–1032CrossRefGoogle Scholar
- 234.van Dijk D, Nainys J, Sharma R, Kathail P, Carr AJ, Moon KR, Mazutis L, Wolf G, Krishnaswamy S, Pe'er D (2017) MAGIC: A diffusion-based imputation method reveals gene-gene interactions in single-cell RNA-sequencing data. BioRxiv Google Scholar
- 235.Venna J, Peltonen J, Nybo K, Aidos H, Kaski S (2010) Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J Mach Learn Res 11(Feb):451–490MathSciNetzbMATHGoogle Scholar
- 236.Vepakomma P, Elgammal A (2016) A fast algorithm for manifold learning by posing it as a symmetric diagonally dominant linear system. Appl Comput Harmon Anal 40(3):622–628MathSciNetzbMATHCrossRefGoogle Scholar
- 237.Vidaki A, Johansson C, Giangasparo F, Court DS (2017) Differentially methylated embryonal Fyn-associated substrate (EFS) gene as a blood-specific epigenetic marker and its potential application in forensic casework. Forensic Science International: Genetics 29:165–173CrossRefGoogle Scholar
- 238.Vohradsky J (2001) Neural network model of gene expression. FASEB J 15(3):846–854CrossRefGoogle Scholar
- 239.Wang H, van der Laan MJ (2011) Dimension reduction with gene expression data using targeted variable importance measurement. BMC bioinformatics 12(1):1–12CrossRefGoogle Scholar
- 240.Wang Z, Li G, Robinson RW, Huang X (2016) UniBic: sequential row-based biclustering algorithm for analysis of gene expression data. Sci Rep 6:1–10CrossRefGoogle Scholar
- 241.Wang A, An N, Yang J, Chen G, Li L, Alterovitz G (2017) Wrapper-based gene selection with Markov blanket. Comput Biol Med 81:11–23CrossRefGoogle Scholar
- 242.Westcott SL, Schloss PD (2015) De novo clustering methods outperform reference-based methods for assigning 16S rRNA gene sequences to operational taxonomic units. PeerJ 3:e1487CrossRefGoogle Scholar
- 243.Willems E, Leyns L, Vandesompele J (2008) Standardization of real-time PCR gene expression data from independent biological replicates. Anal Biochem 379(1):127–129CrossRefGoogle Scholar
- 244.Wilson A, Fenton B, Malloch G, Boag B, Hubbard S, Begg G (2016) Urbanisation versus agriculture: a comparison of local genetic diversity and gene flow between wood mouse Apodemus sylvaticus populations in human-modified landscapes. Ecography 39(1):87–97CrossRefGoogle Scholar
- 245.Wong MH, Mutch DM, McNicholas PD (2017) Two-way learning with one-way supervision for gene expression data. BMC bioinformatics 18(1):150CrossRefGoogle Scholar
- 246.Xu Y, Olman V, Xu D (2002) Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees. Bioinformatics 18(4):536–545CrossRefGoogle Scholar
- 247.Xu R, Damelin S, Wunsch DC (2007) Applications of diffusion maps in gene expression data-based cancer diagnosis analysis. In IEEE 29th annual international conference of Engineering in medicine and biology society 4613–4616Google Scholar
- 248.Xu J, Mu H, Wang Y, Huang F (2018) Feature genes selection using supervised locally linear embedding and correlation coefficient for microarray classification. Computational and mathematical methods in medicine 2018(5490513):1–11Google Scholar
- 249.Xuan P, Guo MZ, Wang J, Wang CY, Liu XY, Liu Y (2011) Genetic algorithm-based efficient feature selection for classification of pre-miRNAs. Genet Mol Res 10(2):588–603CrossRefGoogle Scholar
- 250.Yang YH, Buckley MJ, Dudoit S, Speed TP (2002) Comparison of methods for image analysis on cDNA microarray data. J Comput Graph Stat 11(1):108–136MathSciNetCrossRefGoogle Scholar
- 251.Yang Y, Xie B, Yan J (2014) Application of next-generation sequencing technology in forensic science. Genomics, proteomics & bioinformatics 12(5):190–197CrossRefGoogle Scholar
- 252.Ye J, Li T, Xiong T, Janardan R (2004) Using uncorrelated discriminant analysis for tissue classification with gene expression data. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 1(4):181–190CrossRefGoogle Scholar
- 253.Yeung KY, Haynor DR, Ruzzo WL (2001) Validating clustering for gene expression data. Bioinformatics 17(4):309–318CrossRefGoogle Scholar
- 254.Yeung KY, Fraley C, Murua A, Raftery AE, Ruzzo WL (2001) Model-based clustering and data transformations for gene expression data. Bioinformatics 17(10):977–987CrossRefGoogle Scholar
- 255.Yu Z, Wong HS, Wang H (2007) Graph-based consensus clustering for class discovery from gene expression data. Bioinformatics 23(21):2888–2896CrossRefGoogle Scholar
- 256.Yuan B, Zhang C, Shao X (2015) A late acceptance hill-climbing algorithm for balancing two-sided assembly lines with multiple constraints. J Intell Manuf 26(1):159–168CrossRefGoogle Scholar
- 257.Zamani-Dahaj SA, Okasha M, Kosakowski J, Higgs PG (2016) Estimating the frequency of horizontal gene transfer using phylogenetic models of gene gain and loss. Mol Biol Evol 33(7):1843–1857CrossRefGoogle Scholar
- 258.Zeng T, Li R, Mukkamala R, Ye J, Ji S (2015) Deep convolutional neural networks for annotating gene expression patterns in the mouse brain. BMC bioinformatics 16(1):1–10CrossRefGoogle Scholar
- 259.Zhang S, Chen S, Li W, Guo X, Zhao P, Xu J, Chen Y, Pan Q, Liu X, Lu H, Wang Y, Pei D, Esteban MA (2011) Rescue of ATP7B function in hepatocyte-like cells from Wilson's disease induced pluripotent stem cells using gene therapy or the chaperone drug curcumin. Hum Mol Genet 20(16):3176–3187CrossRefGoogle Scholar
- 260.Zhang L, Qian L, Ding C, Zhou W, Li F (2015) Similarity-balanced discriminant neighbor embedding and its application to cancer classification based on gene expression data. Comput Biol Med 64:236–245CrossRefGoogle Scholar
- 261.Zhu Z, Ong YS, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recogn 40(11):3236–3248zbMATHCrossRefGoogle Scholar
- 262.Zou Q, Zeng J, Cao L, Ji R (2016) A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 173:346–354CrossRefGoogle Scholar