Abstract
Studying the original gene expression dataset is one of the essential methods for analyzing biological processes. Many platforms were developed to conduct this kind of study, such as GSEA, and the online gene list analysis portal Metascape. However, these well-known platforms sometimes are not friendly enough for inexperienced users due to the following reasons. Firstly, many biological experiments only have three duplicates, which make classical statistical methods lack of efficient and accuracy. Secondly, different experiments could result in different gene expression profiles, where standard differential expressed gene identification methods still have room to be further improved. Thirdly, many platforms work only for specific experimental conditions based on their default parameters, where users are not easily setup parameters for their own studies. In this study, we designed a comprehensive and flexible gene expression data analysis tool, where six novel differential expressed gene identification methods and three functional enrichment analysis methods were proposed. Majority parameters can be friendly setting by users and a variety of algorithms can be 9 according to the user’s own study designing. Experiments show that our platform provides an effective way for gene set series analysis, and has great performance in both practicality and convenience.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tusher, V.G., Tibshirani, R., Chu, G.E.: Significance analysis of microarrays applied to the ionizing radiation response. In: Proceedings of the National Academy of Science of the United States of America (24 April 2001) 98 (5116–5121)). Proceedings of the National Academy of Sciences of the United States of America, p. 98 (2001)
Mutch, D.M., et al.: The limit fold change model: a practical approach for selecting differentially expressed genes from microarray data. BMC Bioinform. 3(1), 17–20 (2002)
Raser, J.M.: Noise in gene expression: origins, consequences, and control. Science (Washington DC), 309(5743), 2010–2013 (2005)
Zhou, Y., et al.: Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nature communications (2019)
Metascape homepage. http://metascape.org/gp/index.html#/main/step1
Dalman, M.R., Deeter, A., Nimishakavi, G., Duan, Z.H: Fold change and p-value cutoffs significantly alter microarray interpretations. BMC Bioinform. 13, 256–303 (2012)
Witten, D.M., Tibshirani, R.A.: comparison of fold-change and the t-statistic for microarray data analysis. Analysis (2007)
Robinson, M.D, Smyth, G.K.: Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics (2007)
Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for RNA-SEQ data with DESEQ2. Genome Biol. 15(12), 550 (2014)
Mika, S.: Kernel PCA and de-noising in feature spaces. Adv. Neural Inf. Process. Syst. 11, 65–92 (1999)
Hong, M.G., Pawitan, Y., Magnusson, P.K.E., Prince, J.A.: Strategies and issues in the detection of pathway enrichment in genome-wide association studies. Hum. Genet. 126(2), 289–301 (2009)
Gene Ontology Consortium. The Gene Ontology project in 2008. Nucleic acids research, 36(Database issue). D440–D444 (2007). https://doi.org/10.1093/nar/gkm883
Huang, D.W., Sherman, B.T., Lempicki, R.A.: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37(1), 1–13 (2009)
Fisher, R.A.: On the interpretation of X2 from contingency tables, and the calculation of P. J. Royal Stat. Soc. 85(1), 87–94 (1922)
Powers, D.: Evaluation: fom precision, recall and f-measure to roc, informedness, markedness and correlation. J. Mach. Learn. Technol. 2, 37–63 (2007)
Acknowledgement
This work was supported by the National Natural Science Foundation of China under Grant Nos. 61972320, 61772426, 61702161, 61702420, 61702421, and 61602386, the education and teaching reform research project of Northwestern Polytechnical University (Grant No 2020JGY23), the Fundamental Research Funds for the Central Universities under Grant No. 3102019DX1003, the Key Research and Development and Promotion Program of Henan Province of China under Grant 182102210213, the Key Research Fund for Higher Education of Henan Province of China under Grant 18A520003, and the Top International University Visiting Program for Outstanding Young Scholars of Northwestern Polytechnical University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, B., Wang, C., Gao, L., Shang, X. (2020). A Flexible and Comprehensive Platform for Analyzing Gene Expression Data. In: Han, H., Wei, T., Liu, W., Han, F. (eds) Recent Advances in Data Science. IDMB 2019. Communications in Computer and Information Science, vol 1099. Springer, Singapore. https://doi.org/10.1007/978-981-15-8760-3_12
Download citation
DOI: https://doi.org/10.1007/978-981-15-8760-3_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8759-7
Online ISBN: 978-981-15-8760-3
eBook Packages: Computer ScienceComputer Science (R0)