MetaboNexus is an interactive metabolomics data analysis platform that integrates pre-processing of raw peak data with in-depth statistical analysis and metabolite identity search. It is designed to work as a desktop application hence uploading large files to web servers is not required. This could speed up the data analysis process because server queries or queues are avoided, while ensuring security of confidential clinical data on a local computer. With MetaboNexus, users can progressively start from data pre-processing, multi- and univariate analysis to metabolite identity search of significant molecular features, thereby seamlessly integrating critical steps for metabolite biomarker discovery. Data exploration can be first performed using principal components analysis, while prediction and variable importance can be calculated using partial least squares-discriminant analysis and Random Forest. After identifying putative features from multi- and univariate analyses (e.g. t test, ANOVA, Mann–Whitney U test and Kruskal–Wallis test), users can seamlessly determine the molecular identity of these putative features. To assist users in data interpretation, MetaboNexus also automatically generates graphical outputs, such as score plots, diagnostic plots, boxplots, receiver operating characteristic plots and heatmaps. The metabolite search function will match the mass spectrometric peak data to three major metabolite repositories, namely HMDB, MassBank and METLIN, using a comprehensive range of molecular adducts. Biological pathways can also be searched within MetaboNexus. MetaboNexus is available with installation guide and tutorial at http://www.sph.nus.edu.sg/index.php/research-services/research-centres/ceohr/metabonexus, and is meant for the Windows Operating System, XP and onwards (preferably on 64-bit). In summary, MetaboNexus is a desktop-based platform that seamlessly integrates the entire data analytical workflow and further provides the putative identities of mass spectrometric data peaks by matching them to databases.
Metabolomics Raw data pre-processing Statistical analysis Software Mass spectrometry Feature selection
This is a preview of subscription content, log in to check access.
SM Huang was supported by an NUS Research Scholarship. We thank the Singapore National Medical Research Council (NMRC 1242/2009 WBS R-608-000-034-213) and the NUS Environmental Research Institute (NERI) for the support of this work.
Baker, S. G. (2003). The central role of receiver operating characteristic (ROC) curves in evaluating tests for the early detection of cancer. Journal of the National Cancer Institute,95, 511–515.CrossRefPubMedGoogle Scholar
Breiman, L. E. O. (2002). Random forests. R News,2(3), 5–32.Google Scholar
Chen, F., Koufaty, D. A. & Zhang, X. (2009) Understanding intrinsic characteristics and system implications of flash memory based solid state drives. Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, 181–192. Google Scholar
Horai, H., et al. (2010). MassBank: A public repository for sharing mass spectral data for life sciences. Journal of Mass Spectrometry,45, 703–714.CrossRefPubMedGoogle Scholar
Kanehisa, M., Goto, S., Sato, Y., Furumichi, M., & Tanabe, M. (2012). KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Research,40, D109–D114.CrossRefPubMedGoogle Scholar
Katajamaa, M., Miettinen, J., & Orešič, M. (2006). MZmine: Toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics,22, 634–636.CrossRefPubMedGoogle Scholar
Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News,2, 18–22.Google Scholar
Luedemann, A., Strassburg, K., Erban, A., & Kopka, J. (2008). TagFinder for the quantitative analysis of gas chromatography–mass spectrometry (GC–MS)-based metabolite profiling experiments. Bioinformatics,24, 732–737.CrossRefPubMedGoogle Scholar
Saghatelian, A., Trauger, S. A., Want, E. J., Hawkins, E. G., Siuzdak, G., & Cravatt, B. F. (2004) Assignment of endogenous substrates to enzymes by global metabolite profiling. Biochemistry, 43, 14332–14339.CrossRefGoogle Scholar
Tautenhahn, R., Patti, G. J., Rinehart, D., & Siuzdak, G. (2012a). XCMS Online: A web-based platform to process untargeted metabolomic data. Analytical Chemistry,84, 5035–5039.CrossRefPubMedPubMedCentralGoogle Scholar
Team, R. D. C. (2005). R: A language and environment for statistical computing. Vienna: R Found Statistics Computing.Google Scholar
Tomita, M., Kawakami, M., Soga, T., Robert, M., & Sugimoto, M. (2012). Bioinformatics tools for mass spectroscopy-based metabolomic data processing and analysis. Current Bioinformatics,7, 96–108.CrossRefPubMedPubMedCentralGoogle Scholar
Wishart, D. S., et al. (2009). HMDB: A knowledgebase for the human metabolome. Nucleic Acids Research,37, D603–D610.CrossRefPubMedGoogle Scholar
Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and Intelligent Laboratory Systems,2, 37–52.CrossRefGoogle Scholar
Wold, S., & Sjostrom, M. (2001). PLS-regression: A basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems,58, 109–130.CrossRefGoogle Scholar
Xia, J., Mandal, R., Sinelnikov, I. V., Broadhurst, D., & Wishart, D. S. (2012). MetaboAnalyst 2.0—a comprehensive server for metabolomic data analysis. Nucleic Acids Research,40, W127–W133.CrossRefPubMedPubMedCentralGoogle Scholar