Abstract
A viral development of statistical data processing, computing capabilities, chromatography–mass spectrometry, and omics technologies (technologies based on the achievements of genomics, transcriptomics, proteomics, and metabolomics) in recent decades has not led to formation of a unified protocol for untargeted profiling. Systematic errors reduce the reproducibility and reliability of the obtained results and at the same time hinder consolidation and analysis of data gained in large-scale multiday experiments. We propose an algorithm for conducting omics profiling to identify potential markers in the samples of complex composition and present the case study of urine samples obtained from different clinical groups of patients. Profiling was carried out by the method of liquid chromatography–mass spectrometry. The markers were selected using methods of multivariate analysis, including machine learning and feature selection. Testing of the approach was performed using an independent dataset by clustering and projection on principal components.
Similar content being viewed by others
REFERENCES
Omics Approaches, Technologies and Applications: Integrative Approaches for Understanding OMICS Data, Arivaradarajan, P. and Misra, G., Eds., Singapore: Springer-Verlag, 2018, 1st ed. https://doi.org/10.1007/978-981-13-2925-8_4
Gorrochategui, E., Jaumot, J., Lacorte, S., and Tauler, R., Data analysis strategies for targeted and untargeted LC-MS metabolomic studies: overview and workflow, TrAC, Trends Anal. Chem., 2016, vol. 82, pp. 425–442. https://doi.org/10.1016/j.trac.2016.07.004
Argueso, C.T., Assmann, S.M., Birnbaum, K.D., et al., Directions for research and training in plant omics: Big Questions and Big Data, Plant Direct, 2019, vol. 3, no. 4, p. e00133. https://doi.org/10.1002/pld3.133
Lozano, D.C.P., Thomas, M.J., Jones, H.E., and Barrow, M.P., Petroleomics: tools, challenges, and developments, Annu. Rev. Anal. Chem., 2020, vol. 13, pp. 405–430. https://doi.org/10.1146/annurev_anchem-091619-091824
Ferranti, P., The future of analytical chemistry in foodomics, Curr. Opin. Food Sci., 2018, vol. 22, pp. 102–108. https://doi.org/10.1016/j.cofs.2018.02.005
Bolotnik, T.A., Timchenko, Yu.V., Plyushchenko, I.V., et al., Use of chemometric methods of data analysis for the identification and typification of petroleum and petroleum products, J. Anal. Chem., 2019, vol. 74, no. 13, pp. 1336–1340. https://doi.org/10.1134/S1061934819130045
Kharyuk, P., Nazarenko, D., Oseledets, I., et al., Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task, Sci. Rep., 2018, vol. 8, no. 1, p. 17053. https://doi.org/10.1038/s41598-018-35399-z
Cui, X., Tang, J., Yang, Q., et al., Assessing the effectiveness of direct data merging strategy in long-term and large-scale pharmacometabonomics, Front. Pharmacol., 2019, vol. 10, p. 127. https://doi.org/10.3389/fphar.2019.00127
Yang, Q., Hong, J., Li, Y., et al., A novel bioinformatics approach to identify the consistently well-performing normalization strategy for current metabolomic studies, Brief Bioinf., 2019, vol. 21, no. 6, pp. 2142–2152. https://doi.org/10.1093/bib/bbz137
Holman, J.D., Tabb, D.L., and Mallick, P., Employing ProteoWizard to convert raw mass spectrometry data, Curr. Protoc. Bioinf., 2014, vol. 46, no. 1, pp. 13.24.1–13.24.9. https://doi.org/10.1002/0471250953.bi1324s46
Chang, H.Y., Chen, C.T., Lih, T.M., et al., iMet-Q: a user-friendly tool for label-free metabolomics quantitation using dynamic peak-width determination, PLoS One, 2016, vol. 11, no. 1, p. e0146112. https://doi.org/10.1371/journal.pone.0146112
R Core Team, R: A Language and Environment for Statistical Computing, Vienna: R Found. Stat. Comput., 2019.
Kuhn, M. and Johnson, K., Applied Predictive Modeling, New York: Springer-Verlag, 2013, 1st ed. https://doi.org/10.1007/978-1-4614-6849-3
Andrews, J.L. and McNicholas, P.D., Variable selection for clustering and classification, J. Classif., 2014, vol. 31, no. 2, pp. 136–153. https://doi.org/10.1007/s00357-013-9139-2
Li, B., Tang, J., Yang, Q., et al., NOREVA: normalization and evaluation of MS-based metabolomics data, Nucleic Acids Res., 2017, vol. 45, no. 1, pp. W162–W170. https://doi.org/10.1093/nar/gKx449
Huber, W., von Heydebreck, A., Sültmann, H., et al., Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics, 2002, vol. 18, no. 1, pp. S96–S104. https://doi.org/10.1093/bioinformatics/18.suppl_1.S96
Karpievitch, Y.V., Taverner, T., Adkins, J.N., et al., Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition, Bioinformatics, 2009, vol. 25, no. 19, pp. 2573–2580. https://doi.org/10.1093/bioinformatics/btp426
Gautier, L., Cope, L., Bolstad, B.M., and Irizarry, R.A., affy—analysis of Affymetrix GeneChip data at the probe level, Bioinformatics, 2004, vol. 20, no. 3, pp. 307–315. https://doi.org/10.1093/bioinformatics/btg405
Funding
This work was supported by the Russian Foundation for Basic Research (grant no.: Aspiranty 19-33-90071).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that there is no conflict of interest.
Rights and permissions
About this article
Cite this article
Plyushchenko, I.V., Shakhmatov, D.G. & Rodin, I.A. Algorithm of Combining Chromatography–Mass Spectrometry Untargeted Profiling and Multivariate Analysis for Identification of Marker Substances in Samples of Complex Composition. Inorg Mater 57, 1397–1403 (2021). https://doi.org/10.1134/S0020168521140089
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0020168521140089