Skip to main content
Log in

Big Data in Modern Chemical Analysis

  • REVIEWS
  • Published:
Journal of Analytical Chemistry Aims and scope Submit manuscript

Abstract

A review of scientific publications covering the acquisition and use of big data in modern analytical chemistry is presented. Such data are characterized by considerable volumes, flows, and variety. Their generation and manipulations with them accompany the analysis of biosamples and samples of other origin by chromatography and mass spectrometry. Big data obtained by these techniques ensure multianalyte sample analysis, though the characteristics of the detection, identification, and quantification are satisfactory not for all analytes. The application of simple analytical systems can also be accompanied by the accumulation of big data volumes. A huge body of information is contained in big chemical databases, the use of which is necessary in non-target analysis. The selection of candidates for identification takes into account the prevalence (citation rate) of chemicals; identification includes the use of reference mass spectral libraries. Methods of data processing, analysis, and presentation (statistics, chemometrics) evolve with the growth of the volume of information. Technical characteristics of computers and their networks are improved at advancing rates, creating a potential for the development of methods of data analysis and opening new possibilities for interlaboratory cooperation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.

Similar content being viewed by others

Notes

  1. “Information” and “data” are often considered similar notions. More exactly, “data” are “raw materials”, and “information” is processed data (see, for example, [2]).

REFERENCES

  1. Eckschlager, K. and Danzer, K., Information Theory in Analytical Chemistry, New York: Wiley, 1994.

    Google Scholar 

  2. Harris, J., Data, Information, and Knowledge Management. http://www.ocdqblog.com/home/data-information-and-knowledge-management.html. Accessed October 1, 2018.

  3. Williams, A.J. and Pence, H.E., Chem. Int., 2017, vol. 39, no. 3, p. 9.

    Article  CAS  Google Scholar 

  4. May, J.C. and McLean, J.A., Annu. Rev. Anal. Chem., 2016, vol. 9, p. 387.

    Article  Google Scholar 

  5. Szymańska, E., Anal. Chim. Acta, 2018, vol. 1028, p. 1.

    Article  PubMed  CAS  Google Scholar 

  6. Tauler, R. and Parastar, H., Angew. Chem., Int. Ed. E-ngl., 2018. https://doi.org/10.1002/anie.201801134

  7. Kalidindi, S.R. and De Graef, M., Annu. Rev. Mater. Res., 2015, vol. 45, p. 171.

    Article  CAS  Google Scholar 

  8. Andreu-Perez, J., Poon, C.C., Merrifield, R.D., Wong, S.T., and Yang, G.Z., IEEE J. Biomed. Health Inf., 2015, vol. 19, no. 4, p. 1193.

    Article  Google Scholar 

  9. Pence, H.E. and Williams, A.J., J. Chem. Educ., 2016, vol. 93, no. 3, p. 504.

    Article  CAS  Google Scholar 

  10. Chiang, L., Lu, B., and Castillo, I., Annu. Rev. Chem. Biomol. Eng., 2017, vol. 8, p. 63.

    Article  PubMed  Google Scholar 

  11. Haug, K., Salek, R.M., and Steinbeck, C., Curr. Opin. Chem. Biol., 2017, vol. 36, p. 58.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Pluskal, T. and Yanagida, M., Cold Spring Harbor Protocols, 2016, vol. 2016, no. 12, p. 1044.

    Google Scholar 

  13. Veselkov, K., Sleeman, J., Claude, E., Vissers, J.P., Galea, D., Mroz, A., Laponogov, I., Towers, M., Tong, R., Mirnezami, R., Takats, Z., Nicholson, J., and Langridge, J.I., Sci. Rep., 2018, vol. 8, no. 1, p. 4053.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Wright, D.A., US Patent 8935101, 2015.

  15. Michalski, A., Cox, J., and Mann, M., J. Proteome Res., 2011, vol. 10, no. 4, p. 1785.

    Article  CAS  PubMed  Google Scholar 

  16. Apte, J.S., Messier, K.P., Gani, S., Brauer, M., Kirchstetter, T.W., Lunden, M.M., Marshall, J.D., Portier, C.J., Vermeulen, R.C.H., and Hamburg, S.P., Environ. Sci. Technol., 2017, vol. 51, no. 12, p. 6999.

    Article  CAS  PubMed  Google Scholar 

  17. Bandodkar, A.J., Jeerapan, I., and Wang, J., ACS Sens, 2016, vol. 1, no. 5, p. 464.

    Article  CAS  Google Scholar 

  18. Koydemir, H.C. and Ozcan, A., Annu. Rev. Anal. Chem., 2018, vol. 11, no. 1, p. 127.

    Article  CAS  Google Scholar 

  19. Mil’man, B.L. and Zhurkovich, I.K., Analitika (Analytics), 2017, no. 5, p. 30.

  20. CAS content. http://www.cas.org/about/cas-content. Accessed July 31, 2019.

  21. Substance identity in REACH. EU Final Report, 2016. https://op.europa.eu/en/publication-detail/-/publication/b31a7b23-b544-11e7-837e-01aa75ed71a1/language-en. Accessed October 2, 2018.

  22. PubChem. https://pubchem.ncbi.nlm.nih.gov/search. Accessed July 31, 2019.

  23. ChemSpider. http://www.chemspider.com. Accessed July 31, 2019.

  24. ZINC15. http://zinc15.docking.org. Accessed July 31, 2019.

  25. Milman, B.L. and Zhurkovich, I.K., TrAC,Trends Anal. Chem., 2017, vol. 97, p. 179.

    Article  CAS  Google Scholar 

  26. Milman, B.L. and Kovrizhnych, M.A., Fresenius’ J. Anal. Chem., 2000, vol. 367, no. 7, p. 629.

    Article  CAS  Google Scholar 

  27. Milman, B.L., Anal. Chem., 2002, vol. 74, no. 7, p. 1484.

    Article  CAS  PubMed  Google Scholar 

  28. Milman, B.L., J. Chem. Inf. Model, 2005, vol. 45, no. 5, p. 1153.

    Article  CAS  PubMed  Google Scholar 

  29. Mil’man, B.L., Vvedenie v khimicheskuyu identifikatsiyu (Introduction to Chemical Identification), St. Petersburg: VVM, 2008.

  30. Milman, B.L., Chemical Identification and Its Quality Assurance, Berlin: Springer, 2011.

    Book  Google Scholar 

  31. How many proteins exist in human body? http:// www.innovateus.net/health/how-many-proteins-exist-human-body. Accessed October 2, 2018.

  32. Mass spectral libraries (NIST 17 and Wiley libraries). http://www.sisweb.com/software/ms/wiley.htm. Accessed October 2, 2018.

  33. Guijas, C., Montenegro-Burke, J.R., Domingo-Almenara, X., Palermo, A., Warth, B., Hermann, G., Koellensperger, G., Huan, T., Uritboonthai, W., Aisporna, A.E., Wolan, D.W., Spilker, M.E., Benton, H.P., and Siuzdak, G., Anal. Chem., 2018, vol. 90, no. 5, p. 3156.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. The Global Natural Product Social Molecular Networking (GNPS). https://gnps.ucsd.edu/ProteoSAFe/gnpslibrary.jsp?library=all. Accessed October 3, 2018.

  35. MONA, MassBank of North America. http://mona.fiehnlab.ucdavis.edu. Accessed November 17, 2018.

  36. MassBank. https://massbank.eu/MassBank. Accessed October 3, 2018.

  37. Spectral Database for Organic Compounds. https://sdbs.db.aist.go.jp/sdbs/cgi-bin/cre_index.cgi. Accessed November 17, 2018.

  38. HighChem Spectral Tree. http://www.highchem.com/index.php/81-massfrontier. Accessed November 17, 2018.

  39. PeptideAtlas Overview. http://www.peptideatlas.org/overview.php. Accessed October 3, 2018.

  40. X!HUNTER Annotated Spectrum Library. http://thegpm.org/HUNTER/index.html. Accessed October 3, 2018.

  41. Griss, J., Foster, J.M., Hermjakob, H., and Vizcaino, J.A., Nat. Methods, 2013, vol. 10, no. 2, p. 95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. NIST Libraries of Peptide Tandem Mass Spectra. https:// chemdata.nist.gov/dokuwiki/doku.php?id=peptidew:start. Accessed November 17, 2018.

  43. Kind, T., Tsugawa, H., Cajka, T., Ma, Y., Lai, Z., Mehta, S.S., Wohlgemuth, G., Barupal, D.K., Showalter, M.R., Arita, M., and Fiehn, O., Mass Spectrom. Rev., 2017, vol. 37, no. 4, p. 513.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  44. Blaženović, I., Kind, T., Ji, J., and Fiehn, O., Metabolites, 2018, vol. 8, no. 2, p. 31.

    Article  PubMed Central  CAS  Google Scholar 

  45. Dumancas, G.G., Bello, G.A., Hughes, J., Murimi, R., Viswanath, L.C., Orndorff, C.O., Dumancas, G.F., and Dell, J.D., in Handbook of Research on Big Data Storage and Visualization Techniques, Segall, R. and Cook, J., Eds., Hershey, PA: IGI Global, 2018, p. 873. https://doi.org/10.4018/978-1-5225-3142-5.ch030

    Google Scholar 

  46. Dubrov, A.M., Mkhitaryan, V.S., and Troshin, L.I., Mnogomernye statisticheskie metody (Multivariate Statistical Methods), Moscow: Finansy i statistika, 1998.

  47. Krallinger, M., Rabal, O., Lourenço, A., Oyarzabal, J., and Valencia, A., Chem. Rev., 2017, vol. 117, no. 12, p. 7673.

    Article  CAS  PubMed  Google Scholar 

  48. Postma, G.J. and Kateman, G., J. Chem. Inf. Comput. Sci., 1993, vol. 33, no. 3, p. 350.

    Article  CAS  Google Scholar 

  49. Schneider, N., Lowe, D.M., Sayle, R.A., Tarselli, M.A., and Landrum, G.A., J. Med. Chem., 2016, vol. 59, no. 9, p. 4385.

    Article  CAS  PubMed  Google Scholar 

  50. Milman, B.L., Gostev, V.V., and Dmitriev, A.V., J. Anal. Chem., 2018, vol. 73, no. 13, p. 1217.

    Article  CAS  Google Scholar 

  51. Milman, B.L. and Zhurkovich, I.K., Mass Spectrom. Lett., 2018, vol. 9, no. 3, p. 73.

    CAS  Google Scholar 

  52. Sample size calculator. http://www.surveysystem.com/ sscalc.htm. Accessed October 3, 2018.

  53. Nazipova, N.N., Isaev, E.A., Kornilov, V.V., Pervukhin, D.V., Morozova, A.A., Gorbunov, A.A., and Ustinin, M.N., Mat. Biol. Bioinf., 2017, vol. 12, no. 1, p. 102.

    Article  Google Scholar 

  54. Alyass, A., Turcotte, M., and Meyre, D., BMC Med. Genomics, 2015, vol. 8, no. 1, p. 33.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Schymanski, E.L., Ruttkies, C., Krauss, M., Brouard, C., Kind, T., Dührkop, K., Allen, F., Vaniya, A., Verdegem, D., S. Böcker Rousu, J., Shen, H., Tsugawa, H., Sajed, T., Fiehn, O., Ghesquiére, B., and Neumann, S., J. Cheminf., 2017, vol. 9, p. 22.

    Article  Google Scholar 

  56. Blaźenović, I., Kind, T., Torbašinović, H., Obrenović, S., Mehta, S.S., Tsugawa, H., Wermuth, T., Schauer, N., Jahn, M., Biedendieck, R., Jahn, D., and Fiehn, O., J. Cheminf., 2017, vol. 9, p. 32.

    Article  Google Scholar 

  57. Blaźenović, I., Kind, T., Sa, M.R., Ji, J., Vaniya, A., Wancewicz, B., Roberts, B.S., Torbašinović, H., Lee, T., Mehta, S.S., Showalter, M.R., Song, H., Kwok, J., Jahn, D., Kim, J., and Fiehn, O., Anal. Chem., 2019, vol. 91, no. 3, p. 2155.

    Article  PubMed  CAS  Google Scholar 

  58. Dasenaki, M.E., Bletsou, A.A., Koulis, G.A., and Thomaidis, N.S., J. Agric. Food Chem., 2015, vol. 63, no. 18, p. 4493.

    Article  CAS  PubMed  Google Scholar 

  59. Robert, C., Gillard, N., Brasseur, P.Y., Pierret, G., Ralet, N., Dubois, M., and Delahaut, P., Food Addit. Contam.,Part A, 2013, vol. 30, no. 3, p. 443.

    CAS  Google Scholar 

  60. Malachová, A., Sulyok, M., Beltrán, E., Berthiller, F., and Krska, R., J. Chromatogr. A, 2014, vol. 1362, p. 145.

    Article  PubMed  CAS  Google Scholar 

  61. Dzuman, Z., Zachariasova, M., Veprikova, Z., Godula, M., and Hajslova, J., Anal. Chim. Acta, 2015, vol. 863, p. 29.

    Article  CAS  PubMed  Google Scholar 

  62. Pérez-Ortega, P., Lara-Ortega, F.J., García-Reyes, J.F., Gilbert-López, B., Trojanowicz, M., and Molina-Díaz, A., Talanta, 2016, vol. 160, p. 704.

    Article  PubMed  CAS  Google Scholar 

  63. Fu, Y., Zhou, Z., Kong, H., Lu, X., Zhao, X., Chen, Y., Chen, J., Wu, Z., Xu, Z., Zhao, C., and Xu, G., Anal. Chem., 2016, vol. 88, no. 17, p. 8870.

    Article  CAS  PubMed  Google Scholar 

  64. Gago-Ferrero, P., Borova, V., Dasenaki, M.E., and Thomaidis, N.S., Anal. Bioanal. Chem., 2015, vol. 407, no. 15, p. 4287.

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to B. L. Milman or I. K. Zhurkovich.

Additional information

Translated by E. Rykova

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Milman, B.L., Zhurkovich, I.K. Big Data in Modern Chemical Analysis . J Anal Chem 75, 443–452 (2020). https://doi.org/10.1134/S1061934820020124

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1061934820020124

Keywords:

Navigation