Skip to main content
Log in

The intensive use of digital data in modern natural science

  • Information Analysis
  • Published:
Automatic Documentation and Mathematical Linguistics Aims and scope

Abstract

Common approaches and technologies applied to digital data storage and processing in various disciplines are analyzed. It is shown that regardless of a specific subject area, working with large data set obtained as a result of experimenting or modeling requires similar methodological support, involving data curation, metadata support, and data genesis and quality annotation. The interdisciplinary field called “The properties of materials and substances” is analyzed as an example of a discipline that actively applies digital data. New approaches to the integration of data with heterogeneous properties that take into account structural data variations by the class of substances, the state of sample, experimental conditions, and other factors are investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Lynch, C., Big Data: How Do Your Data Grow?, Nature, 2008, vol. 455, pp. 28–29.

    Article  Google Scholar 

  2. Gray, J., Szalay, A.S., Thakar, A.R., et al., Online scientific data curation, publication, and archiving, in Technical Report MSR-TR-2002-74. Microsoft Research.

  3. The Fourth Paradigm. Data-Intensive Scientific Discovery, Hey, T., Tansley, St., and Tolle, Kr., Eds., Microsoft Corporation, 2009.

  4. Thanos, C., A vision for global research data infrastructures, Data Sci. J., 2013, vol. 12, pp. 71–90.

    Article  Google Scholar 

  5. Zhao, J., Corcho, O., Missier, P., et al., eScience, in Handbook of Semantic Web Technologies, Berlin Heidelberg: Springer-Verlag, 2011, pp. 703–733.

    Google Scholar 

  6. Borne, K., Astroinformatics: Data-oriented astronomy research and education, Earth Sci. Inf., 2010, vol. 3, no. 1, pp. 5–17.

    Article  Google Scholar 

  7. Erkimbaev, A.O., Zitserman, V.Yu., Kobzev, G.A., and Trakhtenhers, M.S., Nanoinformatics: Problems, methods, and technologies, Sci. Tech. Inf. Process., 2016, vol. 43, no. 4, pp. 199–216.

    Article  Google Scholar 

  8. Smith, F.J., Data science as an academic discipline, Data Sci. J., 2006, vol. 5, pp. 163–164.

    Article  Google Scholar 

  9. Bohle, S., What is e-science and how should it be managed, SciLogs, June 12, 2013.

    Google Scholar 

  10. Erbach, G., Data-centric view in E-science inforation systems, Data Sci. J., 2006, vol. 5, pp. 219–222.

    Article  Google Scholar 

  11. Zhu, Y. and Xiong, Y., Towards data science, Data Sci. J., 2015, vol. 14, no. 8, pp. 1–7.

    Google Scholar 

  12. Zabezhailo, M.I., Intellectual data analysis–a new direction of development of information technologies, Nauchno-Tekh. Inf., Ser. 2, 1998, no. 5, pp. 6–17.

    Google Scholar 

  13. Zitserman, V.Yu., Kobzev, G.A., and Fokin, L.R., Prospects for the development of information and analytical tools in the collection and generation of reference data, Nauchno-Tekh. Inf., Ser. 1, 2004, no. 2, pp. 7–14.

    Google Scholar 

  14. Hansen, C., Johnson, C.R., Pascucci, V., and Silva, C.T., Visualization for data-intensive science, in The Fourth Paradigm. Data-Intensive Scientific Discovery, Hey, T., Tansley, St., and Tolle, Kr., Eds., Microsoft Corporation, 2009.

  15. Palmer, C.L., Weber, N.M., Munoz, T., and Renear, A.H., Foundations of data curation: The pedagogy and practice of “purposeful work” with research data, Arch. J., 2013, vol. 3.

    Google Scholar 

  16. Zorich, D.M., Data management: Managing electronic information: Data curation in museums, Mus. Manage. Curatorship, 1995, vol. 14, no. 4, p. 431.

    Google Scholar 

  17. Erkimbaev, A.O., Zitserman, V.Yu., and Kobzev, G.A., The role of metadata in the creation and use of information resources about the properties of substances and materials, Nauchno-Tekh. Inf., Ser. 1, 2008, no. 11, pp. 13–19.

    Google Scholar 

  18. Khokhlov, Yu.E. and Arnautov, S.A., Overview of metadata formats. http://www.elbib.ru/index.phtml? page=elbib/rus/methodology/md_rev.

  19. Erkimbaev, A.O., Zitserman, V.Yu., Kobzev, G.A., and Fokin, L.R., The logical structure of physical and chemical data. Problems of standardization and exchange of numerical data, Zh. Fiz. Khim., 2008, vol. 82, no. 1, pp. 20–31.

    Google Scholar 

  20. Erkimbaev, A.O., Zitserman, V.Yu., Kobzev, G.A., and Trakhtenhers, M.S., A universal metadata system for the characterization of nanomaterials, Sci. Tech. Inf. Process., 2015, vol. 42, no. 4, pp. 211–222.

    Article  Google Scholar 

  21. Stonebraker, M. et al., Requirements for science data bases and sciDB, Fourth Bienial Conference on Innovation Data Systems Research, 2009. http://www-db.cs.wisc.edu/cidr/cidr2009/Paper_26.pdf.

    Google Scholar 

  22. Chirico, R.D., Frenkel, M., Diky, V.V., et al., ThermoMLs: an XML-based approach for storage and exchange of experimental and critically evaluated thermophysical and thermochemical property data. 2. Uncertainties, J. Chem. Eng. Data, 2003, vol. 48, no. 5, pp. 1344–1359.

    Article  Google Scholar 

  23. Eletskii, A.V., Erkimbaev, A.O., Zitserman, V.Yu., Kobzev, G.A., and Trakhtengerts, M.S., Thermophysical properties of nanosized objects: Systematization and estimation of data reliability, Teplofiz. Vys. Temp., 2012, vol. 50, no. 4, pp. 524–532.

    Google Scholar 

  24. Wang, R.Y. and Strong, D.M., Beyond accuracy: What data quality means to data consumers, J. Manage. Inf. Syst., 1996, vol. 12, no. 4, pp. 5–33.

    Article  Google Scholar 

  25. Cai, L. and Zhu, Y., The challenges of data quality and data quality assessment in the big data era, Data Sci. J., 2015, vol. 14, no. 2, pp. 1–10.

    Google Scholar 

  26. Potapov, V.M. and Kochetova, E.K., Khimicheskaya informatsiya. Gde i kak iskat' khimiku nuzhnye svedeniya (Chemical Information. Where and How Chemist Should Find Necessary Information), Moscow: Khimiya, 1988.

    Google Scholar 

  27. Frenkel, M., Global communications and expert systems in thermodynamics: Connecting property measurement and chemical process design, Pure Applied Chem., 2005, vol. 77, no. 8, pp. 1349–1367.

    Article  Google Scholar 

  28. Hill, J., Mulholland, G., Persson, K., et al., Materials science with large-scale data and informatics: Unlocking new opportunities, MRS Bull., 2016, vol. 41, no. 5, pp. 399–409.

    Article  Google Scholar 

  29. Hunt, W.H., Jr., Materials informatics: Growing from the bio world, JOM, 2006, vol. 58, no. 7, p. 88.

    Article  Google Scholar 

  30. Kiseleva, N.N. and Dudarev, V.A., The infrastructure of providing specialists with data in inorganic chemistry and materials science, Trudy XVIII Mezhdunarodnoi konferentsii DAMDID/RCDL'2016 “Analitika i upravlenie dannymi v oblastyakh s intensivnym ispol’zovaniem dannykh” (Proceedings of the XVIII International Conference DAMDID/RCDL'2016 Analytics and Data Management in Areas with Intensive Data Use, Ershovo, October 11–14, 2016), 2016, pp. 191–198.

    Google Scholar 

  31. Dudarev, V.A., Integratsiya informatsionnykh sistem v oblasti neorganicheskoi khimii i materialovedeniya (Integration of Information Systems in the Field of Inorganic Chemistry and Materials Science), Moscow: KRASAND, 2016.

    Google Scholar 

  32. Rodgers, J.R. and Cebon, D., Materials informatics, MRS Bull., 2006, vol. 31, no. 12, pp. 975–980.

    Article  Google Scholar 

  33. Erkimbaev, A.O., Zhizhchenko, A.B., Zitserman, V.Yu., Kobzev, G.A., Son, E.E., and Sotnikov, A.N., Integration of databases on substance properties: Approaches and technologies, Autom. Doc. Math. Linguist., 2012, vol. 46, no. 4, pp. 170–176.

    Article  Google Scholar 

  34. Zhang, X., Zhao, C., and Wang, X., A survey on knowledge representation in materials science and engineering: An ontological perspective, Comput. Ind., 2015, vol. 73, pp. 8–22.

    Article  Google Scholar 

  35. Erkimbaev, A.O., Zitserman, V.Yu., Kobzev, G.A., and Kosinov, A.V., Linking the ontologies to databases for properties of substances and materials, Nauchno-Tekh. Inf., Ser. 2, 2015, no. 12, pp. 1–16.

    Google Scholar 

  36. Dima, A., Bhaskarla, S., Becker, C., et al., Informatics infrastructure for the materials genome initiative, JOM, 2016, vol. 68, no. 8, pp. 2053–2064.

    Article  Google Scholar 

  37. Michel, K. and Meredig, B., Beyond bulk single crystals: A data format for all materials structure–property–processing relationships, MRS Bull., 2016, vol. 41, no. 8, pp. 617–623.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. O. Erkimbaev.

Additional information

Original Russian Text © A.O. Erkimbaev, V.Yu. Zitserman, G.A. Kobzev, 2017, published in Nauchno-Tekhnicheskaya Informatsiya, Seriya 2: Informatsionnye Protsessy i Sistemy, 2017, No. 9, pp. 9–22.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Erkimbaev, A.O., Zitserman, V.Y. & Kobzev, G.A. The intensive use of digital data in modern natural science. Autom. Doc. Math. Linguist. 51, 201–213 (2017). https://doi.org/10.3103/S0005105517050028

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0005105517050028

Keywords

Navigation