Materials Data Infrastructure: A Case Study of the Citrination Platform to Examine Data Import, Storage, and Access
- 1.2k Downloads
Considerations are presented around the design of a materials data infrastructure including import of structured and unstructured data, storage of that data for archival and retrieval, and access to that data through programmatic and graphical interfaces. In particular, the choices around technologies used in such an infrastructure, the benefits and drawbacks of those technologies, and their impact on the experience of users of that system are presented. The Citrination platform is used as an example of a materials data infrastructure and the choices made around architecture are discussed.
KeywordsRelational Database Complex Query Unstructured Data Query Response Time Inorganic Crystal Structure Database
- 1.J.H. Westbrook and J. R. Rumble Jr., Computerized Materials Data Systems (Office of Standard Reference Data, National Bureau of Standards, 1983).Google Scholar
- 2.Materials Genome Initiative National Science and Technology Council Committee on Technology Subcommittee on the Materials Genome Initiative (National Science and Technology Council Committee on Technology Subcommittee on the Materials Genome Initiative, Washington, 2014).Google Scholar
- 3.J.P. Holdren, “Increasing access to the results of federally funded scientific research. https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf. Accessed 26 May 2016.
- 4.T. Austin, Mater. Discov. (2016). doi: 10.1016/j.md.2015.12.003.
- 5.The NoMaD Repository, http://nomad-repository.eu. Accessed 26 May 2016.
- 10.S.R. Hall, B. McMahon (Eds), International Tables for Crystallography Volume G: Definition and exchange of crystallographic data (Springer, The Netherlands, 2005).Google Scholar
- 11.Materials Commons, https://materialscommons.org. Accessed 26 May 2016.
- 12.NIST Repositories, https://materialsdata.nist.gov/dspace/xmlui. Accessed 26 May 2016.
- 13.I. Foster, R. Ananthakrishnan, B. Blaiszik, K. Chard, R. Osborn, S. Tuecke, M. Wilde, and J.M. Wozniak, Adv. Par. Com. 26, 117 (2015).Google Scholar
- 14.Dryad Digital Reposity, http://datadryad.org. Accessed 26 May 2016.
- 15.Figshare, https://figshare.com. Accessed 26 May 2016.
- 16.Citrination, https://citrination.com. Accessed 26 May 2016.
- 17.J.A. Warren, and R.F. Boisvert, Building the Materials Innovation Infrastructure: Data and Standards, NISTIR 7898.Google Scholar
- 18.C.H. Ward, and J.A. Warren, Materials Genome Initiative: Materials Data, NISTIR 8038.Google Scholar
- 19.NIST Materials Data Curation System, https://mgi.nist.gov/materials-data-curation-system. Accessed 26 May 2016.
- 20.P. Huck, A. Jain, D. Gunter, D. Winston, and K. Persson, A Community Contribution Framework for Sharing Materials Data with Materials Project, arXiv:1510.05024v1.
- 21.K. Michel and B. Meredig, Citrine Informatics. Redwood City, CA, unpublished research, 2016.Google Scholar
- 22.PIF Documentation, http://www.citrine.io/pif. Accessed 26 May 2016.
- 23.Pypif, http://www.citrine.io/pypif. Accessed 26 May 2016.
- 24.Jpif, http://www.citrine.io/jpif. Accessed 26 May 2016.
- 27.S. Sumathi and S. Esakkirajan, Fundamentals of Relational Database Management Systems (Springer, The Netherlands, 2007).Google Scholar
- 29.P.J. Sadalage and M. Fowler, NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence (Boston: Addison-Wesley Professional, 2012).Google Scholar
- 30.Lucene, https://lucene.apache.org. Accessed 26 May 2016.
- 31.Solr, http://lucene.apache.org/solr. Accessed 26 May 2016.
- 32.ElasticSearch, https://www.elastic.co/products/elasticsearch. Accessed 26 May 2016.
- 33.Citrination API Documentation, http://www.citrine.io/api. Accessed 26 May 2016.
- 34.Citrine Informatics, http://www.citrine.io. Accessed 26 May 2016.