Enabling modern data discovery for atmospheric measurements

Abstract

The Atmospheric Radiation Measurement (ARM) user facility is a US Department of Energy Office of Science user facility that is managed and operated through a collaborative effort led by nine US Department of Energy national laboratories. The ARM Data Center, located at Oak Ridge National Laboratory, is responsible for the timely collection, processing, and delivery of data products to the scientific community. The ARM Data Center holds more than 11,000 data products, including metadata collected from field campaigns, instruments, value-added products, and principal investigator–contributed data. These data sets are checked for successful transfer (for most data, this transfer is carried out automatically via the network; however, some of the largest data sets and some of the most remote sites require manual shipping of hard disks) and both the data and metadata are processed to a standard format, which is an ARM-standardized structure, via the Network Common Data Form. The Network Common Data Form is a self-describing binary format with many compatible software tools. Once processed, the data are cataloged, stored in the ARM Data Archive, and made discoverable through association with an array of metadata-characterizing information, such as location and measurement classification. These metadata enable powerful search capabilities through the ARM Data Center Data Discovery interface. This paper discusses the workflow of how the new discovery system has been redesigned from user requirements and how the data are distributed to the scientific community.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

References

  1. ARM (n.d.) Capabilities: Atmospheric observatories. Retrieved from https://arm.gov/capabilities/observatories/

  2. BootstrapVue (n.d.) From https://bootstrap-vue.org/

  3. Wilkinson MD, Dumontier D, Aalbersberg UJ, et al (2016) The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3

  4. Devarakonda R, Prakash G, Guntupally K, Kumar J (2019) Big federal data centers implementing FAIR data principles: ARM Data Center example. 2019 IEEE International Conference on Big Data, Los Angeles, California: 6,033–6,036. https://doi.org/10.1109/BigData47090.2019.9006051

  5. Globus (n.d.) Data transfer with globus. Retrieved July 22, 2020 from https://www.globus.org/data-transfer

  6. Guntupally K, Devarakonda R, Kehoe K (2018) Spring Boot based REST API to improve data quality report generation for big scientific data ARM data center example. 2018 IEEE International Conference on Big Data, Seattle, Washington: 5,328–5,329. https://doi.org/10.1109/BigData.2018.8621924

  7. Guntupally K, Dumas K, Darnell W, Crow M, Devarakonda R, Giri P (2020) Automated indexing of structured scientific metadata using apache solr. In: 2020 IEEE International Conference on Big Data (Big Data),  pp. 5685-5687. https://doi.org/10.1109/BigData50022.2020.9378448

  8. Meteorological Monographs (2016) The Atmospheric Radiation Measurement (ARM) Program: The First 20 Years (2016), American Meteorology Society 57:1 https://journals.ametsoc.org/view/journals/amsm/57/1/amsm.57.issue-1.xml

  9. Microservices Architecture (n.d.) Microservices pattern: Microservice architecture pattern. Retrieved June 18, 2020 from https://microservices.io/patterns/microservices.html

  10. Oak Ridge Leadership Computing Facility (n.d.) Retrieved from https://www.olcf.ornl.gov

  11. Prakash G, Kumar J, Rush E, Records R, Clodfelter A, Voyles J (2016) HPC infrastructure to support the next-generation ARM facility data operations. 2016 IEEE International Conference on Big Data, Washington, DC: 4,026–4,028. https://doi.org/10.1109/BigData.2016.7841098

  12. Servicenow (n.d.) From https://www.servicenow.com/

  13. Simform (n.d.) React vs. Vue. Retrieved June 18, 2020 from https://www.simform.com/react-vs-vue

  14. Solr (2017a) Uploading Structured Data Store Data with the Data Import Handler. Retrieved June 16, 2020 from https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html

  15. Solr (2017b) Overview of searching in Solr. Retrieved June 16, 2020 from https://lucene.apache.org/solr/guide/6_6/overview-of-searching-in-solr.html#overview-of-searching-in-solr

  16. Solr (n.d.) Apache Solr 8.6.0. Retrieved June 16, 2020 from https://lucene.apache.org/solr/

  17. Spring (n.d.) Spring Boot. Retrieved June 18, 2020 from https://spring.io/projects/spring-boot

  18. Hotjar (n.d.). Retrieved April 18, 2021 from https://www.hotjar.com/tour/#heatmaps

  19. UCAR Community Programs (2018) THREDDS data server 4.6. Accessed July 11, 2020 from Vue.js (n.d.). Retrieved June 22, 2020 from https://www.unidata.ucar.edu/software/tds/current/TDS.html

  20. Vue.js (n.d.) Vue.js: The progressive JavaScript framework. Retrieved June 20, 2020 from https://vuejs.org

  21. Webb P, Syer D, Long J, et al (n.d.) Spring Boot reference documentation. Retrieved June 22, 2020 from https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/

Download references

Acknowledgements

This research was supported by the Atmospheric Radiation Measurement (ARM) user facility, a U.S. Department of Energy (DOE) Office of Science user facility managed by the Office of Biological and Environmental Research Program. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the US Department of Energy. The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (https://www.energy.gov/downloads/doe-public-access-plan).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Kavya Guntupally.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by H. Babaie.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guntupally, K., Dumas, K., Prakash, G. et al. Enabling modern data discovery for atmospheric measurements. Earth Sci Inform (2021). https://doi.org/10.1007/s12145-021-00635-0

Download citation

Keywords

  • ARM data center
  • Metadata
  • Data archive
  • FAIR data
  • Metadata management
  • Data search