Abstract
Day by day, severe meteorological events increasingly highlight the importance of fast and accurate weather forecasting. There are various Numerical Weather Prediction (NWP) models worldwide that are run on either a local or a global scale to predict future weather. NWP models typically take hours to finish a complete run, however, depending on the input parameters and the size of the forecast domain. Provenance information is of central importance for detecting unexpected events that may develop during model execution, and also for taking necessary action as early as possible. Besides, the need to share scientific data and results between researchers or scientists also highlights the importance of data quality and reliability. In this study, we develop a framework for tracking The Weather Research and Forecasting (WRF) model and for generating, storing, and analyzing provenance data. We develop a machine-learning-based log parser to enable the proposed system to be dynamic and adaptive so that it can adapt to different data and rules. The proposed system enables easy management and understanding of numerical weather forecast workflows by providing provenance graphs. By analyzing these graphs, potential faulty situations that may occur during the execution of WRF can be traced to their root causes. Our proposed system has been evaluated and has been shown to perform well even in a high-frequency provenance information flow.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Simmhan, Y.L., Plale, B., Gannon, D.: A survey of data provenance in e-science. ACM SIGMOD Rec. 34(3), 31–36 (2005). https://doi.org/10.1145/1084805.1084812
Missier, P., Belhajjame, K., Cheney, J.: The W3C PROV family of specifications for modelling provenance metadata. In: Proceedings of the 16th International Conference on Extending Database Technology, pp. 773–776 (2013). https://doi.org/10.1145/2452376.2452478
Tufek, A., Gurbuz, A., Ekuklu, O.F., Aktas, M. S.: Provenance collection platform for the Weather Research and Forecasting Model. In: 2018 14th International Conference on Semantics, Knowledge and Grids (SKG), pp. 17–24 (2018). https://doi.org/10.1109/skg.2018.00009
Simmhan, Y.L., Plale, B., Gannon, D.: A framework for collecting provenance in data-centric scientific workflows. In: 2006 IEEE International Conference on Web Services (ICWS06), pp. 427–436 (2006). https://doi.org/10.1109/icws.2006.5
Indiana University, Pervasive Technology Institute. (n.d.). Karma. Pervasive Technology Institute website: https://pti.iu.edu/impact/open-source/karma.html. 12 Apr 2020
Indiana University, Data To Insight Center (D2I). (n.d.). Komadu: Provenance collection and visualization tool based on W3C PROV standard, GitHub website: https://github.com/Data-to-Insight-Center/komadu. 12 Apr 2020
Droegemeier, K.K., et al.: Linked environments for atmospheric discovery (LEAD): architecture, technology roadmap and deployment strategy. In: 21st Conference on Interactive Information Processing Systems for Meteorology, Oceanography, and Hydrology, January 2005
Aktas, M.S., Fox, G.C., Pierce, M., Oh, S.: XML metadata services. Concurrency Comput. Pract. Experience 20(7), 801–823 (2008). https://doi.org/10.1002/cpe.1276
Aktas, M.S., Pierce, M.: High-performance hybrid information service architecture. Concurrency Comput. Pract. Experience 22(15), 2095–2123 (2010). https://doi.org/10.1002/cpe.1557
Aktas, M.S., Fox, G.C., Pierce, M.: Information services for dynamically assembled semantic grids. In: 2005 First International Conference on Semantics, Knowledge and Grid, pp. 10–10 (2005). https://doi.org/10.1109/skg.2005.83
Jensen, S., Plale, B., Aktas, M.S., Luo, Y., Chen, P., Conover, H.: Provenance capture and use in a satellite data processing pipeline. IEEE Trans. Geosci. Remote Sens. 51(11), 5090–5097 (2013). https://doi.org/10.1109/TGRS.2013.2266929
Moreau, L., et al.: The open provenance model core specification (v1.1). Future Gener. Comput. Syst. 27(6), 743–756 (2011). https://doi.org/10.1016/j.future.2010.07.005
Shu, Y., Taylor, K., Hapuarachchi, P., Peters, C.: Modelling provenance in hydrologic science: a case study on streamflow forecasting. J. Hydroinformatics 14(4), 944–959 (2012). https://doi.org/10.2166/hydro.2012.134
Bernardet, L., Carson, L., Tallapragada, V.: The design of a modern information technology infrastructure to facilitate research-to-operations transition for NCEP’s modeling suites. Bull. Am. Meteor. Soc. 98(5), 899–904 (2017). https://doi.org/10.1175/bams-d-15-00139.1
McCallumzy, A., Nigamy, K., Renniey, J., Seymorey, K.: Building domain-specific search engines with machine learning techniques. In: Proceedings of the AAAI Spring Symposium on Intelligent Agents in Cyberspace, pp. 28–39 (1999). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.4717
Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: AAAI Workshop on Internet Based Information Systems, pp. 1–8 (1996). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.9172
Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 183–194 (2008). https://doi.org/10.1145/1341531.1341557
Neethu, M.S., Rajasree, R.: Sentiment analysis in Twitter using machine learning techniques. In: 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), pp. 1–5 (2013). https://doi.org/10.1109/ICCCNT.2013.6726818
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86 (2002). https://doi.org/10.3115/1118693.1118704
Groth, P., Moreau, L. (Eds.). (n.d.). PROV-Overview: An Overview of the PROV Family of Documents. https://www.w3.org/TR/prov-overview. 12 Apr 2020
Baeth, M., Aktas, M.: Detecting misinformation in social networks using provenance data. Concurrency Comput. Pract. Experience 31(3), e4793 (2019)
Baeth, M., Aktas, M.: An approach to custom privacy policy violation detection problems using big social provenance data. Concurrency Comput. Pract. Experience 30(21), e4690 (2018)
Riveni, M., Nguyen, T., Aktas, M.S., Dustdar, S.: Application of provenance in social computing: a case study. Concurrency Comput. Pract. Experience 31(3), e4894 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tufek, A., Aktas, M.S. (2021). On the Provenance Extraction Techniques from Large Scale Log Files: A Case Study for the Numerical Weather Prediction Models. In: Balis, B., et al. Euro-Par 2020: Parallel Processing Workshops. Euro-Par 2020. Lecture Notes in Computer Science(), vol 12480. Springer, Cham. https://doi.org/10.1007/978-3-030-71593-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-71593-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71592-2
Online ISBN: 978-3-030-71593-9
eBook Packages: Computer ScienceComputer Science (R0)