Abstract
The amount of available data changes the style of research in geo-scientific domains, and thus influences the requirements for spatial processing systems. To support data-driven research and exploratory workflows, we propose the Visualization, Analysis & Transformation system (VAT). We first identify ten fundamental requirements, which span from supporting spatial data types over low latency computations to visualization techniques. Based on these we evaluate state-of-the-art systems from the domains of spatial frameworks, GIS, workflow systems, scientific databases and Big Data solutions. The goal of the VAT system is to overcome the identified limitations by a holistic approach to raster and vector data, demand-driven and tiled processing, and the efficient usage of heterogeneous hardware architectures. A first comparison with other systems shows the validity of our approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Literatur
Aji A, Wang F, Vo H et al (2013) Hadoop-GIS: a High Performance Spatial Data Warehousing System over MapReduce. Proceedings VLDB Endowment 6(11): 1009-1020
Andrienko G, Andrienko N, Demsar U, Dransch D, Dykes J, Fabrikant SI, Jern M, Kraak MJ, Schumann H, Tominski C (2010) Space, time and visual analytics. Int J Geogr Inf Sci 24(10):1577–1600
Authmann C, Beilschmidt C, Drönner J, Mattig M, Seeger B (2015) Rethinking Spatial Processing in Data-Intensive Science. BTW 2015 Workshop
Bach K, Schäfer D, Enke N et al (2012) A comparative evaluation of technical solutions for long-term data repositories in integrative biodiversity research. Ecological Informatics 11:16–24
Baumann P, Dehmel A, Furtado P et al (1998) The Multidimensional Database System RasDaMan. In: ACM SIGMOD Conference '98, pp. 575–577. ACM
Baumgärtner L, Strack C, Hossbach B, Seidemann M, Seeger B, Freisleben B (2015) Complex Event Processing for Reactive Security Monitoring in Virtualized Computer Systems. The 9th ACM International Conference on Distributed Event-Based Systems
Bose R, Frew J (2005) Lineage retrieval for scientific data processing: a survey. ACM Computing Surveys 37 (CSUR) (1):1–28
Brown PG (2010) Overview of sciDB: large scale array storage, processing and analysis. In: Proc. of the ACM SIGMOD Conference, pp. 963–968
Corvalan C, Hales S, McMichael AJ (2005) Ecosystems and human well-being: health synthesis. World health organization
Cuevas-Vicenttín V, Dey S et al Scientific Workflows and Provenance: Introduction and Research Opportunities. Datenbank-Spektrum 12 (3), 193–203 (2012)
Dean J, Ghemawat S (2008) Mapreduce. Comm. of the ACM 51(1):107
Diepenbroek M, Glöckner F, Grobe P et al (2014) Towards an Integrated Biodiversity and Ecological Research Data Management and Archiving Platform: The German Federation for the Curation of Biological Data (GFBio). GI: Informatik 2014 – Big Data Komplexität meistern
Garcia-Molina H, Ullman JD, Widom J (2000) Database system implementation, vol. 654. Prentice Hall Upper Saddle River, NJ
Hey AJ, Tansley S, Tolle KM et al (2009) The Fourth Paradigm: Data-Intensive Scientific Discovery, vol. 1. Microsoft Research Redmond, WA
Hijmans RJ, Cameron SE, Parra JL et al (2005) Very high resolution interpolated climate surfaces for global land areas. Int J Climatol 25(15):1965–1978
Jetz W, McPherson JM, Guralnick RP (2012) Integrating biodiversity distribution knowledge: toward a global map of life. Trends Ecol & Evol 27(3):151–159
Kini A, Emanuele R (2014) Geotrellis: Adding Geospatial Capabilities to Spark. Spark Summit 2014
Kyriazis G (2012) Heterogeneous system architecture: a technical review. AMD Fusion Developer Summit
Lam C (2010) Hadoop in Action, 1st edn. Manning Pub., Greenwich, CT, USA
Manyika J, Chui M, Brown B et al (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute
McLennan M, Clark S, Deelman E, Rynge M, Vahi K, McKenna F, Kearney D, Song C (2013) Bringing scientific workflow to the masses via Pegasus and HUBzero. Proceedings of the 5th International Workshop on Science Gateways 13 p. 14
{Open Geospatial Consortium Inc.} (2011) OpenGIS Implementation Standard for Geographic information - Simple feature access - Part 1: Common architecture
Vitolo C, Elkhatib Y, Reusser D et al (2015) Web technologies for environmental Big Data. Environmental Modelling & Software 63 pp. 185–198
Warmerdam F (2008) The geospatial data abstraction library. Open Source Approaches in Spatial Data Handling. Springer, Berlin Heidelberg pp. 87–104
Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association
Acknowledgement
This work has been supported by the Deutsche Forschungsgemeinschaft (DFG) under grant no. SE 553/7-1 (GFBio) and by the Bundesministerium für Bildung und Forschung (BMBF) under grant no. 01LL1301 (IDESSA).
Author information
Authors and Affiliations
Corresponding author
Additional information
This is an extended version of the paper “Rethinking Spatial Processing in Data-Intensive Science” [3] selected for the special DASP issue Best Workshop Papers of BTW 2015.
Rights and permissions
About this article
Cite this article
Authmann, C., Beilschmidt, C., Drönner, J. et al. VAT: A System for Visualizing, Analyzing and Transforming Spatial Data in Science. Datenbank Spektrum 15, 175–184 (2015). https://doi.org/10.1007/s13222-015-0197-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13222-015-0197-y