Advertisement

VAT: A System for Visualizing, Analyzing and Transforming Spatial Data in Science

  • 410 Accesses

  • 6 Citations

Abstract

The amount of available data changes the style of research in geo-scientific domains, and thus influences the requirements for spatial processing systems. To support data-driven research and exploratory workflows, we propose the Visualization, Analysis & Transformation system (VAT). We first identify ten fundamental requirements, which span from supporting spatial data types over low latency computations to visualization techniques. Based on these we evaluate state-of-the-art systems from the domains of spatial frameworks, GIS, workflow systems, scientific databases and Big Data solutions. The goal of the VAT system is to overcome the identified limitations by a holistic approach to raster and vector data, demand-driven and tiled processing, and the efficient usage of heterogeneous hardware architectures. A first comparison with other systems shows the validity of our approach.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Abb. 1
Abb. 2
Abb. 3
Abb. 4

Notes

  1. 1.

    www.gfbio.org

  2. 2.

    www.idessa.org

  3. 3.

    modis.gsfc.nasa.gov

  4. 4.

    www.r-project.org

  5. 5.

    geos.osgeo.org

  6. 6.

    www.qgis.org

  7. 7.

    earthengine.google.org

  8. 8.

    species.mol.org

  9. 9.

    www.postgis.net

  10. 10.

    www.taverna.org.uk

  11. 11.

    www.biodiversitycatalogue.org

  12. 12.

    www.kepler-project.org

  13. 13.

    www.dataone.org

  14. 14.

    www.opengeospatial.org

  15. 15.

    www.khronos.org/opencl

  16. 16.

    www.vividsolutions.com/jts

  17. 17.

    www.catsg.org

  18. 18.

    ec.europa.eu/jrc/en/scientific-tool/global-land-cover

  19. 19.

    www.gbif.org

  20. 20.

    grass.osgeo.org

  21. 21.

    pywps.wald.intevation.org

  22. 22.

    github.com/geotrellis/geotrellis

  23. 23.

    accumulo.apache.org

Literatur

  1. 1.

    Aji A, Wang F, Vo H et al (2013) Hadoop-GIS: a High Performance Spatial Data Warehousing System over MapReduce. Proceedings VLDB Endowment 6(11): 1009-1020

  2. 2.

    Andrienko G, Andrienko N, Demsar U, Dransch D, Dykes J, Fabrikant SI, Jern M, Kraak MJ, Schumann H, Tominski C (2010) Space, time and visual analytics. Int J Geogr Inf Sci 24(10):1577–1600

  3. 3.

    Authmann C, Beilschmidt C, Drönner J, Mattig M, Seeger B (2015) Rethinking Spatial Processing in Data-Intensive Science. BTW 2015 Workshop

  4. 4.

    Bach K, Schäfer D, Enke N et al (2012) A comparative evaluation of technical solutions for long-term data repositories in integrative biodiversity research. Ecological Informatics 11:16–24

  5. 5.

    Baumann P, Dehmel A, Furtado P et al (1998) The Multidimensional Database System RasDaMan. In: ACM SIGMOD Conference '98, pp. 575–577. ACM

  6. 6.

    Baumgärtner L, Strack C, Hossbach B, Seidemann M, Seeger B, Freisleben B (2015) Complex Event Processing for Reactive Security Monitoring in Virtualized Computer Systems. The 9th ACM International Conference on Distributed Event-Based Systems

  7. 7.

    Bose R, Frew J (2005) Lineage retrieval for scientific data processing: a survey. ACM Computing Surveys 37 (CSUR) (1):1–28

  8. 8.

    Brown PG (2010) Overview of sciDB: large scale array storage, processing and analysis. In: Proc. of the ACM SIGMOD Conference, pp. 963–968

  9. 9.

    Corvalan C, Hales S, McMichael AJ (2005) Ecosystems and human well-being: health synthesis. World health organization

  10. 10.

    Cuevas-Vicenttín V, Dey S et al Scientific Workflows and Provenance: Introduction and Research Opportunities. Datenbank-Spektrum 12 (3), 193–203 (2012)

  11. 11.

    Dean J, Ghemawat S (2008) Mapreduce. Comm. of the ACM 51(1):107

  12. 12.

    Diepenbroek M, Glöckner F, Grobe P et al (2014) Towards an Integrated Biodiversity and Ecological Research Data Management and Archiving Platform: The German Federation for the Curation of Biological Data (GFBio). GI: Informatik 2014 – Big Data Komplexität meistern

  13. 13.

    Garcia-Molina H, Ullman JD, Widom J (2000) Database system implementation, vol. 654. Prentice Hall Upper Saddle River, NJ

  14. 14.

    Hey AJ, Tansley S, Tolle KM et al (2009) The Fourth Paradigm: Data-Intensive Scientific Discovery, vol. 1. Microsoft Research Redmond, WA

  15. 15.

    Hijmans RJ, Cameron SE, Parra JL et al (2005) Very high resolution interpolated climate surfaces for global land areas. Int J Climatol 25(15):1965–1978

  16. 16.

    Jetz W, McPherson JM, Guralnick RP (2012) Integrating biodiversity distribution knowledge: toward a global map of life. Trends Ecol & Evol 27(3):151–159

  17. 17.

    Kini A, Emanuele R (2014) Geotrellis: Adding Geospatial Capabilities to Spark. Spark Summit 2014

  18. 18.

    Kyriazis G (2012) Heterogeneous system architecture: a technical review. AMD Fusion Developer Summit

  19. 19.

    Lam C (2010) Hadoop in Action, 1st edn. Manning Pub., Greenwich, CT, USA

  20. 20.

    Manyika J, Chui M, Brown B et al (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute

  21. 21.

    McLennan M, Clark S, Deelman E, Rynge M, Vahi K, McKenna F, Kearney D, Song C (2013) Bringing scientific workflow to the masses via Pegasus and HUBzero. Proceedings of the 5th International Workshop on Science Gateways 13 p. 14

  22. 22.

    {Open Geospatial Consortium Inc.} (2011) OpenGIS Implementation Standard for Geographic information - Simple feature access - Part 1: Common architecture

  23. 23.

    Vitolo C, Elkhatib Y, Reusser D et al (2015) Web technologies for environmental Big Data. Environmental Modelling & Software 63 pp. 185–198

  24. 24.

    Warmerdam F (2008) The geospatial data abstraction library. Open Source Approaches in Spatial Data Handling. Springer, Berlin Heidelberg pp. 87–104

  25. 25.

    Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association

Download references

Acknowledgement

This work has been supported by the Deutsche Forschungsgemeinschaft (DFG) under grant no. SE 553/7-1 (GFBio) and by the Bundesministerium für Bildung und Forschung (BMBF) under grant no. 01LL1301 (IDESSA).

Author information

Correspondence to Michael Mattig.

Additional information

This is an extended version of the paper “Rethinking Spatial Processing in Data-Intensive Science” [3] selected for the special DASP issue Best Workshop Papers of BTW 2015.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Authmann, C., Beilschmidt, C., Drönner, J. et al. VAT: A System for Visualizing, Analyzing and Transforming Spatial Data in Science. Datenbank Spektrum 15, 175–184 (2015). https://doi.org/10.1007/s13222-015-0197-y

Download citation

Keywords

  • Geographic Information System
  • Coordinate Reference System
  • Raster Data
  • Geographic Information System Application
  • Resilient Distribute Dataset