Open-Source GIS Libraries
Open source GIS libraries provide basic functionality for certain aspects and tasks of open source and commercial GIS software. Libraries are helper software which offers services to independent GIS software, thus enabling code and data to be shared in a modular way. If a software provides an additional abstraction layer, underlying libraries can be exchanged if needed, e.g., due to performance, accuracy or functionality reasons. In GIS, some basic functionality is required by many derived and specialized GIS software and tools. Rather than implementing the same functionality again and again, specialized libraries can provide the functionality, offering derived software a head start rather than having to write everything from scratch. Typical cases where libraries are traditionally used are graphics and GIS format support and conversion, reprojection support, computational geometry operations, topology operations, and more. The goal of this entry is not to provide a complete list of available open source GIS libraries, which would be quickly outdated, but to describe the availability and functionality of some of the more popular GIS libraries. GDAL/OGR is a GIS and image format access and conversion library and a suite of utilities (GDAL is responsible for raster data and OGR for vector data access). PROJ is a reprojection library. The Java Topology Suite (JTS) and GEOS both provide geometry engines for computational geometry and topologic queries. The Java Conflation Suite (JCS) provides functionality and tools for combining, integrating and improving data from various data sources. GPSBabel enables the reading, writing and conversion of various GPS formats. Many open source GIS libraries are published under a less strict license, such as a variation of the MIT License or the LGPL, thus also allowing commercial use of the libraries without forcing a company to release its full source code of depending applications.
Naturally, since multiple projects are discussed, each project has its own history.
The history of GDAL/OGR dates back to 1993, when Frank Wamerdam (the main programmer of GDAL/OGR) started to work at PCI Geomatics (QGIS Community 2005). Wamerdam initially worked at PCI on a predecessor library of GDAL/OGR called GDB (Generic Database). In 1998 he became an independent consulter in Open Source GIS, filling the empty niche for a library to read, write, convert and manipulate geospatial raster and vector data. GDAL is an abbreviation for Geospatial Data Abstraction Library. The name of OGR has historic roots (OpenGIS Simple Features Reference Implementation), though it has lost its meaning today since it is no longer a reference implementation. In 2006, GDAL/OGR joined the OSGeo (Open Source Geospatial) foundation as a founding member and began transitioning to a more community oriented governance model. Today, GDAL/OGR is used by most open source and a lot of commercial GIS software. While there are some additional contributors, Wamerdam is still the primary developer of the library. Currently, GDAL supports more than 60 raster formats and OGR supports more than 25 vector formats and spatial database providers. However, not every operation is supported by each format driver.
The history of PROJ.4 (Evenden 2007) started in the late 1970s, when Gerald Evenden was involved in the development of map plotting software at the Atlantic Geology branch of USGS. At that time, two separate software packages had been developed: MAPGEN for map plotting and PROJ for projection calculations. This separation was a wise decision because projection calculations are useful in many work flows, not just for making maps. Additionally, at that time, other available projection libraries were either not up to the task or impossible to integrate into other software. In the 1980s, the software packages went through several iterations with different programming languages until ending up as a C library. With the advent of commercial graphic software, MAPGEN was abandoned, while PROJ was further developed. Originally designed for US datums and projection systems, the new library called libproj4 concentrates on projections, handles international projections and does not deal with datum conversions. Libproj4 is a new development and not directly compatible with the old PROJ system and the PROJ.4 distribution which is more widely used today. PROJ.4 is now maintained by Frank Wamerdam, but still mostly based on Gerald Evenden’s work.
The first version of the Java Topology Suite (JTS) was released in February 2002, developed by Vivid Solutions and sponsored by several Canadian and British Columbian governmental institutions. As the name already reveals, the software is written in Java. Updates occurred at regular intervals, enhancing functionality, performance and providing bug fixes. The latest release, 1.8, dates from December 2006. The Java Conflation Suite builds on the Java Topology Suite and ties into the JUMP user interface. Its first release dates from June 2003 with a compatibility release in November 2003 to collaborate with the new version of JUMP. GEOS (Geometry Engine Open Source) is a C++ port of the Java Topology Suite. Release 1.0 dates from November 2003 and the latest release is from February 2007. GEOS is developed by Refractions Research, the same company responsible for developing PostGIS and other Open Source GIS tools.
Version 1.0 of GPSBabel was released by Robert Lipe in October 2002. The first Windows GUI release dates from December 2002 and an OSX GUI was contributed in February 2004. The latest release, 1.3.2, is from November 2006. GPSBabel currently supports about one hundred different GPS dialects and versions and is available for almost any platform.
GDAL is a library to access raster data formats and presents a single abstraction model for all the formats to an application. GDAL is written in C++, but also offers bindings for C and Python. GDAL is licensed with an MIT style open source license. This license basically excludes any warranty and states that everything can be done with the software (apart from removing the copyright disclaimer) (GDAL 2007).
The GDAL abstraction model is based on the representation of a dataset (GDALDataset) and the representation of a raster band (GDALRasterBand). GDALDataset contains information about the georeferencing of the raster and references to the raster bands. GDALRasterBand provides the method GDALRasterBand::GDALRasterIO to access data. This method loads the raster values to a memory buffer where they can be accessed by the program using GDAL. Two important parameters of this method are the resolutions in x- and y-direction. Note that, used with different arguments, the same method can also be used for writing data.
A raster band can have associated overviews (also called pyramids). In fact, overviews are also GDALRasterBands, but with a lower resolution than the original band. This makes queries with lower resolutions much faster. Overviews can be created for all bands in a dataset with the method GDALDataset::createOverviews, giving a list with decimation factors for the requested overviews.
The code of the OGR library is not dependent on GDAL. However, it resides in the same source tree and is compiled into the same binary library (libgdal). OGR supports access to a large number of vector data formats, including GML, ESRI shapefiles, GRASS and PostgreSQL (OGR 2007). Sequential reading of the data is provided for all supported formats. Further operations may be available depending on the data formats and if the OGR drivers are able to return their capabilities at runtime. Such capabilities include random read, sequential write, random write, delete feature, fast spatial filter, fast feature count, fast query of layer extent. OGR is also able to create new empty data sources for some formats.
The most important objects in OGR are the layers (OGRLayer), the features (OGRFeature) and the geometries (OGRGeometry). A sequential read of the features in a layer can be done by using the functions OGRLayer::ResetReading() and then repeatedly OGRLayer::GetNextFeature(). OGRFeature contains the attribute values and a reference to the feature geometry. OGRGeometry is an abstract base class, implemented by specific subclasses for the representation of point, multipoint, line, multiline, polygon, multipolygon.
Java Topology Suite (JTS)
The Java Topology Suite (JTS) is a library (API) for 2D spatial functions and predicates. It implements the spatial functions and predicates defined in the OGC Simple Features for SQL specification. The JTS implementation contains complete, consistent and robust implementation of fundamental spatial algorithms that is fast enough for production use. As the name indicates, it is written in Java and is released under the LGPL (Lesser GNU Public License).
Buffer (positive and negative offsets)
Additionally, the relate operator is supported which works according to the Dimensionally Extended 9 Intersection Matrix (DE-9IM).
GEOS is a C++ port of the Java Topology Suite (JTS) and also offers bindings C and python. Even if GEOS itself is written in C++, it is recommended to use the C interface as it is supposed to remain stable, while the C++ interface may differ between versions (GEOS 2007).
A geometric feature is represented in GEOS by the abstract class geos::Geometry, from which representations for the simple features are derived (see Fig. 1). The functionality of geos::Geometry includes the calculation of convex hulls, centroids and buffers as well as the intersection, union and difference of two features. A geos::Geometry object may be created from vertex coordinates by using the methods of geos::GeometryFactory. GEOS also provides the possibility to import geometries from Well Known Text (WKT) and Well Known Binary (WKB) format.
Java Conflation Suite (JCS)
The Java Conflation Suite (JCS) (Vivid Solutions 2006) is an API/library and a set of interactive tools assisting while conflating (merging) and cleaning vector data sets. It is part of the JUMP GIS framework and builds upon the Java Topology Suite. It was released under the terms of the GPL (Gnu Public License). During conflation, it helps with coverage cleaning, coverage alignment and road network matching. JCS also assists in quality assurance of vector data sets by detecting and visualizing errors and offering both automatic and manual cleaning functions. All detection and cleaning mechanisms are accessible either programmatically or interactively.
detect and clean gaps
detect and clean overlaps
detect and fix boundary alignment errors (horizontal conflation)
detect and fix coverage alignment errors (vertical conflation), one data set is the reference data set
road network matching: JCS provides algorithms to establish node and edge matchings between two road networks
precision reduction: reduce excessive precision to lower precision. This step may produce incorrect vector topology, which can be fixed in an extra step (see above)
geometry difference detection: find geometric differences of two versions of the same data set. JCS supports either exact matching or matching with a distance tolerance
GPSBabel (GPSBabel Community 2007) helps to flatten the tower of babel that GPS device and software manufacturers introduced with their proprietary GPS data formats. Despite the fact that standard GPS data formats (like GPX) exist and GPS data isn’t really of complex nature, there is still a considerable amount of GPS data stored and exchanged with proprietary and incompatible data formats. GPSBabel can translate waypoints, routes, tracks and partially signal quality (PDOP) information between over a hundred different data formats and versions. Additionally, it can filter, interpolate and rearrange (sort) data and batch process many files in one run. Strictly spoken, GPSBabel isn’t a library but a commandline tool. The fact that it runs as a commandline tool helps create workflows where GPSBabel is part of a workflow pipeline controlled by a shell script, batch file or any programming language and makes it suitable for web server usage. However, there are also GUI tools available for the occasional user. As most open source software, GPSBabel works on almost any platform (including Linux/Unix, MacOSX and Windows). GPSBabel is also used by Google Earth and the geocaching community. It is licensed under the Gnu Public License (GPL).
GPSBabel supports powerful filter options. It may detect and remove duplicate coordinates, reduce the number of points by using a simplify operator, filter data by geometry (within the radius of a given point, inside a given polygon, within the distance of a given arc). The simplify option is particularly useful for devices with limited storage capabilities. The algorithm tries to preserve the original geometry as well as it can, honoring the specified parameters. A PDOP filter allows one to filter out unreliable points with a bad PDOP value. This filter works separately for HDOP (horizontal precision) and VDOP (vertical precision) values. The rearranging options allow sorting by time, shortname, name, description and geocache id. The interpolation functions allow one to add points at regular time and distance intervals. Finally, there are conversion and manipulation functions, such as shifting all timestamps, splitting and merging tracks and files, convert waypoints to tracks and convert tracks to waypoints.
Base Functionality for Derived GIS Software
The functionality of the libraries can be used to build a GIS on top of it. With this approach, a GIS program can be built by providing the a graphical user-interface to translate user requests to calls to library functions. An example of such a program is QGIS (QGIS Community 2007). QGIS uses GDAL/OGR for reading and writing to data sources, PROJ4 for reprojecting vector layers on the fly and GEOS for intersection tests between geometries and (selection) rectangles.
Format Conversion, Import/Export, Data Access
GDAL and OGR provide, in addition to the library, command line tools for data access and format conversion. For instance, ogrinfo prints out information about vector data sources and ogr2ogr converts between vector formats. gdaltranslate is the tool to convert raster formats and gdalwarp reprojects raster datasets. gdal_rasterize converts vector files to raster files through rasterization, rgb2pct and pct2rgb allows conversion between 24 bit raster files and paletted 8 bit raster files.
Data Management and Administration
As mentioned in the previous paragraph, various commandline utilities exist to receive information on data sets, convert formats, and reproject data. gdaladdo is a utility to add overview pyramids with lower resolution to a high resolution raster data set. gdaltindex and ogrtindex create index files documenting the footprint of a raster data set for quick retrieval, which is in particular important for map servers. gdal_merge builds mosaics from different raster data sets. gdal_contour can create vector contours from a raster based digital terrain model. The Java Conflation Suite assists for merging vector data sets. GPSBabel helps to manage and process GPS data. The JTS and GEOS library can help to select, filter and manipulate data based on geometry or spatial relationships.
Merging of Vector Data Sets, Consistency Checks, Quality Assurance
As mentioned in the JCS section, the Java Conflation Suite can assist with the merging of vector (feature) data sets and helps to check data quality and consistency. It can detect and visualize errors and consistency problems and offers both automatic and manual (interactive) cleaning tools. It is accessible to developers through APIs or to end-users through a graphical user interface, utilizing the JUMP GIS framework.
Building Blocks for More Complex Work Flows and Web Applications
As many libraries also provide commandline based utilities, it is easy to write batch files, shell scripts or other software that utilizes the data management and administration capabilities of the libraries dealt with in this entry. They can also be used as components for web GIS and web cartography in combination with server side scripting or programming languages.
GDAL/OGR will provide more language bindings, featuring support of Java and. NET access to the libraries. Additionally, Wamerdam will work on a thread safe version of the libraries. Obviously, users will demand more drivers targeting data formats not yet supported. Unifying the architecture of GDAL and OGR is also on the agenda. Future versions of PROJ.4 aim to provide better documentation, improved datum shifting support and better APIs for other software to build upon PROJ.4. The Java Topology Suite and its C++ port GEOS are, functionality wise, almost complete. Future releases will concentrate on bug fixes, performance and compatibility issues. The functionality of future versions of the Java Conflation Suite will depend on functionality ordered by customers or research projects. The future is thus hard to predict. It would be desirable that JCS would support additional topology rules. GPSBabel obviously will support upcoming GPS data formats and improve support for new visualization tools such as Google Earth and other software projects. As with any other open source project, the future is difficult to predict. New functionality will be added if there is a need to, a client pays for development or a programmer invests time and interest to implement it. Possible improvements will be ports to other platforms, improved GUIs or support for additional data formats.
- Evenden G (2004) Brief history of PROJ. http://lists.maptools.org/pipermail/proj/2004-March/001117.html. Accessed 17 Feb 2007
- GDAL (2007) GDAL – Geospatial Data Abstraction Library. http://www.gdal.org. Accessed 22 Feb 2007
- GEOS (2007) GEOS homepage. http://geos.refractions.net. Accessed 22 Feb 2007
- GPSBabel Community (2007) GPSBabel: convert, upload, download data from GPS and Map programs. http://www.gspbabel.org/. Accessed 20 Feb 2007
- QGIS Community (2005) QGIS Community – Interview with Frank Wamerdam. http://qgis.org/content/view/58/44/. Accessed 17 Feb 2007
- QGIS Community (2007) QGIS Community – Home. http://qgis.org. Accessed 22 Feb 2007
- OGR (2007) OGR Vector Formats. http://www.gdal.org/ogr/ogr_formats.html. Accessed 22 Feb 2007
- Vivid Solutions (2006) Java Conflation Suite (JCS). http://www.vividsolutions.com/JCS/. Accessed 20 Feb 2007