Underground utilities are critical components of the massive utility networks that provide basic services to the society. It is estimated that the total length of underground utilities including water, sewer, gas, electrical, and telecom in the US is in excess of 35 million miles. The largest single threat to the safety of underground utilities is excavation (National Transportation Safety Board (NTSB) (2000) 1998; National Transportation Safety Board (NTSB) 1997). In the US, underground utilities are hit or damaged by excavation every 60 seconds (Spurgin et al. 2009; Common Ground Alliance (CGA) 2010).

Besides the high frequency of its occurrence, a hit on utilities by an excavation operation often leads to disastrous consequences in aspects of disruption to services, property damage, deaths, and serious injuries (Felt 2007;Nelson and Daly 1998;Doctor et al. 1995). For instance, the natural gas pipeline rupture and subsequent explosion caused by excavation in St. Cloud, Minnesota on December 11, 1998 caused four fatal injuries, one serious injury, and 10 minor injuries; and destroyed six buildings (National Transportation Safety Board (NTSB) (2000) 1998; National Transportation Safety Board (NTSB) 1997). In 2007, the excavation strike on a high pressure gas main in Cary, North Carolina resulted in a 100 feet high fireball that burned for nearly six hours and consequently, the evacuation of nearby residents and closing of major roads (WRAL archives 2011). The Office of Pipeline Safety’s Pipeline and Hazardous Materials Safety Administration (PHMSA) reported a total of 2770 serious incidents from year 2001 to 2010, of which nearly 20% (544 incidents) were excavation related and caused a total of 37 fatalities, 152 injuries, and $200 million in property damage (PHMSA 2011).

Despite the implementation of the 811 One-Call System that requires excavation contractors to call the state One-Call center that in turn, informs utility owners to mark utility locations with spray paint or flags, excavation remains the single largest cause of pipeline accidents. For instance, a UNCC (Utility Notification Center of Colorado (UNCC) 2005) study reported that 55.7% of the 9,371 incidents in Colorado in 2005 occurred even though the excavators followed the One-Call procedure. These incidents occur due to two primary reasons: (1) reliable data regarding the true location of underground utilities is missing or incomplete, i.e., utilities are often NOT at locations where the records specify (Sterling et al. 2009); and (2) uncertainty in the utility location is not communicated to excavator operators in real-time to help them objectively perceive the digging machine’s position relative to the buried utilities (Sterling et al. 2009;CGER - Commission on Geosciences, Environment and Resources Environment and Resources 2000).

The authors have created a framework that synergizes geospatial informatics and construction informatics (Figure 1) to visualize and monitor the interaction between buried utilities and excavation implements to overcome these limitations. Recognizing the pivotal effects of the positional accuracy/uncertainty of buried utilities to downstream visualization and proximity analysis, the newly created framework is intended to be uncertainty-aware. This paper presents the technical approach in achieving this intention by modeling the uncertainty of geospatial utility data, quantifying the parameters of the uncertainty model based on data lineage, visualizing both the geospatial utility data and its associated uncertainty in a Geospatial Augmented Reality (GAR) environment, and analyzing the proximity between buried utilities and the digging implements. Based on this analysis, appropriate quantitative warnings can be issued and together with visual displays, be brought to excavation operators in real-time. The newly created framework is expected to contribute to safe urban excavation and the improvement of construction productivity.

Figure 1
figure 1

Computational framework for uncertainty-aware visualization and proximity monitoring in urban excavation.

The remainder of this paper is organized as follows. The authors first review the current practice and related studies. Following this review, the authors describe the technical details of their methodology in modeling the positional uncertainty of geospatial utility data, monitoring the excavator movement, and synergizing them into a geospatial AR environment for real-time visualization and proximity analysis. The authors then illustrate the test and validation of the newly developed framework. Finally, the authors draw their conclusions and point out future research directions.

Related studies

This section reviews related studies in modeling the geospatial underground utilities and the associated positional uncertainties, the methods of locating invisible underground utilities, and emerging Augmented Reality-based approaches in visualizing buried, invisible utilities in the context of their spatial context. The current practice of the 811 One-Call procedure and its limitation are also reviewed.

Geographic information systems (GIS) for utility data

Geographic information science (GISci) is the discipline that focuses on understanding the world by describing, analyzing, and explaining human relationships with the earth (Huxhold 1991). GIS is an information system built upon GISci to manage, analyze, and report spatial data, describing phenomena above, on, and underneath the earth’s surface. A GIS is both a database system to manage spatial and non-spatial data and a set of spatial operators for working on spatial relations (Poku and Arditi 2006). Integration of spatial and non-spatial information on a database platform, registration of locations to the real world coordinates, and spatial analytical capabilities are the distinguishing merits of GIS, leading to the proliferation of its applications in civil engineering (Poku and Arditi 2006;Miles and Ho 1999).

GIS allows a utility owner to have a complete utility inventory stored in a single repository that is easy to update and extract (Corbley 2007). Since its emergence, GIS has been steadily replacing paper-based as-built drawings and digital Computer Aided Design and Drafting (CADD) drawings to inventory and manage utility data. It has become the de facto tool of choice for utility owners for creating, organizing and managing geospatial utility information (Sipes 2007). Many utility owners have gone through the transition from paper maps to CADD files or GIS databases via digitization (Cypas et al. 2006).

GIS has been historical two-dimensional (2D), modeling the geometries of objects into georeferenced points, polylines, and polygons. Buried utilities are predominantly modeled as 2D polylines in GIS, missing vertical location information. Some utilities might have their buried depths (e.g. “depth of cover”) stored as attributes and their vertical location might be derived by consulting the reference surface. However, utility depths are rarely referenced to a recognized elevation datum (Federal Highway Administration (FHWA) 1999;Anspach 1995) and any changes of the reference surface make the buried depth a very unreliable source for deriving vertical utility locations.

Locating utilities in the field

When utilities are first installed, their locations are captured in as-built drawings that vary in terms of information richness, positional accuracy, and storage format (e.g. paper-based versus digital). The advancement in tracking technologies such as Global Positioning Systems (GPS) and Radio Frequency Identification (RFID) has greatly facilitated the collection of accurate utility locations for new installations in both horizontal and vertical dimensions (Dziadak et al. 2008;North 2010).

After utilities are installed and covered, the procedure of locating them in the field typically starts with the as-built drawings. Geophysical surveys based on a variety of sensing and locating techniques might be performed to locate buried utilities with good accuracies at the levels of A and B, based on the generally accepted definitions of quality levels in Subsurface Utility Engineering (SUE) (Stevens and Anspach 1993;Lew 1996; American Society of Civil Engineers (ASCE) 2002). Locating techniques include radio frequency (RF) detection techniques, electromagnetic techniques, magnetic methods, vacuum extraction, ground penetrating radar (GPR), and terrain conductivity (Anspach 1995). A number of studies have been conducted to apply the GPR technique in detecting buried utilities (Hereth et al. 2006; National Research Council (NRC) 2000;Butler 2001;Lanka et al. 2001). GPS, though not a detecting technique, has been frequently combined with locating techniques to register the location of detected utilities to real-world spatial referencing system and thus, forms the foundation of integrating with GIS to automate the inventory and update of utility locations in GIS and guide future field location of utilities (Common Ground Alliance (CGA) 2010;Ellis et al. 2009;Manacorda et al. 2007;Bakhtar 2006;Ishikawa et al. 2006;Goldstein 1997).

Uncertainty in geospatial data

Uncertainty has been a major issue in GIS for many years (Heuvelink and Burrough 2002;Goodchild 1998), and is one of the top ten research priorities in GISci (Cobb et al. 2000). Uncertainty can be generally defined as the discrepancy between what a database indicates and what actually exists in the real world (Goodchild 1998) and be described in aspects of positional inaccuracy, errors, vagueness, ambiguity, fuzziness, scale, and sampling (Goodchild 1998;Cobb et al. 2000;Fisher 1999). The literature on the topic makes use of a range of terms related to uncertainty (Crosetto and Tarantola 2001;Duckham et al. 2001), including quality, accuracy, reliability, error, ignorance, precision, clearness, distinctiveness, etc. (Foody 2003). A large amount of research articles have been included in the publications of two international conferences of Symposia on Spatial Accuracy Assessment and Symposia on Spatial Data Quality.

Current practice uses metadata (e.g. data about data) standards such as ISO 19113 (ISO 2001), ISO 19115 (ISO 2003), and Federal Geographic Data Committee (FGDC) (Federal Geographic Data Committee (FGDC) 1998) to measure and record the quality of spatial data upon a common platform (Fisher et al. 2010). Spatial data quality is generally measured in aspects of lineage (e.g. data collection methods and data sources), accuracy, consistency, and completeness as the assessment of results (Fisher et al. 2010). The concept of “fitness for use,” e.g. how well a certain data set meets the needs of an application, has also been proposed as the overall measure of the quality/certainty of geospatial data (Fisher et al. 2010;Devillers et al. 2010).

While metadata provides a common platform for recording and communicating data uncertainty information, it does not provide methods to handle and mitigate the inherent uncertainty in geospatial data. The current practice of GIS is deterministic. All visualization and analyses are being performed as if the underlying GIS data were correct without any incompleteness, errors, and inaccuracy. To cope with this limitation, Burrough (Burrough 1992) discussed an “intelligent GIS” to benefit from available metadata to support the use of uncertain data. Unwin (Unwin 1995) introduced the concept of “error-sensitive GIS” for error management such as data verification and validation, visualization of errors/uncertainties, and simulation and sensitivity analysis to obtain a range of potential results as well as associating a sense of credibility to each scenario. Duchham and McCreadie (Duckhma and McCreadie 20021999) proposed the concept of “error-aware GIS” as an extension to “error-sensitive GIS” by adding techniques to understand errors and integrate errors into decision-making. Devillers et al. (Devillers et al. 2005) designed a prototype multidimensional database system to assist users in assessing the fitness for use of geospatial data, considering errors and error propagation issues.

Error modeling for linear objects

For linear, geospatial utility data, uncertainty is mostly concerned with the positional discrepancy between the records-indicated object locations and their real world locations and thus, uncertainty might be interchangeable with positional error/inaccuracy. Accuracy assessment and error modeling of linear objects have been active research topics. A number of studies have proposed and tested several error models for geospatial linear objects such as roads, utilities, and streams (Mozas and Ariza 2011;Shi and Liu 2000;Goodchild and Hunter 1997;Caspary and Scheuring 1993;Perkal 1956).

Utility lines in GIS are typically modeled as straight line segments that connect two end points. In 2D, the uncertainty of a straight line can be captured as an uncertainty epsilon band that encloses the “true” location of the utility centerline (Mozas and Ariza 2011). This concept was initially proposed by Perkal (Perkal 1956) who used an epsilon band, the 2D space enclosed by two parallel lines that are also tangents to circular errors at the ending points as the probability range for a line. The epsilon band has been discussed frequently in the literature (Blakemore 1984;Aspinall and Pearson 1995) and has been implemented in 2D GIS in various algorithms in the form of a tolerance (Goodchild and Hunter 1997). Caspary and Sheuring (Caspary and Scheuring 1993) and Shi and Liu (Shi and Liu 2000) suggested that intermediate points on a line have smaller errors than the end points and thus, the error band will be “slim” in the middle, leading to a genetic band, or a G-band. Probabilities can be determined for G-bands with various sizes to model the uncertainty in lines (Heuvelink et al. 2007;Wu and Liu 2008). Goodchild and Hunter (Goodchild and Hunter 1997) pointed out that epsilon was often interpreted in a deterministic sense as the minimum buffer width that enclosed the true location of the objects under testing/assessing and was very sensitive to outliers. They proposed a simple buffering approach to evaluate the positional accuracy of linear objects by simultaneously referring to the buffer width and the percentage of lines within this buffer (Goodchild and Hunter 1997). The main limitations of current error modeling for linear objects are (1) being 2D, (2) being deterministic (e.g. the epsilon band is determined to enclose the true location and given the existence of outliers, the band width is unreasonably large), and (3) the lack of a method to estimate the most probable location of linear objects given their recorded location and associated positional uncertainty, the reverse procedure of accuracy assessment.

Augmented reality (AR) for utility visualization

A relatively new technological advancement in AR has enabled the visualization of invisible, buried utilities in the context of their real-world surroundings (Kamat 2003;Behzadan 2008;Roberts et al. 2002a). In an AR environment, buried utilities can be visualized as floating lines on the correct locations relative to background images and photos. If three-dimensional (3D) reference data is available, the buried utilities can be offset downward to compose an interactive 3D display (Talmaki and Kamat 2012). The positional uncertainties associated with geospatial utility data might be visualized as 3D buffers/”halos” to provide an uncertainty-aware visualization and proximity analysis (Talmaki et al. 2012).

Current practice of the One-call system

Recognizing excavation damage as the largest single cause of pipeline accidents and associated deaths and injuries, the National Transportation Safety Board (NTSB) initiated the development of the One-Call (811) notification system in 1970s (National Transportation Safety Board (NTSB) 1998). Before digging, it is required by Federal Law to have the location of all buried utilities in the vicinity of the excavation area to be pre-marked. Excavation contractors are required to contact state-wide one-call agencies 48 to 72 hours prior to the start of the operations. One-call agencies, in turn contact their member companies with the location of the excavation site. If the member companies determine an overlap between the job site and their utility lines’ location, they mark the location of the utility lines using spray paint, flags, stakes or any combination of these.

The markings are typically made referring to data from as-built drawings. An obvious limitation of this one-call procedure is that the markings are typically the very first things being removed when excavation starts. The excavation operators then have to rely on their memory to estimate the utility location and their imagination to compose mental images of the proximity of the digging implement to utilities and how the excavation operation might interact with utilities. This is an error-prone procedure that in turn, is the main reason of the large number of utility strikes even after the one-call procedure is followed.

The overarching goal of this study is to prevent unintended collisions between a digging implement and buried utilities via uncertainty-aware visualization and proximity analysis of excavation operations in a virtual environment. The premises for the research study presented in this paper are that (1) increased spatial awareness of excavator operators (e.g. being able to “see” the buried utilities and the movement of the excavator bucket and judge the proximity between utilities and the bucket) leads to improved excavation safety, and (2) such an increased in spatial awareness can be achieved through the combination of visual perception and analytical proximity monitoring of excavation operations. Considering the pivotal role of data quality and the tremendous uncertainty in utility locations, the research hypothesis is that uncertainty, particularly spatial uncertainty, can be brought into the decision-making process in an easy-to-understand manner for both visualization and analytical proximity monitoring to prevent unintended collisions and increase excavation safety.


To achieve the aforementioned goal, the authors designed a framework (Figure 1) to synergistically incorporate inherent uncertainties associated with geospatial utility data into a geospatial AR environment for error-aware visualization and proximity monitoring. This section describes the technical details in modeling error/uncertainty in geospatial utility data and quantifying the error model by linking accuracy to data lineage. This section also presents an error-aware utility data model in the format of Universal Modeling Language (UML) and its implementation of Extensible Markup Language or XML (Cypas et al. 2006) model for data transfer and sharing with downstream visualization and analysis, given its flexibility, extensibility, and compatibility with open-source requirements. The mechanism of proximity monitoring and uncertainty-aware reasoning is also provided as the base for appropriate warning messages.

Uncertainty modeling of geospatial utility data

The authors modeled utility lines as 3D straight line segments connected at turning points (Talmaki et al. 2012). In practice, the 3D locations of turning points are first obtained and 3D straight lines can be constructed by connecting these 3D turning points. Of all positional accuracy/error models, the most related are those that apply to points and lines. Uncertainty, when expressed as probabilities associated with geospatial extents that contain the “true” location of that object, can be derived from the recorded utility location and its positional accuracy.

Figure 2 illustrates the 3D uncertainty model that was designed specifically for geospatial utilities points Probabilities were introduced to represent uncertainties. The assumptions for this model include: (1) the positional error consists of a systematic error and a random error, (2) the random error is normally distributed in all dimensions, (3) random errors in X, Y, and Z directions are independent, and (4) horizontal errors are not different in X and Y dimensions. The systematic error can be remedied via a translation in a 3D coordinate system. The random error in the Z dimension follows a normal distribution and probability ranges can be determined as linear ranges. The random errors in X and Y dimensions are combined into a horizontal radial error calculated as square root of (ex2 + ey2), where ex and ey are the random errors in the X and Y dimensions, respectively. The horizontal radial error can be treated as a normal distribution (Greenwalt and Shultz 1968) or a χ2 distribution with two degrees of freedom (Caspary and Scheuring 1993). Probability circles can then be determined as pairs of a radius in the unit of standard deviation (SD) and a probability expressed as a percentage.

Figure 2
figure 2

Error and uncertainty model for points.

The horizontal probability circles and the vertical linear probability ranges can be combined into 3D probability ellipses/spheres (Greenwalt and Shultz 1968) or probability cylinders. A probability cylinder is constructed by taking a probability circle and extruding it vertically. The resulting probability is the product of the corresponding circle probability and the linear probability. For instance, the horizontal 50% circle is extruded to reach the vertical 90% linear probability, resulting in a 45% probability cylinder. Unlike probability ellipse/circle, many cylinders can have the same probability, making implementation impractical. A random simulation was conducted, of which the results confirmed that 3D ellipsoids closely follow the point clouds and thus, were chosen to model the uncertainty of utility turning points.

In this study, utility lines were modeled as straight line segments that connect two end points. In 2D, the uncertainty of a straight line is captured as an uncertainty epsilon band (Mozas and Ariza 2011). The epsilon band took a shape that is “slim” (Shi and Liu 2000;Caspary and Scheuring 1993) in the middle to indicate the positional error at turning points is larger than that at intermediate points on the line. Such a band model in 2D is commonly referred to as a genetic band, or G-band. Since this study extended the 2D G-band into 3D and introduced probabilities to represent uncertainties, the resulting uncertainty model for utility lines was named 3D Probability G-band. Figure 3 illustrates the uncertainty model for utility lines by extending the epsilon concept from 2D into 3D and incorporating probabilities to represent uncertainty. Only one probability band (the outermost surface) is shown in Figure 3. The 3D space enclosed by this boundary represents the space that encloses the true location of the utility line at a particular probability. The shape and the size of the band can be described via a mathematical function (Equation (1)):

r x , p % = f x , p % , data lineage ,

where rx, p% represents the radius of the p% probability band at location x along the centerline.

Figure 3
figure 3

Uncertainty model for lines.

The function f indicates that radius is dependent on the location along the centerline, the probability of interest, and the data lineage. The determining effect of the data lineage on uncertainty models will be described detail in the section of Construction of Uncertainty Models for Geospatial Utility Data. When the locations of utility lines are collected directly such as in GPR, the 3D Probability G-band can be simplified into 3D Probability Bands that can be derived via the 3D buffering approach. Rather than being “slim” in the middle, a probability band is of uniform size along the line.

Validation of uncertainty models

Monte Carlo simulations were performed to validate both the point and line error models. Figure 4 illustrates the distributions for end points and the centerline. The Monte Carlo simulations generated random 3D locations for end points. Corresponding end points were connected to form centerlines. Corresponding 3D ranges of the 50% and 90% probabilities are highlighted in Figure 4, where the white line indicates the 3D extent that is associated with a probability of 50% to enclose the true utility location and the black line indicates the 3D extent that is associated with a probability of 90% to enclose the true utility location. Figure 4 clearly illustrates that the point distribution follows a sphere shape and the 3D Probability G-bands for the line are “slim” in the middle. Such 3D ranges (e.g. 3D Probability G-bands) compose the 3D uncertainty models for utility centerlines in this study.

Figure 4
figure 4

Uncertainty model for lines.

Construction of uncertainty models for geospatial utility data

The authors constructed uncertainty models by quantifying the 3D Probability bands in pairs of probability and band size. The probability expressed in percentage refers to the probability of a specific 3D geospatial volume determined by a particular band size enclosing the “true” location. The band size was further detailed via a mathematical function that describes the shape and extent of the 3D G-band. The function might be simplified into a cylinder function when describing a 3D Probability band that does not “slim” in the middle along the line.

In implementation, the probability bands for centerlines must be transformed into utility circumferences. This is because GIS models 3D utility pipes/lines as lines that correspond to utility centerlines. Figure 5 illustrates this transformation needs for circular shape utilities. The size for the transformed band is the sum of the pipe radius and the model band size. This transformation works with cross-sectional dimension and is straightforward. The size of a probability range that is linear (1D), radial (2D), or 3D G-band, is quantified in the unit of standard deviation (SD), a commonly used statistical index for positional accuracy. Such a mechanism establishes a direct link between positional accuracy models (typically described via the deviation of true location from observed location) and uncertainty models (typically expressed as the probability of being true, e.g. the probability of enclosing the true utility location). Creating uncertainty models via 3D probability eclipses/spheres and probability G-bands for utility points and lines facilitated the adoption of proximity (buffer) process of GIS and formed the foundation for visualizing and analyzing positional uncertainties in utility lines and points.

Figure 5
figure 5

Transformation to circumferences.

The premise for this dynamic approach for model construction is the linkage between data lineage and positional uncertainty, i.e. the genesis and the process to collect and interpret data determines the positional accuracy and uncertainty, a widely accepted scenario in Subsurface Utility Engineering (SUE) (Sterling et al. 2009) and GIS. In order to estimate the positional accuracy associated with the data lineage, utility data sources are first semantically categorized into four Level 1 quality groups according to (American Society of Civil Engineers (ASCE) 2002), detailed in Table 1. The level 1 groups are then detailed to Level 2 groups based on spatial measurement technologies. For instance, group A can be categorized based on subsequent measurement method and equipment. The use of real time kinematic (RTK) GPS could result in a positional accuracy of 6 mm (horizontal) and 12 to 18 mm (vertical). Similarly, Groups B, C, and D are further categorized based on geophysical methods used; density and spatial coverage, and the methods used to survey and record above-ground utility features; and original data collection methods; respectively. Since data lineage is now categorized based on spatial measurement technologies that in turn, determine the achievable levels of accuracy, the correlation between data lineage and positional accuracy is logically established. Consequently, uncertainty models, following the link between accuracy models and uncertainty models, can then be constructed to quantify uncertainty into probability 3D G-bands.

Table 1 Quality Groups of Utility Location Data

The dynamic nature of the uncertainty model construction approach is highlighted in Figure 6 that also illustrates the system architecture as well as the flow chart in managing the uncertainties of geospatial utilities data in a synergetic geospatial AR environment. The system is composed of three modules. The Uncertainty Modeling Module contains accuracy models, uncertainty models, the link between data lineage and accuracy, and the link between accuracy and uncertainty. The Open Source <UML> & <XML> Module includes data models employed to facilitate the management and sharing of geospatial utility data with downstream engineering applications. The Visualization and Proximity Analysis Module focuses on visualizing invisible utilities and the excavator movement, and analyzing the proximity between a digging implement (e.g. the excavator) and buried utilities, given inherent uncertainties in the utility location and excavator movement. The workflow (indicated by arrows in Figure 6) is such that: (1) utility data is enhanced with data lineage information (based on an expanded version of Table 1), (2) based on the links between data lineage and accuracy, and between accuracy and uncertainty, utility data is enhanced with uncertainty information, e.g. probability bands, and (3) uncertainty-aware utility data feeds into the geospatial AR for visualization and proximity analysis. Throughout the workflow, probability bands are derived rather than stored as properties of the corresponding utilities. This dynamic nature allowed easy update of the uncertainty model as well as extension and expansion. Any changes/updates in the uncertainty model and/or data lineage are automatically updated during future data utilization as utility data is extracted. For instance, the data lineage of a utility might change from Quality Group B to A. Simply updating the data lineage information automatically takes care of the uncertainty bands. Similarly, when the uncertainty model of a specific type of data lineage is changed, the uncertainty bands for all utilities that have that particular type of data lineage are automatically updated. Such a design is similar to the CASCADE rules in databases to ensure data integrity and consistency (Date 2000).

Figure 6
figure 6

The dynamic procedure in the construction of uncertainty model.

Uncertainty-aware geospatial data model for utilities

To facilitate the sharing of uncertainty-aware geospatial utility data in downstream applications, the authors created an uncertainty-aware utility data model illustrated in Figure 7. A solid box represents a class. A line represents an association between two classes. Special symbols such as diamonds and triangles are added to lines to indicate specific associations. For instances, the inclusion of a solid diamond represents composition – the relationship of “owns”; the inclusion of a hollow diamond represents aggregation – the relationship of “has”; and the inclusion of a hollow triangle represents generalization/inheritance - the relationship of “is a”. In this model, the utility network is composed of utility lines, straight line segments that are spatially constructed by connecting ending points, or utility vertices. The uncertainty model is specialized into linear uncertainty model for utility lines and ellipsoid model for utility vertices. The linear uncertainty model is further specialized into 3D G-band model and 3D cylinder model to suit different application needs. All uncertainty models are quantitatively (e.g. probabilities and band sizes derived) associated with their corresponding utility objects (e.g. utility lines and vertices) by referring to the linkage model. Such a model echoes the dynamic nature highlighted in Figure 6.

Figure 7
figure 7

The uncertainty-aware utilities data model.

The authors explored XML based data schemes such as the Geography Markup Language (GML) (Burggraf 2006) and CityGML, an extension to GML for the built environment (Kolbe et al. 2008); and created an XML-based model (Figure 8) as an implementation of the uncertainty-aware utilities data model in Figure 7. The newly created XML schema is an open source data model that is flexible and extensible to facilitate modeling, sharing, exchange, and integration of geospatial data. In the newly created model, each utility line is composed of straight line segments that are fixed in the space via their two ending points. A vertex is only needed when the utility changes direction, e.g. a bend. All spatial and non-spatial lineage information is inherent in utility vertices and lines to dynamically construct probability 3D-bands for uncertainty-aware visualization and proximity monitoring in excavation, and the sharing of uncertainty-aware data in downstream applications.

Figure 8
figure 8

XML schema for uncertainty-aware geospatial utility data.

Geospatial augmented reality environment for visualization and proximity analysis

In order to visualize uncertainty-aware geospatial data in Augmented Reality, the data must first be converted to a format suitable for 3D graphical visualization. More importantly, the geodata accuracy and its uncertainty must be graphically characterized and displayed for it to be useful in excavator operation and control. This research developed methods to characterize utility geodata in terms of its lineage and accuracy. Together, the accuracy and lineage help characterize the errors of a geodata set and the reliability that can be associated with its source. This information can be usefully exploited during excavation by displaying not only the expected locations of utilities to an operator, but also the degree of uncertainty (or “buffer”) associated with the expected locations in the form of a “halo”. In a 2D projection, the buffer calculated by interpreting the geodata’s lineage and accuracy is represented as a “band” whose width represents the uncertainty associated with the utility’s location. In a 3D projection, the buffer is identified by increasing the diameter of the cylindrical geometry representing the utility line.

The proximity analysis included three scenarios illustrated in Figure 9, with increased capabilities of handling positional uncertainty. Figure 9(a) illustrates a deterministic proximity measure that is the closest distance between an underground utility and the digging implement, e.g. the excavator bucket, assuming both the utility pipe and the excavator bucket are at their “true” locations with no positional uncertainty. Figure 9(b) introduces the positional uncertainty in the geospatial utility data, but the excavator bucket is still deterministic. Figure 9(c) introduces the positional uncertainty in the excavator bucket in addition to the positional uncertainty in the utility location. For the illustration purpose, both Figure 9(b) and (c) are presented in 2D.

Figure 9
figure 9

Interpretation of resulting uncertainty in proximity analysis; (a) Deterministic proximity measure; (b) Proximity measure incorporating uncertainty in pipe location; (c) Proximity measure incorporating uncertainty in pipe location and bucket movement.

The original positional uncertainty “propagates” into the proximity, reflected as a probability associated with the resulting 3D distance. However, it is inappropriate to interpret the propagated uncertainty as “an x% probability of proximity of d.” Rather, it shall be interpreted as “an x% probability that the proximity is at least d.” Table 2 provides interpretations of Figure 9(b) scenarios. Note that 40% is exactly (100%-60%) and 15% is exactly (100%-85%). In implementation, interpretations, together with the corresponding probability bands, could be displayed to an operator as warning messages when both the probability and the proximity thresholds are reached.

Table 2 Interpretation of System Level Uncertainty in Proximity

When the uncertainty in the bucket location is also introduced as in Figure 9(c), similar analyses can be carried out with corresponding interpretations. The difference is that instead of a single 3D geometry representing the bucket, the analysis applies to a series of 3D geometries representing the probability bands of the bucket. Also, the probability to be used is the product of the probabilities in the corresponding utility and excavator bucket bands. For instance, assuming in Figure 10, the bucket band has a 60% probability and one interpretation is that there is at most a 6% ((100%-85%)*(100%-60%)) probability that the excavator WILL hit the utility. Compared to underground utilities, the uncertainty associated with the bucket is significantly smaller. This is because most of the time, the bucket is above ground or has a line of sight. Thought it could be outside the operators view (e.g. blocked by the terrain or cabin), sensing technologies such as GPS, video cameras, and accelerometers can accurately track its location and movement. Thus, the uncertainties associated with the bucket might be ignored in visualization and proximity analysis, as illustrated in Figure 9(a) and Figure 9(b). When the utility data is acquired via highly accurate sensing technologies such as RTK GPS with an uncertainty at the same magnitude to the uncertainty of bucket, then the uncertainty associated with the bucket matters and shall be incorporated in the analysis, as illustrated in Figure 9(c).

Figure 10
figure 10

Digital field location and visualization of error-aware geospatial utility data.

Results and discussion

The authors validated the developed geospatial uncertainty models in a controlled environment. Monte Carlo simulations for validating the uncertainty models for points and lines were described in an earlier section and therefore, will not be repeated here. The technical feasibility of the proposed geospatial AR ideas was evaluated by implementing a visualization and proximity analysis framework designed to visualize “error-aware” subsurface utilities during ongoing excavation operations for improved context awareness and accident avoidance.

The data used in the experiments was provided by DTE Energy, which is the largest provider of electricity and gas in southeast Michigan and by consequence owns significant underground distribution assets. The first step in pre-processing the data was assigning a specific lineage to chosen geospatial data sets. A data set assumed to have been recorded following a Ground-Penetrating Radar (GPR) survey was selected for subsequent analysis. The uncertainty model was chosen next. Given the data lineage, a cylindrical uncertainty model was chosen with quantitative error magnitudes in both horizontal and vertical directions. The error characterized data set was then archived in the developed XML schema.

The current practice of excavation damage prevention followed by state one-call centers, and the limitations in current practice were noted in an earlier section. Geospatial Augmented Reality can help to accurately visualize a proposed excavation area and digitize the located underground utilities, thus helping bridge the communication gaps among excavation contractors, one-call centers, utility owners, field locators and excavator operators. First, a contractor can issue a “visual” ticket to the one-call center and utility owners by superimposing a semi-transparent layer above a proposed excavation area. This in turn enables dispatched locators to come to the field, “see” the proposed excavation area (Figure 10), and precisely mark the surveyed area.

The following procedure was adopted to interpret error-aware geospatial data files and build conduit (e.g. pipe) models in the augmented space. First, the spatial and attribute information of pipelines was extracted by parsing the data model. For example, the geographical location of pipelines is recorded under the Geometry element as “LineString”. A cursor was designed to iterate through the XML file, locate “LineString” elements, and extract the geographical locations. Second, consecutive vertices within one “LineString” were converted from the geographical coordinate to the local coordinate in order to raise computational efficiency during the registration routine. The first vertex on the line string was chosen as the origin of the local coordinate system, and the local coordinates of the remaining vertices were determined by calculating the relative 3D vector between the rest of the vertices and the first one, using the Vincenty algorithm (Vincenty 1975). In order to save memory, a unit cylinder is shared by all pipe segments as primitive geometry upon which the transformation matrix is built.

Third, the primitive cylinder geometry was scaled, rotated, and translated to the correct size, attitude, and position. For simplicity, the normalized vector between two successive vertices was named as the pipeline vector. First, the primitive cylinder was scaled along the X- and Y-axis by the radius of the true pipeline, and then scaled along the Z-axis by the distance between two successive vertices. In addition, the scaled cylinder was rotated along the axis—formed by the cross product between vector <0, 0, 1> and the pipeline vector—by the angle of the dot product between vector <0, 0, 1> and the pipeline vector. Finally, the center of the rotated pipeline was translated to the midpoint between two successive vertices. This step was applied to each pair of two successive vertices to extract the complete geospatial data set.

The extracted data is stored digitally and contains information about the utility type and geometry, as well as the computed buffer zones indicating the geospatial uncertainty predicted by the developed models. Upon extraction and conversion of the geospatial data into 3D models, excavator operators are able to persistently visualize what utilities lie buried in a digging machine’s vicinity, and consciously avoid accidental utility strikes as excavation progresses by estimating the evolving distance between the digging machine and vicinal utilities (Figure 11).

Figure 11
figure 11

Visualization of excavator proximity to expected buried utility locations.


This paper described a dynamic approach to incorporate the uncertainties associated with buried utilities data into a geospatial-AR system for real time visualization and proximity analysis. This research modeled uncertainties of buried utilities data as probability bands, described by pairs of band size and the probability of the 3D space constructed by “buffering” at the band size to enclose the “true” location of utilities. Given the research hypothesis of that positional uncertainty of utilities data is dependent on data lineage, e.g. the genesis and processes used to collect and interpret data, the positional uncertainty of utilities data was derived in real time by referring to the data lineage model. Consequently, not only the 3D shapes and locations of utility lines and vertices, but also the associated uncertainties could be visualized, as 3D probability bands in a geospatial AR environment. This newly created approach is expected to contribute to the safety in urban excavation via the integration of Geoinformatics and construction informatics in real time in an uncertainty-aware manner.

A framework, a generic data model, and a sample XML implementation of the data model were developed and tested in this study. The impacts of the uncertainties in the utilities data on proximity analysis, e.g. analyzing the closeness between a digging implement and the underground utilities, were also discussed. A method was also developed for analyzing the proximity and interpreting the results in the context of uncertainties that could come from both the utilities and the excavator movement. Visualizing the uncertainty associated with utility location data and appropriately interpreting the resulting proximity were found to be key elements in any visual and analytical guidance furnished to excavator operators or field personnel to prevent utility strikes. It was found that uncertainty-aware, geospatial AR was the enabling technology to bring the interaction between a digging implement and the buried utilities to the excavator operator both visually and analytically (with appropriate interpretations). It was also found that having a practical object-oriented, open access, and uncertainty-aware utility data model was critical to the sharing of utilities data and its inherent uncertainties with downstream applications. Such a data model could also serve as the base for management and sharing of uncertainty-aware utilities data from a life cycle perspective.

The authors’ future goal in this uncertainty-aware, geospatial AR direction is to create advanced uncertainty qualifying and quantifying algorithms, and the further integration of technologies such as GPR and machine control and guidance (MAC).