Bayesian Network Integration with GIS
Synonyms
Definition
A Bayesian network (BN) is a graphicalmathematical construct used to probabilistically model processes which include interdependent variables, decisions affecting those variables, and costs associated with the decisions and states of the variables. BNs are inherently system representations and, as such, are often used to model environmental processes. Because of this, there is a natural connection between certain BNs and GIS. BNs are represented as a directed acyclic graph structure with nodes (representing variables, costs, and decisions) and arcs (directed lines representing conditionally probabilistic dependencies between the nodes). A BN can be used for prediction or analysis of realworld problems and complex natural systems where statistical correlations can be found between variables or approximated using expert opinion. BNs have a vast array of applications for aiding decisionmaking in areas such as medicine, engineering, natural resources, and decision management. BNs can be used to model geospatially interdependent variables as well as conditional dependencies between geospatial layers. Additionally, BNs have been found to be useful and highly efficient in performing image classification on remotely sensed data.
Historical Background
Originally described by Pearl (1988), BNs have been used extensively in medicine and computer science (Heckerman 1997). In recent years, BNs have been applied in spatially explicit environmental management studies. Examples include the Neuse Estuary Bayesian ecological response network (Borsuk and Reckhow 2000), Baltic salmon management (Varis and Kuikka 1996), climate change impacts on Finnish watersheds (Kuikka and Varis 1997), the Interior Columbia Basin Ecosystem Management Project (Lee and Bradshaw 1998), and waterbody eutrophication (Haas 1998). As illustrated in these studies, a BN graph structures a problem such that it is visually interpretable by stakeholders and decisionmakers while serving as an efficient means for evaluating the probable outcomes of management decisions on selected variables.
Both BNs and GIS can be used to represent spatially explicit, probabilistically connected environmental and other systems; however, the integration of the two techniques has only been explored relatively recently. BN integration with GIS typically takes one of the four distinct forms: (1) BNbased layer combination (i.e., probabilistic map algebra) as demonstrated in Taylor (2003); (2) BNbased classification as demonstrated in Stassopoulou et al. (1998) and Stassopoulou et al. (1998); (3) using BNs for intelligent, spatially oriented data retrieval, as demonstrated in Walker et al. (2004) and Walker et al. (2005); and (4) GISbased BN decision support system (DSS) frameworks where BN nodes are spatially represented in a GIS framework as presented by Ames et al. (2005).
Scientific Fundamentals
The Umbrella model can be interpreted as follows: if it is raining, there is a higher probability that the forecast will predict it will rain. In reverse, through the Bayesian network “backward propagation of evidence,” if the forecast predicts rain, it can be inferred that there is a higher chance that rain will actually occur. The link between “Forecast” and “Take Umbrella” indicates that the “Take Umbrella” decision is based largely on the observed forecast. Finally, the link to the “Satisfaction” utility node from both “Take Umbrella” and “Weather” captures the relative gains in satisfaction derived from every combination of states of the BN variables.
Bayesian networks are governed by two mathematical techniques: conditional probability and Bayes’ theorem.
The conditional probability inversion represented here allows for the powerful technique of Bayesian inference, for which BNs are particularly well suited. In the Umbrella model, inferring a higher probability of a rain given a rainy forecast is an example application of Bayes’ theorem.
Connecting each node in the BN is a conditional probability table (CPT). Each nature node (state variable) includes a CPT that stores the probability distribution for the possible states of the variable given every combination of the states of its parent nodes (if any). These probability distributions can be assigned by frequency analysis of the variables and expert opinion based on observation or experience, or they can be set to some “prior” distribution based on observations of equivalent systems.
Probability of rain
Weather  

No rain  Rain  
70%  30% 
Forecast probability conditioned on rain
Forcast  

Weather  Sunny  Cloudy  Rainy  
No rain  70%  20%  10%  
Rain  15%  25%  60% 
Satisfaction utility conditioned on rain and the “Take Umbrella” decision
Satisfaction  

Weather  Take Umbrella  Satisfaction  
No Rain  Take  20 units  
No Rain  Do not Take  100 units  
Rain  Take  70 units  
Rain  Do not Take  0 units 
Clearly, the higher satisfaction is predicted for leaving the umbrella at home, thereby providing an example of how a simple BN analysis can aid the decisionmaking process. While the Umbrella BN presented here is quite simple and not particularly spatially explicit, it serves as a generic BN example. Specific application of BNs in GIS is presented in the following section.
Key Applications
As discussed before, integration of GIS and BNs is useful in any BN which has spatial components, whether displaying a spatially oriented BN, using GIS functionality as input to a BN, or forming a BN from GIS analysis. Given this, the applications of such integration are only limited by that spatial association really. One example mentioned above of such a spatial orientation has showed usefulness of a watershed management BN, but there are other types of BNs which may benefit from this form of integration. For instance, many ecological, sociological, and geological studies which might benefit from a BN also could have strong spatial associations. Another example might be that traffic analysis BNs have very clear spatial associations often. Finally, even BNs trying to characterize the spread of diseases in epidemiology would likely have clear spatial association.

Probabilistic map algebra

Image classification

Automated data query and retrieval

Spatial representation of BN nodes
A brief explanation of the scientific fundamentals of each of these uses is presented here.
Probabilistic Map Algebra
Probabilistic map algebra involves the use of a BN as the combinatorial function used on a cellbycell basis when combining raster layers. For example, consider the ecological habitat models described by Taylor (2003). Here, several geospatial raster data sets are derived representing proximity zones for humancaused landscape disturbances associated with the development of roads, wells, and pipelines. Additional data layers representing known habitat for each of several threatened and endangered species are also developed and overlaid on the disturbance layers. Next, a BN was constructed representing the probability of habitat risk conditioned on both human disturbance and habitat locations. CPTs in this BN were derived from interviews with acknowledged ecological experts in the region. Finally, this BN was applied on a cellbycell basis throughout the study area, resulting in a risk probability map for the region for each species of interest.
The use of BNs in this kind of probabilistic map algebra is currently hindered only by the lack of specialized tools to support the analysis. However, the concept holds significant promise as an alternative to the more traditional GISbased “indicator analysis” where each layer is reclassified to represent an arbitrary index and then summed to give a final metric (often on a 1 to 100 scale of either suitability or unsuitability). Indeed, the BN approach results in a more interpretable probability map. For example, such an analysis could be used to generate a map of the probability of landslide conditioned on slope, wetness, vegetation, etc. Certainly a map that indicates percent chance of landslide could be more informative for decisionmakers than an indicator model that simply displays the sum of some number of reclassified indicators.
Image Classification
In the previous examples, BN CPTs are derived from historical data or information from experts. However, many BN applications make use of the concept of Bayesian learning as a means of automatically estimating probabilities from existing data. BN learning involves a formal automated process of “creating” and “pruning” the BN nodearc structure based on rules intended to maximize the amount of unique information represented by the BN CPTs. In a GIS context, BN learning algorithms have been extensively applied to image classification problems. Image classification using a BN requires the identification of a set of input layers (typically multispectral or hyperspectral bands) from which a known set of objects or classifications are to be identified.
Learning data sets include both input and output layers where output layers clearly indicate features of the required classes (e.g., polygons indicating known land cover types). A BN learning algorithm applied to such a data set will produce an optimal (in BN terms) model for predicting land cover or other classification schemes at a given raster cell based on the input layers. The application of the final BN model to predict land cover or other classifications at an unknown point is similar to the probabilistic map algebra described previously.
Automated Data Query and Retrieval
In the case of application of BNs to automated query and retrieval of geospatial data sets, the goal is typically to use expert knowledge to define the CPTs that govern which data layers are loaded for visualization and analysis. Using this approach in a dynamic webbased mapping system, one could develop a BN for the display of layers using a CPT that indicates the probability that the layer is important, given the presence or absence of other layers or features within layers at the current view extents. Such a tool would supplant the typical approach which is to activate or deactivate layers based strictly on “zoom level.” For example, consider a military GIS mapping system used to identify proposed targets. A BNbased data retrieval system could significantly optimize data transfer and bandwidth usage by only showing specific highresolution imagery when the probability of needing that data is raised due to the presence of other features which indicate a higher likelihood of the presence of the specific target.
BNbased data query and retrieval systems can also benefit from Bayesian learning capabilities by updating CPTs with new information or evidence observed during the use of the BN. For example, if a user continually views several data sets simultaneously at a particular zoom level or in a specific zone, this increases the probability that those data sets are interrelated and should result in modified CPTs representing those conditional relationships.
Spatial Representation of BN Nodes
Future Directions
It is expected that research and development of tools for the combined integration of GIS and BNs will continue in both academia and commercial entities. New advancements in each of the application areas described are occurring on a regular basis and represent an active and interesting study area for many GIS analysts and users.
References
 Ames DP, Neilson BT, Stevens DK, Lall U (2005) Using Bayesian networks to model watershed management decisions: an East Canyon Creek case study. J Hydroinform 7:267–282. IWA PublishingGoogle Scholar
 Borsuk ME, Reckhow KH (2000) Summary description of the Neuse estuary Bayesian ecological response network (NeuBERN). http://www2.ncsu.edu/ncsu/CIL/WRRI/neuseltm.html. 26 Dec 2001
 Haas TC (1998) Modeling waterbody eutrophication with a Bayesian belief network. Working paper, School of Business Administration, University of Wisconsin, MilwaukeeGoogle Scholar
 Heckerman D (1997) Bayesian networks for data mining. Data Mining Knowl Discov 1:79–119. MapWindow Open Source Team (2007). MapWindow GIS 4.3 Open Source Software. Accessed 06 Feb 2007 at the MapWindow Website: http://www.mapwindow.org/
 Kuikka S, Varis O (1997) Uncertainties of climate change impacts in Finnish watersheds: a Bayesian network analysis of expert knowledge. Boreal Environ Res 2:109–128Google Scholar
 Lee DC, Bradshaw GA (1998) Making monitoring work for managers: thoughts on a conceptual framework for improved monitoring within broadscale ecosystem management. http://icebmp.gov/spatial/lee_monitor/preface.html (26 Dec 2001)
 Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San FranciscoMATHGoogle Scholar
 Shachter R, Peot M (1992) Decision making using probabilistic inference methods. In: Proceedings of the eighth conference on uncertainty in artificial intelligence, Stanford, pp 275–283Google Scholar
 Stassopoulou A, Petrou M, Kittler J (1998) Application of a Bayesian network in a GIS based decision making system. Int J Geograph Inf Sci 12(1):23–45CrossRefGoogle Scholar
 Taylor KJ (2003) Bayesian belief networks: a conceptual approach to assessing risk to habitat. Utah State University, LoganGoogle Scholar
 Varis O, Kuikka S (1996) An influence diagram approach to Baltic salmon management. In: Proceedings of the conference on decision analysis for public policy in Europe, INFORMS decision analysis society, AtlantaGoogle Scholar
 Walker A, Pham B, Maeder A (2004) A Bayesian framework for automated dataset retrieval. In: Geographic information systems. 10th International Multimedia Modelling Conference (MMM), Brisbane, p 138Google Scholar
 Walker A, Pham B, Moody M (2005) Spatial Bayesian learning algorithms for geographic information retrieval. In: Proceedings 13th annual ACM international workshop on geographic information systems, Bremen, pp 105–114Google Scholar
Recommended Reading
 Ames DP (2002) Bayesian decision networks for watershed management. Utah State University, LoganGoogle Scholar
 Norsys Software Corp (2006) Netica Bayesian belief network software. Acquired from http://www.norsys.com/
 Stassopoulou A, Caelli T (2000) Building detection using Bayesian networks. Int J Pattern Recognit Artif Intell 14(6):715–733CrossRefGoogle Scholar