Journal of Intelligent Information Systems

, Volume 25, Issue 3, pp 293–332

A Framework for Management of Semistructured Probabilistic Data


DOI: 10.1007/s10844-005-0197-8

Cite this article as:
Zhao, W., Dekhtyar, A. & Goldsmith, J. J Intell Inf Syst (2005) 25: 293. doi:10.1007/s10844-005-0197-8


This paper describes the theoretical framework and implementation of a database management system for storing and manipulating diverse probability distributions of discrete random variables with finite domains, and associated information. A formal Semistructured Probabilistic Object (SPO) data model and a Semistructured Probabilistic Query Algebra (SP-algebra) are proposed. The SP-algebra supports standard database queries as well as some specific to probabilities, such as conditionalization and marginalization. Thus, the Semistructured Probabilistic Database may be used as a backend to any application that involves the management of large quantities of probabilistic information, such as building stochastic models. The implementation uses XML encoding of SPOs to facilitate communication with diverse applications. The database management system has been implemented on top of a relational DBMS. The translation of SP-algebra queries into relational queries are discussed here, and the results of initial experiments evaluating the system are reported.

probabilistic databasesquery algebrasdata modelssemistructured data

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of New MexicoAlbuquerqueUSA
  2. 2.Department of Computer ScienceUniversity of KentuckyLexingtonUSA