Skip to main content
Log in

Probability Theory as Logic: Data Assimilation for Multiple Source Reconstruction

  • Published:
Pure and Applied Geophysics Aims and scope Submit manuscript

Abstract

Probability theory as logic (or Bayesian probability theory) is a rational inferential methodology that provides a natural and logically consistent framework for source reconstruction. This methodology fully utilizes the information provided by a limited number of noisy concentration data obtained from a network of sensors and combines it in a consistent manner with the available prior knowledge (mathematical representation of relevant physical laws), hence providing a rigorous basis for the assimilation of this data into models of atmospheric dispersion for the purpose of contaminant source reconstruction. This paper addresses the application of this framework to the reconstruction of contaminant source distributions consisting of an unknown number of localized sources, using concentration measurements obtained from a sensor array. To this purpose, Bayesian probability theory is used to formulate the full joint posterior probability density function for the parameters of the unknown source distribution. A simulated annealing algorithm, applied in conjunction with a reversible-jump Markov chain Monte Carlo technique, is used to draw random samples of source distribution models from the posterior probability density function. The methodology is validated against a real (full-scale) atmospheric dispersion experiment involving a multiple point source release.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. To allow for the presence of localized sources with time-varying emission rates, it is straightforward to deal with this case by simply selecting the source atoms in the structure-based representation of Eq. 5 to have the following form: \(Q_k \delta({\mathbf{x}}-{\mathbf{x}}_{s,k})\left[{\mathcal H}(t-T_b^k)-{\mathcal H}(t-T_e^k)\right]\) (\(k=1,2,\ldots, N_{\rm s}\)) where \({\mathcal H}(\cdot)\) denotes the Heaviside unit step function; and, T b and T e are the activation and deactivation times of the source atom, respectively.

  2. Actually, the source–receptor relationship considered explicitly here is a special case of Eq. 1 involving the steady continuous emission of a contaminant from a sustained source into a statistically stationary atmosphere.

  3. The binomial distribution B(p *;N s,minN s,max) has a probability distribution function defined as follows:

    $$ p(N_{\rm s}|I)={\frac{(N_{{\rm s},{\rm max}}-N_{{\rm s},{\rm min}})!}{(N_{\rm s}-N_{{\rm s},{\rm min}})!(N_{{\rm s},{\rm max}}-N_{\rm s})!}} p^{*(N_{\rm s}-N_{{\rm s},{\rm min}})} (1-p^*)^{N_{{\rm s},{\rm max}}-N_{\rm s}}, $$

    for \(N_{\rm s} = N_{\rm s,min},N_{\rm s,min}+1, \ldots , N_{\rm s,max}.\) Note that in this definition, the standard form of the binomial distribution has been offset by the minimum number of source atoms N s,min. If the expected degree of complexity in the source molecule is unknown a priori, then the assignment of the maximum number of source atoms can be omitted if the binomial distribution for p(N s|I) is replaced by a discrete probability distribution having support over the natural numbers \({{\mathbb{N}}}\) (e.g., Poisson distribution, geometric distribution).

  4. More specifically, a Bernoulli-uniform mixture model for Q k has the following form:

    $$ p(Q_k|I) = (1-\gamma)\delta(Q_k) + \gamma {{\mathbb{I}}}_{(0,Q_{\rm max})}(Q_k)\big/Q_{\rm max}, $$

    (\(k=1,2,\ldots,N_{\rm s}\)) and \({{\mathbb{I}}_A(x)}\) is the indicator function for set A, with \({{\mathbb{I}}_A(x) = 1}\) if \(x \in A\) and \({{\mathbb{I}}_A(x) = 0}\) if \(x \not\in A. \)

  5. The forward increments of the marked particle position and velocity are defined as \(\hbox{d}{\mathbf{X}}(t) \equiv {\mathbf{X}}(t+\hbox{d}t)-{\mathbf{X}}(t)\) and \(\hbox{d}{\mathbf{U}}(t) \equiv {\mathbf{U}}(t+\hbox{d}t)-{\mathbf{U}}(t), \) respectively, with dt > 0.

  6. A Wiener process W i is a stochastic process which has independent increments with \({\hbox{d}W_i(t+\hbox{d}t) \equiv W_i(t+\hbox{d}t) - W_i(t) \sim {\mathcal{N}}(0,\hbox{d}t)}\) where \({{\mathcal{N}}(\mu,\sigma^2)}\) denotes a Gaussian distribution with mean μ and variance σ2.

  7. In this paper, we use the term “sampler” to refer to any algorithm that has been designed to draw samples from the posterior PDF of the source parameters.

  8. A Markov chain is a discrete-time random process with the Markov property (viz, the next state in the chain depends only on the current state and not on any of the past states).

  9. The prior distribution \(p(\Uptheta|I)\) is composed of standard probability distributions for which independent sampling is easy.

  10. This simply implies that \(\{\Uptheta_k(\lambda)\}_{k=1}^{N_{\rm mem}}\) can be interpreted as an ensemble of source molecules drawn from \(p_\lambda(\Uptheta|{\mathbf{D}},I).\)

  11. The finite value of N s,max = 8 chosen here assumes that the user knows a priori that the unknown source distribution consists of a (small) number of discrete point sources. Furthermore, the choice of p * = 1/7 for p(N s|I) implies that the expected a priori number of sources in the domain is \(\langle N_{\rm s} \rangle = N_{\rm s,min} + (N_{\rm s,max}-N_{\rm s,min})p^* = 2. \)

  12. The marginal posterior distribution \(p(N_{\rm s}|{\mathbf{D}},I), \) which is required for the inference of the number of sources, N s, is estimated as follows:

    $$ {\hat p}(N_{\rm s}|{{\mathbf{D}}},I)=\frac{1}{N_*}\#\left\{t: N_{\rm s}^{(t)} = N_{\rm s}\right\}=\frac{1}{N_*}\sum_{t=1}^{N_*} {\mathbb{I}}_{N_{\rm s}}(N_{\rm s}^{(t)}), $$

    where N * is the number of samples (source molecules) drawn from the Markov chain. Note that this estimate counts the number of source molecules with exactly N s source atoms and normalizes the count by N *.

  13. The algorithm was applied with other values for \(p^* \in (0,1)\) with essentially no change in the results with respect to both the best estimates of the source parameters and the uncertainty quantification for these estimates.

  14. A p% highest posterior density (HPD) interval encloses a source parameter with p% probability, and is constructed so that the lower and upper bounds of the specified interval are such that the probability density function within the interval is everywhere larger than outside it. This interval can be used as a uncertainty specification for a source parameter.

References

  • Allen, C. T., Young, G., and Haupt, S. E. (2007), Improving Pollutant Source Characterization by Better Estimating Wind Direction with a Genetic Algorithm, Atmos. Environ. 41, 2283–2289.

  • Bocquet, M. (2005), Reconstruction of an Atmospheric Tracer Source Using the Principle of Maximum Entropy. I: Theory, Quart. J. Roy. Meteorol. Soc. 131, 2191–2208.

  • Chow, F. K., Kosovic, B., and Chan, S. (2008), Source Inversion for Contaminant Plume Dispersion in Urban Environments Using Building-Resolving Simulations, J. Appl. Meteorol. Climatol. 47, 1553–1572.

  • Cover, T. M., and Thomas, J. A., Elements of Information Theory (John Wiley & Sons, Inc., New York 1991).

  • Cox, R. T. (1946), Probability, Frequency, and Reasonable Expectation, Am. J. Phys. 14, 1–13.

  • Flesch, T. K., Lowry, L. A., Desjardins, R. L., Gao, Z., and Crenna, B. P. (2009), Multi-Source Emission Determination Using an Inverse-Dispersion Technique, Boundary-Layer Meteorol. 132, 11–30.

  • Flesch, T. K., Wilson, J. D., and Yee, E. (1995), Backward-Time Lagrangian Stochastic Dispersion Models and Their Application to Estimate Gaseous Emissions, J. Appl. Meteorol. 34, 1320–1332.

  • Gamerman, D., and Lopes, H. F., Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, Second edition (CRC Press, Chapman and Hall, Boca Raton, Florida 2006).

  • Gelman, A., Carlin, J., Stern, H., and Rubin, D., Bayesian Data Analysis, Second edition (CRC Press, Chapman and Hall, Boca Raton, Florida 2003).

  • Geyer, C. J., Markov Chain Monte Carlo Maximum Likelihood, In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface (ed. Keramidas E. M.) (Interface Foundation, Fairfax Station 1991), pp. 156–163.

  • Gilks, W. R., Richardson, S., and Spiegelhalter, D. J., Markov Chain Monte Carlo In Practice (CRC Press, Chapman and Hall, Boca Raton, Florida 1996).

  • Green, P. J. (1995), Reversible Jump MCMC Computation and Bayesian Model Determination, Biometrika 82, 711–732.

  • Hanson, P. R., Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion (Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania 1998).

  • Hourdin, F., and Issartel, J.-P. (2000), Sub-Surface Nuclear Tests Monitoring Through the CTBT Xenon Network, Geophys. Res. Lett. 27, 2245–2248.

  • Hourdin, F., and Talagrand, O. (2006), Eulerian Backtracking of Atmospheric Tracers. I: Adjoint Derivation and Parameterization of Subgrid-Scale Transport, Quart. J. Roy. Meteorol. Soc. 132, 567–583.

  • Huerta, G., and West, M. (1999), Priors and Component Structures in Autoregressive Time Series Models, J. Roy. Stat. Soc. (Series B) 51, 881–899.

  • Issartel, J.-P., and Baverel, J. (2003), Inverse Transport for the Verification of the Comprehensive Nuclear Test Band Treaty, Atmos. Chem. Phys. 3, 475–486.

  • Jaynes, E. T., Probability Theory: The Logic of Science (Cambridge University Press, Cambridge, UK 2003).

  • Keats, A., Yee, E., and Lien, F.-S. (2007a), Bayesian Inference for Source Determination With Applications to a Complex Environment, Atmos. Environ. 41, 465–479.

  • Keats, A., Yee, E., and Lien, F.-S. (2007b), Efficiently Characterizing the Origin and Decay rate of a Nonconservative Scalar Using Probability Theory, Ecol. Model. 205, 437–452.

  • Keats, A., Yee, E., and Lien, F.-S. (2010), Information-Driven Receptor Placement for Contaminant Source Determination, Environ. Model. Software 25, 1000–1013.

  • Kitagawa, G. (1996), Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models, J. Comput. Graph. Statist. 5, 1–25.

  • Krysta, M., and Bocquet, M. (2007), Source Reconstruction of an Accidental Radionuclide Release at European Scale, Quart. J. Roy. Meteorol. Soc. 133, 529–544.

  • Oberkampf, W. L., and Barone, M. F. (2006), Measures of Agreement Between Computation and Experiment: Validation Metrics, J. Comput. Phys. 217, 5–36.

  • Rao, K. S. (2005), Uncertainty Analysis in Atmospheric Dispersion Modeling, Pure Appl. Geophys. 162, 1893–1917.

  • Richardson, S., and Green, P. (1997), Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination (With Discussion), J. Roy. Stat. Soc. (Series B) 59, 731–758.

  • Robertson, L., and Langner, J. (1998), Source Function Estimate by Means of Variational Data Assimilation Applied to the ETEX-I Tracer Experiment, Atmos. Environ. 32, 4219–4225.

  • Robertson, L., and Persson, C. (1993), Attempts to Apply Four-Dimensional Data Assimilation of Radiological Data Using the Adjoint Technique, Rad. Protect. Dos. 50, 333–337.

  • Seibert, P., Methods for Source Determination in the Context of the CTBT Radionuclide Monitoring System, In Informal Workshop on Meteorological Modeling in Support of CTBT Verification, 2000, 6 pp.

  • Seibert, P., and Stohl, A., Inverse Modeling of the ETEX-1 Release With a Lagrangian Particle Model, In Proceedings of the Third GLOREAM Workshop (eds. Barone G., Builtjes P., and Giunta, G.) (2000) pp. 95–105.

  • Stull, R. B., An Introduction to Boundary-Layer Meteorology (Kluwer Academic Publishers, Dordrecht, The Netherlands, 1988).

  • Thomson, D. J. (1987), Criteria for the Selection of Stochastic Models of Particle Trajectories in Turbulent Flows, J. Fluid Mech. 180, 529–556.

  • Thomson, L. C., Hirst, B., Gibson, G., Gillespie, S., Jonathan, P., Skeldon, K. D., and Padgett, M. J. (2007), An Improved Algorithm for Locating a Gas Source Using Inverse Methods, Atmos. Environ. 41, 1128–1134.

  • Tikhonov, A. N., and Arsenin, V. Y., Solutions of Ill-Posed Problems (Wiley, New York, 1977).

  • van Dop, H., Addis, R., Fraser, G., Girardi, F., Graziani, G., Inoue, Y., Kelly, N., Klug, W., Kulmala, A., Nodop, K., and Pretel, J. (1998), ETEX: A European Tracer Experiment; Observations, Dispersion Modeling and Emergency Response, Atmos. Environ. 32, 4089–4094.

  • von der Linden, W., Fischer, R., and Dose, V., Evidence Integrals, In Maximum Entropy and Bayesian Methods, (eds. Hanson K. M., and Silver R. N.) (Kluwer Academic Publishers, Dordrecht, The Netherlands 1996), pp. 443–450.

  • Wilson, J. D., Flesch, T. K., and Harper, L. A. (2001), Micro-Meteorological Methods for Estimating Surface Exchange With a Disturbed Wind Flow, Agric. Forest Meteorol. 107, 207–225.

  • Yee, E., Probabilistic Inference: An Application to the Inverse Problem of Source Function Estimation, The Technical Cooperation Program (TTCP) Chemical and Biological Defence (CBD) Group Technical Panel 9 (TP-9) Annual Meeting, Defence Science and Technology Organization, Melbourne, Australia, 2005.

  • Yee, E., A Bayesian Approach for Reconstruction of the Characteristics of a Localized Pollutant Source from a Small Number of Concentration Measurements Obtained by Spatially Distributed “Electronic Noses”, Russian-Canadian Workshop on Modeling of Atmospheric Dispersion of Weapon Agents, Karpov Institute of Physical Chemistry, Moscow, Russia, 2006.

  • Yee, E., Bayesian Probabilistic Approach for Inverse Source Determination from Limited and Noisy Chemical or Biological Sensor Concentration Measurements, In Proceedings of SPIE, Chemical and Biological Sensing VIII, 6554 (ed. Augustus W. Fountain III) (2007) 12 pp.

  • Yee, E., Inverse Dispersion of an Unknown Number of Contaminant Sources, In 15th Joint Conference on the Applications of Air Pollution Meteorology with the A&WMA, New Orleans, LA, Paper 7.1 (2008a) 17 pp.

  • Yee, E. (2008b), Theory for Reconstruction of an Unknown Number of Contaminant Sources Using Probabilistic Inference, Boundary-Layer Meteorol. 127, 359–394.

  • Yee, E. (2010), An Operational Implementation of a CBRN Sensor-Driven Modeling Paradigm for Stochastic Event Reconstruction, DRDC Suffield TR 2010–070, Defence R&D Canada – Suffield, 68 pp.

  • Yee, E. and Flesch, T. K. (2010), Inference of Emission Rates from Multiple Sources using Bayesian Probability Theory, J. Environ. Monit. 12, 622–634.

  • Yee, E., Lien, F.-S., Keats, A., and D’Amours, R. (2008), Bayesian Inversion of Concentration Data: Source Reconstruction in the Adjoint Representation of Atmospheric Diffusion, J. Wind Eng. Ind. Aerodyn. 96, 1805–1816.

Download references

Acknowledgments

This work has been partially supported by Chemical Biological Radiological-Nuclear and Explosives Research and Technology Initiative (CRTI) program under project number CRTI-07-0196TD.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eugene Yee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yee, E. Probability Theory as Logic: Data Assimilation for Multiple Source Reconstruction. Pure Appl. Geophys. 169, 499–517 (2012). https://doi.org/10.1007/s00024-011-0384-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00024-011-0384-1

Keywords

Navigation