Detecting parametric objects in large scenes by Monte Carlo sampling

Verdié, Yannick; Lafarge, Florent

doi:10.1007/s11263-013-0641-0

Detecting parametric objects in large scenes by Monte Carlo sampling

Published: 27 July 2013

Volume 106, pages 57–75, (2014)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yannick Verdié¹ &
Florent Lafarge¹

1014 Accesses
21 Citations
3 Altmetric
Explore all metrics

Abstract

Point processes constitute a natural extension of Markov random fields (MRF), designed to handle parametric objects. They have shown efficiency and competitiveness for tackling object extraction problems in vision. Simulating these stochastic models is however a difficult task. The performances of the existing samplers are limited in terms of computation time and convergence stability, especially on large scenes. We propose a new sampling procedure based on a Monte Carlo formalism. Our algorithm exploits the Markovian property of point processes to perform the sampling in parallel. This procedure is embedded into a data-driven mechanism so that the points are distributed in the scene in function of spatial information extracted from the input data. The performances of the sampler are analyzed through a set of experiments on various object detection problems from large scenes, including comparisons to the existing algorithms. The sampler is also tested as optimization algorithm for MRF-based labeling problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tree-based iterated local search for Markov random fields with applications in image analysis

Article 20 November 2014

Detecting and Tracking Unknown Number of Objects with Dirichlet Process Mixture Models and Markov Random Fields

Object condensation: one-stage grid-free multi-object reconstruction in physics detectors, graph, and image data

Article Open access 24 September 2020

Notes

GCO C++ library (http://vision.csd.uwo.ca/code/).

References

Baddeley, A. J., & Lieshout, M. V. (1993). Stochastic geometry models in high-level vision. Journal of Applied Statistics, 20(5–6), 231–256.
Article Google Scholar
Benchmark, (2013). Datasets, results and evaluation tools. http://www-sop.inria.fr/members/Florent.Lafarge/benchmark/evaluation.html.
Besag, J. E. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, 48(3), 259–302.
MATH MathSciNet Google Scholar
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Article Google Scholar
Byrd, J., Jarvis, S., & Bhalerao, A. (2010). On the parallelisation of mcmc-based image processing. IEEE International Symposium on Parallel and Distributed Processing. Atlanta, US.
Chai, D., Forstner, W., & Lafarge, F. (2013). Recovering line-networks in images by junction-point processes. Computer Vision and Pattern Recognition, Portland.
Chai, D., Forstner, W., & Yang, M. Y. (2012). Combine Markov random fields and marked point processes to extract building from remotely sensed images. International Society for Photogrammetry and Remote Sensing Congress. Melbourne, Australia.
Descombes, X. (2011). Stochastic geometry for image analysis. Oxford: Wiley.
Google Scholar
Descombes, X., Minlos, R., & Zhizhina, E. (2009). Object extraction using a stochastic birth-and-death dynamics in continuum. Journal of Mathematical Imaging and Vision, 33(3), 347–359.
Article MathSciNet Google Scholar
Earl, D., & Deem, M. (2005). Parallel tempering: Theory, applications, and new perspectives. Physical Chemistry Chemical Physics, 23(7), 3910–3916.
Article Google Scholar
Ge, W., & Collins, R. (2009). Marked point processes for crowd counting. Computer Vision and Pattern Recognition. Miami.
Gonzalez, J., Low, Y., Gretton, A., & Guestrin, C. (2011). Parallel Gibbs sampling: From colored fields to thin junction trees. Journal of Machine Learning Research, 15, 324–332.
Google Scholar
Green, P. (1995). Reversible jump Markov chains Monte Carlo computation and Bayesian model determination. Biometrika, 82(4), 711–732.
Article MATH MathSciNet Google Scholar
Grenander, U., & Miller, M. (1994). Representations of knowledge in complex systems. Journal of the Royal Statistical Society, 56(4), 549–603.
MATH MathSciNet Google Scholar
Han, F., Tu, Z. W., & Zhu, S. (2004). Range image segmentation by an effective jump-diffusion method. Pattern Analysis and Machine Intelligence, 26(9), 1138–1153.
Article Google Scholar
Harkness, M., & Green, P. (2000). Parallel chains, delayed rejection and reversible jump mcmc for object recognition. British Machine Vision Conference. Bristol, United Kingdom.
Hastings, W. (1970). Monte Carlo sampling using Markov chains and their applications. Biometrika, 57(1), 97–109.
Article MATH Google Scholar
Lacoste, C., Descombe, X., & Zerubia, J. (2005). Point processes for unsupervised line network extraction in remote sensing. Pattern Analysis and Machine Intelligence, 27(10), 1568–1579.
Article Google Scholar
Lafarge, F., Gimel’farb, G., & Descombes, X. (2010). Geometric feature extraction by a multi-marked point process. Pattern Analysis and Machine Intelligence, 32(9), 1597–1609.
Article Google Scholar
Lafarge, F., & Mallet, C. (2012). Creating large-scale city models from 3d-point clouds: A robust approach with hybrid representation. International Journal of Computer Vision, 99(1), 69–85.
Article MathSciNet Google Scholar
Lehmussola, A., Ruusuvuori, P., Selinummi, J., Huttunen, H., & Yli-Harja, O. (2007). Computational framework for simulating fluorescence microscope images with cell populations. IEEE Transactions on Medical Imaging, 26(7), 1010–1016.
Google Scholar
Lempitsky, V., & Zisserman, A. (2010). Learning to count objects in images. Conference on Neural Information Processing Systems. Vancouver, Canada.
Li, S. (2001). Markov random field modeling in image analysis. Berlin: Springer.
Book MATH Google Scholar
Lieshout, M. V. (2008). Depth map calculation for a variable number of moving objects using markov sequential object processes. Pattern Analysis and Machine Intelligence, 30(7), 1308–1312.
Article Google Scholar
Liu, J. (2001). Monte Carlo strategies in scientific computing. New York: Springer.
MATH Google Scholar
Mallet, C., Lafarge, F., Roux, M., Soergel, U., Bretar, F., & Heipke, C. (2010). A marked point process for modeling lidar waveforms. IEEE Transactions on Image Processing, 19(12), 3204–3221.
Article MathSciNet Google Scholar
Nguyen, H.-G., Fablet, R., & Bouchet, J. (2010). Spatial statistics of visual keypoints for texture recognition. European Conference on Computer Vision. Heraklion, Greece.
Ortner, M., Descombes, X., & Zerubia, J. (2008). A marked point process of rectangles and segments for automatic analysis of digital elevation models. Pattern Analysis and Machine Intelligence, 30(1), 105–119.
Article Google Scholar
Rochery, M., Jermyn, I., & Zerubia, J. (2006). Higher order active contours. International Journal of Computer Vision, 69(3), 335–351.
Article Google Scholar
Salamon, P., Sibani, P., & Frost, R. (2002). Facts, Conjectures, and Improvements for Simulated Annealing. Philadelphia: SIAM Monographs on Mathematical Modeling and Computation.
Srivastava, A., Grenander, U., Jensen, G., & Miller, M. (2002). Jump-Diffusion Markov processes on orthogonal groups for object pose estimation. Journal of Statistical Planning and Inference, 103(1–2), 15–27.
Article MATH MathSciNet Google Scholar
Stoica, R. S., Martinez, V., & Saar, E. (2007). A three dimensional object point process for detection of cosmic filaments. Journal of the Royal Statistical Society, 56(4), 459.
Google Scholar
Sun, K., Sang, N., & Zhang, T. (2007). Marked point process for vasculartree extraction on angiogram. Energy Minimization Methods in Computer Vision and Pattern Recognition. Ezhou, China.
Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., et al. (2008). Comparative study of energy minimization methods for markov random fields with smoothness-based priors. Pattern Analysis and Machine Intelligence, 30(6), 1068.
Article Google Scholar
Tu, Z., & Zhu, S. (2002). Image segmentation by data-driven Markov chain Monte Carlo. Pattern Analysis and Machine Intelligence, 24(5), 657–673.
Article Google Scholar
Utasi, A., & Benedek, C. (2011). A 3-D marked point process model for multi-view people detection. Conference on Computer Vision and Pattern Recognition. Colorado Springs, US.
Verdie, Y., & Lafarge, F. (2012). Efficient Monte Carlo sampler for detecting parametric objects in large scenes. European Conference on Computer Vision. Firenze, Italy.
Weiss, Y., & Freeman, W. (2001). On the optimality of solutions of the max-product belief propagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory, 47(2), 736–744.
Article MATH MathSciNet Google Scholar
Zhu, S., Guo, C., Wang, Y., & Xu, Z. (2005). What are textons? International Journal of Computer Vision, 62(1–2), 121–143.
Google Scholar

Download references

Acknowledgments

This work was partially funded by the European Research Council (ERC Starting Grant “Robust Geometry Processing”, Grant agreement 257474). The authors thank A. Lehmussola, V. Lempitsky, H. Bischof, R. Ehrich, the French Mapping Agency (IGN), the Tour du Valat, and the BRGM for providing the datasets, as well as the reviewers for their valuable comments.

Author information

Authors and Affiliations

INRIA, Sophia Antipolis, France
Yannick Verdié & Florent Lafarge

Authors

Yannick Verdié
View author publications
You can also search for this author in PubMed Google Scholar
Florent Lafarge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Florent Lafarge.

Appendices

Appendix 1: Population counting model

Let $x$ denote a configuration of ellipses for which the center of mass of an ellipse is contained in the compact set $K$ supporting the input image (see Fig. 19). The energy follows the form specified by Eq. 5. The unitary data term $D(x_i)$ and the potential $V(x_i,x_j)$ are given by:

$$\begin{aligned} D(x_i)&= \left\{ \begin{array}{ll} \quad 1 - \frac{d(x_i)}{d_0} &{}\text { if }d(x_i)<d_0\\ \quad exp(\frac{d_0-d(x_i)}{d_0}) -1 &{}\text { otherwise}\\ \end{array} \right. \end{aligned}$$

(20)

$$\begin{aligned} V(x_i,x_j)&= \beta \frac{A(x_i \cap x_j)}{\min (A(x_i),A(x_j))} \end{aligned}$$

(21)

where

$d(x_i)$ represents the Bhattacharyya distance between the radiometry inside and outside the object $x_i$:
$$\begin{aligned} d(x_i) = \frac{(m_{in}-m_{out})^2}{4(\sigma _{in}^2+\sigma _{out}^2)} - \frac{1}{2} ln\left( \frac{2\sigma _{in}\sigma _{out}}{\sigma _{in}^2+\sigma _{out}^2}\right) \end{aligned}$$
(22)
where $m_{in}$ and $\sigma _{in}$ (respectively $m_{out}$ and $\sigma _{out}$) are the intensity mean and standard deviation in $S_{in}$ (respectively in $S_{out}$).
$d_0$ is a coefficient fixing the sensitivity of the object fitting. The higher the value of $d_0$, the more selective the object fitting. In particular, $d_0$ has to be high when the input images are corrupted by a significant amount of noise.
$A(x_i)$ is the area of object $x_i$.
$\beta $ is a coefficient weighting the non-overlapping constraint with respect to the data term.

Note that a basic mathematical dilatation is used in practice to roughly extract the class of interest from the image of birds for creating a space-partitioning tree.

Appendix 2: Line-network extraction model

A line-segment is defined by five parameters, including the 2D point corresponding to the center of mass of the object (Fig. 19). Similarly to the population counting model detailed in Appendix 1, the fitting quality with respect to the data is based on the Bhattacharyya distance: the unitary data term $D(x_i)$ of the energy is given by Eq. 20. The potential $V(x_i,x_j)$ penalizes strong object overlaps (see Eq. 21), but also takes into account a connection interaction in order to favor the linking of the line-segments. The potential term is thus given by:

$$\begin{aligned} V(x_i,x_j) \!=\! \beta _1 \frac{A(x_i \cap x_j)}{\min (A(x_i),A(x_j))} \!+\! \mathbf 1 _{x_i \sim _{nc} x_j} \times \beta _2 f(x_i,x_j) \nonumber \\ \end{aligned}$$

(23)

where

$\beta _1$ and $\beta _2$ are two coefficients weighing respectively the non-overlapping and connection constraints with respect to the data term.
$\sim _{nc}$ is the non-connection relationship between two objects. $x_i \sim _{nc} x_j$ if the anchor areas of $x_i$ and $x_j$ (see Fig. 19) do not overlap.
$ \mathbf 1 _{condition}$ is the indicative function returning one when condition is valid, and zero otherwise.
$f(x_i,x_j)$ is a symmetric function weighting the penalization of two non-connected objects $x_i$ and $x_j$ with respect to their average fitting quality. The function $f$ is introduced to slightly relax the connection constraint when the two objects are of very good quality.

As for the bird counting problem, a basic mathematical dilatation has been used to roughly extract the class of interest from the aerial image shown on Fig. 14. Indeed the pixels corresponding to the class road in this image are relatively bright compared to the background. The segmented result is obviously not optimal, but sufficient to create an efficient space-partitioning tree.

Appendix 3: Tree recognition model formulation

Let $x$ represent a configuration of 3D-models of trees from a template library described in Fig. 20. The center of mass $p$ of a tree is contained in the compact set $K$ supporting the 3D bounding box of the input point cloud (Fig. 19). We denote by $\partial x_i$ the surface of the object $x_i$, and by $\mathcal C x_i$ the cylindrical volume having a vertical axis passing through the center of mass of $x_i$, in which the input points are considered to measure the quality of $x_i$. The unitary data term $D(x_i)$ and the pairwise potential $V(x_i,x_j)$ are given by:

$$\begin{aligned} D(x_i)&= \frac{1}{|\mathcal C x_i|} \underset{p_c \in \mathcal C x_i}{\prod } \gamma ( d(p_c, \partial x_i)) \end{aligned}$$

(24)

$$\begin{aligned} V(x_i,x_j)&= \beta _1 V_{overlap}(x_i,x_j) + \beta _2 V_{competition}(x_i,x_j) \nonumber \\ \end{aligned}$$

(25)

where

$|\mathcal C x_i|$ is a coefficient normalizing the unitary data term with respect to the number of input points contained in $\mathcal C x_i$.
$d(p_c, \partial x_i)$ is a distance measuring the coherence of the point $p_c$ with respect to the object surface $\partial x_i$. $d$ is not the traditional orthogonal distance from point to surface because, as real trees do not describe ellipsoidal/conoidal shapes, input points are not homogeneously distributed on the object surface. Here, $d$ is defined as the combination of the planimetric distance, i.e. the projection in the plane of equation $z=0$ of the Euclidean distance, and the altimetric variation such that points outside the object are more penalized than inside points. Note that $d$ is invariant by rotation around the Z-axis.
$\gamma (.) \in [-1,1]$ is a quality function which is strictly increasing.
$V_{overlap}$ is the pairwise potential penalizing strong overlapping between two objects, and given by:
$$\begin{aligned} V_{overlap}(x_i,x_j) = \frac{A(x_i \cap x_j)}{\min (A(x_i),A(x_j))} \end{aligned}$$
(26)
where $A(x_i)$ is the area of the object $x_i$ projected onto the plane of equation $z=0$.
$V_{competition}$ is the pairwise potential favoring a similar tree type $t$ in a local neighborhood:
$$\begin{aligned} V_{competition}(x_i,x_j) = \mathbf 1 _{t_i \ne t_j} \end{aligned}$$
(27)
where $\mathbf 1 _{.}$ is the indicative function.
$\beta _1$ and $\beta _2$ are two coefficients weighting respectively the non-overlapping constraint and the competition term with respect to the data term.

In order to roughly extract the class of interest from the point clouds, the scatter descriptor proposed by Lafarge and Mallet (2012) is used to identify the points which potentially correspond to trees.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Verdié, Y., Lafarge, F. Detecting parametric objects in large scenes by Monte Carlo sampling. Int J Comput Vis 106, 57–75 (2014). https://doi.org/10.1007/s11263-013-0641-0

Download citation

Received: 28 August 2012
Accepted: 10 July 2013
Published: 27 July 2013
Issue Date: January 2014
DOI: https://doi.org/10.1007/s11263-013-0641-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting parametric objects in large scenes by Monte Carlo sampling

Abstract

Access this article

Similar content being viewed by others

Tree-based iterated local search for Markov random fields with applications in image analysis

Detecting and Tracking Unknown Number of Objects with Dirichlet Process Mixture Models and Markov Random Fields

Object condensation: one-stage grid-free multi-object reconstruction in physics detectors, graph, and image data

Notes

References

Acknowledgments