Thin structures retrieval using anisotropic neighborhoods of superpixels: application to shape-from-focus

Ribal, Christophe; Le Hégarat-Mascle, Sylvie; Lermé, Nicolas

doi:10.1007/s11045-022-00854-8

Thin structures retrieval using anisotropic neighborhoods of superpixels: application to shape-from-focus

Published: 16 November 2022

Volume 34, pages 179–204, (2023)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Christophe Ribal¹,
Sylvie Le Hégarat-Mascle¹ &
Nicolas Lermé¹

171 Accesses
Explore all metrics

Abstract

Shape-from-focus (SFF) refers to the challenging inverse problem of recovering the scene depth from a given set of focused images using a static camera. Standard approaches model the interactions between neighboring pixels to get a regularized solution. Nevertheless, isotropic regularization is known to introduce undesired artifacts and to remove early thin structures. These structures have a small size in at least one dimension and are more numerous when considering superpixel preprocessing. This paper addresses the improvement of SFF regularization through the estimation of the presence of such structures and the construction of anisotropic neighborhoods sticking along image edges and proposes a flexible formulation over pixels or superpixels. A thoroughly study comparing different strategies for constructing these neighborhoods in terms of accuracy and running time for the targeted application is provided. Notably, experiments performed on a reference dataset show the overall superiority of the approach, e.g. a decrease of the RMSE value by about 20%, and its robustness against generated superpixels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

Enforcing spatially coherent structures in shape from focus

Article 31 March 2023

Superpixel-Based Multi-focus Image Fusion

Depth-Based Focus Stacking with Labeled-Laplacian Propagation

Data availibility

The datasets generated during and/or analysed during the current study are available at https://vision.middlebury.edu/stereo/data/.

Notes

https://vision.middlebury.edu/stereo/data/.

References

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Süsstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282. https://doi.org/10.1109/TPAMI.2012.120
Article Google Scholar
Ali, U., & Mahmood, M. (2021). Robust focus volume regularization in shape from focus. IEEE Transactions on Image Processing, 30, 7215–7227. https://doi.org/10.1109/TIP.2021.3100268.
Article Google Scholar
Ali, U., Pruks, V., & Mahmood, M. T. (2019). Image focus volume regularization for shape from focus through 3D weighted least squares. Information Sciences, 489, 155–166. https://doi.org/10.1016/j.ins.2019.03.056
Article MathSciNet Google Scholar
Arbeláez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916. https://doi.org/10.1109/TPAMI.2010.161
Article Google Scholar
Boykov, Y., & Jolly, M.-P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in N–D images. In Proceedings of the international conference on computer vision (vol. 1, pp. 105–112). https://doi.org/10.1109/ICCV.2001.937505
Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(9), 1124–1137. https://doi.org/10.1109/TPAMI.2004.60
Article MATH Google Scholar
Cui, B., Xie, X., Ma, X., Ren, G., & Ma, Y. (2018). Superpixel-based extended random walker for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 56(6), 1–11. https://doi.org/10.1109/TGRS.2018.2796069
Article Google Scholar
Favaro, P. (2010). Recovering thin structures via nonlocal-means regularization with application to depth from defocus. In Proceedings of the international conference on computer vision and pattern recognition (pp. 1133–1140). https://doi.org/10.1109/CVPR.2010.5540089
Fulkerson, B., Vedaldi, A., & Soatto, S. (2009). Class segmentation and object localization with superpixel neighborhoods. In Proceedings of the international conference on computer vision (pp. 670–677). https://doi.org/10.1109/ICCV.2009.5459175
Gaganov, V., & Ignatenko, A. (2009). Robust shape from focus via Markov random fields. In Conference ”GraphiCon’2009” (pp. 74–80).
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741. https://doi.org/10.1109/TPAMI.1984.4767596
Article MATH Google Scholar
Giraud, R., Ta, V.-T., Bugeau, A., Coupe, P., & Papadakis, N. (2017). Super-PatchMatch: An algorithm for robust correspondences using superpixel patches. IEEE Transactions on Image Processing, 26(8), 4068–4078. https://doi.org/10.1109/TIP.2017.2708504
Article MathSciNet MATH Google Scholar
Gould, S., Fulton, R., & Koller, D. (2009). Decomposing a scene into geometric and semantically consistent regions. In Proceedings of the international conference on computer vision (pp. 1–8). https://doi.org/10.1109/ICCV.2009.5459211
Kumar G. P., & Sahay, R. R. (2017). Accurate structure recovery via weighted nuclear norm: A low rank approach to shape-from-focus. In 2017 IEEE international conference on computer vision workshops (ICCVW) (pp. 563–574). IEEE. https://doi.org/10.1109/ICCVW.2017.73
Lai, K.-N., & Leou, J.-J. (2021). Superpixel-based multi-focus image fusion. Advances in Computer Vision and Computational Biology, 66, 221–233. https://doi.org/10.1007/978-3-030-71051-417
Article Google Scholar
Liu, Y.-J., Yu, C.-C., Yu, M.-J., & He, Y. (2016). Manifold SLIC: A fast method to compute content-sensitive superpixels. In Proceedings of international conference on computer vision and pattern recognition (pp. 651–659). https://doi.org/10.1109/CVPR.2016.77
Machairas, V., Faessel, M., Cárdenas-Peña, S., Chabardes, T., Walter, T., & Decencière, E. (2015). Waterpixels. IEEE Transactions on Image Processing, 24(11), 3707–3716.
Article MathSciNet MATH Google Scholar
Medioni, G., Mordohai, P., & Nicolescu, M. (2005). The tensor voting framework. Handbook of geometric computing (pp. 535–568). Springer. https://doi.org/10.1007/3-540-28247-516
Medioni, G., Tang, C.-K., & Lee, M.-S. (2000). Tensor Voting: Theory and applications. In Proceedings of the RFIA.
Merveille, O., Naegel, B., Talbot, H., & Passat, N. (2019). n d variational restoration of curvilinear structures with prior-based directional regularization. IEEE Transactions on Image Processing, 28(8), 3848–3859.
Article MathSciNet MATH Google Scholar
Merveille, O., Talbot, H., Najman, L., & Passat, N. (2018). Curvilinear structure analysis by ranking the orientation responses of path operators. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(2), 304–317. https://doi.org/10.1109/TPAMI.2017.2672972
Article MATH Google Scholar
Moeller, M., Benning, M., Schonlieb, C., & Cremers, D. (2015). Variational depth from focus reconstruction. IEEE Transactions on Image Processing, 24(12), 5369–5378. https://doi.org/10.1109/TIP.2015.2479469
Article MathSciNet MATH Google Scholar
Nayar, S., & Nakagawa, Y. (1994). Shape from focus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(8), 824–831. https://doi.org/10.1109/34.308479
Article Google Scholar
Pei, S.-C., Chang, W.-W., & Shen, C.-T. (2014). Saliency detection using superpixel belief propagation. In Proceedings of the international conference on image processing (pp. 1135–1139). https://doi.org/10.1109/ICIP.2014.7025226
Pertuz, S., Puig, D., & Garcia, M. (2013). Analysis of focus measure operators for shape-from-focus. Pattern Recognition, 46(5), 1415–1432. https://doi.org/10.1016/j.patcog.2012.11.011
Article MATH Google Scholar
Ribal, C., Lermé, N., & Le Hégarat-Mascle, S. (2018). Efficient graph cut optimization for shape from focus. Journal of Visual Communication and Image Representation, 55, 529–539. https://doi.org/10.1016/j.jvcir.2018.06.029
Article Google Scholar
Ribal, C., Lermé, N., & Le Hégarat-Mascle, S. (2020). Thin structures segmentation using anisotropic neighborhoods. Information processing and management of uncertainty in knowledge-based systems (vol. 1237, pp. 601–612). https://doi.org/10.1007/978-3-030-50146-4_44
Scharstein, D., & Pal, C. (2007). Learning conditional random fields for stereo. In Proceedings of the international conference on computer vision and pattern recognition (pp. 1–8). https://doi.org/10.1109/CVPR.2007.383191
Stawiaski, J., & Decencière, E. (2011). Region merging via graph-cuts. Image Analysis & Stereology, 27(1), 39.
Article MATH Google Scholar
Stutz, D., Hermans, A., & Leibe, B. (2018). Superpixels: An evaluation of the state-of-the-art. Computer Vision and Image Understanding, 166, 1–27. https://doi.org/10.1016/j.cviu.2017.03.007
Article Google Scholar
Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., & Rother, C. (2008). A comparative study of energy minimization methods for Markov random fields with smoothness-based priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 1068–1080. https://doi.org/10.1109/TPAMI.2007.70844
Article Google Scholar
Tang, D., Fu, H., & Cao, X. (2012). Topology preserved regular superpixel. In 2012 IEEE international conference on multimedia and expo (pp. 765–768). IEEE. Retrieved 2018-05-03, from http://ieeexplore.ieee.org/document/6298495/, https://doi.org/10.1109/ICME.2012.184
Ulen, J., Strandmark, P., & Kahl, F. (2015). Shortest paths with higherorder regularization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(12), 2588–2600.
Article Google Scholar
Wang, Z., & Sheikh, H. R. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 14. https://doi.org/10.1109/TIP.2003.819861
Article Google Scholar
Yao, J., Boben, M., Fidler, S., & Urtasun, R. (2015). Real-time coarse-tofine topologically preserving segmentation. In Proceedings of international conference on computer vision and pattern recognition (pp. 2947–2955). https://doi.org/10.1109/CVPR.2015.7298913
Yu, Y., Guan, H., & Ji, Z. (2015). Rotation-invariant object detection in highresolution satellite imagery using superpixel-based deep Hough forests. IEEE Geoscience and Remote Sensing Letters, 12(11), 2183–2187. https://doi.org/10.1109/LGRS.2015.2432135
Article Google Scholar
Zou, Q., Cao, Y., Li, Q., Mao, Q., & Wang, S. (2012). Cracktree: Automatic crack detection from pavement images. Pattern Recognition Letters, 33(3), 227–238. https://doi.org/10.1016/j.patrec.2011.11.004
Article Google Scholar

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Université Paris-Saclay, SATIE Laboratory UMR 8029, Avenue des sciences, 91190, Gif-sur-Yvette, France
Christophe Ribal, Sylvie Le Hégarat-Mascle & Nicolas Lermé

Authors

Christophe Ribal
View author publications
You can also search for this author in PubMed Google Scholar
Sylvie Le Hégarat-Mascle
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Lermé
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors equally contributed to the writing and the reviewing of this paper. All authors approved the current version of this paper.

Corresponding author

Correspondence to Nicolas Lermé.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: 3D Tensor Voting

Let ${\mathbb {R}}^{3\times 3}$ with an origin coordinate O in ${\mathbb {R}}^3$ be the considered vector space, endowed with a voting function $VF:{\mathbb {R}}^{3\times 3}\times {\mathbb {R}}^{3} \mapsto {\mathbb {R}}^{3\times 3}$. A tensor can be represented by a matrix ${\mathbb {T}}\in {\mathbb {R}}^{3\times 3}$. The voting operation VF builds a new tensor ${\mathbb {T}}'$ to the cast location $P\in {\mathbb {R}}^3$ and adds it to the tensor at this location, since tensors have good summation properties. The tensor ${\mathbb {T}}'$ is a combination of rotation and scaling of the source tensor ${\mathbb {T}}$, combinations that are all derived from the stick kernel. Indeed, tensors can be decomposed in a basis of tensors, in which the stick tensor is the simplest element. Then, the stick kernel refers to the voting operation of this stick tensor.

In tensor voting, a tensor is a second order symmetric tensor that can be represented by a positive semidefinite diagonalizable matrix ${\mathbb {T}}\in {\mathbb {R}}^{3\times 3}$, whose eigenvectors are orthogonal. In addition to its coordinates, one tensor can be characterized either from six scalar values corresponding to the coefficients of the symmetric matrix or, from three eigenvalues and a rotation. This rotation defines the transformation of the orthonormal basis $({\textbf{e}}_0,{\textbf{e}}_1,{\textbf{e}}_2)$ to align with $({\hat{\textbf{e}}}_0,{\hat{\textbf{e}}}_1,{\hat{\textbf{e}}}_2)\in {\mathbb {R}}^{3\times 3}$ the set of eigenvectors sorted by decreasing eigenvalue. The decomposition of the matrix into a set of diagonal matrices is a key point introduced by Medioni et al. (2000). By definition, the tensor is a diagonal matrix in the system $({\hat{\textbf{e}}}_0,{\hat{\textbf{e}}}_1,{\hat{\textbf{e}}}_2)$, so that:

$$\begin{aligned} \begin{pmatrix} \lambda _0 &{} 0 &{} 0 \\ 0 &{} \lambda _1 &{} 0 \\ 0 &{} 0 &{} \lambda _2 \end{pmatrix} = (\lambda _0-\lambda _1) {\mathbb {T}}_{stick} + (\lambda _1-\lambda _2) {\mathbb {T}}_{plate} + \lambda _2 {\mathbb {T}}_{ball} \text{, } \end{aligned}$$

(A1)

where ${\mathbb {T}}_{stick}$, ${\mathbb {T}}_{plate}$ and ${\mathbb {T}}_{ball}$ are respectively the stick tensor, the plane one and the ball one, named according to their representations as ellipsoids (see figure in Medioni et al., 2005), and each of them represents a different type of structure: The stick component encodes the saliency of surfaces that are normal to ${\hat{\textbf{e}}}_0$, the plate component is encoding some curves with tangent direction ${\hat{\textbf{e}}}_2$, and the ball component is encoding points, e.g. corresponding to thin structure junctions.

The stick kernel that allows for the vote cast by a stick tensor, ${\mathbb {T}}_{stick}\in {\mathbb {R}}^{3\times 3}$, involves a multiplication of ${\mathbb {T}}_{stick}$ by a decay function DF, and a rotation by a vector $\varvec{\Omega }$. Specifically, DF is as follows:

$$\begin{aligned} DF(r,\phi ,\sigma _T) = \exp \left( -\frac{r^2 + v\phi ^2}{\sigma _T^2} \right) \text{, } \end{aligned}$$

where $\sigma _T$ is the scale parameter, v is a constant that controls the decay with curvature, $r\in {\mathbb {R}}_{>0}$ is the length of the circle arc between O and P on the osculating circle joining O and P with normal ${\hat{\textbf{e}}}_0$ at point O and $\phi \in ]-\pi ,\pi ]$ the angle between the tangent to the same osculating circle in O and $\vec {OP}$. The decay function allows for a smooth voting kernel whose support can be bounded to a sphere of radius $3\sigma _T$. Along with the term $v\phi ^2$ used for increasing the decay with curvature, Medioni et al. (2000) proposes also to restrict vote to the area where $\phi <\frac{\pi }{4}$ and consider that the term $DF(r,\phi ,\sigma _T)$ is null otherwise.

The rotation ${\textbf{R}}(\varvec{\Omega })\in {\mathbb {R}}^{3\times 3}$ is defined by the rotation vector $\varvec{\Omega }\in {\mathbb {R}}^3$, that transforms the vector ${\hat{\textbf{e}}}_0$ into the vector ${\hat{\textbf{e}}}'_0$ with ${\hat{\textbf{e}}}'_0$ and ${\hat{\textbf{e}}}_0$ symmetrical with respect to the mediator of the segment OP. This allows for computing the cast tensor ${\mathbb {T}}'_{stick}\in {\mathbb {R}}^{3\times 3}$ as follows:

$$\begin{aligned} {\mathbb {T}}'_{stick} = DF(r,\phi ,\sigma _T){\textbf{R}}(\varvec{\Omega }){\mathbb {T}}_{stick}{\textbf{R}}^{T}(\varvec{\Omega }) \text{. } \end{aligned}$$

where $\cdot ^{T}$ is the transposition operation.

Plate tensor can be written ${{\mathbb {T}}_{plate} = {\hat{\textbf{e}}}_0 {\hat{\textbf{e}}}_0^T + {\hat{\textbf{e}}}_1 {\hat{\textbf{e}}}_1^T}$, while ball tensor is written ${{\mathbb {T}}_{ball} = {\hat{\textbf{e}}}_0 {\hat{\textbf{e}}}_0^T + {\hat{\textbf{e}}}_1 {\hat{\textbf{e}}}_1^T + {\hat{\textbf{e}}}_2 {\hat{\textbf{e}}}_2^T}$. The plate and ball kernels are derived from the stick kernel by integration of stick tensors. Approximating these integrals as sums of tensors,

$$\begin{aligned} {\mathbb {T}}_{plate}' \approx \sum _{i=0}^{I} DF(r,\phi ,\sigma _T) {\textbf{R}}(\varvec{\Omega }){\mathbb {T}}_{stick}(i\Delta _\rho ){\textbf{R}}^{T}(\varvec{\Omega }) \Delta _\rho , \end{aligned}$$

$$\begin{aligned} \begin{array}{rl} {\mathbb {T}}_{ball}' \approx \sum _{i=0}^{I} \sum _{j=-J/2}^{J/2}&{} DF(r,\phi ,\sigma _T) {\textbf{R}}(\varvec{\Omega }){\mathbb {T}}_{stick}(i\Delta _\rho ,j\Delta _\psi )\\ &{}{\textbf{R}}^{T}(\varvec{\Omega })\sin (j\Delta _\psi ) \Delta _\psi \Delta _\rho , \end{array} \end{aligned}$$

where $\Delta _\rho = \frac{\Pi }{I}$ and $\Delta _\psi =\frac{\Pi }{J}$, and $I,J\in {\mathbb {N}}$ are arbitrary constants. Note that these kernels are usually precomputed for computational efficiency.

Then, any tensor ${\mathbb {T}}_s$ at location $s\in {\mathbb {R}}^3$ can be decomposed from Eq. (A1) in a basis $({\hat{\textbf{e}}}_0,{\hat{\textbf{e}}}_1,{\hat{\textbf{e}}}_2 )$ as ${\mathbb {T}}(s) = (\lambda _0-\lambda _1){\hat{\textbf{e}}}_0{\hat{\textbf{e}}}_0^T + (\lambda _1-\lambda _2){\hat{\textbf{e}}}_1{\hat{\textbf{e}}}_1^T + \lambda _2{\hat{\textbf{e}}}_2{\hat{\textbf{e}}}_2^T$, and the vote cast at location $t\in {\mathbb {R}}^3$ is written:

$$\begin{aligned} \begin{array}{ll} VF({\mathbb {T}},\vec {st}) &{} = (\lambda _0-\lambda _1)VF({\mathbb {T}}_{stick}(t),\vec {st}) \\ &{}\quad + (\lambda _1-\lambda _2)VF({\mathbb {T}}_{plate}(t),\vec {st}) \\ &{}\quad + \lambda _2 VF({\mathbb {T}}_{ball}(t),\vec {st}) \end{array} \end{aligned}$$

Having introduced the voting operation for one tensor, let us specify the global voting process.

From ${\mathcal {S}}_0,{\mathcal {S}}_1\subset {\mathcal {S}}$ the sets of voters and the cast locations respectively, $\forall s\in {\mathcal {S}}$,

$$\begin{aligned} \left\{ \begin{array}{ccl} \forall p \not \in {\mathcal {S}}_1, &{} {\mathbb {T}}'(p) = &{} {\mathbb {T}}(p) \text{, }\\ \forall p\in {\mathcal {S}}_1, &{} {\mathbb {T}}'(p) = &{} {\mathbb {T}}(p) + \sum _{s\in {\mathcal {S}}_0} VF({\mathbb {T}}(s),\vec {sp}) \text{, } \end{array} \right. \end{aligned}$$

where ${\mathbb {T}}'(s)$ is the tensor at location s after vote and ${\mathbb {T}}(s)$ before.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ribal, C., Le Hégarat-Mascle, S. & Lermé, N. Thin structures retrieval using anisotropic neighborhoods of superpixels: application to shape-from-focus. Multidim Syst Sign Process 34, 179–204 (2023). https://doi.org/10.1007/s11045-022-00854-8

Download citation

Received: 11 May 2022
Revised: 30 September 2022
Accepted: 02 October 2022
Published: 16 November 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s11045-022-00854-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Thin structures retrieval using anisotropic neighborhoods of superpixels: application to shape-from-focus

Abstract

Access this article

Similar content being viewed by others

Enforcing spatially coherent structures in shape from focus

Superpixel-Based Multi-focus Image Fusion

Depth-Based Focus Stacking with Labeled-Laplacian Propagation

Data availibility

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: 3D Tensor Voting

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Thin structures retrieval using anisotropic neighborhoods of superpixels: application to shape-from-focus

Abstract

Access this article

Similar content being viewed by others

Enforcing spatially coherent structures in shape from focus

Superpixel-Based Multi-focus Image Fusion

Depth-Based Focus Stacking with Labeled-Laplacian Propagation

Data availibility

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: 3D Tensor Voting

Appendix A: 3D Tensor Voting

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation