Skip to main content
Log in

Thin structures retrieval using anisotropic neighborhoods of superpixels: application to shape-from-focus

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

Shape-from-focus (SFF) refers to the challenging inverse problem of recovering the scene depth from a given set of focused images using a static camera. Standard approaches model the interactions between neighboring pixels to get a regularized solution. Nevertheless, isotropic regularization is known to introduce undesired artifacts and to remove early thin structures. These structures have a small size in at least one dimension and are more numerous when considering superpixel preprocessing. This paper addresses the improvement of SFF regularization through the estimation of the presence of such structures and the construction of anisotropic neighborhoods sticking along image edges and proposes a flexible formulation over pixels or superpixels. A thoroughly study comparing different strategies for constructing these neighborhoods in terms of accuracy and running time for the targeted application is provided. Notably, experiments performed on a reference dataset show the overall superiority of the approach, e.g. a decrease of the RMSE value by about 20%, and its robustness against generated superpixels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availibility

The datasets generated during and/or analysed during the current study are available at https://vision.middlebury.edu/stereo/data/.

Notes

  1. https://vision.middlebury.edu/stereo/data/.

References

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Authors

Contributions

All authors equally contributed to the writing and the reviewing of this paper. All authors approved the current version of this paper.

Corresponding author

Correspondence to Nicolas Lermé.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: 3D Tensor Voting

Appendix A: 3D Tensor Voting

Let \({\mathbb {R}}^{3\times 3}\) with an origin coordinate O in \({\mathbb {R}}^3\) be the considered vector space, endowed with a voting function \(VF:{\mathbb {R}}^{3\times 3}\times {\mathbb {R}}^{3} \mapsto {\mathbb {R}}^{3\times 3}\). A tensor can be represented by a matrix \({\mathbb {T}}\in {\mathbb {R}}^{3\times 3}\). The voting operation VF builds a new tensor \({\mathbb {T}}'\) to the cast location \(P\in {\mathbb {R}}^3\) and adds it to the tensor at this location, since tensors have good summation properties. The tensor \({\mathbb {T}}'\) is a combination of rotation and scaling of the source tensor \({\mathbb {T}}\), combinations that are all derived from the stick kernel. Indeed, tensors can be decomposed in a basis of tensors, in which the stick tensor is the simplest element. Then, the stick kernel refers to the voting operation of this stick tensor.

In tensor voting, a tensor is a second order symmetric tensor that can be represented by a positive semidefinite diagonalizable matrix \({\mathbb {T}}\in {\mathbb {R}}^{3\times 3}\), whose eigenvectors are orthogonal. In addition to its coordinates, one tensor can be characterized either from six scalar values corresponding to the coefficients of the symmetric matrix or, from three eigenvalues and a rotation. This rotation defines the transformation of the orthonormal basis \(({\textbf{e}}_0,{\textbf{e}}_1,{\textbf{e}}_2)\) to align with \(({\hat{\textbf{e}}}_0,{\hat{\textbf{e}}}_1,{\hat{\textbf{e}}}_2)\in {\mathbb {R}}^{3\times 3}\) the set of eigenvectors sorted by decreasing eigenvalue. The decomposition of the matrix into a set of diagonal matrices is a key point introduced by Medioni et al. (2000). By definition, the tensor is a diagonal matrix in the system \(({\hat{\textbf{e}}}_0,{\hat{\textbf{e}}}_1,{\hat{\textbf{e}}}_2)\), so that:

$$\begin{aligned} \begin{pmatrix} \lambda _0 &{} 0 &{} 0 \\ 0 &{} \lambda _1 &{} 0 \\ 0 &{} 0 &{} \lambda _2 \end{pmatrix} = (\lambda _0-\lambda _1) {\mathbb {T}}_{stick} + (\lambda _1-\lambda _2) {\mathbb {T}}_{plate} + \lambda _2 {\mathbb {T}}_{ball} \text{, } \end{aligned}$$
(A1)

where \({\mathbb {T}}_{stick}\), \({\mathbb {T}}_{plate}\) and \({\mathbb {T}}_{ball}\) are respectively the stick tensor, the plane one and the ball one, named according to their representations as ellipsoids (see figure in Medioni et al., 2005), and each of them represents a different type of structure: The stick component encodes the saliency of surfaces that are normal to \({\hat{\textbf{e}}}_0\), the plate component is encoding some curves with tangent direction \({\hat{\textbf{e}}}_2\), and the ball component is encoding points, e.g. corresponding to thin structure junctions.

The stick kernel that allows for the vote cast by a stick tensor, \({\mathbb {T}}_{stick}\in {\mathbb {R}}^{3\times 3}\), involves a multiplication of \({\mathbb {T}}_{stick}\) by a decay function DF, and a rotation by a vector \(\varvec{\Omega }\). Specifically, DF is as follows:

$$\begin{aligned} DF(r,\phi ,\sigma _T) = \exp \left( -\frac{r^2 + v\phi ^2}{\sigma _T^2} \right) \text{, } \end{aligned}$$

where \(\sigma _T\) is the scale parameter, v is a constant that controls the decay with curvature, \(r\in {\mathbb {R}}_{>0}\) is the length of the circle arc between O and P on the osculating circle joining O and P with normal \({\hat{\textbf{e}}}_0\) at point O and \(\phi \in ]-\pi ,\pi ]\) the angle between the tangent to the same osculating circle in O and \(\vec {OP}\). The decay function allows for a smooth voting kernel whose support can be bounded to a sphere of radius \(3\sigma _T\). Along with the term \(v\phi ^2\) used for increasing the decay with curvature, Medioni et al. (2000) proposes also to restrict vote to the area where \(\phi <\frac{\pi }{4}\) and consider that the term \(DF(r,\phi ,\sigma _T)\) is null otherwise.

The rotation \({\textbf{R}}(\varvec{\Omega })\in {\mathbb {R}}^{3\times 3}\) is defined by the rotation vector \(\varvec{\Omega }\in {\mathbb {R}}^3\), that transforms the vector \({\hat{\textbf{e}}}_0\) into the vector \({\hat{\textbf{e}}}'_0\) with \({\hat{\textbf{e}}}'_0\) and \({\hat{\textbf{e}}}_0\) symmetrical with respect to the mediator of the segment OP. This allows for computing the cast tensor \({\mathbb {T}}'_{stick}\in {\mathbb {R}}^{3\times 3}\) as follows:

$$\begin{aligned} {\mathbb {T}}'_{stick} = DF(r,\phi ,\sigma _T){\textbf{R}}(\varvec{\Omega }){\mathbb {T}}_{stick}{\textbf{R}}^{T}(\varvec{\Omega }) \text{. } \end{aligned}$$

where \(\cdot ^{T}\) is the transposition operation.

Plate tensor can be written \({{\mathbb {T}}_{plate} = {\hat{\textbf{e}}}_0 {\hat{\textbf{e}}}_0^T + {\hat{\textbf{e}}}_1 {\hat{\textbf{e}}}_1^T}\), while ball tensor is written \({{\mathbb {T}}_{ball} = {\hat{\textbf{e}}}_0 {\hat{\textbf{e}}}_0^T + {\hat{\textbf{e}}}_1 {\hat{\textbf{e}}}_1^T + {\hat{\textbf{e}}}_2 {\hat{\textbf{e}}}_2^T}\). The plate and ball kernels are derived from the stick kernel by integration of stick tensors. Approximating these integrals as sums of tensors,

$$\begin{aligned} {\mathbb {T}}_{plate}' \approx \sum _{i=0}^{I} DF(r,\phi ,\sigma _T) {\textbf{R}}(\varvec{\Omega }){\mathbb {T}}_{stick}(i\Delta _\rho ){\textbf{R}}^{T}(\varvec{\Omega }) \Delta _\rho , \end{aligned}$$
$$\begin{aligned} \begin{array}{rl} {\mathbb {T}}_{ball}' \approx \sum _{i=0}^{I} \sum _{j=-J/2}^{J/2}&{} DF(r,\phi ,\sigma _T) {\textbf{R}}(\varvec{\Omega }){\mathbb {T}}_{stick}(i\Delta _\rho ,j\Delta _\psi )\\ &{}{\textbf{R}}^{T}(\varvec{\Omega })\sin (j\Delta _\psi ) \Delta _\psi \Delta _\rho , \end{array} \end{aligned}$$

where \(\Delta _\rho = \frac{\Pi }{I}\) and \(\Delta _\psi =\frac{\Pi }{J}\), and \(I,J\in {\mathbb {N}}\) are arbitrary constants. Note that these kernels are usually precomputed for computational efficiency.

Then, any tensor \({\mathbb {T}}_s\) at location \(s\in {\mathbb {R}}^3\) can be decomposed from Eq. (A1) in a basis \(({\hat{\textbf{e}}}_0,{\hat{\textbf{e}}}_1,{\hat{\textbf{e}}}_2 )\) as \({\mathbb {T}}(s) = (\lambda _0-\lambda _1){\hat{\textbf{e}}}_0{\hat{\textbf{e}}}_0^T + (\lambda _1-\lambda _2){\hat{\textbf{e}}}_1{\hat{\textbf{e}}}_1^T + \lambda _2{\hat{\textbf{e}}}_2{\hat{\textbf{e}}}_2^T\), and the vote cast at location \(t\in {\mathbb {R}}^3\) is written:

$$\begin{aligned} \begin{array}{ll} VF({\mathbb {T}},\vec {st}) &{} = (\lambda _0-\lambda _1)VF({\mathbb {T}}_{stick}(t),\vec {st}) \\ &{}\quad + (\lambda _1-\lambda _2)VF({\mathbb {T}}_{plate}(t),\vec {st}) \\ &{}\quad + \lambda _2 VF({\mathbb {T}}_{ball}(t),\vec {st}) \end{array} \end{aligned}$$

Having introduced the voting operation for one tensor, let us specify the global voting process.

From \({\mathcal {S}}_0,{\mathcal {S}}_1\subset {\mathcal {S}}\) the sets of voters and the cast locations respectively, \(\forall s\in {\mathcal {S}}\),

$$\begin{aligned} \left\{ \begin{array}{ccl} \forall p \not \in {\mathcal {S}}_1, &{} {\mathbb {T}}'(p) = &{} {\mathbb {T}}(p) \text{, }\\ \forall p\in {\mathcal {S}}_1, &{} {\mathbb {T}}'(p) = &{} {\mathbb {T}}(p) + \sum _{s\in {\mathcal {S}}_0} VF({\mathbb {T}}(s),\vec {sp}) \text{, } \end{array} \right. \end{aligned}$$

where \({\mathbb {T}}'(s)\) is the tensor at location s after vote and \({\mathbb {T}}(s)\) before.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ribal, C., Le Hégarat-Mascle, S. & Lermé, N. Thin structures retrieval using anisotropic neighborhoods of superpixels: application to shape-from-focus. Multidim Syst Sign Process 34, 179–204 (2023). https://doi.org/10.1007/s11045-022-00854-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-022-00854-8

Keywords

Navigation