A Bimodal Co-sparse Analysis Model for Image Processing

Kiechle, Martin; Habigt, Tim; Hawe, Simon; Kleinsteuber, Martin

doi:10.1007/s11263-014-0786-5

A Bimodal Co-sparse Analysis Model for Image Processing

Published: 22 November 2014

Volume 114, pages 233–247, (2015)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Martin Kiechle¹,
Tim Habigt¹,
Simon Hawe¹ &
…
Martin Kleinsteuber¹

787 Accesses
13 Citations
Explore all metrics

Abstract

The success of many computer vision tasks lies in the ability to exploit the interdependency between different image modalities such as intensity and depth. Fusing corresponding information can be achieved on several levels, and one promising approach is the integration at a low level. Moreover, sparse signal models have successfully been used in many vision applications. Within this area of research, the so-called co-sparse analysis model has attracted considerably less attention than its well-known counterpart, the sparse synthesis model, although it has been proven to be very useful in various image processing applications. In this paper, we propose a bimodal co-sparse analysis model that is able to capture the interdependency of two image modalities. It is based on the assumption that a pair of analysis operators exists, so that the co-supports of the corresponding bimodal image structures have a large overlap. We propose an algorithm that is able to learn such a coupled pair of operators from registered and noise-free training data. Furthermore, we explain how this model can be applied to solve linear inverse problems in image processing and how it can be used as a prior in bimodal image registration tasks. This paper extends the work of some of the authors by two major contributions. Firstly, a modification of the learning process is proposed that a priori guarantees unit norm and zero-mean of the rows of the operator. This accounts for the intuition that local texture carries the most important information in image modalities independent of brightness and contrast. Secondly, the model is used in a novel bimodal image registration algorithm, which estimates the transformation parameters of unregistered images of different modalities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Ncorr: Open-Source 2D Digital Image Correlation Matlab Software

Article 31 March 2015

Image Fusion Techniques: A Survey

Article 24 January 2021

Notes

http://vision.middlebury.edu/stereo/eval/.

References

Absil, P. A., Mahony, R., & Sepulchre, R. (2008). Optimization Algorithms on Matrix Manifolds. Princeton: Princeton University Press.
Book MATH Google Scholar
Baker, S., & Kanade, T. (2002). Limits on super-resolution and how to break them. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1167–1183.
Article Google Scholar
Brown, L. G. (1992). A survey of image registration techniques. ACM Computing Surveys (CSUR), 24, 325–376.
Article Google Scholar
Brown, M., & Süsstrunk, S. (2011). Multi-spectral SIFT for scene category recognition. IEEE conference on computer vision and pattern recognition, pp. 177–184.
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing sparsity by reweighted $\ell _1$ minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905.
Article MathSciNet MATH Google Scholar
Chan, D., Buisman, H., Theobalt, C., Thrun, S. (2008). A noise-aware filter for real-time depth upsampling. Workshop on multi-camera and multi-modal sensor fusion algorithms and applications.
Chen, Y., Ranftl, R., & Pock, T. (2014). Insights into analysis operator learning: From patch-based sparse models to higher order MRFs. IEEE Transactions on Image Processing, 23(3), 1060–1072.
Article MathSciNet Google Scholar
Cole-Rhodes, A. A., Johnson, K. L., LeMoigne, J., & Zavorin, I. (2003). Multiresolution registration of remote sensing imagery by optimization of mutual information using a stochastic gradient. IEEE Transactions on Image Processing, 12(12), 1495–1511.
Article MathSciNet Google Scholar
Collignon, A., Maes, F., Delaere, D., Vandermeulen, D., Suetens, P., & Marchal, G. (1995). Automated multi-modality image registration based on information theory. Information Processing in Medical Imaging, 3, 263–274.
Google Scholar
Dai, Y., & Yuan, Y. (2001). An efficient hybrid conjugate gradient method for unconstrained optimization. Annals of Operations Research, 103(1–4), 33–47.
Article MathSciNet MATH Google Scholar
Diebel, J., & Thrun, S. (2005). An application of Markov random fields to range sensing. NIPS, 18, 291–298.
Google Scholar
Elad, M., Milanfar, P., & Rubinstein, R. (2007). Analysis versus synthesis in signal priors. Inverse Problems, 23(3), 947–968.
Article MathSciNet MATH Google Scholar
Fan, X., Rhody, H., & Saber, E. (2010). A spatial-feature-enhanced MMI algorithm for multimodal airborne image registration. IEEE Transactions on Geoscience and Remote Sensing, 48(6), 2580–2589.
Article Google Scholar
Freeman, W. T., Pasztor, E. C., & Carmichael, O. T. (2000). Learning Low-Level Vision. International Journal of Computer Vision, 40(1), 25–47.
Article MATH Google Scholar
Hawe, S., Kleinsteuber, M., & Diepold, K. (2013). Analysis operator learning and its application to image reconstruction. IEEE Transactions on Image Processing, 22(6), 2138–2150.
Article MathSciNet Google Scholar
Hong, C., Dit-Yan, Y., & Yimin, Xiong. (2004). Super-resolution through neighbor embedding. Computer Vision and Pattern Recognition, 1, 275–282.
MATH Google Scholar
Hyder, M., & Mahata, K. (2009). A robust algorithm for joint-sparse recovery. IEEE Signal Processing Letters, 16(12), 1091–1094.
Article Google Scholar
Jia, K., Wang, X., & Tang, X. (2013). Image transformation based on learning dictionaries across image spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(2), 367–380.
Article Google Scholar
Kiechle, M., Hawe, S., & Kleinsteuber, M. (2013). A Joint Intensity and Depth Co-Sparse Analysis Model for Depth Map Super-Resolution. Proceedings of the international conference on computer vision.
Klein, S., Staring, M., Murphy, K., Viergever, M. A., & Pluim, J. P. W. (2010). Elastix: A toolbox for intensity-based medical image registration. IEEE Transactions on Medical Imaging, 29(1), 196–205.
Article Google Scholar
Krotosky, S. J., & Trivedi, M. M. (2007). Mutual information based registration of multimodal stereo videos for person tracking. Computer Vision and Image Understanding, 106(2–3), 270–287.
Article Google Scholar
Li, Y., Xue, T., Sun, L., & Liu, J. (2012) Joint example-based depth map super-resolution. In IEEE international conference on multimedia and expo pp. 152–157.
Liu, C., Shum, H. Y., & Freeman, W. T. (2007). Face hallucination: Theory and practice. International Journal of Computer Vision, 75(1), 115–134.
Article Google Scholar
Lu, J., Min, D., Pahwa, R.S., & Do, M.N. (2011). A revisit to MRF-based depth map super-resolution and enhancement. ICASSP, pp. 985–988.
Mairal, J., Bach, F., & Ponce, J. (2012). Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 791–804.
Article Google Scholar
Mattes, D., Haynor, D. R., Vesselle, H., Lewellen, T. K., & Eubank, W. (2003). PET-CT image registration in the chest using free-form deformations. IEEE Transactions on Medical Imaging, 22(1), 120–128.
Article Google Scholar
Mishali, M., & Eldar, Y. (2008). Reduce and boost: Recovering arbitrary sets of jointly sparse vectors. IEEE Transactions on Signal Processing, 56(10), 4692–4702.
Article MathSciNet Google Scholar
Nam, S., Davies, M. E., Elad, M., & Gribonval, R. (2013). The cosparse analysis model and algorithms. Applied and Computational Harmonic Analysis, 34(1), 30–56.
Article MathSciNet MATH Google Scholar
Ophir, B., Elad, M., Bertin, N., & Plumbley, M.D. (2011). Sequential minimal eigenvalues: An approach to analysis dictionary learning. EUSIPCO, pp. 1465–1469.
Orchard, J. (2007). Efficient least squares multimodal registration with a globally exhaustive alignment search. IEEE Transactions on Image Processing, 16(10), 2526–2534.
Article MathSciNet Google Scholar
Peng, Y., Ganesh, A., Wright, J., Xu, W., & Ma, Y. (2012). RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2233–2246.
Article Google Scholar
Pluim, J. P. W., Maintz, J. B. A., & Viergever, M. A. (2003). Mutual-information-based registration of medical images: A survey. IEEE Transactions on Medical Imaging, 22(8), 986–1004.
Article Google Scholar
Ravishankar, S., & Bresler, Y. (2013). Learning sparsifying transforms. IEEE Transactions on Signal Processing, 61(5), 1072–1086.
Article MathSciNet Google Scholar
Rubinstein, R., Peleg, T., & Elad, M. (2013). Analysis K-SVD: A dictionary-learning algorithm for the analysis sparse model. IEEE Transactions on Signal Processing, 61(3), 661–677.
Article MathSciNet Google Scholar
Scharstein, D., & Szeliski, R. (2003). High-accuracy stereo depth maps using structured light. IEEE Conference on Computer Vision and Pattern Recognition, pp. 195–202.
Studholme, C., Hill, D., & Hawkes, D. (1999). An overlap invariant entropy measure of 3D medical image alignment. Pattern Recognition, 32(1), 71–86.
Article Google Scholar
Tropp, J. A., Gilbert, A. C., & Strauss, M. J. (2006). Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit. Signal Processing, 86(3), 572–588.
Article MATH Google Scholar
Viola, P., & Wells, W. M, I. I. I. (1997). Alignment by maximization of mutual information. International Journal of Computer Vision, 24(2), 137–154.
Article Google Scholar
Wang, S., Zhang, D., Liang, Y., & Pan, Q. (2012). Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. IEEE conference on computer vision and pattern recognition, pp. 2216–2223.
Yaghoobi, M., Nam, S., Gribonval, R., & Davies, M. E. (2013). Constrained overcomplete analysis operator learning for cosparse signal modelling. IEEE Transactions on Signal Processing, 61(9), 2341–2355.
Article Google Scholar
Yang, J., Wright, J., Huang, T., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–2873.
Article MathSciNet Google Scholar
Yang, Q., Yang, R., Davis, J., Nistér, D. (2007). Spatial-depth super resolution for range images. IEEE conference on computer vision and pattern recognition, pp. 1–8.
Zeyde, R., Elad, M., & Protter, M. (2012). On single image scale-up using sparse-representations. Curves and Surfaces.
Zitová, B., & Flusser, J. (2003). Image registration methods: A survey. Image and Vision Computing, 21(11), 977–1000.
Article Google Scholar

Download references

Acknowledgments

This work was supported by the German Federal Ministry of Economics and Technology (BMWi) through Project KF3057001TL2 and by the Cluster of Excellence CoTeSys - Cognition for Technical Systems, funded by the German Research Foundation (DFG).

Author information

Authors and Affiliations

Department of Electrical, Electronic and Computer Engineering, Technische Universität München, Arcisstr. 21, 80333, Munich, Germany
Martin Kiechle, Tim Habigt, Simon Hawe & Martin Kleinsteuber

Authors

Martin Kiechle
View author publications
You can also search for this author in PubMed Google Scholar
Tim Habigt
View author publications
You can also search for this author in PubMed Google Scholar
Simon Hawe
View author publications
You can also search for this author in PubMed Google Scholar
Martin Kleinsteuber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Kiechle.

Additional information

Communicated by Julien Mairal, Francis Bach, Michael Elad.

Appendix I

1.1 Derivation of the Riemannian Gradient in Sect. 5

In this section we derive the Riemannian gradient in Eq. (53) for the bimodal alignment algorithm. We make use of the following criterion for its derivation. Let $\langle \cdot ,\cdot \rangle _\mathbf {P}$ be the Riemannian metric on the Lie group $\mathcal {G}$ inherited from (52) and let $F(\cdot )$ be a smooth real valued function on $\mathcal {G}$. Then the Riemannian gradient of $F$ at $\delta \in \mathcal {G}$ is the unique vector $\mathbf {G} \in T_\delta \mathcal {G}$, the tangent space at $\delta $, such that

$$\begin{aligned} \frac{\mathrm{d} }{\mathrm{d} t } \Big |_{t=0} F(e^{t \mathbf {H}}\delta ) = \langle \mathbf {H},\mathbf {G} \rangle _\mathbf {P} \end{aligned}$$

(56)

holds for all tangent elements $\mathbf {H} \in T_\delta \mathcal {G}$.

For our purpose, we compute the gradient at $\delta = \mathrm{id}$. Now let $B$ be the image region in which we want to align the modalities $I_U$ and $I_V$. We assume that $B$ is rectangular and denote by

$$\begin{aligned} I(\mathbf {x})_{\mathbf {x} \in B} \end{aligned}$$

(57)

the vectorized version of $I$ over the domain $B$. Using Eq. (51) and the fact that $\mathbf {c}:= \mathbf {{\varvec{\Omega }}}^F_U I_U$ is a constant vector, we compute by the chain rule that

$$\begin{aligned}&\tfrac{\mathrm{d} }{\mathrm{d} t } \Big |_{t=0} F(e^{t \mathbf {H}}\tau ) = \tfrac{\mathrm{d}}{\mathrm{d} t } \Big |_{t=0} g\left( \mathbf {c}, \mathbf {{\varvec{\Omega }}}^F_V \left[ (e^{t \mathbf {H}}\tau ) \circ I_V \right] \right) \nonumber \\&\quad = \nabla g\left( \mathbf {c}, \mathbf {{\varvec{\Omega }}}^F_V I_V (\tau \mathbf {x})_{\mathbf {x} \in B} \right) ^\top \mathbf {{\varvec{\Omega }}}^F_V \left[ \tfrac{\mathrm{d} }{\mathrm{d} t } \Big |_{t=0} I_V (e^{t \mathbf {H}} \tau \mathbf {x})_{\mathbf {x} \in B}\right] . \end{aligned}$$

(58)

The last bracket is a vector where each of its entries is computed as

$$\begin{aligned} \tfrac{\mathrm{d} }{\mathrm{d} t } \Big |_{t=0} I_V (e^{t \mathbf {H}}\tau \mathbf {x})&= \nabla I_V(\tau \mathbf {x})^\top \mathbf {H} \tau \mathbf {x} \nonumber \\&= \mathrm{vec} (\tau \mathbf {x} \otimes \nabla I_V(\tau \mathbf {x}))^\top \mathrm{vec}(\mathbf {H}), \end{aligned}$$

(59)

where, as usual, $\mathrm{vec}(\cdot )$ denotes the linear operator that stacks the columns of a matrix among each other and $\otimes $ is the Kronecker product. Note, that since we stick to the representation with homogeneous coordinates, $\nabla I_V(\mathbf {x}) \in \mathbb {R}^3$ is the common image gradient of $I_V$ with an additional $0$ in the third component.

Thus, with

$$\begin{aligned}&\mathbf {r}^\top \nonumber \\&\quad :=\nabla g\left( \mathbf {c}, \mathbf {{\varvec{\Omega }}}_V^F I_V (\tau \mathbf {x})_{\mathbf {x} \in B} \right) ^\top \mathbf {{\varvec{\Omega }}}_V^F \left( \mathrm{vec} (\tau \mathbf {x} \otimes \nabla I_V(\tau \mathbf {x}))^\top \right) _{\mathbf {x} \in B}, \end{aligned}$$

(60)

we have

$$\begin{aligned} \frac{\mathrm{d} }{\mathrm{d} t } \Big |_{t=0} F(e^{t \mathbf {H}}\delta ) = \mathbf {r}^\top \mathrm{vec}(\mathbf {H})&= {{\mathrm{tr}}}(\mathrm{vec}^{-1}(\mathbf {r}) \mathbf {H}^\top ) \nonumber \\&= \langle \mathrm{vec}^{-1}(\mathbf {r})\odot \hat{\mathbf {P}} , \mathbf {H} \rangle _{\mathbf {P}}, \end{aligned}$$

(61)

where the entries of $\hat{\mathbf {P}}$ are the inverse of the entries of $\mathbf {P}$.

Using Eq. (56), the Riemannian gradient is therefore the orthogonal projection of $\mathrm{vec}^{-1}(\mathbf {r})\odot \hat{\mathbf {P}}$ with respect to $\langle \cdot , \cdot \rangle _{\mathbf {P}}$ onto the tangent space of $\delta = \mathrm{id}$, which is nothing else than the Lie algebra $\mathfrak {g}$, i.e.

$$\begin{aligned} \mathrm{grad}_\delta F(\delta \circ \tau ) = \Pi _{\mathfrak g}\left( \mathrm{vec}^{-1}(\mathbf {r})\odot \hat{\mathbf {P}} \right) . \end{aligned}$$

(62)

If we further assume for the entries $p_{ij}$ of $\mathbf {P}$ that

$$\begin{aligned} p_{11}=p_{22} \text { and } p_{12}=p_{21}, \end{aligned}$$

(63)

then, for the considered Lie groups, these projections are explicitly given by

$$\begin{aligned} \Pi _{SO}(\mathbf {X})&= \begin{bmatrix} \tfrac{1}{2}(\mathbf {X}_{11}-\mathbf {X}_{11}^\top )&0 \\ 0&0 \end{bmatrix} \end{aligned}$$

(64)

$$\begin{aligned} \Pi _{SE}(\mathbf {X})&= \begin{bmatrix} \tfrac{1}{2}(\mathbf {X}_{11}-\mathbf {X}_{11}^\top )&\mathbf {x}_{12} \\ 0&0 \end{bmatrix} \end{aligned}$$

(65)

$$\begin{aligned} \Pi _{SA}(\mathbf {X})&= \begin{bmatrix} (\mathbf {X}_{11}-\tfrac{1}{2}{{\mathrm{tr}}}(\mathbf {X}_{11}) \mathbf {I}_2)&\mathbf {x}_{12} \\ 0&0 \end{bmatrix}\end{aligned}$$

(66)

$$\begin{aligned} \Pi _{A}(\mathbf {X})&= \begin{bmatrix} \mathbf {X}_{11}&\mathbf {x}_{12} \\ 0&0 \end{bmatrix}, \end{aligned}$$

(67)

where $\mathbf {X} \in \mathbb {R}^{3 \times 3}$ is partitioned as

$$\begin{aligned} \mathbf {X} = \begin{bmatrix} \mathbf {X}_{11}&\mathbf {x}_{12} \\ \mathbf {x}_{21}^\top&x_{22} \end{bmatrix}. \end{aligned}$$

(68)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kiechle, M., Habigt, T., Hawe, S. et al. A Bimodal Co-sparse Analysis Model for Image Processing. Int J Comput Vis 114, 233–247 (2015). https://doi.org/10.1007/s11263-014-0786-5

Download citation

Received: 14 February 2014
Accepted: 06 November 2014
Published: 22 November 2014
Issue Date: September 2015
DOI: https://doi.org/10.1007/s11263-014-0786-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Bimodal Co-sparse Analysis Model for Image Processing

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

Ncorr: Open-Source 2D Digital Image Correlation Matlab Software

Image Fusion Techniques: A Survey

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix I

1.1 Derivation of the Riemannian Gradient in Sect. 5

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Bimodal Co-sparse Analysis Model for Image Processing

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

Ncorr: Open-Source 2D Digital Image Correlation Matlab Software

Image Fusion Techniques: A Survey

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix I

Appendix I

1.1 Derivation of the Riemannian Gradient in Sect. 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation