Abstract
The rapid development of deep learning techniques has introduced extensive research improvements to various aspects in the processing pipeline of the stereo matching problem. Due to the high requirements of 3D convolution on computing resources and the domain sensitivity of 2D convolution, some stereo matching networks have begun to shift from full convolutional structures to recurrent structures, using the hidden state update mechanism of recurrent units to achieve global consistency of disparity-related information. In this paper, a new recurrent convolutional model is constructed based on a two-dimensional spiking neural computational system, and three types of recurrent units are designed by setting different parameters. The newly designed recurrent units are applied to a recent recurrent stereo matching network for better disparity propagation. Starting from the definition of two-dimensional gated spiking neural P systems, the spiking mechanism of a single neuron is extended to multi-neurons arranged in a two-dimensional array which are locally topologically connected. Its state update mechanism is parameterized in a form that can be back-propagated, thus realizing a new type of recurrent convolutional model. The proposed model can be embedded into existing recurrent stereo matching networks. Experimental results demonstrate that it can effectively reduce the computational load of the baseline method and achieve comparable accuracy to existing state-of-the-art methods.
Similar content being viewed by others
References
Chang J-R, Chen Y-S (2018) Pyramid stereo matching network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p 5410–5418
Guo X, Yang K, Yang W, Wang X, Li, H (2019) Group-wise correlation stereo network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p 3273–3282
Tankovich V, Hane C, Zhang Y, Kowdle A, Fanello S, Bouaziz S (2021) Hitnet: Hierarchical iterative tile refinement network for real-time stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p 14362–14372
Lipson L, Teed Z, Deng J (2021) Raft-stereo: Raft-stereo: Multilevel recurrent field transforms for stereo matching. In: 2021 International Conference on 3D Vision (3DV). IEEE p 218–227
Du H, Li Y, Sun Y, Zhu J, Tombari F (2021) Srh-net: Stacked recurrent hourglass network for stereo matching. IEEE Robot Autom Lett 6(4):8005–8012
Li J, Wang P, Xiong P, Cai T, Yan Z, Yang L, Liu J, Fan H, Liu S (2022) Practical stereo matching via cascaded recurrent network with adaptive correlation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p 16263–16272
Zhao H, Zhou H, Zhang Y, Zhao Y, Yang Y, Ouyang T (2022) Eai-stereo: Error aware iterative network for stereo matching. In: Proceedings of the Asian Conference on Computer Vision (ACCV). p 315–332
Liu Q, Long L, Peng H, Wang J, Yang Q, Song X, Riscos-Núñez A, Pérez-Jiménez MJ (2021) Gated spiking neural p systems for time series forecasting. IEEE Trans Neural Netw Learn Syst
Laga H, Jospin LV, Boussaid F, Bennamoun M (2022) A survey on deep learning techniques for stereo-based depth estimation. IEEE Trans Pattern Anal Mach Intell 44(4):1738–1764
Hamid MS, Abd Manap N, Hamzah RA, Kadmin AF (2022) Stereo matching algorithm based on deep learning: A survey. J King Saud Univ - Comput Inf Sci 34(5):1663–1673
Zbontar J, LeCun Y (2015) Computing the stereo matching cost with a convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p 1592–1599
Zbontar J, LeCun Y et al (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17(1):2287–2318
Zagoruyko S, Komodakis N (2015) Learning to compare image patches via convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p 4353–4361
Chen Z, Sun X, Wang L, Yu Y, Huang C (2015) A deep visual correspondence embedding model for stereo matching costs. In: Proceedings of the IEEE international conference on computer vision. p 972–980
Simo-Serra E, Trulls E, Ferraz L, Kokkinos I, Fua P, Moreno-Noguer F (2015) Discriminative learning of deep convolutional feature point descriptors. In: Proceedings of the IEEE international conference on computer vision. p 118–126
Balntas V, Johns E, Tang L, Mikolajczyk K (2016) Pn-net: Conjoined triple deep network for learning local image descriptors. arXiv:1601.05030
Kumar BGV, Carneiro G, Reid I (2016) Learning local image descriptors with deep siamese and triplet convolutional networks by minimising global loss functions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p 5385–5394
Park H, Lee KM (2016) Look wider to match image patches with convolutional neural networks. IEEE Signal Process Lett 24(12):1788–1792
Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van Der Smagt P, Cremers D, Brox T (2015) Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision. p 2758–2766
Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p 4040–4048
Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vision 47(1):7–42
Song X, Zhao X, Fang L, Hu H, Yu Y (2020) Edgestereo: An effective multi-task learning network for stereo matching and edge detection. Int J Comput Vision 128(4):910–930
Lu C, Uchiyama H, Thomas D, Shimada A, Taniguchi R-I (2018) Sparse cost volume for efficient stereo matching. Remote Sens 10(11):1844
Yang G, Zhao H, Shi J, Deng Z, Jia J (2018) Segstereo: Exploiting semantic information for disparity estimation. In: Proceedings of the European Conference on Computer Vision (ECCV). p 636–651
Shamsafar F, Woerz S, Rahim R, Zell A (2022) Mobilestereonet: Towards lightweight deep networks for stereo matching. In: Proceedings of the Ieee/cvf winter conference on applications of computer vision. p 2417–2426
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p 770–778
Li Z, Liu X, Drenkow N, Ding A, Creighton FX, Taylor RH, Unberath M (2021) Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF international conference on computer vision. p 6197–6206
Yang G, Manela J, Happold M, Ramanan D (2019) Hierarchical deep stereo matching on high-resolution images. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p 5515–5524
Poggi M, Tonioni A, Tosi F, Mattoccia S, Di Stefano L (2021) Continual adaptation for deep stereo. IEEE Trans Pattern Anal Mach Intell
Shen Z, Dai Y, Rao Z (2021) Cfnet: Cascade and fused cost volume for robust stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p 13906–13915
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, Italy, pp 234–241
Zhang F, Prisacariu V, Yang R, Torr PH (2019) Ga-net: Guided aggregation net for end-to-end stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 185–194
Zhang F, Qi X, Yang R, Prisacariu V, Wah B, Torr P (2020) Domaininvariant stereo matching networks. European conference on computer vision. Springer, pp 420–439
Nie G-Y, Cheng M-M, Liu Y, Liang Z, Fan D-P, Liu Y, Wang Y (2019) Multi-level context ultra-aggregation for stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3283–3291
Khamis S, Fanello S, Rhemann C, Kowdle A, Valentin J, Izadi S (2018) Stereonet: Guided hierarchical refinement for real-time edge-aware depth prediction. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 573–590
Chabra R, Straub J, Sweeney C, Newcombe R, Fuchs H (2019) Stereodrnet: Dilated residual stereonet. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 11786–11795
Xie C-W, Zhou H-Y, Wu J (2018) Vortex pooling: Improving context representation in semantic segmentation. arXiv:1804.06242
Cheng X, Zhong Y, Harandi M, Dai Y, Chang X, Li H, Drummond T, Ge Z (2020) Hierarchical neural architecture search for deep stereo matching. Adv Neural Inf Process Syst 33:22158–22169
Pang J, Sun W, Ren JS, Yang C, Yan Q (2017) Cascade residual learning: A two-stage convolutional neural network for stereo matching. In: proceedings of the ieee international conference on computer vision workshops. pp 887–895
Yao Y, Luo Z, Li S, Shen T, Fang T, Quan L (2019) Recurrent mvsnet for high-resolution multi-view stereo depth inference. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 5525–5534
Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE international conference on computer vision. p 66–75
Xu G, Cheng J, Guo P, Yang X (2022) Attention concatenation volume for accurate and efficient stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 12981–12990
Guo C, Chen D, Huang Z (2019) Learning efficient stereo matching network with depth discontinuity aware super-resolution. IEEE Access 7:159712–159723
Tulyakov S, Ivanov A, Fleuret F (2018) Practical deep stereo (pds): Toward applications-friendly deep stereo matching. Adv Neural Inf Process Syst 31
Ionescu M, Păun G, Yokomori T (2006) Spiking neural p systems. Fundamenta informaticae 71(2-3)279–308
Peng H, Yang J, Wang J, Wang T, Sun Z, Song X, Luo X, Huang X (2017) Spiking neural p systems with multiple channels. Neural Netw 95:66–71
Song X, Valencia-Cabrera L, Peng H, Wang J, Pérez-Jiménez MJ (2021) Spiking neural p systems with delay on synapses. Int J Neural Syst 31(01):2050042
Peng H, Li B, Wang J, Song X, Wang T, Valencia-Cabrera L, Pérez-Hurtado I, Riscos-Núñez A, Pérez-Jiménez MJ (2020) Spiking neural p systems with inhibitory rules. Knowl-Based Syst 188:105064
Peng H, Wang J, Pérez-Jiménez MJ, Riscos-Núñez A (2019) Dynamic threshold neural p systems. Knowl-Based Syst 163:875–884
Peng H, Wang J (2018) Coupled neural p systems. IEEE Trans Neural Netw Learn Syst 30(6):1672–1682
Peng H, Bao T, Luo X, Wang J, Song X, Riscos-Núñez A, Pérez-Jiménez MJ (2020) Dendrite p systems. Neural Netw 127:110–120
Peng H, Lv Z, Li B, Luo X, Wang J, Song X, Wang T, Pérez-Jiménez MJ, Riscos-Núñez A (2020) Nonlinear spiking neural p systems. Int J Neural Syst 30(10):2050008
Díaz-Pernil D, Gutiérrez-Naranjo MA, Peng H (2019) Membrane computing and image processing: a short survey. J Membr Comput 1(1):58–73
Díaz-Pernil D, Peña-Cantillana F, Gutiérrez-Naranjo MA (2013) A parallel algorithm for skeletonizing images by using spiking neural p systems. Neurocomputing 115:81–91
Li B, Peng H, Wang J, Huang X (2020) Multi-focus image fusion based on dynamic threshold neural p systems and surfacelet transform. Knowl-Based Syst 196:105794
Peng H, Li B, Yang Q, Wang J (2021) Multi-focus image fusion approach based on cnp systems in nsct domain. Comput Vis Image Underst 210:103228
Li B, Peng H, Luo X, Wang J, Song X, Pérez-Jiménez MJ, Riscos-Núñez A (2021) Medical image fusion method based on coupled neural p systems in nonsubsampled shearlet transform domain. Int J Neural Syst 31(01):2050050
Li B, Peng H, Wang J (2021) A novel fusion method based on dynamic threshold neural p systems and nonsubsampled contourlet transform for multi-modality medical images. Signal Process 178:107793
Cai Y, Mi S, Yan J, Peng H, Luo X, Yang Q, Wang J (2022) An unsupervised segmentation method based on dynamic threshold neural p systems for color images. Inf Sci 587:473–484
Ballas N, Yao L, Pal C, Courville A (2015) Delving deeper into convolutional networks for learning video representations. arXiv:1511.06432
Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211
Jordan MI (1997) Serial order: A parallel distributed processing approach. 121:471–495
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 3354–3361
Menze M, Heipke C, Geiger A (2015) Joint 3d estimation of vehicles and scene flow. ISPRS Ann Photogramm Remote Sens Spat Inf Sci 2:427
Liang Z, Feng Y, Guo Y, Liu H, Chen W, Qiao L, Zhou L, Zhang J (2018) Learning for disparity estimation through feature constancy. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2811–2820
Yin Z, Darrell T, Yu F (2019) Hierarchical discrete distribution decomposition for match density estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 6044–6053
Acknowledgements
This work was partially supported by the National Natural Science Foundation of China (No. 62076206 and No. 62176216), China.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Consent for publication
The paper is original in its contents and is not under consideration for publication in any other journals/proceedings. The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, C., Peng, H. & Wang, J. Recurrent convolutional model based on gated spiking neural P system for stereo matching networks. Appl Intell 53, 29570–29584 (2023). https://doi.org/10.1007/s10489-023-05091-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05091-5