Abstract
IoT has been overwhelmingly empowered by the rapid development of big-data ecosystems, such as remote sensing technology which runs all the time in obtaining accurate and high-quality images to facilitate the subsequent image processing and content analysis in embedded devices. Object counting, which aims to estimate the number of objects in a captured image, is one of the most crucial tasks among multimedia data and wireless network. However, there are enormous inherent factors that seriously degrade the counting performance in remote sensing, e.g. the background clutter, scale variation, and orientation arbitrariness. In this paper, we tackle the aforementioned problems in a divide-and-conquer manner by devising the dense attention fusion network (DAFNet). Specifically, we introduce an iterative attention fusion (IAF) module, which mainly relies on the multiscale channel attention (MCA) unit, to alleviate the side effect caused by background clutter. Meanwhile, to overcome the intrinsic scale variations, we build a dense spatial pyramid (DSP) module to consider the hierarchical information obtained under diverse receptive fields. Finally, we stack deformable convolution layers to deal with the orientation arbitrariness. The synergy of the proposed IAF and DSP modules substantially promotes the effectiveness of the proposed DAFNet, which can be demonstrated by the notable superiority in extensive experiments on the remote sensing counting datasets against state-of-the-art competitors.
Similar content being viewed by others
Data Availability
Not applicable.
References
Pallavi S, Mallapur JD, Bendigeri KY (2017) Remote sensing and controlling of greenhouse agriculture parameters based on iot. In: 2017 International conference on big data, IoT and data science (BID). IEEE, pp 44–48
Zhao W, Ma W, Jiao L, Chen P, Yang S, Hou B (2019) Multi-scale image block-level f-cnn for remote sensing images object detection. IEEE Access 7:43607–43621. https://doi.org/10.1109/ACCESS.2019.2908016
Cheng G, Si Y, Hong HDT, Yao X, Guo L (2021) Cross-scale feature fusion for object detection in optical remote sensing images. IEEE Geosci Remote Sens Lett 18:431–435. https://doi.org/10.1109/LGRS.2020.2975541
Kotaridis I, Lazaridou M (2021) Remote sensing image segmentation advances: a meta-analysis. Isprs Journal of Photogrammetry and Remote Sensing 173:309–322. https://doi.org/10.1016/J.ISPRSJPRS.2021.01.020
Xu Z, Zhang W, Zhang T, Yang Z, Li J (2021) Efficient transformer for remote sensing image segmentation. Remote Sens 13:3585. https://doi.org/10.3390/rs13183585
Rathore MM, Ahmad A, Paul A, Rho S (2016) Urban planning and building smart cities based on the internet of things using big data analytics. Comput Netw 101:63–80. https://doi.org/10.1016/j.comnet.2015.12.023
Pekel J-F, Cottam A, Gorelick N, Belward AS (2016) High-resolution mapping of global surface water and its long-term changes. Nature 540:418–422. https://doi.org/10.1038/nature20584
Fan Y, Wen Q, Wang W, Wang P, Li L, Zhang P (2017) Quantifying disaster physical damage using remote sensing data—a technical work flow and case study of the 2014 ludian earthquake in china. International Journal of Disaster Risk Science 8:471–488. https://doi.org/10.1007/s13753-017-0143-8
Gao J, Wang Q, Yuan Y (2019) Scar: Spatial-/channel-wise attention regression networks for crowd counting. Neurocomputing 363:1–8. https://doi.org/10.1016/j.neucom.2019.08.018
Chen X, Bin Y, Sang N, Gao C (2019) Scale pyramid network for crowd counting. In: 2019 IEEE Winter conference on applications of computer vision (WACV), pp 1941–1950. https://doi.org/10.1109/WACV.2019.00211
Arteta C, Lempitsky V, Zisserman A (2016) Counting in the wild. In: European conference on computer vision. Springer, pp 483–498. https://doi.org/10.1007/978-3-319-46478-7_30
Loh DR, Yong WX, Yapeter J, Subburaj K, Chandramohanadas R (2021) A deep learning approach to the screening of malaria infection: Automated and rapid cell counting, object detection and instance segmentation using mask r-cnn. Comput Med Imaging Graph 88:101845. https://doi.org/10.1016/j.compmedimag.2020.101845
Dai Z, Song H, Wang X, Fang Y, Yun X, Zhang Z, Li H (2019) Video-based vehicle counting framework. IEEE Access 7:64460–64470. https://doi.org/10.1109/ACCESS.2019.2914254
Topkaya IS, Erdogan H, Porikli FM (2014) Counting people by clustering person detector outputs. 313–318. https://doi.org/10.1109/AVSS.2014.6918687
Li M, Zhang Z, Huang K, Tan T (2008) Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1–4. https://doi.org/10.1109/ICPR.2008.4761705
Lempitsky VS, Zisserman A (2010) Learning to count objects in images. In: NIPS
Pham VQ, Kozakaya T, Yamaguchi O, Okada R (2015) Count forest: Co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the international conference on computer vision (ICCV), pp 3253–3261. https://doi.org/10.1109/ICCV.2015.372
Li M, Zhang Z, Huang K, Tan T (2008) Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1–4. https://doi.org/10.1109/ICPR.2008.4761705
Ge W, Collins RT (2009) Marked point processes for crowd counting. In: CVPR. https://doi.org/10.1109/CVPR.2009.5206621
Gao G, Gao J, Liu Q, Wang Q, Wang Y (2020) Cnn-based density estimation and crowd counting: A survey. arXiv:2003.12783
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 589–597. https://doi.org/10.1109/CVPR.2016.70
Li Y, Zhang X, Chen D (2018) Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1091–1100. https://doi.org/10.1109/CVPR.2018.00120
Liu W, Salzmann M, Fua P (2019) Context-aware crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5094–5103. https://doi.org/10.1109/CVPR.2019.00524
Gao J, Wang Q, Li X (2020) Pcc net: Perspective crowd counting via spatial convolutional network. IEEE Trans Circuits Syst Video Technol 30:3486–3498. https://doi.org/10.1109/TCSVT.2019.2919139
de Santana Correia A, Colombini E (2021) Attention, please! a survey of neural attention models in deep learning. arXiv:2103.16775
Zhai W, Li Q, Zhou Y, Li X, Pan J, Zou G, Gao M (2022) Da2net: A dual attention-aware network for robust crowd counting Multimedia Systems PP. https://doi.org/10.1007/s00530-021-00877-4
Zhai W, Gao M, Anisetti M, Li Q, Jeon S, Pan J (2022) Group-split attention network for crowd counting. Journal of Electronic Imaging. https://doi.org/10.1117/1.JEI.31.4.041214
Liu N, Long Y, Zou C, Niu Q, Pan L, Wu H (2019) Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3225–3234. https://doi.org/10.1109/CVPR.2019.00334
Sindagi VA, Patel VM (2020) Ha-ccn: Hierarchical attention-based crowd counting network. IEEE Trans Image Process 29:323–335. https://doi.org/10.1109/TIP.2019.2928634
Jiang X, Zhang L, Xu M, Zhang T, Lv P, Zhou B, Yang X, Pang Y (2020) Attention scaling for crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4705–4714. https://doi.org/10.1109/cvpr42600.2020.00476
Rong L, Li C (2021) Coarse- and fine-grained attention network with background-aware loss for crowd density map estimation. In: Proceedings of the IEEE workshop on applications of computer vision (WACV), pp 3674–3683
Gao G, Liu Q, Wang Y (2021) Counting from sky: A large-scale data set for remote sensing object counting and a benchmark method. IEEE Trans Geosci Remote Sens 59:3642–3655. https://doi.org/10.1109/TGRS.2020.3020555
Sindagi V, Patel V (2017) Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: Proceedings of the IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6. https://doi.org/10.1109/AVSS.2017.8078491
Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the European conference on computer vision (ECCV). https://doi.org/10.1007/978-3-030-01228-1_45
Wang Q, Gao J, Lin W, Yuan Y (2019) Learning from synthetic data for crowd counting in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 8190–8199. https://doi.org/10.1109/CVPR.2019.00839
Zhu L, Zhao Z, Lu C, Lin Y, Peng Y, Yao T (2019) Dual path multi-scale fusion networks with attention for crowd counting. arXiv:1902.01115
Funding
This work is supported in part by the National Natural Science Foundation of China (Nos. 61601266 and 61801272) and National Natural Science Foundation of Shandong Province (Nos. ZR2021QD041 and ZR2020MF127).
Author information
Authors and Affiliations
Contributions
Xiangyu Guo: Conceptualization, Methodology, Data Curation, and Writing - Original Draft. Mingliang Gao: Supervision, Formal analysis, Investigation, and Funding Acquisition. Wenzhe Zhai: Data Curation, Data Visualization, and Investigation. Qilei Li: Investigation, and Software. Kyu Hyung Kim: Formal Analysis, and Writing -Review & Editing. Gwanggil Jeon: Validation, and Writing -Review & Editing.
Corresponding author
Ethics declarations
Ethics approval
We declare that there is no ethics issue.
Conflict of Interests
We declare that we have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, X., Gao, M., Zhai, W. et al. Dense Attention Fusion Network for Object Counting in IoT System. Mobile Netw Appl 28, 359–368 (2023). https://doi.org/10.1007/s11036-023-02090-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-023-02090-1