Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications

Ji, Wei; Li, Jingjing; Bi, Qi; Liu, Tingwei; Li, Wenbo; Cheng, Li

doi:10.1007/s11633-023-1385-0

Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications

News & Views
Open access
Published: 12 April 2024

Volume 21, pages 617–630, (2024)
Cite this article

Download PDF

You have full access to this open access article

Machine Intelligence Research Aims and scope Submit manuscript

Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications

Download PDF

Wei Ji ORCID: orcid.org/0000-0003-4059-5902¹,
Jingjing Li ORCID: orcid.org/0000-0003-0811-4988¹,
Qi Bi²,
Tingwei Liu³,
Wenbo Li⁴ &
…
Li Cheng¹

826 Accesses
15 Citations
29 Altmetric
2 Mentions
Explore all metrics

A Correction to this article was published on 11 September 2024

This article has been updated

Abstract

Recently, Meta AI Research approaches a general, promptable segment anything model (SAM) pre-trained on an unprecedentedly large segmentation dataset (SA-1B). Without a doubt, the emergence of SAM will yield significant benefits for a wide array of practical image segmentation applications. In this study, we conduct a series of intriguing investigations into the performance of SAM across various applications, particularly in the fields of natural images, agriculture, manufacturing, remote sensing and healthcare. We analyze and discuss the benefits and limitations of SAM, while also presenting an outlook on its future development in segmentation tasks. By doing so, we aim to give a comprehensive understanding of SAM’s practical applications. This work is expected to provide insights that facilitate future research activities toward generic segmentation. Source code is publicly available at https://github.com/LiuTingWed/SAM-Not-Perfect.

Article PDF

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Change history

13 September 2024
An Erratum to this paper has been published: https://doi.org/10.1007/s11633-024-1526-0
11 September 2024
An Erratum to this paper has been published: https://doi.org/10.1007/s11633-024-1526-0

References

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, I. Sutskever. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning, pp. 8748–8763, 2021.
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W. Y. Lo, P. Dollar, R. Girshick. Segment anything, [Online], Available: https://arxiv.org/abs/2304.02643, 2023.
T. Y. Ko, S. H. Lee. Novel method of semantic segmentation applicable to augmented reality. Sensors, vol. 20, no. 6, pp. 1737, 2020. DOI: https://doi.org/10.3390/s20061737.
Article Google Scholar
B. Wang, A. Aboah, Z. Y. Zhang, U. Bagci. GazeSAM: What you see is what you segment, [Online], Available: https://arxiv.org/abs/2304_13844, 2023
A. Borji, M. M. Cheng, Q. B. Hou, H. Z. Jiang, J. Li. Salient object detection: A survey. Computational Visual Media, vol. 5, no. 2, pp. 117–150, 2019. DOI: https://doi.org/10.1007/s41095-019-0149-9.
Article Google Scholar
W. Ji, S. Yu, J. D. Wu, K. Ma, C. Bian, Q. Bi, J. J. Li, H. R. Liu, L. Cheng, Y. F. Zheng. Learning calibrated medical image segmentation via multi-rater agreement modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashvllle, USA, pp. 12336–12346, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01216.
T. He, Y. Liu, C. Y. Xu, X. L. Zhou, Z. K. Hu, J. N. Fan. A fully convolutional neural network for wood defect location and identification. IEEE Access, vol. 7, pp. 123453–123462, 2019. DOI: https://doi.org/10.1109/ACCESS.2019.2937461.
Article Google Scholar
Y. N. Li, Z. Y. Huang, Z. G. Cao, H. Lu, H. H. Wang, S. P. Zhang. Performance evaluation of crop segmentation algorithms. IEEE Access, vol.8, pp.36210–36225, 2020. DOI: https://doi.org/10.1109/ACCESS.2020.2969451.
Y. Y. Xu, Z. Xie, Y. X. Feng, Z. L. Chen. Road extraction from high-resolution remote sensing imagery using deep learning. Remote Sensing, vol. 10, no. 9, Article number 1461, 2018. DOI: https://doi.org/10.3390/rs10091461.
J. Li, W. Ji, S. Wang, W. Li, L. Cheng. DVSOD: RGB-D video salient object detection. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, USA, 2023.
N. Liu, N. Zhang, L. Shao, J. W. Han. Learning selective mutual attention and contrast for RGB-D saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 12, pp. 9026–9042, 2022. DOI: https://doi.org/10.1109/TPAMI.2021.3122139.
Article Google Scholar
D. P. Fan, G. P. Ji, G. L. Sun, M. M. Cheng, J. B. Shen, L. Shao. Camouflaged object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp.2774–2784, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00285.
G. P. Ji, D. P. Fan, P. Xu, B. W. Zhou, M. M. Cheng, L. Van Gool. SAM struggles in concealed scenes-empirical study on “segment anything”. Science China Information Sciences, vol. 66, no. 12, pp.226101, 2023. DOI: https://doi.org/10.1007/s11432-023-3881-x.
Article Google Scholar
E. Z. Xie, W. J. Wang, W. H. Wang, P. Z. Sun, H. Xu, D. Liang, P. Luo. Segmenting transparent objects in the wild with transformer. In Proceedings of the 30th International Joint Conference on Artificial Intelligence, pp. 1194–1200, 2021. DOI: https://doi.org/10.24963/ijcai.2021/165.
X. W. Hu, T. Y. Wang, C. W. Fu, Y. T. Jiang, Q. Wang, P. A. Heng. Revisiting shadow detection: A new benchmark dataset for complex world. IEEE Transactions on Image Processing, vol. 30, pp. 1925–1934, 2021. DOI: https://doi.org/10.1109/TIP.2021.3049331.
Article Google Scholar
L. Hou, T. F. Y. Vicente, M. Hoai, D. Samaras. Large scale shadow annotation and detection using lazy annotation and stacked CNNS. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 4, pp.1337–1351, 2021. DOI: https://doi.org/10.1109/TPAMI.2019.2948011.
Article Google Scholar
W. Guo, U. K. Rage, S. Ninomiya. Illumination invariant segmentation of vegetation for time series wheat images based on decision tree model. Computers and Electronics in Agriculture, vol. 96, pp.58–66, 2013. DOI: https://doi.org/10.1016/j.compag.2013.04.010.
Article Google Scholar
A. Sriwastwa, S. Prakash, S. Swarit, K. Kumari, S. S. Sahu. Detection of pests using color based image segmenttion. In Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies, Coimbatore, India, pp. 1393–1396, 2018. DOI: https://doi.org/10.1109/ICICCT.2018.8473166.
D. Contributors. Leaf disease segmentation dataset, [Online], Available: https://www.kaggle.com/datasets/fakh-realam9537/leaf-disease-segmentation-dataset, 2023.
P. Bergmann, M. Fauser, D. Sattlegger, C. Steger. MVTec AD-A comprehensive real-world dataset for unsupervised anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 9584–9592, 2019. DOI: https://doi.org/10.1109/CV-PR.2019.00982.
S. He, W. S. Jiang. Boundary-assisted learning for building extraction from optical remote sensing imagery. Remote Sensing, vol. 13, no.4, pp.760, 2021. DOI: https://doi.org/10.3390/rs13040760.
Article Google Scholar
Q. Bi, K. Qin, H. Zhang, G. S. Xia. Local semantic enhanced convNet for aerial scene recognition. IEEE Transactions on Image Processing, vol. 30, pp.6498–6511, 2021. DOI: https://doi.org/10.1109/TIP.2021.3092816.
Article Google Scholar
V. Mnih, G. Hinton. Machine Learning for Aerial Image Labeling, Toronto, Canada: University of Toronto, pp. 1–24, 2013.
Google Scholar
H. Z. Fu, J. Cheng, Y. W. Xu, D. W. K. Wong, J. Liu, X. C. Cao. Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Transactions on Medical Imaging, vol. 37, no.7, pp. 1597–1605, 2018. DOI: https://doi.org/10.1109/TMI.2018.2791488.
Article Google Scholar
A. Almazroa, S. Alodhayb, E. Osman, E. Ramadan, M. Hummadi, M. Dlaim, M. Alkatee, K. Raahemifar, V. Lakshminarayanan. Agreement among ophthalmologists in marking the optic disc and optic cup in fundus images. International Ophthalmology, vol. 37, no. 3, pp. 701–717, 2017. DOI: https://doi.org/10.1007/s10792-016-0329-x.
Article Google Scholar
D. P. G. Fan P. Ji, T. Zhou, G. Chen, H. Fu, J. Shen, L. Shao. PraNet: Parallel reverse attention network for polyp segmentation. In Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention, pp. 263–273, 2020. DOI: https://doi.org/10.1007/978-3-030-59725-2_26.
G. P. Ji, G. B. Xiao, Y. C. Chou, D. P. Fan, K. Zhao, G. Chen, L. Van Gool. Video polyp segmentation: A deep learning perspective. Machine Intelligence Research, vol. 19, no. 6, pp. 531–549, 2022. DOI: https://doi.org/10.1007/s11633-022-1371-y.
Article Google Scholar
J. W. Pan, Q. Bi, Y. Z. Yang, P. F. Zhu, C. Bian. Label-efficient hybrid-supervised learning for medical image segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2026–2034, 2022. DOI: https://doi.org/10.1609/aaai.v36i2.20098.
N. S. An, P. N. Lan, D. V. Hang, D. V. Long, T. Q. Trung, N. T. Thuy, D. V. Sang. Blazeneo: Blazing fast polyp segmentation and neoplasm detection. IEEE Access, vol. 10, pp. 43669–43684, 2022. DOI: https://doi.org/10.1109/ACCESS.2022.3168693.
Article Google Scholar
N. C. F. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler, A. Halpern. Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). In Proceedings of the 15th International Symposium on Biomedical Imaging, Washington DC, USA, pp.168–172, 2018. DOI: https://doi.org/10.1109/ISBI.2018.8363547.
T. Mendonça, P. M. Ferreira, J. S. Marques, A. R. S. Marcal, J. Rozeira. PH2- a dermoscopic image database for research and benchmarking. In Proceedings of the 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Osaka, Japan, pp. 5437–5440, 2013. DOI: https://doi.org/10.1109/EMBC.2013.6610779.
L. J. Wang, H. C. Lu, Y. F. Wang, M. Y. Feng, D. Wang, B. C. Yin, X. Ruan. Learning to detect salient objects with image-level supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp.3796–3805, 2017. DOI: https://doi.org/10.1109/CVPR.2017.404.
J. Zhang, D. P. Fan, Y. C. Dai, X. Yu, Y. R. Zhong, N. Barnes, L. Shao. RGB-D saliency detection via cascaded mutual information minimization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 4318–4327, 2021. DOI: https://doi.org/10.1109/IC-CV48922.2021.00430.
Z. Z. Tu, T. Xia, C. L. Li, X. X. Wang, Y. Ma, J. Tang. RGB-T image saliency detection via collaborative graph learning. IEEE Transactions on Multimedia, vol. 22, no. 1, pp. 160–173, 2020. DOI: https://doi.org/10.1109/TMM.2019.2924578.
Article Google Scholar
X. B. Qin, H. Dai, X. B. Hu, D. P. Fan, L. Shao, L. Van Gool. Highly accurate dichotomous image segmentation. In Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Israel, pp.38–56, 2022 DOI: https://doi.org/10.1007/978-3-031-19797-0_3.
T. F. Y. Vicente, L. Hou, C. P. Yu, M. Hoai, D. Samaras. Large-scale training of shadow detectors with noisily-annotated shadow examples. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, pp.816–832, 2016. DOI: https://doi.org/10.1007/978-3-319-46466-4_49.
D. P. Fan, G. P. Ji, P. Xu, M. M. Cheng, C. Sakaridis, L. Van Gool. Advances in deep concealed scene understanding. Visual Intelligence, vol. 1, no. 1, pp. 16, 2023. DOI: https://doi.org/10.1007/s44267-023-00019-6.
Article Google Scholar
N. Tajbakhsh, S. R. Gurudu, J. M. Liang. Automated polyp detection in colonoscopy videos using shape and context information. IEEE Transactions on Medical Imaging, vol. 35, no. 2, pp. 630–644, 2016. DOI: https://doi.org/10.1109/TMI.2015.2487997.
Article Google Scholar
N. Liu, N. Zhang, K. Y. Wan, L. Shao, J. W. Han. Visual saliency transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 4702–4712, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00468.
M. C. Zhuge, D. P. Fan, N. Liu, D. W. Zhang, D. Xu, L. Shao. Salient object detection via integrity learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3738–3752, 2023. DOI: https://doi.org/10.1109/TPAMI.2022.3179526.
Google Scholar
Y. H. Wu, Y. Liu, L. Zhang, M. M. Cheng, B. Ren. EDN: Salient object detection via extremely-downsampled network. IEEE Transactions on Image Processing, vol. 31, pp. 3125–3136, 2022. DOI: https://doi.org/10.1109/TIP.2022.3164550.
Article Google Scholar
X. Q. Zhao, Y. W. Pang, L. H. Zhang, H. C. Lu, L. Zhang. Suppress and balance: A simple gated network for salient object detection. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 35–51, 2020. DOI: https://doi.org/10.1007/978-3-030-58536-5_3.
H. Y. Mei, G. P. Ji, Z. Q. Wei, X. Yang, X. P. Wei, D. P. Fan. Camouflaged object segmentation with distraction mining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, pp.8768–8777, 2021. DOI: https://doi.org/10.1109//CVPR46437.2021.00866.
Q. Jia, S. L. Yao, Y. Liu, X. Fan, R. S. Liu, Z. X. Luo. Segment, magnify and reiterate: Detecting camouflaged objects the hard way. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, pp. 4703–4712, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.00467.
H. Y. Mei, X. Yang, Y. D. Zhou, G. P. Ji, X. P. Wei, D. P. Fan. Distraction-aware camouflaged object segmentation. Scientia Sinica Informations, 2023
Y. W. Pang, X. Q. Zhao, T. Z. Xiang, L. H. Zhang, H. C. Lu. Zoom in and out: A mixed-scale triplet network for camouflaged object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, pp. 2150–2160, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.00220.
X. W. Hu, L. Zhu, C. W. Fu, J. Qin, P. A. Heng. Direction-aware spatial context features for shadow detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 7454–7462, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00778.
Q. L. Zheng, X. T. Qiao, Y. Cao, R. W. H. Lau. Distraction-aware shadow detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 5162–5171, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00531.
Z. H. Chen, L. Zhu, L. Wan, S. Wang, W. Feng, P. A. Heng. A multi-task mean teacher for semi-supervised shadow detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp. 5610–5619, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00565.
D. P. Fan, G. P. Ji, M. M. Cheng, L. Shao. Concealed object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 10, pp. 6024–6042, 2022. DOI: https://doi.org/10.1109/TPAMI.2021.3085766.
Article Google Scholar
X. B. Hu, S. Wang, X. B. Qin, H. Dai, W. Q. Ren, D. H. Luo, Y. Tai, L. Shao. High-resolution iterative feedback network for camouflaged object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington DC, USA, pp. 881–889, 2023. DOI: https://doi.org/10.1609/aaai.v37i1.25167.
G. P. Ji, D. P. Fan, Y. C. Chou, D. X. Dai, A. Liniger, L. Van Gool. Deep gradient learning for efficient camouflaged object detection. Machine Intelligence Research, vol. 20, no.1, pp. 92–108, 2023. DOI: https://doi.org/10.1007/s11633-022-1365-9.
Article Google Scholar
T. Zhou, Y. Zhou, C. Gong, J. Yang, Y. Zhang. Feature aggregation and propagation network for camouflaged object detection. IEEE Transactions on Image Processing, vol. 31, pp.7036–7047, 2022. OOI: https://doi.org/10.1109//TIP.2022.3217695.
Article Google Scholar
T. Zhou, Y. Zhou, K. L. He, C. Gong, J. Yang, H. Z. Fu, D. G. Shen. Cross-level feature aggregation network for polyp segmentation. Pattern Recognition, vol. 140, pp. 109555, 2023. DOI: https://doi.org/10.1016/j.patcog.2023.109555.
Article Google Scholar
W. C. Zhang, C. Fu, Y. Zheng, F. Y. Zhang, Y. L. Zhao, C. W. Sham. HSNet: A hybrid semantic network for polyp segmentation. Computers in Biology and Medicine, vol. 150, pp. 106173, 2022. DOI: https://doi.org/10.1016/j.compbiomed.2022.106173.
Article Google Scholar
X. J. Xiang, Q. Tan, H. Zhou, D. Q. Tang, J. Lai. Multimodal fusion of voice and gesture data for UAV control. Drones, vol. 6, no. 8, Article number 201, 2022. DOI: https://doi.org/10.3390/drones6080201.
M. Kaya, H. Ş. Bilge. Deep metric learning: A survey. Symmetry, vol. 11, no. 9, Article number 1066, 2019. DOI: https://doi.org/10.3390/sym11091066.
W. Ji, J. J. Li, Q. Bi, C. Guo, J. Liu, L. Cheng. Promoting saliency from depth: Deep unsupervised RGB-D saliency detection. In Proceedings of the International Conference on Learning Representations, 2022.
Y. R. Piao, W. Ji, J. J. Li, M. Zhang, H. C. Lu. Depth-induced multi-scale recurrent attention network for saliency detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 7253–7262, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00735.
W. Ji, J. J. Li, C. Bian, Z. C. Zhang, L. Cheng. SemanticRT: A large-scale dataset and method for robust semantic segmentation in multispectral images. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, Canada, pp.3307–3316, 2023. DOI: https://doi.org/10.1145/3581783.3611738.
W. Ji, J. J. Li, C. Bian, Z. W. Zhou, J. Y. Zhao, A. Yuille, L. Cheng. Multispectral video semantic segmentation: A benchmark dataset and baseline. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, pp.1094–1104, 2023. DOI: https://doi.org/10.1109/CVPR52729.2023.00112.
M. Zhang, J. Liu, Y. F. Wang, Y. R. Piao, S. Y. Yao, W. Ji, J. J. Li, H. C. Lu, Z. X. Luo. Dynamic context-sensitive filtering network for video salient object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 1533–1543, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00158.
J. J. Li, T. Y. Yang, W. Ji, J. Wang, L. Cheng. Exploring denoised cross-video contrast for weakly-supervised temporal action localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, pp. 19882–19892, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.01929.
J. J. Li, W. Ji, Q. Bi, C. Yan, M. Zhang, Y. R. Piao, H. C. Lu, L. Chen. Joint semantic mining for weakly supervised RGB-D salient object detection. In Proceedings of the 35th Conference on Neural Information Processing Syste, pp. 11945–11959, 2021.
M. A. Mazurowski, H. Y. Dong, H. X. Gu, J. C. Yang, N. Konz, Y. X. Zhang. Segment anything model for medical image analysis: An experimental study. Medical Image Analysis, vol. 89, pp. 102918, 2023. DOI: https://doi.org/10.1016/j.media.2023.102918.
Article Google Scholar
Y. C. Zhang, R. S. Jiao. How segment anything model (SAM) boost medical image segmentation? [Online], Available: https://arxiv.org/abs/2305.03678, 2023.
J. Ma, B. Wang. Segment anything in medical images, [Online], Available: https://arxiv.org/abs/2304.12306, 2023.
J. D. Wu, R. Fu, H. H. Fang, Y. P. Liu, Z. W. Wang, Y. W. Xu, Y. M. Jin, T. Arbel. Medical SAM adapter: Adapting segment anything model for medical image segmentation, [Online], Available: https://arxiv.org/abs/2304.12620, 2023.
L. P. Osco, Q. S. Wu, E. L. De Lemos, W. N. Gonçalves, A. P. M. Ramos, J. Li, J. M. Junior. The segment anything model (SAM) for remote sensing applications: From zero to one shot. International Journal of Applied Earth Observation and Geoinformation, vol. 124, Article number 103540, 2023. DOI: https://doi.org/10.1016/j.jag.2023.103540.
F. Chen, M. V. Giuffrida, S. A. Tsaftaris. Adapting vision foundation models for plant phenotyping. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop, pp. 604–613, 2023.
Y. Zhao, K. C. Song, W. Q. Cui, H. Ren, Y. H. Yan. MFS enhanced SAM: Achieving superior performance in bimodal few-shot segmentation. Journal of Visual Communication and Image Representation, vo1. 97, Article number 103946, 2023. DOI: https://doi.org/10.1016/j.jvcir.2023.103946.
Y. M. Cheng, L. L. Li, Y. Y. Xu, X. D. Li, Z. X. Yang, W. G. Wang, Y. Yang. Segment and track anything, [Online], Available: https://arxiv.org/abs/2305.06558, 2023.
Z. H. Lu, Z. Y. Xiao, J. W. Bai, Z. W. Xiong, X. C. Wang. Can sam boost video super-resolutionn [Online], Available: https://arxiv.org/abs/2305.06524, 2023.
T. R. Chen, L. Y. Zhu, C. T. Deng, R. L. Cao, Y. Wang, S. Z. Zhang, Z. J. Li, L. Y. Sun, Y. Zang, P. P. Mao. SAM-adapter: Adapting segment anything in underperformed scenes. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop, pp. 3367–3375, 2023.
H. X. Dai, C. Ma, Z. L. Liu, Y. W. Li, P. Shu, X. Z. Wei, L. Zhao, Z. H. Wu, D. J. Zhu, W. Liu, Q. Z. Li, T. M. Liu, X. Li. SAMAug: Point prompt augmentation for segment anything model, [Online], Available: https://arxiv.org/abs/2307.01187, 2023.

Download references

Acknowledgements

Thank Meta AI Research for the valuable and impressive work on providing open-source SAM model and SA-1B dataset. This study is partially supported by the Mitacs, CFI-JELF and NSERC Discovery grants. The views and conclusions contained in this paper are those of the authors and should not be interpreted as representing any funding agency.

Author information

Authors and Affiliations

University of Alberta, Edmonton, T6G 2R3, Canada
Wei Ji, Jingjing Li & Li Cheng
Wuhan University, Wuhan, 430072, China
Qi Bi
Dalian University of Technology, Dalian, 116024, China
Tingwei Liu
Samsung Research America, Mountain View, 94043, USA
Wenbo Li

Authors

Wei Ji
View author publications
You can also search for this author in PubMed Google Scholar
Jingjing Li
View author publications
You can also search for this author in PubMed Google Scholar
Qi Bi
View author publications
You can also search for this author in PubMed Google Scholar
Tingwei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wenbo Li
View author publications
You can also search for this author in PubMed Google Scholar
Li Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingjing Li.

Ethics declarations

The authors declared that they have no conflicts of interest to this work.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

The original version of this article was revised due to a retrospective Open Access order

Wei Ji is currently a Ph.D. degree candidate at University of Alberta, Canada. He worked as a visiting Ph.D. student at Johns Hopkins University, USA. He achieved CVPR Best Paper Candidate and MICCAI Young Scientist Award Nominee.

His research interests include saliency detection, image segmentation, and multimodal robust learning.

Jingjing Li is currently a Ph.D. degree candidate at University of Alberta, Canada.

Her research interests include designing deep neural networks and applying deep learning in various fields of low-level vision, such as RGB salient object detection, RGB-D salient object detection, video object segmentation, and medical image segmentation.

Qi Bi is currently a Ph.D. degree candidate with the Computer Vision Research Group, University of Amsterdam, The Netherlands. He was awarded as an outstanding reviewer for CVPR 2023. His works were shortlisted for CVPR 2021 best paper candidates.

His research interests include image understanding, robust vision in bad weather and domain generalization.

Tingwei Liu is currently a Ph.D. degree candidate at Dalian University of Technology, China.

His research interests include scene understanding, salient object detection and medical image segmentation.

Wenbo Li received the Ph.D. degree in computer science at State University of New York, Albany, USA in 2019. He is currently a staff researcher at Samsung Research America, USA.

His research interests include visual generation and computer vision.

Li Cheng received the Ph.D. degree in computer science from the University of Alberta, Canada in 2004. He is currently a full professor at University of Alberta, Canada. He was with the Statistical Machine Learning Group, National Information and Communications Technology Australia Limited (NICTA), Australia, Toyota Technological Institute-Chicago, USA, and the University of Alberta, Canada.

His research interests include computer vision and machine learning.

Rights and permissions

Open Access . This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ji, W., Li, J., Bi, Q. et al. Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications. Mach. Intell. Res. 21, 617–630 (2024). https://doi.org/10.1007/s11633-023-1385-0

Download citation

Received: 10 November 2023
Accepted: 01 December 2023
Published: 12 April 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s11633-023-1385-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications

Abstract

Article PDF

Explore related subjects

Change history

13 September 2024

11 September 2024

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation