作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (8): 16-38. doi: 10.19678/j.issn.1000-3428.0070619

• 热点与综述 • 上一篇    下一篇

SAM及其改进模型在图像分割中的应用综述

马依拉木·木斯得克, 高雨欣, 张思拓, 冯珂, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木*()   

  1. 新疆财经大学信息管理学院,新疆 乌鲁木齐 830012
  • 收稿日期:2024-11-18 修回日期:2025-02-11 出版日期:2025-08-15 发布日期:2025-08-15
  • 通讯作者: 哈里旦木·阿布都克里木
  • 基金资助:
    国家自然科学基金(62366050); 新疆维吾尔自治区自然科学基金(2024D01A38); 光电信息技术教育部重点实验室(天津大学)项目(2024KFKTO16)

Review of Application of SAM and Its Improved Models in Image Segmentation

Mayilamu Musideke, GAO Yuxin, ZHANG Situo, FENG Ke, Abudukelimu Abulizi, Halidanmu Abudukelimu*()   

  1. College of Information Management, Xinjiang University of Finance and Economics, Urumqi 830012, Xinjiang, China
  • Received:2024-11-18 Revised:2025-02-11 Online:2025-08-15 Published:2025-08-15
  • Contact: Halidanmu Abudukelimu

摘要:

随着通用人工智能技术的快速发展,基础模型在多个领域的应用日益受到广泛关注。在图像分割领域,分割一切模型(SAM)作为一种核心基础模型,在提升图像理解和处理效率方面展现出了显著优势。尽管SAM在图像分割任务中表现出色,但在功耗、计算效率以及在不同应用场景中的适应性等方面,仍然存在一定的优化空间。为此,从多个维度对SAM的改进方向进行了深入探索,包括提升速度与计算效率、增强模型的精度与鲁棒性、提高模型的适应性与通用性、优化提示工程设计,以及提升数据利用效率与强化迁移学习能力等方面。通过这些改进,SAM不仅能够在更复杂的任务中保持高效性能,还能更好地适应各领域和应用场景的需求。在此基础上,总结SAM在医学、遥感、机械等领域中的实际应用,展示了其在不同场景下的适用性与挑战。此外,详细介绍了图像分割领域常用的数据集和评价指标,通过实验对比分析,进一步评估了视觉Transformer(ViT)变体对SAM性能的影响,以及EfficientSAM、EfficientViT-SAM、MobileSAM和RobustSAM等改进模型的性能表现。最后,总结了SAM及其改进模型在实际应用中面临的挑战,并展望了未来的发展方向,旨在帮助科研工作者更全面地了解SAM及其变体的改进与应用,为新模型的提出提供启发。

关键词: 分割一切模型, 视觉基础模型, 改进模型, 图像分割, 通用模型

Abstract:

With the rapid advancement of general artificial intelligence technology, the application of foundational models across various fields has gained increasing attention. In image segmentation, the Segment Anything Model (SAM), as a foundational model, demonstrates notable advantages in enhancing image comprehension and processing efficiency. While SAM achieves state-of-the-art performance in image segmentation, further optimization in power consumption, computational efficiency, and cross-domain adaptability is required. This review provides an in-depth exploration of the potential improvements to SAM across several crucial dimensions, such as enhancing speed and computational efficiency, improving model accuracy and robustness, increasing adaptability and generalization, optimizing prompt engineering, and boosting data utilization and transfer learning capabilities. With these enhancements, SAM is expected to sustain high efficiency in highly complex tasks and better meet requirements of various fields and application contexts. In addition, this review summarizes the practical applications of SAM in various fields, including medical imaging, remote sensing, and the mechanical industry, and demonstrates the suitability and challenges of the model in different scenarios. Moreover, this review provides a detailed overview of commonly used datasets and evaluation metrics in the field of image segmentation. Through experimental comparative analyses, the impact of Vision Transformer (ViT) variants on the performance of SAM is assessed, along with performance evaluations of enhanced models, such as EfficientSAM, EfficientViT-SAM, MobileSAM, and RobustSAM. The challenges faced by SAM and its improved models in real-world applications are also discussed, and future research directions are proposed. This review aims to provide researchers with a comprehensive understanding of the advancements and applications of SAM and its variants, offering insights that may inform the development of new models.

Key words: Segment Anything Model (SAM), Vision Foundation Model (VFM), improved model, image segmentation, general model