MAYILAMU Musideke, GAO Yuxin, ZHANG Situo, FENG Ke, ABUDUKELIMU Abulizi, HALIDANMU Abudukelimu
Accepted: 2025-04-08
With the rapid advancement of general artificial intelligence technology, the application of foundational models in various fields has gained increasing attention. In the domain of image segmentation, the "Segment Anything Model" (SAM), as a core foundational model, has demonstrated significant advantages in improving both image understanding and processing efficiency. While SAM has shown strong performance in image segmentation tasks, there remains considerable room for optimization in areas such as power consumption, computational efficiency, and adaptability to diverse application scenarios. This paper provides an in-depth exploration of potential improvements to SAM across several key dimensions, including enhancing speed and computational efficiency, improving model accuracy and robustness, increasing adaptability and generalization, optimizing prompt engineering, and boosting data utilization and transfer learning capabilities. These enhancements aim to enable SAM to not only sustain high efficiency in more complex tasks but also better meet the requirements of various fields and application contexts. Additionally, this paper summarizes the practical applications of SAM in various fields, including medical imaging, remote sensing, and mechanical industries, demonstrating its suitability and challenges in different scenarios. Moreover, this paper provides a detailed overview of commonly used datasets and evaluation metrics in the field of image segmentation. Through experimental comparative analyses, the impact of Vision Transformer variants on SAM’s performance is assessed, alongside performance evaluations of enhanced models such as Efficient SAM, EfficientViT-SAM, MobileSAM, and Robust SAM. The challenges faced by SAM and its improved models in real-world applications are also discussed, and future research directions are proposed. The aim is to provide researchers with a comprehensive understanding of the advancements and applications of SAM and its variants, offering insights that may inform the development of new models.