摘要： 螺母因紧固零件的作用被广泛应用于机械制造环节，其内壁螺纹质量对于机械联接至关重要。为了实现螺母内螺纹的非接触缺陷检测，首先提出了一种基于球面折反射全景成像原理的图像采集装置，其次利用该装置采集图像数据集并提出了一种基于改进YOLOv7的缺陷检测算法。该成像装置具备一次性成像、无需伸入内壁、采集到的内螺纹图像细节完整等优势，有效地改进了传统视觉检测方案存在的成像分辨率低、相机视场占比小的问题。YOLOv7算法改进结合螺母内螺纹的缺陷特征，使用k-means++算法聚类锚框，使得模型训练更容易收敛。通过在特征融合网络中加入CA（Coordinate attention）注意力机制，提高网络的特征表达能力，使用SIoU（Scylla intersection over union）损失函数替换原YOLOv7模型中的CIoU（Complete intersection over union）损失函数，更好地估算模型的表现能力并获得较低的误差。实验表明，改进后的YOLOv7算法针对内螺纹缺口、漏攻牙、刮痕、碎屑四种缺陷类型，平均精度（AP）分别达96.89%、100%、98.07%、99.98%，平均精度均值（mAP）达到 98.74%，检测速度（FPS）达39.64，与其他常见的模型相比，算法精度最高，满足工业现场实时检测需求。
Abstract: Nuts are widely used in mechanical manufacturing due to their role in fastening parts, and the quality of their inner wall threads is crucial for mechanical connections. In order to achieve non-contact defect detection of nut internal threads, this paper firstly proposes an image acquisition device based on the principle of spherical catadioptric panoramic imaging, and secondly uses the device to acquire an image dataset and proposes a defect detection algorithm based on improved YOLOv7. The imaging device has the advantages of one-time imaging, no need to reach into the inner wall, and complete details of the acquired image of the internal thread, which effectively improves the problems of low imaging resolution and small field of view of the camera that exist in the traditional visual inspection scheme. The YOLOv7 algorithm improvement incorporates the defective features of the internal threads of the nut and uses the k-means++ algorithm to cluster the anchor frames, making it easier for model training to converge. By incorporating coordinated attention mechanism into the feature fusion network, the feature expression ability of network is improved. SIoU loss function is used to replace the CIoU loss function in the original YOLOv7 model to better estimate the performance ability of the model and obtain lower errors. The results show that for the four types of defects, including internal thread breaches, unthreaded, grinding crack, and scraps, the average precision reaches 96.89%, 100%, 98.07%, and 99.98%, respectively. The mean average precision reaches 98.74%, and the FPS reaches 39.64. Compared with other common models, the algorithm has the highest accuracy and meets the real-time detection needs of industrial sites.