作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 216-228. doi: 10.19678/j.issn.1000-3428.0068285

• 图形图像处理 • 上一篇    下一篇

基于自监督学习的葡萄实例去重叠遮挡算法

曾湄1, 王逸涵1, 雷志伟1, 刘雪垠1,2, 李柏林1,*()   

  1. 1. 西南交通大学机械工程学院, 四川 成都 610031
    2. 四川省机械设计研究院研发中心, 四川 成都 610063
  • 收稿日期:2023-08-24 出版日期:2024-08-15 发布日期:2023-12-28
  • 通讯作者: 李柏林
  • 基金资助:
    四川省科技计划重点研发项目(2021YFN0020)

Grape Instance De-Overlapping Occlusion Algorithm Based on Self-Supervised Learning

Mei ZENG1, Yihan WANG1, Zhiwei LEI1, Xueyin LIU1,2, Bailin LI1,*()   

  1. 1. School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, Sichuan, China
    2. Research and Development Center, Sichuan Machinery Research and Design Institute, Chengdu 610063, Sichuan, China
  • Received:2023-08-24 Online:2024-08-15 Published:2023-12-28
  • Contact: Bailin LI

摘要:

传统随机遮挡算法在合成葡萄遮挡图像时会导致合成数据失真, 易使葡萄遮挡预测失效。因此, 提出一种适用于葡萄遮挡预测的遮挡数据合成方法, 并进一步提出基于自监督学习的葡萄实例去遮挡预测算法。在数据合成阶段, 该算法采用接近式遮挡策略取代随机遮挡方式用于将完整葡萄实例合成为不同的被遮挡实例, 并在合成过程前通过一系列预处理机制来控制互为遮挡的葡萄实例尺寸, 从而保证合成的遮挡葡萄符合真实情形, 不存在失真问题; 随后, 将遮挡预测过程拆分为掩码重构与语义填充2个部分, 并挑选对应的合成数据分别用于训练基于通用Unet的掩码重构网络和语义填充网络。为了克服因实例分割截取尺寸限制而无法预测完整实例的问题, 该算法在数据合成阶段充分考虑被遮挡实例与遮挡者实例, 并提出对应的重构和填充函数; 在遮挡预测阶段, 基于开源架构训练的Pointrend实例分割网络以及所提出的掩码重构网络和语义填充网络被依次用来完成对被遮挡葡萄的预测。在遮挡估计数据集上进行实验, 结果表明, 该算法预测的遮挡葡萄掩码与真实标注间的交并比(IoU)值达到81.16%, 高于其他对比方法, 表明所提合成算法与重构框架能够用于葡萄遮挡预测任务。

关键词: 葡萄遮挡, 遮挡预测, 重构网络, 遮挡合成, 实例分割

Abstract:

Conventional random occlusion algorithms used in generating synthetic occluded grape images often lead to data distortion, potentially rendering grape occlusion prediction ineffective. Therefore, this study proposes an occlusion data synthesis method suitable for grape occlusion prediction and further introduces a self-supervised grape instance de-occlusion prediction algorithm. During data synthesis, the proposed algorithm employs a proximity-based occlusion strategy to replace random occlusion methods for synthesizing different occluded instances from complete grape instances. Prior to the synthesis process, various preprocessing mechanisms are employed to control the sizes of mutually occluding grape instances, ensuring that the synthesized occluded grapes align with real-world conditions without distortion. Subsequently, the proposed approach splits occlusion prediction into mask reconstruction and semantic inpainting components. The study selects the corresponding synthetic data to train a generic Unet-based mask reconstruction network and a semantic inpainting network. To address the inability to predict complete instances owing to the limitations of instance segmentation cropping sizes, our algorithm fully considers both the occluded and occluder instances during data synthesis. The study introduces corresponding reconstruction and inpainting functions. In the occlusion prediction phase, an instance segmentation network, Pointrend, trained on an open-source architecture, the proposed mask reconstruction network, and a semantic inpainting network are sequentially applied to predict occluded grapes. When applied to the collected occlusion estimation dataset, the proposed algorithm achieves an Intersection-over-Union (IoU) value of 81.16% between the predicted occluded grape masks and ground truth annotations, outperforming other comparative methods. Experimental results demonstrate that the proposed synthesis algorithm and reconstruction framework are effective for grape occlusion prediction.

Key words: grape occlusion, occlusion prediction, reconstruction network, occlusion synthesis, instance segmentation