作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (12): 232-240. doi: 10.19678/j.issn.1000-3428.0064011

• 图形图像处理 • 上一篇    下一篇

基于多分辨率自蒸馏网络的小样本图像分类

仇真1,2,3, 奚雪峰1,2,3, 崔志明1,2,3, 盛胜利4, 胡伏原1,2,3   

  1. 1. 苏州科技大学 电子与信息工程学院, 江苏 苏州 215000;
    2. 苏州市虚拟现实智能交互及应用重点实验室, 江苏 苏州 215000;
    3. 苏州智慧城市研究院, 江苏 苏州 215000;
    4. 德州理工大学 计算机科学系, 美国 拉伯克 79401
  • 收稿日期:2022-02-23 修回日期:2022-05-02 发布日期:2022-05-24
  • 作者简介:仇真(1998—),男,硕士研究生,主研方向为小样本学习、图像处理;奚雪峰(通信作者),副教授、博士;崔志明,教授、博士、博士生导师;盛胜利、胡伏原,教授、博士。
  • 基金资助:
    国家自然科学基金(61876217,61876121,62176175);江苏省“六大人才高峰”高层次人才项目(XYDXX-086);苏州市科技计划项目(SGC2021078)。

Few-Shot Image Classification Based on Multi-Resolution Self-Distillation Network

QIU Zhen1,2,3, XI Xuefeng1,2,3, CUI Zhiming1,2,3, SHENG Shengli4, HU Fuyuan1,2,3   

  1. 1. School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu 215000, China;
    2. Suzhou Key Laboratory of Virtual Reality Intelligent Interaction and Application Technology, Suzhou, Jiangsu 215000, China;
    3. Suzhou Smart City Research Institute, Suzhou, Jiangsu 215000, China;
    4. Computer Science Department, Texas Tech University, Lubbock 79401, USA
  • Received:2022-02-23 Revised:2022-05-02 Published:2022-05-24

摘要: 因图像数据具有大量的空间冗余信息,传统的多分辨率网络在处理图像数据时会产生较高的计算成本。而自蒸馏学习方法能够在精度与计算成本之间实现动态平衡,使模型在不增加网络深度和宽度的基础上,有效地提高模型精度。提出一种多分辨率自蒸馏网络(MRSDN),用于解决小样本学习中输入样本的空间冗余问题。从原始网络中分出一个浅层子网络以识别图像的低分辨率表示,并且保持该原始网络识别高分辨率图像特征的能力。同时,在多分辨率网络中引入改进的全局注意力机制,以减少信息损失且放大全局交互表示。利用自蒸馏学习方法将网络中更深层的知识压缩到浅层子网络中,以提升浅层子网络的泛化能力。在此基础上,将低分辨率网络中的粗粒度特征融合到高分辨率网络中,从而提高模型提取图像特征的能力。实验结果表明,在Mini-ImageNet数据集上MRSDN网络对5-way 1-shot与5-way 5-shot任务的准确率分别为56.34%和74.35%,在Tiered-ImageNet数据集上对5-way 1-shot与5-way 5-shot任务的准确率分别为59.56%和78.96%,能有效缓解高分辨率图像输入时的空间冗余问题,提高小样本图像分类的准确率。

关键词: 自蒸馏学习, 小样本学习, 多分辨率网络, 空间冗余, 全局注意力

Abstract: The traditional multi-resolution network incurs high computing costs when processing image data owing to a large amount of spatial redundancy information in the image data.The Self-Distillation(SD) learning method can achieve a dynamic balance between accuracy and calculation cost, effectively improving the accuracy of the model without increasing the depth and width of the network.A Multi-Resolution Self-Distillation Network(MRSDN) is proposed to solve the spatial redundancy of input samples in Few-Shot Learning(FSL).A shallow sub-network is separated from the original network to recognize the low-resolution representation of the image, and the ability of the original network to recognize the high-resolution image features is maintained.In addition, an improved Global Attention Mechanism(GAM) is introduced into the multi-resolution network to reduce information loss and enlarge the global interactive representation.The SD learning method is used to compress the in-depth knowledge of the network into a shallow sub-network to improve the generalization ability of the shallow sub-network.The coarse granularity features in the low-resolution network are fused into the high-resolution network to improve the ability of the model to extract image features.The experimental results show that accuracy of the MRSDN network for five-way one-shot and five-way five-shot tasks on the Mini-ImageNet dataset are 56.34% and 74.35%, respectively.The accuracy of the network for five-way one-shot and five-way five-shot tasks on the Tiered-ImageNet dataset are 59.56% and 78.96%, respectively.The proposed network can effectively alleviate the spatial redundancy when inputting high-resolution images, improving the accuracy of few-shot image classification.

Key words: Self-Distillation(SD), Few-Shot Learning(FSL), multi-resolution network, spatial redundancy, global attention

中图分类号: