Few-shot Object Detection Method Based on Query Guidance and Semantic Enhancement

doi:10.19678/j.issn.1000-3428.0070093

Abstract

Abstract:

This study proposes a Few-Shot Object Detection (FSOD) method based on a query-guided strategy and semantic enhancement mechanism to address the following concerns: the lack of prototypical key information, insufficient adaptation to query images in the meta-learning paradigm, and the detector's sensitivity to the variance of the novel class leading to misclassification. The Query Guidance Module (QGM) conditionally couples query-aware information into support features by learning the correlation between the query and support features, aiming to generate specific and representative prototypes for each query image. The Visual Semantic Enhancement Module (VSEM) distils the knowledge from textual semantic information that matches the novel class of visual features and adaptively enhances these features to improve their discriminability and mitigate variance sensitivity for better classification. In addition, the classification and regression tasks are decoupled, and semantic enhancement is performed on the classification branch to facilitate the model's understanding of the target semantics. The experimental results demonstrate that, compared to the currently known state-of-the-art SMPCCNet method, the proposed approach achieves an average improvement of 2.2 percentage points in novel Average Precision (nAP) on the PASCAL VOC dataset and an average improvement of 1.0 percentage points in Average Precision (AP) on the MS COCO dataset, validating its effectiveness.

Key words: object detection, few-shot learning, meta learning, query-guided prototype, semantic enhancement

摘要：

针对元学习范式中原型关键信息欠缺、对查询图像适应性不足以及检测器对新类方差敏感导致误分类问题, 提出一种基于查询引导策略和语义增强机制的小样本目标检测(FSOD)方法。查询引导模块(QGM)通过学习查询与支持特征之间的相关性, 将查询感知信息有条件地耦合到支持特征中, 旨在为每个查询图像生成特定且具有代表性的原型。而视觉语义增强模块(VSEM)从文本语义信息中蒸馏与新类视觉特征相匹配的知识, 并自适应地对这些特征增强, 提高其可判别性, 缓解方差敏感, 以更好地分类。此外, 将分类和回归任务解耦, 在分类分支上执行语义增强, 以促进模型对目标语义的理解。实验结果表明, 相较于目前已知最新的SMPCCNet方法, 所提出方法在PASCAL VOC数据集上的新类平均精度(nAP)提升了2.2百分点, 在MS COCO数据集上的平均精度(AP)提升了1.0百分点, 证明了其有效性。

关键词: 目标检测, 小样本学习, 元学习, 查询引导原型, 语义增强

XIE Binhong, SHI Yufei, ZHANG Rui, ZHANG Yingjun. Few-shot Object Detection Method Based on Query Guidance and Semantic Enhancement[J]. Computer Engineering, 2026, 52(3): 141-151.

谢斌红, 石宇飞, 张睿, 张英俊. 基于查询引导和语义增强的小样本目标检测方法[J]. 计算机工程, 2026, 52(3): 141-151.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0070093

https://www.ecice06.com/EN/Y2026/V52/I3/141

Figures/Tables 15

Fig.1 The overall framework of QGSE method

Fig.2 Background suppression module

Fig.3 Query guidance module

Fig.4 Visual semantic enhancement module

Fig.5 Visualization of the query guidance module

Fig.6 Visualization comparison of feature aggregation

Fig.7 Visualization of t-SNE

Fig.8 Impact of different values of the balance coefficient λ

Fig.9 Comparison of some test results

References 28

1	史燕燕, 史殿习, 乔子腾, 等. 小样本目标检测研究综述. 计算机学报, 2023, 46(8): 1753- 1780.
	SHI Y Y, SHI D X, QIAO Z T, et al. A survey on recent advances in few-shot object detection. Chinese Journal of Computers, 2023, 46(8): 1753- 1780.
2	KOHLER M, EISENBACH M, GROSS H M. Few-shot object detection: a comprehensive survey. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(9): 11958- 11978. doi: 10.1109/TNNLS.2023.3265051
3	WANG X, HUANG T E, DARRELL T, et al. Frustratingly simple few-shot object detection[C]//Proceedings of the 37th International Conference on Machine Learning. Washington D. C., USA: IEEE Press, 2020: 9919-9928.
4	YANG Y, WEI F, SHI M, et al. Restoring negative information in few-shot object detection[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2020: 3521-3532.
5	FAN Z, MA Y, LI Z, et al. Generalized few-shot object detection without forgetting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 4527-4536.
6	KANG B Y, LIU Z, WANG X, et al. Few-shot object detection via feature reweighting[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 8420-8429.
7	YAN X P, CHEN Z L, XU A N, et al. Meta R-CNN: towards general solver for instance-level low-shot learning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 9577-9586.
8	QUAN J N, GE B Z, CHEN L. Cross attention redistribution with contrastive learning for few shot object detection. Displays, 2022, 72, 102162. doi: 10.1016/j.displa.2022.102162
9	ZHANG L, ZHOU S G, GUAN J H, et al. Accurate few-shot object detection with support-query mutual guidance and hybrid loss[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE Press, 2021: 14424-14432.
10	XU J B, WANG Y, HE X Y, et al. Support-query mutual promotion and classification correction network for few-shot object detection. IEEE Signal Processing Letters, 2024, 31, 201- 205. doi: 10.1109/LSP.2023.3343195
11	姚涵涛, 余璐, 徐常胜. 视觉语言模型引导的文本知识嵌入的小样本增量学习. 软件学报, 2024, 35(5): 2101- 2119.
	YAO H T, YU L, XU C S. Few-shot incremental learning with textual-knowledge embedding by visual-language model. Journal of Software, 2024, 35(5): 2101- 2119.
12	WERTHEIMER D, HARIHARAN B. Few-shot learning with localization in realistic settings[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE Press, 2019: 6558-6567.
13	SUNG F, YANG Y X, ZHANG L, et al. Learning to compare: relation network for few-shot learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 1199-1208.
14	FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the IEEE International Conference on Machine Learning. Washington D. C., USA: IEEE Press, 2017: 1126-1135.
15	XU J Y, LE H, HUANG M Z, et al. Variational feature disentangling for fine-grained few-shot classification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 8812-8821.
16	FAN Q, ZHUO W, TANG C K, et al. Few-shot object detection with attention-RPN and multi-relation detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE Press, 2020: 4013-4022.
17	徐守坤, 张路军, 石林, 等. 意图注意力引导的小样本3D点云目标检测. 计算机工程, 2024, 50(12): 288- 295. doi: 10.19678/j.issn.1000-3428.0068727
	XU S K, ZHANG L J, S L, et al. Few-shot 3D point cloud object detection guided by intention-attention. Computer Engineering, 2024, 50(12): 288- 295. doi: 10.19678/j.issn.1000-3428.0068727
18	LEE H, LEE M, KWAK N. Few-shot object detection by attending to per-sample-prototype[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2022: 2445-2454.
19	LI Y W, FENG W Q, LYU S C, et al. Feature reconstruction and metric based network for few-shot object detection. Computer Vision and Image Understanding, 2023, 227, 103600. doi: 10.1016/j.cviu.2022.103600
20	LI B H, YANG B Y, LIU C, et al. Beyond max-margin: class margin equilibrium for few-shot object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 7363-7372.
21	LI A X, LI Z G. Transformation invariant few-shot object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 3094-3102.
22	HU H Z, BAI S, LI A X, et al. Dense relation distillation with context-aware aggregation for few-shot object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 10185-10194.
23	ZHA Z C, TANG H, SUN Y L, et al. Boosting few-shot fine-grained recognition with background suppression and foreground alignment. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(8): 3947- 3961. doi: 10.1109/TCSVT.2023.3236636
24	EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes challenge. International Journal of Computer Vision, 2010, 88(2): 303- 338. doi: 10.1007/s11263-009-0275-4
25	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context. Berlin, Germany: Springer, 2014.
26	LU Y, CHEN X, WU Z, et al. Decoupled metric network for single-stage few-shot object detection. IEEE Transactions on Cybernetics, 2023, 53(1): 514- 525. doi: 10.1109/TCYB.2022.3149825
27	ZHAO X W, LIU X L, MA Y Q, et al. Temporal speciation network for few-shot object detection. IEEE Transactions on Multimedia, 2023, 25, 8267- 8278. doi: 10.1109/TMM.2023.3234368
28	ZHANG S, WANG L, MURRAY N, et al. Kernelized few-shot object detection with efficient integral aggregation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 19207-19216.

[1]	CAO Jiwei, LUO Fei, DING Weichao. BS-YOLO: A Small Object Detection Algorithm Based on BSAM Attention Mechanism and SCConv [J]. Computer Engineering, 2026, 52(3): 119-127.
[2]	WEN Lang, GOU Guanglei, BAI Ruifeng, MIAO Wanyu. Few-shot Fine-grained Image Classification Based on Neighborhood Fusion and Feature Enhancement [J]. Computer Engineering, 2026, 52(2): 158-166.
[3]	LI Jianlang, WU Xindian, CHEN Ling, YANG Bo, TANG Wensheng. 3D Object Detection Algorithm Based on 4D Millimeter-Wave Radar and Vision Fusion [J]. Computer Engineering, 2026, 52(2): 299-310.
[4]	LI Shugang, LI Shuang, LIU Chi. Cross-Domain Fabric Defect Detection Guided by Texture Knowledge [J]. Computer Engineering, 2026, 52(1): 166-175.
[5]	ZHU Siyuan, LI Jiasheng, ZOU Danping, HE Di, YU Wenxian. Unstructured Road Defect Detection Algorithm Based on Semi-Supervised Learning [J]. Computer Engineering, 2025, 51(9): 14-24.
[6]	LI Xiaoyu, LUO Na. Few-Shot Learning Method with Augmentation Data Based on Transferring Intra-Class Variations [J]. Computer Engineering, 2025, 51(9): 242-251.
[7]	MA Gan, GU Yu, PENG Dongliang. Combining Improved YOLOv5s and Dynamic Data Augmentation for Sea Surface Ship Detection [J]. Computer Engineering, 2025, 51(9): 294-305.
[8]	WANG Shumeng, XU Huiying, ZHU Xinzhong, HUANG Xiao, SONG Jie, LI Yi. Lightweight Small Object Detection Algorithm for Aerial Photography Based on Improved YOLOv8n: PECS-YOLO [J]. Computer Engineering, 2025, 51(9): 280-293.
[9]	MIAO Ru, LI Yi, ZHOU Ke, ZHANG Yanna, CHANG Ranran, MENG Geng. A Study on Improved Faster R-CNN Model for Multi-Object Detection in Remote Sensing Images [J]. Computer Engineering, 2025, 51(8): 292-304.
[10]	SONG Jie, XU Huiying, ZHU Xinzhong, HUANG Xiao, CHEN Chen, WANG Zeyu. Improved Fall Detection Algorithm Based on YOLOv8: OEF-YOLO [J]. Computer Engineering, 2025, 51(7): 127-139.
[11]	LIU Xudong, YANG Xubing. Design of L1-OCSVM Model and Its Application in Forestry Object Detection [J]. Computer Engineering, 2025, 51(7): 375-384.
[12]	ZHANG Jiacheng, WEI Jin, CHEN Yishi. Improved YOLOv8 Real-time Lightweight Robust Hedge Detection Algorithm [J]. Computer Engineering, 2025, 51(7): 362-374.
[13]	LUAN Mengna, ZHENG Qiumei, WANG Fenghua. Real-time Traffic Sign Detection Algorithm Based on DMC-YOLO [J]. Computer Engineering, 2025, 51(7): 90-99.
[14]	LU Xuan, JING Luqi, PENG Furong. Colorectal Polyp Segmentation Method Based on Incremental Learning [J]. Computer Engineering, 2025, 51(7): 284-293.
[15]	FENG Xiaofei, XIE Cheng, ZHANG Xiuzhen, DONG Shikui, CHEN Junsheng, YE Shu, ZHONG Xian. Detection Method of Precast Beam Process Based on Dynamic-Static Fusion Mutual Learning [J]. Computer Engineering, 2025, 51(6): 385-394.

Please choose a citation manager

Content to export