| 1 |
龚勋, 张志莹, 刘璐, 等. 人物交互检测研究进展综述. 西南交通大学学报, 2022, 57 (4): 693- 704.
|
|
GONG X , ZHANG Z Y , LIU L , et al. A survey of human-object interaction detection. Journal of Southwest Jiaotong University, 2022, 57 (4): 693- 704.
|
| 2 |
REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2015: 28-36.
|
| 3 |
GAO C, ZOU Y, HUANG J B. ICAN: Instance-centric attention network for human-object interaction detection[EB/OL]. [2024-03-01]. https://arxiv.org/abs/1808.10437.
|
| 4 |
ZHONG X B , DING C X , QU X , et al. Polysemy deciphering network for human-object interaction detection. Berlin, Germany: Springer, 2020.
|
| 5 |
ULUTAN O, IFTEKHAR A S M, MANJUNATH B S. VSGNet: spatial attention network for detecting human object interactions using graph convolutions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 13617-13626.
|
| 6 |
LIAO Y, LIU S, WANG F, et al. PPDM: parallel point detection and matching for real-time human-object interaction detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE Press, 2020: 482-490.
|
| 7 |
KIM B , CHOI T , KANG J , et al. UnionDet: union-level detector towards real-time human-object interaction detection. Berlin, Germany: Springer, 2020.
|
| 8 |
CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with Transformers[C]//Proceedings of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 213-229.
|
| 9 |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 130-241.
|
| 10 |
ZOU C, WANG B H, HU Y, et al. End-to-end human object interaction detection with HOI Transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE Press, 2021: 11825-11834.
|
| 11 |
TAMURA M, OHASHI H, YOSHINAGA T. QPIC: query-based pairwise human-object interaction detection with image-wide contextual information[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE Press, 2021: 10410-10419.
|
| 12 |
KIM B, LEE J, KANG J, et al. HOTR: end-to-end human-object interaction detection with Transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 74-83.
|
| 13 |
CHEN M F, LIAO Y, LIU S, et al. Reformulating HOI detection as adaptive set prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE Press, 2021: 9004-9013.
|
| 14 |
DONG L Z, LI Z M, XU K L, et al. Category-aware transformer network for better human-object interaction detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE Press, 2022: 19538-19547.
|
| 15 |
IFTEKHAR A S M, CHEN H, KUNDU K, et al. What to look at and where: semantic and spatial refined Transformer for detecting human-object interactions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE Press, 2022: 5353-5363.
|
| 16 |
白雪冰, 车进, 吴金蔓, 等. 基于Transformer视觉特征融合的图像描述方法. 计算机工程, 2024, 50 (8): 229- 238.
doi: 10.19678/j.issn.1000-3428.0068402
|
|
BAI X B , CHE J , WU J M , et al. Image captioning method based on Transformer visual features fusion. Computer Engineering, 2024, 50 (8): 229- 238.
doi: 10.19678/j.issn.1000-3428.0068402
|
| 17 |
衡红军, 范昱辰, 王家亮. 基于Transformer的多方面特征编码图像描述生成算法. 计算机工程, 2023, 49 (2): 199- 205.
doi: 10.19678/j.issn.1000-3428.0064450
|
|
HENG H J , FAN Y C , WANG J L . Multifaceted feature coding image caption generation algorithm based on Transformer. Computer Engineering, 2023, 49 (2): 199- 205.
doi: 10.19678/j.issn.1000-3428.0064450
|
| 18 |
RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]//Proceedings of the International Conference on Machine Learning. Washington D. C., USA: IEEE Press, 2021: 8748-8763.
|
| 19 |
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE Press, 2016: 770-778.
|
| 20 |
REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE Press, 2019: 658-666.
|
| 21 |
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 2980-2988.
|
| 22 |
CHAO Y W, LIU Y F, LIU X Y, et al. Learning to detect human-object interactions[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2018: 381-389.
|
| 23 |
|
| 24 |
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of the 13th European Conference of Computer Vision. Berlin, Germany: Springer, 2014: 740-755.
|
| 25 |
|
| 26 |
XU K L , LI Z M , ZHANG Z J , et al. Effective actor-centric human-object interaction detection. Image and Vision Computing, 2022, 121, 104422.
doi: 10.1016/j.imavis.2022.104422
|
| 27 |
LI Y L, ZHOU S Y, HUANG X J, et al. Transferable interactiveness knowledge for human-object interaction detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE Press, 2019: 3585-3594.
|
| 28 |
LI Y L, XU L, LIU X P, et al. PaStaNet: toward human activity knowledge engine[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE Press, 2020: 382-391.
|
| 29 |
GAO C, XU J R, ZOU Y L, et al. DRG: dual relation graph for human-object interaction detection[C]// Proceedings of the 15th European Conference of Computer Vision. Berlin, Germany: Springer, 2020: 696-712.
|
| 30 |
LI Y L, LIU X, WU X, et al. Hoi analysis: integrating and decomposing human-object interaction[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2020: 5011-5022.
|
| 31 |
HOU Z, PENG X J, QIAO Y, et al. Visual compositional learning for human-object interaction detection[C]//Proceedings of the 16th European Conference of Computer Vision. Berlin, Germany: Springer, 2020: 584-600.
|
| 32 |
ZHONG X B, QU X, DING C X, et al. Glance and gaze: inferring action-aware points for one-stage human-object interaction detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE Press, 2021: 13234-13243.
|
| 33 |
DONG Q, TU Z W, LIAO H F, et al. Visual relationship detection using part-and-sum transformers with composite queries[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 3550-3559.
|
| 34 |
ZHANG F Z, CAMPBELL D, GOULD S. Spatially conditioned graphs for detecting human-object interactions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 13319-13327.
|
| 35 |
ZHANG F Z, CAMPBELL D, GOULD S. Efficient two-stage detection of human-object interactions with a novel unary-pairwise Transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE Press, 2022: 20104-20112.
|
| 36 |
KIM B, MUN J, ON K W, et al. MSTR: multi-scale Transformer for end-to-end human-object interaction detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 19578-19587.
|