1 |
CHANG X J, REN P Z, XU P F, et al. A comprehensive survey of scene graphs: generation and application. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 45(1): 1- 26.
|
2 |
JI C Q, WANG B B, JING Q, et al. Survey of deep feature instance level image retrieval algorithms. Journal of Frontiers of Computer Science & Technology, 2023, 17(7): 1565.
|
3 |
SHARMA H, PADHA D. A comprehensive survey on image captioning: from handcrafted to deep learning-based techniques, a taxonomy and open research issues. Artificial Intelligence Review, 2023, 56(11): 13619- 13661.
doi: 10.1007/s10462-023-10488-2
|
4 |
WANG Y, SUN H C. Review of visual question answering technology. Journal of Frontiers of Computer Science & Technology, 2023, 17(7): 1487- 1495.
|
5 |
DUAN J W, MIN W D, LIN D Y, et al. Multimodal graph inference network for scene graph generation. Applied Intelligence, 2021, 51, 8768- 8783.
doi: 10.1007/s10489-021-02304-7
|
6 |
段静雯, 闵卫东, 杨子元, 等. 提取全局语义信息的场景图生成算法. 中国图象图形学报, 2022, 27(7): 2214- 2225.
|
|
DUAN J W, MIN W D, YANG Z Y, et al. Global semantic information extraction based scene graph generation algorithm. Journal of image and Graphics, 2022, 27(7): 2214- 2225.
|
7 |
CHEN C, ZHAN Y B, YU B S, et al. Resistance training using prior bias: toward unbiased scene graph generation[EB/OL]. [2023-11-28]. https://arxiv.org/pdf/2201.06794.
|
8 |
ZHANG J, ELHOSEINY M, COHEN S, et al. Relationship proposal networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2017: 5678-5686.
|
9 |
XU D F, ZHU Y K, CHOY C B, et al. Scene graph generation by iterative message passing[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2017: 5410-5419.
|
10 |
CHEN Y N, WANG Y J, ZHANG Y, et al. PANet: a context based predicate association network for scene graph generation[C]//Proceedings of IEEE International Conference on Multimedia and Expo (ICME). Washington D.C., USA: IEEE Press, 2019: 508-513.
|
11 |
ZELLERS R, YATSKAR M, THOMSON S, et al. Neural Motifs: scene graph parsing with global context[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 5831-5840.
|
12 |
YIN G J, SHENG L, LIU B, et al. Zoom-Net: mining deep feature interactions for visual relationship recognition[EB/OL]. [2023-11-28]. https://arxiv.org/pdf/1807.04979.
|
13 |
LI Y K, OUYANG W L, ZHOU B L, et al. Factorizable Net: an efficient subgraph-based framework for scene graph generation[EB/OL]. [2023-11-28]. https://arxiv.org/pdf/1806.11538.
|
14 |
TANG K H, ZHANG H W, WU B Y, et al. Learning to compose dynamic tree structures for visual contexts[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2019: 6619-6628.
|
15 |
HE H B, GARCIA E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263- 1284.
doi: 10.1109/TKDE.2008.239
|
16 |
BYRD J, LIPTON Z. What is the effect of importance weighting in deep learning?[C]//Proceedings of International Conference on Machine Learning. [S.l.]: AAAI Press. 2019: 872-881.
|
17 |
LI R J, ZHANG S Y, WAN B, et al. Bipartite graph network with adaptive message passing for unbiased scene graph generation[EB/OL]. [2023-11-28]. https://arxiv.org/pdf/2104.00308.
|
18 |
CUI Y, JIA M L, LIN T Y, et al. Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2019: 9268-9277.
|
19 |
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2017: 2980-2988.
|
20 |
|
21 |
|
22 |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(6): 1137- 1149.
|
23 |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 6000-6010.
|
24 |
PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). San Diego, USA: Association for Computational linguistics, 2014: 1532-1543.
|
25 |
KRISHNA R, ZHU Y K, GROTH O, et al. Visual Genome: connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 2017, 123, 32- 73.
|
26 |
|
27 |
|
28 |
XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2017: 1492-1500.
|
29 |
DONG X N, GAN T, SONG X M, et al. Stacked hybrid-attention and group collaborative learning for unbiased scene graph generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2022: 19427-19436.
|
30 |
LI W, ZHANG H, BAI Q, et al. PPDL: predicate probability distribution based loss for unbiased scene graph generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2022: 19447-19456.
|
31 |
LIN X, DING C, ZHANG J, et al. RU-Net: regularized unrolling network for scene graph generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2022: 19457-19466.
|