1 |
PENG Y L, LIU F. UMass at ImageCLEF Medical Visual Question Answering (Med-VQA) 2018 task[C]//Proceedings of ImageCLEF 2018. New York, USA: ACM Press, 2018: 1-9.
|
2 |
ABBISHEK T, KRISHNAMOORTHI M. MIT Manipal at ImageCLEF 2019 visual question answering in medical domain[C]//Proceedings of ImageCLEF 2019. New York, USA: ACM Press, 2019: 1-6.
|
3 |
HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 4700-4708.
|
4 |
REN F J, ZHOU Y Y. CGMVQA: a new classification and generative model for medical visual question answering. IEEE Access, 2020, 8, 50626- 50636.
doi: 10.1109/ACCESS.2020.2980024
|
5 |
DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. [2023-10-11]. https://arxiv.org/abs/1810.04805v2.
|
6 |
GONG H F, CHEN G Q, LIU S S, et al. Cross-modal self-attention with multi-task pre-training for medical visual question answering[C]//Proceedings of the 2021 International Conference on Multimedia Retrieval. New York, USA: ACM Press, 2021: 456-460.
|
7 |
KIM W, SON B, KIM I. Vilt: vision-and-language transformer without convolution or region supervision[C]//Proceedings of International Conference on Machine Learning. [S. l. ]: PMLR, 2021: 5583-5594.
|
8 |
LIU B, ZHAN L M, WU X M. Contrastive pre-training and representation distillation for medical visual question answering based on radiology images[C]//Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2021: 210-220.
|
9 |
ANDERSON P, HE X D, BUEHLER C, et al. Bottom-up and top-down attention for image captioning and visual question answering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 6077-6086.
|
10 |
KIM J H, JUN J, ZHANG B T. Bilinear attention networks[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Berlin, Germany: Springer, 2018: 1571-1581.
|
11 |
YU Z, YU J, CUI Y H, et al. Deep modular co-attention networks for visual question answering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 6281-6290.
|
12 |
邹品荣, 肖锋, 张文娟, 等. 面向视觉问答的多模块协同注意模型. 计算机工程, 2022, 48(2): 250- 260.
doi: 10.19678/j.issn.1000-3428.0061159
|
|
ZOU P R, XIAO F, ZHANG W J, et al. Multi-module co-attention model for visual question answering. Computer Engineering, 2022, 48(2): 250- 260.
doi: 10.19678/j.issn.1000-3428.0061159
|
13 |
ZHENG W B, YAN L, WANG F Y, et al. Learning from the guidance: knowledge embedded meta-learning for medical visual question answering[C]//Proceedings of the 27th International Conference on Neural Information Processing. Berlin, Germany: Springer, 2020: 194-202.
|
14 |
CHEN Z H, DU Y H, HU J P, et al. Multi-modal masked autoencoders for medical vision-and-language pre-training[C]//Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2022: 679-689.
|
15 |
白亚龙. 面向图像与文本的多模态关联学习的研究与应用[D]. 哈尔滨: 哈尔滨工业大学, 2018.
|
|
BAI Y L. Research and application of multimodal relevance learning for image and text[D]. Harbin: Harbin Institute of Technology, 2018. (in Chinese)
|
16 |
何俊, 张彩庆, 李小珍, 等. 面向深度学习的多模态融合技术研究综述. 计算机工程, 2020, 46(5): 1- 11.
doi: 10.19678/j.issn.1000-3428.0057370
|
|
HE J, ZHANG C Q, LI X Z, et al. Survey of research on multimodal fusion technology for deep learning. Computer Engineering, 2020, 46(5): 1- 11.
doi: 10.19678/j.issn.1000-3428.0057370
|
17 |
FUKUI A, PARK D H, YANG D, et al. Multimodal compact bilinear pooling for visual question answering and visual grounding[EB/OL]. [2023-10-11]. https://arxiv.org/abs/1606.01847v3.
|
18 |
YU Z, YU J, FAN J P, et al. Multi-modal factorized bilinear pooling with co-attention learning for visual question answering[C]//Proceedings of the IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 1821-1830.
|
19 |
CADENE R, BEN-YOUNES H, CORD M, et al. MUREL: multimodal relational reasoning for visual question answering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 1989-1998.
|
20 |
YU Y L, LI H F, SHI H R, et al. Question-guided feature pyramid network for medical visual question answering. Expert Systems with Applications, 2023, 214, 119148.
doi: 10.1016/j.eswa.2022.119148
|
21 |
NGUYEN B D, DO T T, NGUYEN B X, et al. Overcoming data limitation in medical visual question answering[C]//Proceedings of the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2019: 522-530.
|
22 |
ZHANG Y J, CHEN Q Y, YANG Z H, et al. BioWordVec, improving biomedical word embeddings with subword information and MeSH. Scientific Data, 2019, 6, 52.
doi: 10.1038/s41597-019-0055-0
|
23 |
LAU J J, GAYEN S, BEN A A, et al. A dataset of clinically generated visual questions and answers about radiology images. Scientific Data, 2018, 5, 180251.
doi: 10.1038/sdata.2018.251
|
24 |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 7132-7141.
|
25 |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 6000-6010.
|
26 |
DO T, NGUYEN B X, TJIPUTRA E, et al. Multiple meta-model quantifying for medical visual question answering[C]//Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2021: 64-74.
|
27 |
LIU B, ZHAN L M, XU L, et al. Medical visual question answering via conditional reasoning and contrastive learning. IEEE Transactions on Medical Imaging, 2022, 42(5): 1532- 1545.
|