基于多模态融合与多层注意力的视频内容文本表述研究
赵宏, 郭岚, 陈志文, 郑厚泽
Research on Text Representation of Video Content Based on Multi-Modal Fusion and Multi-Layer Attention
ZHAO Hong, GUO Lan, CHEN Zhiwen, ZHENG Houze
计算机工程
.
2022, (10): 45
-54
.
DOI: 10.19678/j.issn.1000-3428.0063294