基于语义学习的图像多模态检索

doi:10.3969/j.issn.1000-3428.2013.03.051

计算机工程 ›› 2013, Vol. 39 ›› Issue (3): 258-263. doi: 10.3969/j.issn.1000-3428.2013.03.051

基于语义学习的图像多模态检索

李志欣¹，施智平²，陈宏朝¹，吴璟莉¹

(1. 广西师范大学计算机科学与信息工程学院，广西桂林 541004；2. 首都师范大学信息工程学院，北京 100048)

收稿日期:2012-03-08 出版日期:2013-03-15 发布日期:2013-03-13
作者简介:李志欣(1971－)，男，副教授、博士，主研方向：图像理解，模式识别，机器学习；施智平，副研究员、博士；陈宏朝，副教授、硕士；吴璟莉，副教授、博士
基金资助:
国家自然科学基金资助项目(61165009, 60903141)；广西自然科学基金资助项目(2012GXNSFAA053219, 2011GXN SFB018068)；“八桂学者”工程专项基金资助项目

Multi-modal Image Retrieval Based on Semantic Learning

LI Zhi-xin ¹, SHI Zhi-ping ², CHEN Hong-chao ¹, WU Jing-li ¹

(1. College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, China; 2. College of Information Engineering, Capital Normal University, Beijing 100048, China)

Received:2012-03-08 Online:2013-03-15 Published:2013-03-13

摘要/Abstract

摘要： 针对语义鸿沟问题，在语义学习的基础上设计图像的多模态检索系统。该系统结合3种查询方式进行图像检索。基于视觉特征的查询通过特征提取与相似度匹配进行排位。基于标签的查询建立在图像自动标注的基础上，但在语义空间之外的泛化能力较差。基于语义图例的查询能够在很大程度上克服这个缺陷，通过在显式或隐式的语义空间上进行查询，使检索结果更符合人类感知。实验结果表明，与基于纹理特征的图像检索相比，基于语义图例的检索具有更高的精度及召回率。

关键词: 图像多模态检索, 图像自动标注, 概率主题建模, 概率潜在语义分析, 语义鸿沟, 语义学习, 语义多项式

Abstract: In order to bridge the semantic gap, a multi-modal image retrieval system is proposed based on semantic learning. The system combines three query modes to retrieval images. The paradigm of query by visual feature ranks images by feature extraction and similarity matching; The paradigm of query by label is based on automatic image annotation, but its generalization ability is not good outside the semantic space; The paradigm of Query by Semantic Example(QBSE) can overcome the problem to a great extent. It makes the retrieval more agreeable with human perception by executing the query in either explicit or implicit semantic space. Experimental results show that the paradigm of query by semantic example has higher precision and recall rate than image retrieval based on the texture feature.

Key words: multi-modal image retrieval, automatic image annotation, probabilistic topic modeling, probabilistic latent semantic analysis, semantic gap, semantic learning, semantic multinomial

中图分类号:

TN911.73

李志欣, 施智平, 陈宏朝, 吴璟莉. 基于语义学习的图像多模态检索[J]. 计算机工程, 2013, 39(3): 258-263.

LI Zhi-Xin, SHI Zhi-Beng, CHEN Hong-Chao, TUN Jing-Chi. Multi-modal Image Retrieval Based on Semantic Learning[J]. Computer Engineering, 2013, 39(3): 258-263.

http://www.ecice06.com/CN/Y2013/V39/I3/258

参考文献

[1] Smeulders A W M, Worring M, Santini S, et al. Content- based Image Retrieval at the End of the Early Years[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(12): 1349-1380.
[2] 李志欣, 施智平, 李志清, 等. 图像检索中语义映射方法综述[J]. 计算机辅助设计与图形学学报, 2008, 20(8): 1085-1096.
[3] Ma Weiying, Manjunath B S. NeTra: A Toolbox for Navigating Large Image Databases[J]. Multimedia Systems, 1999, 7(3): 184-198.
[4] Carson C, Belongie S, Greenspan H, et al. Blobworld: Image Segmentation Using Expectation-maximization and
Its Application to Image Querying[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(8): 1026-1038.
[5] Wang J Z, Li Jia, Wiederhold G. SIMPLIcity: Semantics- sensitive Integrated Matching for Picture Libraries[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(9): 947-963.
[6] Chen Yixin, Wang J Z. A Region-based Fuzzy Feature Matching Approach to Content Based Image Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Learning, 2002, 24(9): 1252-1267.
[7] Zhang Ruofei, Zhang Zhongfei. Hidden Semantic Concept Discovery in Region Based Image Retrieval[C]//Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Computer Society, 2004.
[8] Rasiwasia N, Moreno P J, Vasconcelos N. Bridging the Gap: Query by Semantic Example[J]. IEEE Transactions on Multimedia, 2007, 9(5): 923-938.
[9] Li Zhixin, Shi Zhiping, Liu Xi, et al. Automatic Image Annotation with Continuous PLSA[C]//Proc. of the 35th IEEE International Conference on Acoustics, Speech and Signal Processing. Dallas, USA: IEEE Press, 2010.
[10] 李志欣, 施智平, 李志清, 等. 融合语义主题的图像自动标注[J]. 软件学报, 2011, 22(4): 801-812.
[11] 李志欣, 施智平, 刘曦, 等. 建模连续视觉特征的图像语义标注方法[J]. 计算机辅助设计与图形学学报, 2010, 22(8): 1412-1420.
[12] Vasconcelos N. From Pixels to Semantic Spaces: Advances in Content-based Image Retrieval[J]. IEEE Computer, 2007, 40(7): 20-26.
[13] Chen Yixin, Wang James Z, Krovetz R. CLUE: Cluster- based Retrieval of Images by Unsupervised Learning[J]. IEEE Transactions on Image Processing, 2005, 14(8): 1187-1201.
[14] 施智平. 大规模视频库的组织与检索[D]. 北京: 中国科学院计算技术研究所, 2005.
[15] 施智平, 胡宏, 李清勇, 等. 基于纹理谱描述子的图像检索[J]. 软件学报, 2005, 16(6): 1039-1045.

选择文件类型/文献管理软件名称

选择包含的内容

基于语义学习的图像多模态检索

Multi-modal Image Retrieval Based on Semantic Learning

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 7

编辑推荐

Metrics

本文评价

[1]	曾梦琪, 马蔚吟, 李力. 基于混合相似度的高效图像检索方案[J]. 计算机工程, 2019, 45(11): 262-268.
[2]	王小龙,沈新宁,杜建洪. 一种基于区域综合特征的图像检索算法[J]. 计算机工程, 2014, 40(11): 229-232,254.
[3]	俞建松, 曹冬林, 李绍滋, 林达真. 基于互联网搜索与反馈验证的图像自动标注[J]. 计算机工程, 2012, 38(24): 211-215.
[4]	黄勇辉, 尚赵伟, 张明新. 反馈日志与混合概率模型相结合的图像标注[J]. 计算机工程, 2012, 38(21): 202-205.
[5]	张成, 曲明成, 倪宁, 仇光, 卜佳俊. 基于概率潜在语义分析模型的自动答案选择[J]. 计算机工程, 2011, 37(14): 70-72.
[6]	周宁;薛向阳. 基于核密度估计的图像自动标注方法[J]. 计算机工程, 2010, 36(06): 198-200.
[7]	罗景;涂新辉. 基于概率潜在语义分析的中文信息检索[J]. 计算机工程, 2008, 34(2): 199-201.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于语义学习的图像多模态检索

Multi-modal Image Retrieval Based on Semantic Learning

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 7

编辑推荐

Metrics

本文评价