作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (11): 166-167,. doi: 10.3969/j.issn.1000-3428.2007.11.060

• 人工智能及识别技术 • 上一篇    下一篇

跨语言信息检索中查询语句翻译转换算法

张孝飞,黄河燕,陈肇雄,代六玲   

  1. (中国科学院计算机语言信息工程研究中心,北京 100083)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-06-05 发布日期:2007-06-05

Translation Transforming Algorithm of Query Sentence in Cross Language Information Retrieval

ZHANG Xiaofei, HUANG Heyan, CHEN Zhaoxiong, DAI Liuling   

  1. (Research Center of Computer & Language Information Engineering, Chinese Academy of Sciences, Beijing 100083)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-06-05 Published:2007-06-05

摘要: 跨语言信息检索中,输入的查询语句往往是一系列关键词组合,而不是一个完整意义上的句子,致使查询关键词序列缺乏必要的语法、语境信息,难以实现查询语句的精确翻译。该文基于大规模双语语料库,以向量空间模型和词汇同现互信息为理论基础,运用传统单语信息检索技术,将查询语句的翻译问题转换为查询关键词词典义项的boost值计算,重构目标语查询语句。

关键词: 跨语言信息检索, 查询语句, 翻译转换, 双语语料库

Abstract: In cross language information retrieval, the query sentence often comprises some query keywords, but not a complete sentence. Because of the lack of necessary contextual and syntactic information in the series of query keywords, it is impossible to accurately translat the query sentence. In this paper, based on large-scale bilingual corpora and theory of vector space model and lexical mutual information, traditional mono-language IR technology is applied to convert the problem of translating query sentence into computing the “boost” value of the query keyword translation in bilingual dictionary, and the target language query sentence is reconstructed.

Key words: Cross language information retrieval, Query sentence, Translation transforing, Bilingual corpora

中图分类号: