计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于排序学习模型的微博多样性检索问题研究

王莹,罗准辰,于洋   

  1. (中国国防科技信息中心,北京 100142)
  • 收稿日期:2016-09-07 出版日期:2017-11-15 发布日期:2017-11-15
  • 作者简介:王莹(1992—),男,硕士研究生,主研方向为自然语言处理、信息检索;罗准辰,工程师、博士;于洋,高级工程师、博士。
  • 基金项目:
    国家自然科学基金-青年科学基金(61602490)。

Research on Microblog Diversification Retrieval Problem Based on Rank Learning Model

WANG Ying,LUO Zhunchen,YU Yang   

  1. (China Defense Science and Technology Information Center,Beijing 100142,China)
  • Received:2016-09-07 Online:2017-11-15 Published:2017-11-15

摘要: 多样性检索主要用于解决传统信息检索中面临的查询词歧义问题。为此,研究微博中的多样性检索,提出一种新的微博多样性检索方法,将多样性排序学习方法应用到微博多样性检索。开发一系列社交媒体特征和子话题分布特征,采用查询短语与博文间相关性特征和博文与博文间文本多样性特征模型作为基准,分别加入上述特征,检验其对微博多样性的影响。实验结果表明,多样性排序学习方法能有效解决微博多样性检索问题,明显提高微博检索的效果。

关键词: 机器学习, 信息检索, 多样性检索, 排序学习, 社交媒体

Abstract: Diversification retrieval is used to solve users’ information needs,which typically described by query phrase are often ambiguous and have more than one interpretation.This paper researches microblog diversification retrieval,and proposes a novel microblog diversification retrieval method,diversification learning to rank method is applied to microblog diversification retrieval.It develops a series of social media features considering the characteristics of microblog and subtopics distribution,and adds these features one by one to the baseline model which only considering the relational features and the text diversity feature to verify the effectiveness of them.Experimental results show that diversification learning to rank approach can solve microblog diversification retrieval problem,and improve the effectiveness of microblog retrieval.

Key words: machine learning, information retrieval, diversification retrieval, rank learning, social media

中图分类号: