计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于主题信息的项目评审专家推荐方法

余 峰1,余正涛1,杨剑锋2,郭剑毅1,严 馨1   

  1. (1. 昆明理工大学信息工程与自动化学院,昆明 650500;2. 红云红河烟草(集团)有限责任公司曲靖卷烟厂,云南 曲靖 655001)
  • 收稿日期:2013-04-15 出版日期:2014-06-15 发布日期:2014-06-13
  • 作者简介:余 峰(1986-),男,硕士研究生,主研方向:信息检索;余正涛(通讯作者),教授、博士、博士生导师;杨剑锋,高级工程师;郭剑毅,教授;严 馨,副教授。
  • 基金项目:
    国家自然科学基金资助项目(61175068);云南省教育厅重大专项基金资助项目。

Expert Recommendation Method for Project Evaluation Based on Topic Information

YU Feng 1, YU Zheng-tao 1, YANG Jian-feng 2, GUO Jian-yi 1, YAN Xin 1   

  1. (1. School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; 2. Qujing Cigarette Factory, Hongyunhonghe Tabacco Group Co. Ltd., Qujing 655001, China)
  • Received:2013-04-15 Online:2014-06-15 Published:2014-06-13

摘要: 针对为项目自动推荐评审专家的任务特点,提出一种基于主题信息的专家推荐方法。在分析项目与专家描述文档的属性特点后,使用隐含狄利克雷分配模型获取文档内容的主题词,通过统计主题词词频的方法构建主题特征空间,并结合文档属性栏目的重要性因素,利用TF-IDF特征提取算法分别获得项目文档与专家文档的主题特征向量,采用改进的相似度算法计算项目与专家主题特征向量的相关度,并选择与项目相关度较高的专家作为推荐结果。实验结果表明,该方法的推荐效果优于使用TF-IDF+余弦相似度计算的推荐方法,准确率、召回率和综合评价指标F值平均提高了4.87%, 5.04%和4.97%。

关键词: 专家推荐, 隐含狄利克雷分配模型, 主题词, 向量空间模型, TF-IDF特征, 相似度计算

Abstract: Focusing on the characteristics of recommending experts automatically for project evaluation, this paper proposes a kind of expert recommendation method based on the topic information. Firstly, it analyzes the attributive characteristics of the project documents and the expert documents, then uses Latent Dirichlet Allocation(LDA) topic model to obtain the topic words from each document according to its characteristics. Secondly, it constructs the topic feature space of the documents though the method of topic words frequency statistics, and uses TF-IDF feature extraction algorithm with the importance of the document columns to obtain the topic feature vectors of the project documents and the expert documents respectively. Finally, it uses an improved algorithm of similarity calculation to calculate the correlation of the topic feature vector of the project and the vector of each expert. The experts with a high correlation of the project are chosen as the result of expert recommendation. Experimental results show that the recommendation effect of the proposed method is better than the method based on the TF-IDF and cosine similarity calculation and the algorithm of cosine similarity calculation. The precision, recall and F-score are increased by 4.87%, 5.04% and 4.97% on average.

Key words: expert recommendation, Latent Dirichlet Allocation(LDA) model, topic word, Vector Space Model(VSM), TF-IDF feature, similarity calculation

中图分类号: