Abstract:
Focusing on the characteristics of recommending experts automatically for project evaluation, this paper proposes a kind of expert recommendation method based on the topic information. Firstly, it analyzes the attributive characteristics of the project documents and the expert documents, then uses Latent Dirichlet Allocation(LDA) topic model to obtain the topic words from each document according to its characteristics. Secondly, it constructs the topic feature space of the documents though the method of topic words frequency statistics, and uses TF-IDF feature extraction algorithm with the importance of the document columns to obtain the topic feature vectors of the project documents and the expert documents respectively. Finally, it uses an improved algorithm of similarity calculation to calculate the correlation of the topic feature vector of the project and the vector of each expert. The experts with a high correlation of the project are chosen as the result of expert recommendation. Experimental results show that the recommendation effect of the proposed method is better than the method based on the TF-IDF and cosine similarity calculation and the algorithm of cosine similarity calculation. The precision, recall and F-score are increased by 4.87%, 5.04% and 4.97% on average.
Key words:
expert recommendation,
Latent Dirichlet Allocation(LDA) model,
topic word,
Vector Space Model(VSM),
TF-IDF feature,
similarity calculation
摘要: 针对为项目自动推荐评审专家的任务特点,提出一种基于主题信息的专家推荐方法。在分析项目与专家描述文档的属性特点后,使用隐含狄利克雷分配模型获取文档内容的主题词,通过统计主题词词频的方法构建主题特征空间,并结合文档属性栏目的重要性因素,利用TF-IDF特征提取算法分别获得项目文档与专家文档的主题特征向量,采用改进的相似度算法计算项目与专家主题特征向量的相关度,并选择与项目相关度较高的专家作为推荐结果。实验结果表明,该方法的推荐效果优于使用TF-IDF+余弦相似度计算的推荐方法,准确率、召回率和综合评价指标F值平均提高了4.87%, 5.04%和4.97%。
关键词:
专家推荐,
隐含狄利克雷分配模型,
主题词,
向量空间模型,
TF-IDF特征,
相似度计算
CLC Number:
YU Feng, YU Zheng-tao, YANG Jian-feng, GUO Jian-yi, YAN Xin. Expert Recommendation Method for Project Evaluation Based on Topic Information[J]. Computer Engineering.
余峰,余正涛,杨剑锋,郭剑毅,严馨. 基于主题信息的项目评审专家推荐方法[J]. 计算机工程.