Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering

Special Issue:

Previous Articles     Next Articles

Application of LDA Model in Microblog User Recommendation

DI Liang, DU Yong-ping   

  1. (Institute of Computer Science and Technology, Beijing University of Technology, Beijing 100124, China)
  • Received:2013-09-22 Online:2014-05-15 Published:2014-05-14

LDA模型在微博用户推荐中的应用

邸 亮,杜永萍   

  1. (北京工业大学计算机科学与技术学院,北京 100124)
  • 作者简介:邸 亮(1988-),男,硕士研究生,主研方向:自然语言处理;杜永萍,副教授。
  • 基金资助:
    国家科技支撑计划基金资助项目(2013BAH21B00);北京市自然科学基金资助项目(4123091);北京市属高等学校人才强教深化计划基金资助项目“中青年骨干人才培养计划”(PHR20110815)。

Abstract: Latent Dirichlet Allocation(LDA) model can be used for identifying topic information from large-scale document set, but the effect is not ideal for short text such as microblog. This paper proposes a microblog user model based on LDA, which divides microblog based on user and represents each user with their posted microbolgs. Thus, the standard three layers in LDA model by document-topic-word becomes a user model by user-topic-word. The model is applied to user recommendation. Experiment on real data set shows that the new provided method has a better effect. With a proper topic number, the performance is improved by nearly 10%.

Key words: topic model, Latent Dirichlet Allocation(LDA), microblog, user model, interest analysis, user recommendation

摘要: 潜在狄利克雷分配(LDA)主题模型可用于识别大规模文档集中潜藏的主题信息,但是对于微博短文本的应用效果并不理想。为此,提出一种基于LDA的微博用户模型,将微博基于用户进行划分,合并每个用户发布的微博以代表用户,标准的文档-主题-词的三层LDA模型变为用户-主题-词的用户模型,利用该模型进行用户推荐。在真实微博数据集上的实验结果表明,与传统的向量空间模型方法相比,采用该方法进行用户推荐具有更好的效果,在选择合适的主题数情况下,其准确率提高近10%。

关键词: 主题模型, 潜在狄利克雷分配, 微博, 用户模型, 兴趣分析, 用户推荐

CLC Number: