计算机工程

• 开发研究与工程应用 • 上一篇    

BBS网络舆情的在线自适应话题演化模型

杨春明,张 晖,石大文   

  1. (西南科技大学计算机科学与技术学院,四川 绵阳 621010)
  • 收稿日期:2013-05-07 出版日期:2014-07-15 发布日期:2014-07-14
  • 作者简介:杨春明(1980-),男,讲师、硕士,主研方向:文本挖掘,知识工程;张 晖,教授、博士;石大文,硕士研究生。
  • 基金项目:
    四川省教育厅基金资助项目(12ZB326);绵阳市网络融合实验室基金资助项目(12ZXWK04);西南科技大学博士基金资助项目(12zx7116)。

Adaptive On-line Topic Evolution Model of Internet Public Opinion for BBS

YANG Chun-ming, ZHANG Hui, SHI Da-wen   

  1. (School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang 621010, China)
  • Received:2013-05-07 Online:2014-07-15 Published:2014-07-14

摘要: 针对电子公告栏(BBS)内容演化过程中话题数量动态变化的特点,提出基于潜在狄利克雷分布的自适应在线话题演化模型。该模型以历史时间窗口中话题、词分布的后验线性加权调节当前时间窗口中话题、词分布的先验,给出在线新话题检测和消亡话题检测方法,自动适应数据流中的话题数量。实验结果表明,该模型能有效识别BBS内容演化过程中话题的产生与消亡,分析它们在时间和内容上的演化,及时发现热点事件。

关键词: 网络舆情, 话题模型, 话题演化, 非监督学习, 多项式分布, 时间窗口

Abstract: Aiming at the problem of topic number dynamic change in the process of on-line topic evolution for Bulletin Board System(BBS), a new adaptive on-line topic evolution model based on Latent Dirichlet Allocation(LDA) is proposed. This model uses the posterior of topic and word distribution in historical time window to adjust the prior of current by linear weighted, which is able to find new topic and vanished topic in text stream and automatically update topic number and represent their evolution in time and content. Experimental result shows that the proposed model can identify the new topic well and analyze their evolution in time and content, and the hot spots can be discovered in time.

Key words: Internet public opinion, topic model, topic evolution, unsupervised learning, multinomial distribution, time window

中图分类号: