作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于离散粒子群优化的微博热点话题发现算法

马慧芳,吉余岗,李晓红,周汝南   

  1. (西北师范大学计算机科学与工程学院,兰州 730070)
  • 收稿日期:2015-02-04 出版日期:2016-03-15 发布日期:2016-03-15
  • 作者简介:马慧芳(1981-),女,副教授、博士,主研方向为人工智能、数据挖掘、机器学习;吉余岗,本科生;李晓红,讲师;周汝南,本科生。
  • 基金资助:

    国家自然科学基金资助项目(61363058,61163039);中国科学院计算技术研究所智能信息处理重点实验室开放基金资助项目(IIP2014-4);甘肃省自然科学基金资助项目(145RJZA232);甘肃省青年科技基金资助项目(145RJYA259)。

Hot Topic Discovering Algorithm for Microblog Based on Discrete Particle Swarm Optimization

MA Huifang,JI Yugang,LI Xiaohong,ZHOU Runan   

  1. (College of Computer Science & Engineering,Northwest Normal University,Lanzhou 730070,China)
  • Received:2015-02-04 Online:2016-03-15 Published:2016-03-15

摘要:

结合词项关联关系和粒子群优化(PSO)算法的特点,提出一种基于离散PSO(DPSO)的微博热点话题发现算法。通过对词语互信息及内外关联词信息的挖掘,更新传统文本表示模型,利用DPSO算法从寻优角度发现微博热点话题及简化微博聚类过程,并将聚类质量评价指标作为适应度函数对聚类结果进行不断迭代优化,获得聚类结果的最优解。实验结果表明,该算法能够在大量微博中快速发现热点话题,具有较高的热点话题发现准确性及运行效率。

关键词: 微博, 热点话题发现, 词项关系, 文本表示模型, 粒子群优化

Abstract:

Considering the term relationship and the characteristics of Particle Swarm Optimization(PSO),a hot topic detection method based on Discrete Particle Swarm Optimization(DPSO) is presented.The term mutual information and the intra/inter information are constructed to update the traditional text representation model.DPSO is adopted to detect hot topic which not only simplifies the clustering process but also takes clustering evaluation criteria as fitness function to get optimal solution of clustering results.Experimental results demonstrate that this algorithm can detect hot topics from huge number of microblogs accurately and quickly,and it has high accuracy and efficiency of hot topic discovering.

Key words: microblog, hot topic discovering, term relationship, text representation model, Particle Swarm Optimization(PSO)

中图分类号: