作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2013, Vol. 39 ›› Issue (8): 55-59. doi: 10.3969/j.issn.1000-3428.2013.08.011

• 先进计算与数据处理 • 上一篇    下一篇

基于行为相似度的微博社区发现研究

蔡波斯,陈 翔   

  1. (北京理工大学管理与经济学院,北京 100081)
  • 收稿日期:2012-05-07 出版日期:2013-08-15 发布日期:2013-08-13
  • 作者简介:蔡波斯(1988-),男,硕士研究生,主研方向:社会计算,数据挖掘;陈 翔,副教授、博士
  • 基金资助:
    国家自然科学基金资助项目(71102111)

Reseach on Weibo Community Discovery Based on Behavior Similarity

CAI Bo-si, CHEN Xiang   

  1. (School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China)
  • Received:2012-05-07 Online:2013-08-15 Published:2013-08-13

摘要: 现实的微博关系矩阵通常具有稀疏性,而基于关系链接划分出的社区只能体现社区中人与人的朋友关系。为解决该问题,提出一种基于行为相似度的微博社区发现模型。采用主成分分析方法构造行为相似度,解决关系矩阵稀疏问题,使用改进的派系过滤方法克服计算量过大的缺点。通过抓取新浪微博的真实数据,将该模型与基于关系属性的社区划分模型进行对比,结果表明,该模型的平均集聚系数提高了5倍,更能体现出社区划分的强凝聚性。

关键词: 行为相似度, 微博, 社区发现, 社区划分, 主成分分析, 派系过滤方法

Abstract: To solve the problem of matrix sparseness of the weibo social network and the problem of limited meaning of the community detected based on social relationship. This paper proposes a weibo community detect model based on behavior similarity. By selecting related indexes from weibo and using Principal Component Analysis(PCA), it reduces the dimension of the indicators. It applies modified Clique Percolation Method(CPM) to discover community of weibo and dichotomizes the input matrix to reduce the computation of CPM algorithm. By collecting real data from Sina weibo, it compares new method and traditional method which builds model on connectivity attribute. The results show that this new method has higher clustering coefficient which is 5 times more than the traditional one, which shows that community discovered by new method has stronger cohesion.

Key words: behavior similarity, weibo, community discovery, community division, Principal Component Analysis(PCA, Clique Per- colation Method(CPM)

中图分类号: