作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (7): 79-81. doi: 10.3969/j.issn.1000-3428.2010.07.028

• 软件技术与数据库 • 上一篇    下一篇

基于BBS热点主题发现的文本聚类方法

唐 果,陈宏刚   

  1. (西南大学计算机与信息科学学院,重庆 400715)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-04-05 发布日期:2010-04-05

Text Clustering Method Based on BBS Hot Topics Discovery

TANG Guo, CHEN Hong-gang   

  1. (Faculty of Computer and Information Science, Southwest University, Chongqing 400715)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-04-05 Published:2010-04-05

摘要: 针对电子公告板(BBS)帖子浏览机制不完善和主题发现效率不高的问题,提出一种基于BBS热点主题发现的文本聚类方法。将含有关键词的文档向量相加,经权重处理后计算其两两距离,合并最小的2类,并逐次进行,使最终类的大小比较均匀,以分等级的菜单方式组织帖子便于逐层浏览。实验结果表明,该方法比常规方法更适用于BBS主题浏览。

关键词: 浏览机制, 文本聚类, 热点主题

Abstract: Aiming at the problem that BBS posts browsing mechanism is not consummate and topics discovery has low efficiency, a new text clustering method based on BBS hot topics discovery is put forward. Document vectors including keywords are added, the distance between them is computed and the two classes with a minimum distance are merged in turn to obtain equality for the vector number of final classes. The posts with hierarchical menu are organized, so as to guide users to conveniently browse the posts that they are really interested in and also know well the current hot topics on BBS. Experimental result proves that the new clustering method is more proper for browsing the hot topics on BBS than regular ones.

Key words: browsing mechanism, text clustering, hot topics

中图分类号: