Abstract:
In order to improve text clustering performance, this paper proposes a text clustering method based on word hyperclique. It evaluates document similarity with word relationship between documents, works with word hyperclique as assistance of the document’s vector and uses a corresponding clustering algorithm by graph to partition the document sets. Experimental results validate the effectiveness of the algorithm for improving clustering performance.
Key words:
text clustering,
word hyperclique,
clustering mode,
feature selection
摘要: 为优化文本聚类效果,提出一种基于单词超团理论的文本聚类方法。利用文档中单词的关联模式来评估文档间的相似度,将单词超团作为文档向量辅助信息,以图划分的方式进行聚类分析。对不同聚类方法的结果进行比较,证明基于单词超团的文本聚类方法能提高文本聚类的准确性。
关键词:
文本聚类,
单词超团,
聚类模式,
特征选择
CLC Number:
QU Chao, BO Xiao-Heng, SHU Jun, CA Shao-Zhong, HU Tian-Meng. Text Clustering Method Based on Word Hyperclique[J]. Computer Engineering, 2011, 37(11): 86-88.
曲超, 潘晓衡, 朱君, 蔡少仲, 胡天明. 基于单词超团的文本聚类方法[J]. 计算机工程, 2011, 37(11): 86-88.