Text Clustering Method Based on Word Hyperclique

doi:10.3969/j.issn.1000-3428.2011.11.029

Computer Engineering ›› 2011, Vol. 37 ›› Issue (11): 86-88. doi: 10.3969/j.issn.1000-3428.2011.11.029

• Networks and Communications • Previous Articles Next Articles

Text Clustering Method Based on Word Hyperclique

QU Chao ¹, PAN Xiao-heng ¹, ZHU Jun¹, CAI Shao-zhong ², HU Tian-ming ¹

(1. College of Computer Science, Dongguan University of Technology, Dongguan 523000, China; 2. Yihao Electronics Technology Co., Ltd., Dongguan 523000, China)

Online:2011-06-05 Published:2011-06-09

基于单词超团的文本聚类方法

曲超¹，潘晓衡¹，朱君¹，蔡少仲²，胡天明¹

(1. 东莞理工学院计算机学院，广东东莞 523000；2. 东莞市毅豪电子科技有限公司，广东东莞 523000)

作者简介:曲超(1979－)，男，讲师、硕士，主研方向：信息网络，信息检索；潘晓衡，硕士研究生；朱君，副教授、博士；蔡少仲，学士；胡天明，副教授、博士
基金资助:
国家自然科学基金资助项目(60773050, U0935003)

Abstract

Abstract: In order to improve text clustering performance, this paper proposes a text clustering method based on word hyperclique. It evaluates document similarity with word relationship between documents, works with word hyperclique as assistance of the document’s vector and uses a corresponding clustering algorithm by graph to partition the document sets. Experimental results validate the effectiveness of the algorithm for improving clustering performance.

Key words: text clustering, word hyperclique, clustering mode, feature selection

摘要： 为优化文本聚类效果，提出一种基于单词超团理论的文本聚类方法。利用文档中单词的关联模式来评估文档间的相似度，将单词超团作为文档向量辅助信息，以图划分的方式进行聚类分析。对不同聚类方法的结果进行比较，证明基于单词超团的文本聚类方法能提高文本聚类的准确性。

关键词: 文本聚类, 单词超团, 聚类模式, 特征选择

CLC Number:

TP391.1

QU Chao, BO Xiao-Heng, SHU Jun, CA Shao-Zhong, HU Tian-Meng. Text Clustering Method Based on Word Hyperclique[J]. Computer Engineering, 2011, 37(11): 86-88.

曲超, 潘晓衡, 朱君, 蔡少仲, 胡天明. 基于单词超团的文本聚类方法[J]. 计算机工程, 2011, 37(11): 86-88.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.3969/j.issn.1000-3428.2011.11.029

http://www.ecice06.com/EN/Y2011/V37/I11/86

[1]	Xuan YANG, Jianmin MA, Manjun ZHAO. Feature Selection of High-Dimensional Time-Series Data Based on Neighborhood Mutual Information [J]. Computer Engineering, 2023, 49(7): 135-142.
[2]	LIU Li, ZHANG Desheng, XIAO Yanting. Fuzzy Weighted k-Nearest Centroid Neighbor Algorithm Based on Membership [J]. Computer Engineering, 2022, 48(7): 122-129.
[3]	AI Chenghao, GAO Jianhua, HUANG Zijie. Code Smell Detection Driven by Hybrid Feature Selection and Ensemble Learning [J]. Computer Engineering, 2022, 48(7): 168-176,198.
[4]	FAN Linge, WU Xinrong, TONG Wei, ZENG Weijun. Feature Selection Method for Incomplete Data Sets Based on Probability Matrix Decomposition [J]. Computer Engineering, 2022, 48(6): 57-64.
[5]	ZHANG Yao, MA Yingcang, ZHU Hengdong, LI Heng, CHEN Cheng. Multi-label Feature Selection Combining Manifold Learning and Logistic Regression [J]. Computer Engineering, 2022, 48(3): 90-99,106.
[6]	WANG Zhengkai, SHEN Dongsheng, WANG Chenxi. Fisher Score Fast Multi-Label Feature Selection Algorithm Based on Text Classification [J]. Computer Engineering, 2022, 48(2): 113-124.
[7]	HUANG Yixuan, DU Shiqiang, YU Yao, XIAO Qingjiang, SONG Jinmei. Multi-View Clustering Based on Feature Selection and Robust Graph Learning [J]. Computer Engineering, 2022, 48(12): 95-103.
[8]	XU Weijia, QIN Yongbin, HUANG Ruizhang, CHEN Yanping. Multi-Source Text Topic Model Based on DMA and Feature Division [J]. Computer Engineering, 2021, 47(7): 59-66.
[9]	WANG Junhong, ZHAO Binjia. Research on Feature Selection Algorithms Based on Unbalanced Data [J]. Computer Engineering, 2021, 47(11): 100-107.
[10]	WANG Xu, CHEN Yongle, WANG Qingsheng, CHEN Junjie. Cryptosystem Identification Scheme Combining Feature Selection and Ensemble Learning [J]. Computer Engineering, 2021, 47(1): 139-145,153.
[11]	YUAN Zheming, YANG Jingjing, CHEN Yuan. Feature Selection Method Based on Maximum Information Coefficient and Redundancy Sharing [J]. Computer Engineering, 2020, 46(8): 101-105.
[12]	WU Changming, ZHAO Xingtao, LIU Kexin. Improved SOCFS Algorithm Based on Triplet Ordinal Locality [J]. Computer Engineering, 2020, 46(5): 47-53.
[13]	CHEN Liangchen, GAO Shu, LIU Baoxu, TAO Mingfeng. Research on Dimensionality Reduction in Network Traffic Anomaly Detection [J]. Computer Engineering, 2020, 46(2): 11-20.
[14]	LIU Jie, WANG Zheng, WANG Hui. Research on Spam Filtering Technology Based on IMI-WNB Algorithm [J]. Computer Engineering, 2020, 46(12): 299-304,312.
[15]	ZHU Wenfeng, YU Shujuan, HE Wei. Two-stage Feature Selection Method Based on IG_CDmRMR [J]. Computer Engineering, 2019, 45(9): 183-187,193.

Please choose a citation manager

Content to export

Text Clustering Method Based on Word Hyperclique

基于单词超团的文本聚类方法

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments

模态框（Modal）标题

Please choose a citation manager

Content to export

Text Clustering Method Based on Word Hyperclique

基于单词超团的文本聚类方法

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments