作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (13): 90-92. doi: 10.3969/j.issn.1000-3428.2010.13.032

• 软件技术与数据库 • 上一篇    下一篇

基于模糊软集合理论的文本分类方法

洪智勇1,2,秦克云1   

  1. (1. 西南交通大学数学学院,成都 610031;2. 五邑大学计算机学院,江门 529020)
  • 出版日期:2010-07-05 发布日期:2010-07-05
  • 作者简介:洪智勇(1978-),男,讲师、博士研究生,主研方向:智能信息处理,数据挖掘;秦克云,教授、博士、博士生导师
  • 基金资助:

    广东省自然科学基金资助项目(9151001003000005)

Text Classification Approach Based on Fuzzy Soft Set Theory

HONG Zhi-yong1,2, QIN Ke-yun1   

  1. (1. School of Mathematics, Southwest Jiaotong University, Chengdu 610031; 2. School of Computer Science, Wuyi University, Jiangmen 529020)
  • Online:2010-07-05 Published:2010-07-05

摘要:

为提高文本分类精度,提出一种基于模糊软集合理论的文本分类方法。该方法把文本训练集表示成模糊软集合表格形式,通过约简、构造软集合对照表方法找出待分类文本所属类别,并针对文本特征提取过程中由于相近特征而导致分类精度下降问题给出一种基于正则化互信息特征选择算法,有效地解决了上述问题。与传统的KNN和SVM分类算法相比,模糊软集合方法在文本分类的精度和准度上都有所提高。

关键词: 文本分类, 软集合, 模糊软集合, 特征选择, 互信息

Abstract:

A text classification approach based on soft set theory is proposed to enhance the accuracy of the text classification. The text training set is mapped onto a fuzzy soft set, the category of the new text can be achieved through the reduction of soft set table and construction of the comparison table of the soft set, in order to solve the problem that classification accuracy degrades when the feature is closely related to the selected feature, this paper gives a new feature selection algorithm based on normalization mutual information feature selection algorithm. Comparing with traditional KNN and SVM classification algorithm, the fuzzy soft set approach has the improvement on classification precision and accuracy.

Key words: text classification, soft set, fuzzy soft set, feature selection, mutual information

中图分类号: