作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (16): 149-151. doi: 10.3969/j.issn.1000-3428.2011.16.051

• 人工智能及识别技术 • 上一篇    下一篇

基于模糊关系的文本分类方法

张玉芳,娄 娟,李智星,熊忠阳   

  1. (重庆大学计算机学院,重庆 400030)
  • 收稿日期:2011-01-21 出版日期:2011-08-20 发布日期:2011-08-20
  • 作者简介:张玉芳(1965-),女,副教授,主研方向:文本分类,数据挖掘;娄 娟,硕士研究生;李智星,博士研究生;熊忠阳,教授、博士生导师
  • 基金资助:
    中国博士后科学基金资助项目(20070420711);重庆市科委基金资助项目(CSTC, 2008BB2191)

Text Classification Approach Based on Fuzzy Relationship

ZHANG Yu-fang, LOU Juan, LI Zhi-xing, XIONG Zhong-yang   

  1. (College of Computer, Chongqing University, Chongqing 400030, China)
  • Received:2011-01-21 Online:2011-08-20 Published:2011-08-20

摘要: 为更好地对未标记文本进行分类,通过定义文本和类别的隶属函数,将测试文本和类别表示为特征的模糊集,计算模糊集之间的相关系数并用来度量测试文本到每个类别的隶属度,根据最大隶属度原则确定测试文本所属类别。实验结果表明,与k-NN算法相比,该方法有较好的准确率,分类速度有较大提高。

关键词: 文本分类, 隶属函数, 模糊关系, 相关系数, 隶属度

Abstract: To better classify the unlabeled text, this paper presents a text classification approach based on fuzzy relationship. Through defining membership function of test document relationship and test category relationship, the test document and category can be represented as fuzzy sets. Then evaluating the membership degree between test document and each category by computing the correlation coefficient of fuzzy sets, the test document is decided to category using maximum membership principle. Compared with k-NN, experimental result shows that the precision is increased and the classification process is speeded up to a considerable degree.

Key words: text classification, membership function, fuzzy relationship, correlation coefficient, membership degree

中图分类号: