作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (15): 208-210. doi: 10.3969/j.issn.1000-3428.2008.15.075

• 人工智能及识别技术 • 上一篇    下一篇

粗糙集理论和DT_SVM在Web信息过滤中的应用

衣治安,刘 杨   

  1. (大庆石油学院计算机与信息技术学院,大庆 163318)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-08-05 发布日期:2008-08-05

Application of Rough Set Theory and DT_SVM in Web Information Filtering

YI Zhi-an, LIU Yang   

  1. (College of Computer and Information Technology, Daqing Petroleum Institute, Daqing 163318)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-08-05 Published:2008-08-05

摘要: 针对Web信息过滤问题,提出一种将粗糙集理论和决策树SVM(DT_SVM)相结合进行数据分类、过滤的新方法。该方法运用改进的启发式相对属性约简算法消除冗余、降低样本空间维数,通过聚类和DT_SVM相结合来训练SVM,将多分类问题转化为二值分类问题,提高了训练速度及过滤精度。实验表明,该算法得到了较高的查全率、查准率,体现了将粗糙集理论与DT_SVM算法结合的优越性。

关键词: Web信息过滤, 粗糙集理论, DT_ SVM算法, 属性约简, 聚类

Abstract: This paper advances a new data classification and filtering method based on rough set theory and Decision Tree SVM (DT_SVM) in allusion to the problem of Web information filtering. This method utilizes an improved heuristic algorithm of relative attribute reduction to eliminate redundancy, debase the spacial dimension of sample data, and train SVM by clustering integrated with DT_SVM, it can change multiclass problem into binary classification, and improve the training speed and the filtering precision. Experimental results demonstrate that the new algorithm gains a higher filtering recall and precision, manifests the algorithm’s advantage of rough set theory integrated with DT_SVM.

Key words: Web information filtering, rough set theory, DT_SVM, attribute reduction, clustering

中图分类号: