作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (09): 191-193.

• 人工智能及识别技术 • 上一篇    下一篇

用改进的1-DNF算法获取最强反例集合的方法

赫枫龄,左万利,于海龙   

  1. (吉林大学计算机科学与技术工程学院,符号计算与知识工程教育部重点实验室,长春 130012)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-05-05 发布日期:2007-05-05

Method for Extracting Strongly Negative Data Set
by Improved 1-DNF Algorithm

HE Fengling, ZUO Wanli, YU Hailong   

  1. (College of Computer Science and Technology Engineering, Jilin University, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Changchun 130012)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-05-05 Published:2007-05-05

摘要: 利用正样例集合和未标识样例集合获取初始的最强反例集合是使用两步框架方法构造一个面向PU问题文本分类器的基础。该文指出了使用1-DNF算法抽取初始的最强反例集合的局限性,提出了对算法1-DNF的改进方法。实验结果表明,与原算法相比,它大大增加了获取的最强反例数目,加快了算法的收敛速度,提高了分类器的精度。

关键词: 文本分类, 面向PU问题的文本分类, 文本分类器

Abstract: Extracting initial strongly negative data set from positive data and unlabeled data is a base for constructing a PU-oriented text classifier by two stage frame method. The limitations in the 1-DNF algorithm for getting initial strongly negative data set are described. An improved 1-DNF algorithm is proposed. The experiment result demonstrates the number of initial strongly negative examples got from positive data and unlabeled data is increased greatly, compared with original 1-DNF algorithm. The convergence speed of algorithm is accelerated, and the precision of the classifier is raised.

Key words: Text classification, PU-oriented text classification, Text classifier

中图分类号: