作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (18): 182-184. doi: 10.3969/j.issn.1000-3428.2009.18.064

• 人工智能及识别技术 • 上一篇    下一篇

不等式最大熵中的特征选择方法

张 永,李晓红,樊 斌   

  1. (兰州理工大学计算机与通信学院,兰州 730050)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-09-20 发布日期:2009-09-20

Feature Selection Method for Inequality Maximum Entropy

ZHANG Yong, LI Xiao-hong, FAN Bin   

  1. (School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-09-20 Published:2009-09-20

摘要: 不等式最大熵模型较为成功地缓解了文本分类任务中的过拟合问题,但它使用的特征选择算法不能完全发挥不等式最大熵的最大优势。针对该问题提出采用改进的顺序前进式选择算法,提高文本分类任务中的识别率,试验结果证明该算法能够更准确地选出文本代表特征,对不等式最大熵模型的分类成绩有一定的改善。

关键词: 不等式最大熵, 特征选择, 文本分类

Abstract: Inequality maximum entropy method has alleviated data sparseness with flexible modeling capability more successfully than other probabilistic models in text classification tasks, but feature selection algorithm used by the model can not fully bring its advantage. This paper proposes a new feature selection method. It improves the recognition rate in text classification. Experimental result shows that this algorithm works more effectively in selecting representative features and improves the text classification performance a lot.

Key words: inequality maximum entropy, feature selection, text classification

中图分类号: