作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (10): 52-54. doi: 10.3969/j.issn.1000-3428.2011.10.017

• 软件技术与数据库 • 上一篇    下一篇

基于再生核Hilbert空间PCA的属性约简

黄敢基,吕跃进   

  1. (广西大学数学与信息科学学院,南宁 530004)
  • 出版日期:2011-05-20 发布日期:2011-05-20
  • 作者简介:黄敢基(1972-),男,讲师、硕士,主研方向:概率统计,数据挖掘;吕跃进,教授
  • 基金资助:
    国家自然科学基金资助项目(11061002);广西自然科学 基金资助项目(桂科自0991027)

Dimensionality Reduction Based on Principal Component Analysis in Reproducing Kernel Hilbert Space

HUANG Gan-ji, LV Yue-jin   

  1. (College of Mathematics and Information Science, Guangxi University, Nanning 530004, China)
  • Online:2011-05-20 Published:2011-05-20

摘要: 传统的核主成分分析方法通过不明确的实值函数把原始数据投影到高维空间进行属性约简,增加了搜索分类超平面的时间,降低了分类准确率。为此,提出一种基于再生核Hilbert空间主成分分析的属性约简方法,把原始数据通过明确的连续值函数投影到高维或无限维的再生核空间再进行属性约简。真实数据集实验结果显示,该方法能有效提高分类准确率并减少运行时间。

关键词: 数据挖掘, 属性约简, 希尔伯特空间, 主成分分析

Abstract: The traditional Kernel Principal Component Analysis(KPCA) method maps original data into high dimensional space via the implicit mappings with real-valued function, such a mapping needs too much time for finding the hyper-plane in the classification assignments as well as leads to lower classification accuracy. Aiming at this problem, this paper maps the input into Reproducing Kernel Hilbert Spaces(RKHS) which are full with continuous values function by the explicit mappings, and then implements dimensionality reduction in RKHS. Experimental results in real text data show the proposed method outperforms the comparison in terms of classification accuracy and running time.

Key words: data mining, dimensionality reduction, Hilbert space, Principal Component Analysis(PCA)

中图分类号: