摘要:
对样本点数量巨大、用于刻画对象特征的指标众多、带有时空动态特性、包含大量噪声等特点的大规模复杂数据集进行定义。针对大规模复杂数据集的挖掘要求,结合统计分析、粗糙集、模糊集理论中的数据约简思想和方法,提出一种基于样本模糊聚类和粗糙集属性约简的大规模复杂数据集约简方法。
关键词:
大规模复杂数据,
数据挖掘,
数据约简,
粗糙集,
模糊集
Abstract:
This paper gives the definition of largescale complex dataset with characteristics of large, multiattribute, temporal and spatial, rough. For the problem of largescale complex dataset mining, according to theory of data reduction of statistics, rough set, fuzzy set, an efficient method is proposed to reduce largescale complex data based on fuzzy clustering and attribute reduction of rough set.
Key words:
largescale complex data,
data mining,
data reduction,
rough set,
fuzzy set
中图分类号:
张诤, 王惠文. 大规模复杂数据集的约简方法[J]. 计算机工程, 2010, 36(23): 13-15,18.
ZHANG Zheng, WANG Hui-Wen. Reduction Method for Mining Largescale Complex Datasets[J]. Computer Engineering, 2010, 36(23): 13-15,18.